Matrix Cayley Transform

Sometime ago, in my research, I came across a matrix map I am interested at. It is a generalisation of the complex map

z \mapsto \dfrac{z-i}{z+i}

defined on a complex projective plane \mathbb{C}\mathrm{P}^2.

Let \mathbf{F} be a field. Given a skew-symmetric matrix X \in M_n\left(\mathbf{F}\right), i.e. X^T = -X, one can define the Cayley transform of X, denoted by C(X), as

C(X) = (I-X)(I+X)^{-1},

for all X such that I+X is invertible.

If we were somewhat more restrictive in our assumption about \mathbf{F}, which in this case is to consider an ordered field, then I+X is always invertible for all skew-symmetric matrix X \in M_n(\mathbf{F}). To see this, suppose it is not, then there exists a non-zero \mathbf{u} \in \mathbf{F}^n such that (I+X)\mathbf{u} = \mathbf{0}, i.e. X\mathbf{u} = -\mathbf{u}. However,

\mathbf{u}^T\mathbf{u} = \left(-X\mathbf{u}\right)^T\mathbf{u} = -\mathbf{u}^TX^T\mathbf{u} = \mathbf{u}^TX\mathbf{u} = -\mathbf{u}^T\mathbf{u},

which implies \mathbf{u}^T\mathbf{u} = 0. Since we are working in an ordered field, this forces \mathbf{u} = \mathbf{0}, a contradiction.

It can be shown that C(X) is an orthogonal matrix, i.e. C(X)^TC(X) = I where I is the identity matrix of the same size. It’s not hard to see this by direct computation:

\begin{aligned} C(X)^TC(X) &= \left(\left(I+X\right)^T\right)^{-1}\left(I-X\right)^T(I-X)(I+X)^{-1} \\ &= \left(I-X\right)^{-1}(I+X)(I-X)\left(I+X\right)^{-1} \\ &= \left(I-X\right)^{-1}(I-X)(I+X)\left(I+X\right)^{-1} \\ &= I. \end{aligned}

In the proof above, we used the fact that I-X commutes with I+X. Moreover, \det C(X) = 1 because

\det(I-X) = \det\left(I+X\right)^T = \det\left(I+X\right).

Proposition. The map C is an involution, i.e. C^2 = \mathrm{id}.

Example. If n = 2, then

X = \begin{pmatrix} 0 & a \\ -a & 0 \end{pmatrix}

and

C(X) = \dfrac{1}{1+a^2}\begin{pmatrix} 1-a^2 & 2a \\ -2a & 1-a^2 \end{pmatrix},

for some a \in \mathbf{F} satisfying a^2 + 1 \neq 0. If n = 3, then

X = \begin{pmatrix} 0 & a & b \\ -a & 0 & c \\ -b & -c & 0 \end{pmatrix}

so

C(X) = \dfrac{1}{1+a^2+b^2+c^2}\begin{pmatrix} 1-a^2+b^2+c^2 & 2(-a+bc) & 2(c+ab) \\ 2(a+bc) & 1+a^2-b^2+c^2 & 2(-b+ac) \\ 2(-c+ab) & 2(b+ac) & 1+a^2+b^2-c^2\end{pmatrix},

for all a,b,c\in\mathbf{F} such that a^2+b^2+c^2+1\neq 0.

In Lie group and Lie algebra notation,

C \enspace \colon \enspace \mathfrak{o}(n) \longrightarrow \mathrm{SO}(n) \le \mathrm{O}(n),

where \mathfrak{o}(n) is the Lie algebra of skew-symmetric matrices equipped with the usual Lie bracket, \mathrm{O}(n) the Lie group of orthogonal matrices, and \mathrm{SO}(n) a subgroup of \mathrm{O}(n) consisting of orthogonal matrices with determinant 1. We can then think of Cayley transformation as a map from \mathfrak{o}(n) into its Lie group \mathrm{O}(n).

This map is nice because it allows us parametrise rotation matrices! Generally, if one wishes to study a Lie group, one might do that by studying its Lie algebra which behaves much nicer because we can do linear algebra stuff there.

One thing I want to mention about C is that it does not require infinite processes. The formula is precise and does not require approximation. Compare this with the more commonly studied exponential map \exp, which is also a map from a Lie algebra \mathfrak{g} to its Lie group G:

\exp(X) = \displaystyle\sum_{k\ge 0} \dfrac{X^k}{k!}

for X \in \mathfrak{g}. In this setting, usually the underlying field is taken to be \mathbf{R} so if one is given X \in \mathfrak{g}, one needs to take a limit (which requires infinite processes) to exactly compute \exp(X). It shares a similar property with the Cayley transformation: if X is skew-symmetric, then \exp(X) is orthogonal. In some ways, exponential map is stronger because we have things like

\exp(aX)\exp(bX) = \exp\left((a+b)X\right),

for example, while we don’t have an analogue for this with the Cayley transformation. What I mean by that is it’s not possible to find t \in \mathrm{F} as a function of r,s\in\mathrm{F} that satisfies

C(rX)C(sX) = C(tX)

for all X \in \mathfrak{o}(n). Nevertheless, it does not mean that the Cayley transformation is less interesting. I feel that this map needs to be more well-studied, especially because it might provide an alternate theory of connecting Lie algebras with Lie groups more algebraically.

I will discuss a generalisation of this Cayley transform to a more general geometric setting in my next post. Roughly speaking (I will make this more rigorous in the next post), given a symmetric, non-degenerate matrix B, we can define a symmetric bilinear form \left(\mathbf{x},\mathbf{y}\right) = \mathbf{x}^TB\mathbf{y} for all \mathbf{x}, \mathbf{y} \in \mathbf{F}^n. This gives a new geometry, with the case B = I giving the Euclidean geometry back. The matrix B = \mathrm{diag}(1,1,\ldots,1,-1) gives relativistic geometry.

This generalised Cayley transform depends on B. We will also see that a modification of being skew-symmetric and orthogonal with respect to B in the next post.

Brisbane, 28 June 2019.