San José State University |
---|
applet-magic.com Thayer Watkins Silicon Valley & Tornado Alley USA |
---|
|
This is an introduction to the concepts and procedures of tensor analysis. It makes use of the more familiar methods and notation of matrices to make this introduction. First it is worthwhile to review the concept of a vector space and the space of linear functionals on a vector space.
A vector space is a set of elements V and a number of associated operations. There is an addition operation defined such that for any two elements u and v in V there is an element w=u+v. Furthermore there is an element of V, call it the zero vector 0, such that for any element u of V, u+0=u. And for any element u of V there is an element of V, say v, such that u+v=0. The element v is called the additive inverse of u and is denoted as −u. There is a set of scalars Λ for a vector space such that for any element v of V and any element λ of Λ there is an element of V, denoted as λv and called the scalar product of λ and v. Usually the set of scalars is the real numbers or the complex numbers.
For a vector space there is set of elements of V, called a basis, such that any vector in V can be reprsented as a linear combination of the basis elements. The linear combination is given by the scalar coefficients for the basis elements. Thus any vector in V can be represented by an ordered array of scalar elements, which are called the components of the vector. Such an order array can be considered a column vector of the scalar elements. The basis of a vector space may or may not be finite.
Linear functionals on the vector space are functions which map the elements of V into the set of scalars Λ. Addition and scalar multiplication on the set of linear functionals are defined and likewise a zero element and additive inverses. Thus the linear functionals over the set of vectors V form a vector space, called the dual space to the vector space for V.
If the vector space for V is of finite dimension n then so is its dual space and thus the dual space has the same structure as that for V. If the vector space for V is not of finite dimensions then the dual space not necessarily of the same structure.
For the vector space of n-dimensional column vectors the dual space may be thought of as also n-dimensional vectors. Then the linear functionals may be considered as matrix products; i.e, if X is an element of V and F is an element of the dual space then the linear functional is the matrix product of the transpose of F wth X; i.e.,
Now consider a change in the coordinate system for V such that the vector of components X gets changed into the vector of components Y by multiplication by an invertible matrix M; i.e.,
The linear functional functional z = FTX gets transformed to z = GTY. Since Y=MX and X=M-1Y, therefore
The dual vectors get transformed by the inverse of the transpose of the transformation of the vectors. Thus there are some vectors which get transformed by one rule and other vectors which get transformed by an associated alternate rule. A similar arrangement occurs in tensor analysis in which some tensors are called covariant and transform according to one rule and others are called contravarianat and transform according to an alternate but related rule.
It is very tempting to identify the vectors of the dual space as row vectors. This would illustrate how the dual space could have the same mathematical structure as the primal (original) space yet be distinct. However there is no provision in tensor analysis for identifying one vector as a column vector and another one as a row vector.
A tensor is an entity which is represented in any coordinate system by an array of numbers called its components. The components change from coordiate system to coordinate in a systematic way described by rules. The arrays of numbers are not the tensor; they are only the representation of the tensor in a particular coordinate system. A vector is a special case of a tensor. A vector is an entity which has direction and magnitude and is represented by a one dimensional array of numbers. Unfortunately it is common to consider any one dimensional array of numbers as a vector.
A more general notion of a vector involves not only its direction and magnitude but also its point of application. Thus a vector can be represented by two points; one point representing its point of application and the line between the points giving its direction and magnitude. The points in n-dimensional space can be considered invariant but their representations in a coordinate system are given by two n-dimensional arrays of numbers, say X1 and X2. If the coordinate system is changed by shifting the origin (translation), rotating the axes and/or stretching the axes (dilation) then the components of the representation of the points change. The translation of the origin is given by an n-dimensional array, say A and the rotation and dilation by a matrix, say B. The new representations of a point P is Y where
where X is the representation in the old coordinate system.
The allowable transformations are the ones which have inverses; i.e.,
The conditions for the existence of an inverse transformation can be stated in terms of the Jacobian determinant for the transformation. Let Ji,j=∂yi/∂xj and let J be the matrix formed from the Jij's. The condition for the existence of an inverse transformation within some region R is that det(J)≠0 within that region.
If the x-coordinates are transformed to y-coordinates and the y-coordinates are then transformed to z-coordinates then matrix for the transformation of the x-coordinates to the z-coordinates is just the product of the matrices for the two intermediate transformations; i.e., if J is the matrix for the transformation from the x's to the y's and K is the matrix of the transformation from the y's to the z's then the matrix of the transformation from the x's to the z's is JK. Thus the Jacobian for the transformation from the x's to the z's is det(J)det(K) and hence if det(J) and det(K) are both nonzero in some region R then so is det(JK). Thus if both of the intermediate transformations are allowable then so is their composition.
There is the trivial transformation yi=xi, called the identity transformation. The Jacobian of this transformation is equal to unity.
Since the composition of a transformation with its inverse is just the identity transformation it follows that product of the Jacobian of a transformation with the Jacobian of the inverse transformation is equal to unity and hence det(J−1)=det(J)−1.
All of this adds up to the mathematically interesting proposition that the set of allowable transformations
of coordinate systems form a group.
The transformation from cylindrical coordinates to Euclidean coordinates will be used for the illustration. First these
coordinates will be given in their more familiar form and then converted to the subscripted notation. The cylindrical coordinates
are (r, θ, z) and the Euclidean coordinates
are (x, y, z) where
In subscripted notation then
Thus the transformation is
Therefore the elements of the Jacobian matrix Ji,j = ∂yi/∂xj are
The Jacobian determinant reduces to x1[cos²(x2) + sin²(x2] and hence J=x1.
In the more familiar notation the Jacobian matrix is
and the Jacobian of the transformation is r.
The Jacobian matrix of the inverse transformation, the inverse of the above matrix, is then
Its determinant is 1/r, which of course is the reciprocal of the determinant of the Jacobian matrix of the original transformation.
(To be continued.)
Let X be a representation of a vector in n-dimensional space and let h=f(X) be a scalar function of X.
If the X representation is transformed to another coordinate system by the transformation T(), such that
Y=T(X), where T is a vector-valued function then the scalar function f(X) gets transformed into
g(Y) where g(Y)=f(T-1(Y)). For simplicity let inverse transformation T-1(Y) be
represented as S(Y). Thus g(Y)=f(S(Y)).
Consider the vector whose components are {∂f/∂xi, i=1,…,n}.
This is called the gradient of the scalar function f. Now consider the gradient of the
scalar function g(Y); i.e., {∂g/∂yj, j=1,…,n}. How are the
components of the gradient of g related to the componets of the gradient of f? Calculus
gives the answer as
Summations arise almost always when there is a repeated index such as i in the above
equation. Albert Einstein promoted the convention that since a repeated index almost
always means summation over
that index the summantion sign Σ can be dispensed with. Therefore in tensor
notation the above equation is
Let (∂xi/∂yj) be denoted as the element of a matrix M and
let (∂f/∂xi) and (∂f/∂yj) be the components of the
vectors ∇xf and ∇yf, respectively. Then in matrics notation the
above transformation of the gradient vector is represented as
Any entity whose components transform in this way is called a covariant tensor of
rank 1. There are other entities which transform in a similar but different way.
Consider the differentials dyj and dxi for i and j ranging from 1 to n.
From calculus
The quantities (∂yj/∂xi) are the elements of what previously had been
referred to as the Jacobian matrix of the transformation. Denoting the Jacobian matrix as J then the
transformation of the vector of infinitesimals dX to dY is represented as
This transformation is called contravariant and a vector transforming in this way
would be called a contravariant tensor of rank 1.
What was denoted above as the matrix M is the Jacobian matrix of the inverse transformation.
The covariant transformation of the gradient vectors expressed previously could be represented as
These two rules for transformation, covariant and constravariant, are the defining characteristic of tensors. They can be
extended to multiple indices. The indices for covariance are conventionally denoted as subscripts and contravariance as
superscripts.
One scalar quantity of importance in geometry is distance. In Euclidean coordinates for an n-dimensional space the
formula for the length ds of an infinitesimal line segment is
If dX represents the column vector of dxi's then the formula for the line length could be expressed as
If the coordinate system is changed from the Euclidean X's to some coordinate system of Y's then
In matrix notation this may be expressed as
where M is the matrix whose mi,j element is (∂xi/∂yj). Thus
(dX)T=(dY)T(M)T
and hence
where the matrix G is (M)TM. The elements of G are given by
The two dimensional array of the gi,k's is called the metric tensor. That it is in fact a tensor and
a covariant one at that is something that needs to be proven. It is the metric tensor for the coordinate system Y. The metric
tensor for the Euclidean coordinate system is such that gi,k=δi,k, where δi,k=0 if
i≠j and =1 if i=k.
For cylindrical coordinates the inverse transform of Euclidean to cylindrical, i.e., cylindrical to Euclidean, is
in familiar notation,
Thus the matrix M is given by
Thus the matix M is
and its transpose is
Thus the metric matrix G=MTM is
The length of an infinitesimal line elements is then
The inverse of the metric tensor is significant. In the case of the cylindrical coordinate system
this inverse G-1 is
In tensor analysis the metric tensor is denoted as gi,j and its inverse is denoted as
gi,j. This latter notation suggest that the inverse has something to do with contravariance.
For a column vector X in the Euclidean coordinate system its components in another coordinate system are
given by Y=MX. Now consider G-1X. Since G=MTM,
Thus the transformation of G-1X would be given by
This is the transformation rule for a contravariant tensor. It is also the rule for what in the
Introduction was referred to as a vector of the dual space. Thus multiplication of a covariant tensor
by the contravariant metric tensor creates a contravariant tensor. The operation also works in the
other direction. Multiplication of a contravariant tensor by the metric tensor produces a covariant
tensor.
Suppose the transformation of the coordinate system is such that Y=MX, then the linear functional H
in the new coordinate system corresponding to F in the original coordinate is given by
Now consider GF where G=MTM.
Then
This is the transformation rule for a covariant tensor.
Thus multiplication of a covariant tensor by the inverse metric tensor produces a contravariant tensor.
The Christoffel 3-index symbol of the first kind is defined as
A property of these symbols obvious from the definition is that
When the Chiristoffel symbols of the first kind are multiplied by the elements of the inverse of the
metric tensor and results summed over the third index a different set of symbols are generated which are
called the Christoffel 3-index symbols of the second kind. There are different notations used for these
symbols, but the typographically most convenient is, in analogy with the symbols of the first kind, {ij,k}.
I. S. Sokolninikoff in his Tensor Analysis uses a two level symbol impossible to create in HTML.
The Princeton school uses a symbol of the form Γijk but the superscripts and
subscripts cannot be properly lined up using HTML. Therefore {ij,k} will be used here. The definition
is
It follows from the symmetry of [ij,k] with respect to the interchange of the first two indices that
The Christoffel symbols are essential in defining the derivative of tensors. However the Christoffel
symbols themselves do not represent a tensor. That is to say they do not transform upon a change
in coordinate system according to either the covariant or contravariant rules of transformation.
Previously it was noted that the gradient vector for a scalar function is a covariant tensor. The gradient
is the set of partial derivatives of a scalar function, say f(x1, …, xn), i.e.,
(∂f/∂xi) for i=1 to n.
Consider now the second derivatives of the scalar function
When the coordinate system is changed from X's to Y's the gradient components transform according to
the tensor formula
However the formula for the transformation of the second derivatives is
If there were only the first term on the right-hand side then the transformation would be tensorial
but the second term spoils it.
E.B. Christoffel derived an important formula concerning the second derivatives in a coordinate
transformation, namely
where x{ij,k} and y{ij,k} stand for the Christoffel symbols of the second kind
in the X coordinate system and the Y coordinate system, respectively.
The formula for the transformation of any covariant vector A in X coordinates to vector B in the
Y coordinates is
When the Christoffel formula, with appropriate indices, is substituted for the second term on the right
the result is
In the second term on the right-hand side (∂xp/∂ym)Ap is
just equal to Bm so the above formula reduces to
Upon rearrangement this takes the form
This is the transformation rule for a covariant tensor of rank two. Thus the quantity
is a covariant tensor of rank two
and is denoted as Ai, j. It is called the covariant derivative of a covariant vector.
Likewise the derivative of a contravariant vector Ai can be defined as
and this is denoted as Ai, j.
(To be continued.)
Illustration of a Transformation and its Jacobian Matrix
x = rcos(θ)
y = rsin(θ)
z = z
x1 = r
x2 = θ
x3 = z
and
y1 = x
y2 = y
y3 = z
y1 = x1cos(x2)
y2 = x1sin(x2)
y3 = x3
J1,1 = cos(x2)
J1,2 = −x1sin(x2)
J1,3 = 0
J2,1 = sin(x2)
J2,2 = x1cos(x2)
J2,3 = 0
J3,1 = 0
J3,2 = 0
J3,3 = 1
∂g/∂yj = Σi∂f/∂xi(∂xi/∂yj)
∂g/∂yj = ∂f/∂xi(∂xi/∂yj)
or, in better form
∂g/∂yj = (∂xi/∂yj)(∂f/∂xi)
∇yf = M∇xf
dyj = Σi(∂yj/∂xi)dxi
or, in tensor notation
dyj = (∂yj/∂xi)dxi
dY = JdX
∇yf = J-1∇xf
The Metric Tensor
(ds)² = Σi(dxi)²
or, using the summation convention,
(ds)² = dxidxi
(ds)² = (dX)T(dX)
dxi = (∂xi/∂yj)dyj
dX = MdY
(ds)² = (dY)T(M)TMdY = (dY)TGdY
gi,k = (∂xi/∂yj)(∂xk/∂yj)
x = rcos(θ)
y = rsin(θ)
z =z
m1,1 = ∂x/∂r = cos(θ), m1,2 = ∂x/∂θ = −rsin(θ), m1,3 = ∂x/∂z = 0,
m2,1 = ∂y/∂ = rsin(θ), m2,2 = ∂x/∂θ = rcos(θ), m2,3 = ∂y/∂z = 0,
m3,1 = ∂x/∂z = 0, m3,2 = ∂z/∂θ = 0, m3,3 = ∂z/∂z = 1
(ds)² = (dr)² + (rdθ)² + (dz)²
G-1 = M-1(MT)-1 = M-1(M-1)T
Y = MG-1X = M(M-1(M-1)TX = (M-1)TX
but this is equivalent to
Y = (MT)-1X
H = (MT)-1F
H = (MT)-1GF = (MT)-1(MTM)F
which reduces to
H = MF
The Christoffel Symbols
[ij,k] = ½[∂gik/∂xj + ∂gik/∂xi −
∂gij/∂xk]
[ij,k] = [ji,k]
{ij,k} = gkp[ij,p]
where, of course, there is a
summation over the index p.
{ij,k} = {ji,k}
Differentiation of Tensors
∂²f/∂xj∂xi
∂f/∂yi = (∂f/∂xp)(∂xp/∂yi)
∂²f/∂yj∂yi =
(∂/∂yj)((∂f/∂xp)(∂xp/∂yj))
= (∂²f/∂xp∂xq)(∂xq/∂yj)(∂xp/∂yi)
+ (∂f/∂xp)(∂²xp/∂yj∂yi)
(∂²xm/∂yj∂yi) = y{ij,r}(∂xm/yr) −
x{pq,m}(∂xp/∂yj)(∂xq/∂yi)
Bi = Ap(∂xp/∂yi)
so
(∂Bi/∂yj) = (∂Ap/∂xq)(∂xp/∂yi)(∂xq/∂yj)
+ Ap((∂²xp/∂yj∂yi)
(∂Bi/∂yj) = (∂Ap/∂xq)(∂xp/∂yi)(∂xq/∂yj)
+ y{ij,m}(∂xp/∂ymAp)
− x{mq,p}(∂xm/∂yi)(∂xq/∂yj)Ap
(∂Bi/∂yj) = (∂Ap/∂xq)(∂xp/∂yi)(∂xq/∂yj)
+ y{ij,m}Bm
− x{mq,p}(∂xm/∂yi)(∂xq/∂yj)Ap
(∂Bi/∂yj) − y{ij,m}Bm
=
[(∂Ap/∂xq)
− x{mq,p}(∂xm/∂yi)Ap](∂xp/∂yi)(∂xq/∂yj)
∂Ai/∂xj − {ij,p}Ap
∂Ai/∂xj + {pj,i}Ap
HOME PAGE OF Thayer Watkins