Very often in a physics context, we see a very abstract definition of a tensor based on coordinate transformation :
But where does this definition come from?
It may be easier to start with an example:
e.g.) Dot product: (0, 2) tensor on
For example, consider the dot product of two vectors :
where are the components of in some basis.
It is called a (0, 2) tensor because it takes two vectors as input and produces a real number, a scalar as output.
Importantly, the dot product is multilinear: for and ,
This is an important part of definition of a tensor: multilinearity.
In general, a tensor is a multilinear map that takes vectors as input and produces a scalar as output.
Vector Space
Before going further, let’s briefly review the definition of a vector space, which is essential to understand tensors in this linear algebraic context.
Definition: Vector Space
A vector space over (or ) is a set equipped with two operations:
Vector Addition: An operation :
For ,
Scalar Multiplication: An operation
must satisfy the following axioms for all and :
Commutativity :
Associativity :
Additive Identity : There exists an element such that
Additive Inverse : exists such that
We may also write .
Distributivity (1) :
Distributivity (2) :
Multiplicative Identity :
There exists an element such that
Many things can be vector space.
Our usual -dimensional with usual addition and scalar multiplication is a vector space.
The set of all polynomials on is also a vector space with pointwise addition and scalar multiplication:
This means that when we add and and evaluate at , it is the same as evaluating and at first, then adding the results.
A tensor can be defined on such polynomial vector space as well:
e.g.) Tensor on Polynomial Vector Space
Consider the vector space of all polynomials on .
A (0, 2) tensor can be defined as
From the linearity of differentiation, it is easy to check that is multilinear, and hence a (0, 2) tensor on .
for example, for ,
if we take , then we can also see that
Linear Maps
So far, we have seen what a vector is, but we also had another keyword in the definition of a tensor: linear map.
Definition: Linear Map
A map between two vector spaces and on ; is called a linear map, if for all and ,
Additivity :
Homogeneity :
where and are the addition and scalar multiplication in and , respectively. Note that .
This linear map can be defined between any two vector spaces: , , or even .
Dual Vector Space
Now, before properly defining tensors, we need one more concept: dual vector space.
Consider a column vector :
A “natural” linear map from to is the dot product with another vector :
Since is a column vector, one way we can represent is to use a row vector:
so that the linear map can be expressed simply as matrix multiplication.
But since row vectors are also set of 3 real numbers, they also form a vector space.
This vector space formed by such row vectors is a special case of dual vector space:
Definition: Dual Vector Space
Given a vector space over , its dual vector space is the set of all linear maps from to :
with addition and scalar multiplication .
An element of a dual vector space is called a covector or 1-form or differential form.
e.g.) Covector on
Consider the vector space of all polynomials on .
One example of a linear map is the integral :
This kind of linear map is also called a functional, where the input is a function and the output is a scalar.
From the linearity of integration, it is easy to check that is a linear map, and hence an element of the dual vector space , and is a covector.
Also, gradient of a scalar function is a covector:
e.g.) Gradient as a Covector
Consider a scalar function .
Its gradient is defined as
For a vector , the directional derivative of along is given by
This defines a linear map from to , and hence can be considered as a covector in the dual vector space . (More on this later.)
[!Example] e.g.) Covector on
Consider .
If we consider a function defined as
is a covector.
Now, this is essentially the same as the row vector , so if we take column vectors as elements of , then the dual vector space can be identified with the set of row vectors.
Definition of Tensor
In the beginning, we defined tensors as multilinear maps that take vectors as input and produce a scalar as output.
With the concept of dual vector space, we can finally fill the 0 slot:
Definition: (M, N) Tensor
Given a vector space over , an tensor is a multilinear map:
that takes covectors and vectors as input and produces a scalar as output.
Components of a Vector
Now that we have properly defined tensors, let us try to understand where the “transformation definition” of tensor is coming from.
To do this, it is useful to look back on vectors both geometrically and algebraically.
Geometrically speaking, a vector is an “arrow”.
We can give this arrow an algebraic representation by choosing a coordinate system and a basis of the vector space:
Definition: Basis
For , a set of vectors that are linearly independent and span the whole space is called a basis.
This means that any vector can be expressed as a linear combination of the basis vectors:
where are called the components of in this basis.
As for coordinate system, we can really take any coordinate system, such as Cartesian, polar, spherical, or even curvilinear coordinates.
For basis, in many situations we work with orthogonal basis or orthonormal basis
For a vector space with basis , its dual space star can have a corresponding dual basis defined as:
Definition: Dual Basis
Given a basis of a vector space ,
its dual basis in the dual vector space is defined such that
Using this dual basis, any covector can be expressed as a linear combination of the dual basis vectors:
Now, consider, as an example, a (1, 1) tensor , and and .
Let us further choose an arbitrary basis for and its dual basis for .
Then, we can express and in terms of the basis and dual basis:
Then from the multilinearity, the tensor can be evaluated as:
Here, we can define the components of the tensor in this basis as:
so that we can write the output of the tensor as the linear combination of input vector components and tensor components:
A tensor takes a vector as input and produces a scalar as output, which is exactly the definition of a covector/ 1-form:
where we defined the components of the covector as .
Here, we can change the perspective and think of as the linear map from to , which takes a covector as input and produces a scalar as output:
Thus, a tensor, that takes a covector as input and produces a scalar as output:
is exactly the same as a vector, or tensor.
Now, what happens if we only input a vector to a (1, 1) tensor?
Let us see:
If we set , then we can see that has the same form as a covector acting on a vector:
This means that is a covector, or a (0, 1) tensor.
Similarly, if we only input a covector to a (1, 1) tensor, that becomes a vector, or a (1, 0) tensor.
Note that we can still choose any coordinate system and basis we like, so this expression is actually independent of the choice of coordinate system and basis, which then must hold for any coordinate system and basis.
So, what happens if we change the coordinate system and basis?
To explain this, we need to take some ideas from differential geometry.
Tangent Space
Consider .
If a particle with position vector is moving in , then its path is a curve in , parametrized by time :
Roughly speaking, each component of the position vector has units of length (e.g. meters).
In simple mechanics, we consider the velocity vector , which has units of length/time (e.g. meters/second).
Now, we ask that if these vectors and are really in the same vector space ?
Obviously, they have different units, so it may be more natural to think that they belong in different vector spaces.
We know that is in , but where does belong to?
The answer is the tangent space at point , denoted by .
Definition: Tangent Space
Given a manifold and a point , the tangent space at , denoted by , is the vector space consisting of all tangent vectors at point .
Intuitively, we think that on the curve , the change in position within time is given by
At each point on the curve, we can consider the tangent space where the velocity vector lives.
Physically, we may only have 1 velocity vector for each point on the curve , but mathematically, velocity is just a vector with 3 components.
So, at each point , we can imagine any velocity vector (not just the actual, real velocity of the particle at that point).
This is actually not constrained to a single curve.
When we think of, for example, a fluid flow in , at each point , there is a velocity vector , which lives in the tangent space, “attached” to .
In electromagnetism, we see electromagnetic field defined on every point in , which by the same logic, also live in the tangent space at each point.
Tangent Vector Bundle
Now that we have this set of points and its corresponding tangent spaces , we can combine them together to form the tangent vector bundle, or simply the tangent bundle.
We denote the tangent bundle by :
Definition: Tangent Bundle
Given a manifold , the tangent bundle is the disjoint union of all tangent spaces at every point in :
If you have heard of “configuration space” in classical mechanics, it is essentially the same as the tangent bundle of the position space.
We have every possible pair of position and velocity vectors in the configuration space, which is exactly what the tangent bundle is.
The collection of pairs are not limited to the actual motion of a particle in our universe, but it also includes all possible and even “impossible” or “imaginary” pairs of position and velocity vectors.
When we want to solve for the motion of a particle, we need to carefully choose the correct pairs of that satisfy the equations of motion.
This is called selecting a section of the tangent bundle: for every , we choose one value of .
Definition: Section of a Bundle
Given a tangent bundle over a manifold (our physical space) , a section is a map from the manifold to the tangent bundle :
This velocity vector is not limited to the velocity of a particle moving in , but can also be any “vector field” defined on .
In the context of vector fields, a section is equivalent to assigning a vector to each point :
Basis of Tangent Space
Now, coming back to the example of a particle moving in , at each point , we have the tangent space .
In the tangent space , we can choose a basis.
An intuitive choice is the basis vectors for usual Cartesian coordinates:
It is certainly a valid choice, but in the context of differential geometry, we can introduce a more “natural” or robust choice of basis.
Consider a derivative of a scalar function along the path of the particle :
where the velocity vector (tangent of the curve).
Note that velocity vector takes 3 real numbers as components , which makes the tangent space .
Notice that is a differential operator on , but we have 3 scalars that uniquely determine this operator: (which follows from the orthogonality of partial derivatives ).
This suggests that the set of directional derivative operator
forms a vector space, with three dimensions, just like .
This leads to isomorphism between the tangent space and the vector space formed by the directional derivative operators at point (not limited to the particle path).
In short, if we take the basis of as
the two vector spaces, and , are completely identical in terms of structure.
This isomorphism is very useful, because when we change the basis and coordinate in the tangent space, we can just calculate how the partial derivative operators change using chain rule.
Let us see how this works in a little general case.
Assume that we transform our coordinate from to .
Further assume that the transformation is bijective and smooth, so that there is an inverse transformation and differentiation is well-defined.
Then, from the chain rule,
Since we can switch between usual basis and the derivative operator basis using this isomorphism, the new basis vectors in the new coordinate system can be expressed in terms of the old basis vectors as:
This is how basis vectors in the tangent space transform under coordinate transformation.
Any vector that has the same transformation property as the basis vectors is called a covariant vector (meaning, “transforming together with the basis”) and we denote it with lower indices:
Definition: Covariant Vector
A vector is called a covariant vector if its components transform under coordinate transformation as:
So, we mentioned that this transformation of basis vectors have inverse transformation:
If we remember that for a coordinate transformation , its inverse transformation is denoted by , which satisfy:
thus if we just combine the two transformations, we must get the identity transformation:
Let us then compare this with the identity transformation (doing nothing):
By comparison, we get our relation between forward and inverse transformation:
The factor is the inverse of the basis vector transformation .
We can follow the same logic when we consider the identity transformation for the new coordinates:
Geometrically speaking, vectors are “arrows”, independent of basis.
When we change the basis, for two representations of the arrow in different basis to represent the same arrow, their components must also change to compensate:
By comparison,
we notice that the partial derivative factors are identical to , so we get the transformation rule for vector components (note we switch to as dummy index):
If we compare this the covariant vector transformation rule:
Again, as we saw, these are inverse of each other.
When a vector whose components transform like this, it is called a contravariant vector:
Definition: Contravariant Vector
A vector is called a contravariant vector if its components transform under coordinate transformation as:
Cotangent Space
Similarily to the linear algebra case, to properly define tensors, we also need the dual space of the tangent space, called the cotangent space.
If we take , then its dual space is called the cotangent space at point , denoted by :
Definition: Cotangent Space
Given a manifold and a point , the cotangent space at , denoted by , is the dual vector space of the tangent space :
Now, let us come back to a vector in the tangent space .
Due to the isomorphism, any vector is equivalent to a directional derivative operator at point .
Given a smooth scalar function , let us denote a directional derivative at point with the direction as :
Notice how on the RHS, is a linear functional .
But on the LHS, we can think of as a map from the tangent space to : , taking a vector as input and producing a scalar as output.
This is exactly the definition of a covector in the dual space of , which is the cotangent space .
Definition: Differential as a Covector
Given a smooth scalar function , its differential at point , denoted by , is a covector in the cotangent space defined as:
Now we ask, what is a good choice of basis for the cotangent space ?
So we just saw that the covectors can be written as directional derivatives as a function of :
If we take a function that takes out a single coordinate component, which we might denote by (), then its differential at point is:
By the linearity of the directional derivative,
Missing superscript or subscript argumentd f_{\vec{x}} (\vec{v}) = v^j d x^i (\partial_j) = v^j \delta^i_
Which means that satisfy the dual basis property:
Tensor Field
Now that we have properly defined tangent space and cotangent space, we can finally define “tensor fields”.
Definition: (m, n) Tensor Field
On , an tensor field is a multilinear map defined at each point :
Where and are the tangent space and cotangent space at point , respectively.
With this knowledge, we can finally understand the transformation definition of tensors.
Assume, as before, that we have a (1, 1) tensor field on .
At each point , the tensor field takes a covector and a vector as input and produces a scalar as output:
We have seen that basis for and transform as:
Due to the multilinearity, the tensor components, which takes the basis vectors as input, must transform under the combined transformation:
Such tensor that transforms like this under coordinate transformation is called a (1, 1) tensor, or mixed tensor (since it has both covariant and contravariant indices).
Similarly, a (0, 2) tensor takes two vectors as input and produces a scalar as output:
Its components transform as:
this is called a (0, 2) tensor, or 2nd rank covariant tensor, because its transformation has 2 covariant factors.
Similarly, a (2, 0) tensor takes two covectors as input and produces a scalar as output:
Its components transform as:
this is called a (2, 0) tensor, or 2nd rank contravariant tensor, because its transformation has 2 contravariant factors.