If you like this content, you can help maintaining this website with a small tip on my tipeee page
In our last article Local Flatness or Local Inertial Frames and SpaceTime curvature, we have introduced the concept of Riemann tensor, saying that the importance of this tensor stems from the fact that nonzero components are the hallmark of the spacetime curvature.
But before to delve into more details and to give a complete formulation of the most important tensor in General Relativity, it seems reasonnable to get a better understanding of the tensor's concept itself.
Definition

Tensor definition
Let us start by giving a definition first:
A tensor of rank n is an array of 4^{n} values (in fourdimensionnal spacetime) called "tensor components" that combine with multiple directional indicators (basis vectors) to form a quantity that does NOT vary as the coordinate system is changed^{[1]}. So we will have to think of tensors as objects with components that transform between coordinate systems in specific and predictable ways^{[2]}.
Corollary 1: Combined with the principle of General Covariance, which extends the Principle of Relativity ^{[3]} to say that the form of the laws of physical should be the same in all  inertial and accelerating frames, it means that if we have a valid tensor equation that is true in special relativity (in an inertial frame), this equation will remain valid in general relativity (in a accelerating frame)
Corollary 2: A null tensor in one coordinate system is null in all other coordinate systems. In other words, a quantity that we can nullify by coordinate system transformation is NOT a tensor.
Vector rotation and contravariant components
A good place to begin is to consider a vector, which is nothing else thant a tensor of rank one, and to consider this question:"What happens to a vector when you change the coordinate system in which you're representing this vector?" The quick answer is that nothing at all happens to the vector itself, but the vector's components may be different in the new coordinate system.
Let us consider the simple rotation of the twodimensional Cartesian coordinate system shown below. In this transformation, the location of the origin has not changed, but both the x and y axis have been tilted counterclockwise by an angle of θ. The rotated axes are labeled x' and y' and are drawn using red color to distinguish them from the original axes.
Our aim is to express the components A'^{x} and A'^{y}^{[4]} of the vector A in the primed/rotated coordinate system relative to the components A^{x} and A^{y} in the unprimed/untransformed coordinate system, defined as follows:
If you think to the changes to components A^{x} and A^{y} of the vector A, you might come to realize that the vector component A'^{x} in the rotated coordinate system can not depend entirely on the component A^{x} in the original system. Actually, as you can see in the figure above, A'^{x} can be considered to be made up of two segments, labeled L_{1} and L_{2}.
So A'^{x} = L_{1} + L_{2}.
You can see that A_{x} is the hypothenuse of a right triangle formed by drawing a perpendicular from the end of A^{x} to the x'axis. Then it is easy to see that the length of L1 (the projection of A^{x} onto the x'axis) is A^{x} cos θ.
L_{1 = }A^{x} cos (θ)
To find the length of L_{2}, consider the right triangle formed by sliding A'^{x} upward along the y' axis and then drawing a perpendicular from the tip of A'^{x} to the xaxis. From this triangle, we should be able to see that
L_{2 = }A^{y} cos (π/2  θ)
where (π/2  θ) is the angle formed by the tips of A'^{x} and A^{y} (which is also the angle between the x'axis and the yaxis as you can see from the parallelogram)
So we can finally write A'^{x} = A^{x} cos θ + A^{y} cos (π/2  θ)
A similar anaylis for A'^{y}, the ycomponent of vector A in the rotated coordinate systems, gives:
A'^{y} = A^{x} cos (π/2 + θ) + A^{y} cos (θ)
The relationship between the components of the vector in the rotated and nonrotated systems is conveniently expressed using matrix notation as:
It is very important to understand that the above transformation equation does not rotate or change the initial vector in any way; it determines the values of the components of the vector in the new coordinate system. More specifically, the new components are weighted linear combinations of the original components.
As a final simplification, we can use the Einstein index notation by writing the equation as follows:
This last equation tells you that the components of a vector in the primed/transformed coordinate system are the weight linear combination of the components of the same vector in the unprimed/orginal coordinate system. And the weighting factors a_{ij} are the elements of the transformation matrix.
So in our example, we could write the transformation matrix a_{ij} as follows:
Basisvector transformation
Let us try now to figure out how a basis vector transform from the non primed to the primed coordinate when the original basis vector is rotated through angle Θ. We have to be very careful on the meaning of transformation when referring to basisvector: we are not looking at how the components of the same vector transform from an original to a new coordinate system (above example of a_{ij} transformation matrix), but how to find the components of the new (rotated) vector in the original/same coordinate system.
We could show easily through geometric constructions such as those shown precedently that the components A'x and A'y of the new rotated vector (A') in the original coordinate system are:
Multiplying the two matrices = the transformation matrix for finding components of same vector as coordinate system is rotated through angle Θ, and the transformation matrix for finding new basis vectors by rotating original basis vectors through angle Θ reveals the nature of the relationship between them:
There is clearly an inverse relationship between the basisvector transformation matrix and the vectortransformation matrix, so we can say in that case that the vector components transform "inversely to" or "against" the manner in which the basis vector transform. That's exactly why we qualify these components as contravariant components and why we use the superscript notation.
Quantities that transform contravariantly
For any coordinate system in which a linear relationship exists between differential length elements ds, writing the equations which transform between the system is quite straightforward. If you call the differentials of one coordinate system dx, dy and dz and the other coordinate sytem dx', dy', and dz' the transformation equations from the unprimed to the primed system comes directly from the rules of partial differentiation:
which once again, using the Einstein summation convention could be written as:
But the dx'^{i}/dx^{j} terms are also the components of the basis vectors tangent to the original (unprimed) coordinate axes, expressed in the new (primed) coordinate system.
Let us confirm this by an example of the transformation from 2d polar (r,θ) to cartesian (x,y) coordinates. In such case, we have x'^{1}=x, x'^{2}=y, x^{1}=r, and x^{2}=θ. We know also that x=rcosθ and y=rsinθ
Calculating the appropriate derivatives, we get
Are these really the components of the tangent vector to the original (r,θ) coordinate axes? We can confirm that by writing these components in the primed (cartesian in this case) coordinate system
The first of these expressions is a vector pointing radially outward (along the rdirection in polar coordinates) and the second is a vector pointing perpendicular to the radial direction (along the θdirection). This demonstrates that the partial derivatives do represent components of the original (unprimed here polar) covariant basis vector in the new (primed here cartesian) coordinate system.
But since we know from tha above paragraph that contravariant vector components combine with covariant basis vectors to produce identity, then differential length elements must transform as contravariant vector components.
And we should now understand why the transformation equation for contravariant components of vector A is often written as
Contravariant components and dual basis vectors
In Cartesian coordinate system as the one used previously, there is no ambiguity when you consider the process of projection of a vector onto a coordinate axis.
Now imagine a twodimensional coordinate system in which the x and y axes are not perpendicular to one another. In such cases, the process of projecting a vector onto one of the coordinate axes could be done parallel to the coordinate axes, or perpendicular to the axes.
In the diagram below, to understand parallel projections, we have to consider the basis vectors e_{1} and e_{2} pointing along the non orthogonal coordinate axes and the projections X^{1} and X^{2} of the X vector onto those directions.
In this case, vector X may be written as:
where as seen above, X^{1} and X^{2} represent the parallelprojection (contravariant) components of vector X.
Now if we project vector X in a orthogal way along the axes, we come up with the X_{1} and X_{2} components of the vector. First remark to do is that the "parallel" projections and the "orthogonal" projections don't have quite the same length and that obviously using the rules of vector addition with X_{1 }and X_{2} don't form vector X. The perpendicular projections simply don't add up as vectors to give the original vector._{}
It's then reasonable to wonder if there are alternative basis vectors than e_{1} and e_{2} that would allow the perpendicularprojection components to form a vector in a manner analoguous to the contravariant components.
There are, and those alternative basis vectors are called "reciprocal" or "dual" basis vectors. These have two defining characteristics:
 Each one must be perpendicular to all original basis vectors with different indices. So if we call the dual basis vectors e^{1} and e^{2} to distinguish them from the original basis vector e_{1} and e_{2}, you have to make sure that e^{1} is perpendicular to e_{2} (which is the yaxis in this case). Likewise, e^{2} must be perpendicular to e_{1} (and thus perpendicular to the xaxis in this case).
 The second defining characteristic for dual basis vector is that the dot product between each dual basis vector and the original basis vector with the same index must equal one, so e^{1}oe_{1} = 1 and e^{2}oe_{2}=1.
The covariant components X_{1} and X_{2} made onto the direction of the dual basis vectors rather than onto the directions of the original basis vectors can than be written as follows:
We use superscript notation to denote the dual basis vectors as the inverse tranformation matrix has to be used when these basis vectors are transformed to a new coordinate system, as it is for the contravariant vector components X^{1} and X^{2}.
Conclusion:
So a vector A represents the same entity whether it is expressed using contravariant components A^{i} or covariant components A_{i}:
where e_{i} represents a covariant basis vector and e^{i} represents a contravariant basis vector.
In transforming between coordinate systems, a vector with contravariant components A^{j} in the original (unprimed) coordinate system and contravariant components A'^{i} in the new (primed) coordinate system transforms as:
where the dx^{'i}/dx^{j} terms represent the components in the new coordinate sytem of the basis vector tangent to the original axes.
Likewise, for a vector with covariant components Aj in the original (unprimed) coordinate system and covariant components A'i in the new (primed) coordinate system, the transformation equation is:
where the dx^{j}/dx'^{i} terms represent the components in the new coordinate sytem of the (dual) basis vector perpendicular to the original axes.
[1] Defintion given by Daniel Fleisch in his Student's Guide to Vectors and Tensors  Chapter 5  Higher rank tensors p.134
[2] In more formal mathematical terms, a transformation of the variables of a tensor changes the tensor into another whose components are linear homogeneous functions of the components of the original tensor (reference MathWorld article Homogeneous Function).
[3] We recall that according to the Principle of Relativity, laws of physics are the same in any inertial frame of reference.
[4] We will see in the next part of the article why we are superscript index notation for the 'x' and 'y' there; just let us say for now that is because they represent the contravariant components of the vector and this is for distinguishing them from the covariant components A_{x} and A_{y}.
Metric tensor example

Metric Tensor
Let us try to illustrate this by the tensor that we have used extensively so far, at least since our article Generalisation of the metric tensor in pseudoRiemannian manifold, i.e the metric tensor.
where ξ^{α} are the coordinates in an inertial referential and x^{μ} the coordinates in a arbitrary referential.
If we now try to express this metric tensor components g'_{μν} in an another arbitrary referential R' with coordinate x'^{μ}, we get:
which is actually conform to the transformation equation of the covariant components of a secondrank tensor
In this expression, T'_{μν} are the covariant tensor components in the new coordinate system, T_{αβ} are the covariant tensor components in the original coordinate system, and δx^{α}/δx'^{μ} as well as δx^{β}/δx'^{ν} are elements of the transformation matrix between the original and new coordinate systems. These elements of the transformation matrix represent the dual basis vectors perpendicular to the original coordinate axis.
Index raising and lowering
One of the very useful functions of the metric tensor is to convert between the covariant and contravariant components of the other tensors.
So imagine that you are given the contravariant components and original basis vectors of a tensor and you wish to determine the covariant components. One approach could be to determine the dual basis vectors, performing the perpendicular projections as seen above, but with the metric tensor you have the sorther option to use relations such as
If you wish to convert from a covariant index to a contravariant index, you can use the inverse g_{ij} (which is just g^{ij}) to perform operations like
This same process works also for higherorder tensors
This is the consequence of a more general mecanism called contraction, by which a tensor can have its rank lowered by multiplying it by another tensor with an equal index in the opposite position, ie by summing over the two indices. In this example, the upper and lower α indices are summed over: