This web site is aimed at the general reader who is keen to discover Einstein's theories of special and general relativity, and who may also like to tackle the essential underlying mathematics.
Einstein's Relativity is too beautiful and too engaging to be restricted to the professionals!
Have fun!
"I have no special talents
I am only passionately curious"
Albert Einstein
]]>
Quantum computers are built on top of singlequbit and 2qubit operators. In the last two articles, we covered few singlequbit gates, and especially the Hadamard gate which puts a qubit into superposition.
Here, we will explore the 2qubit operators and more precisely we will look at putting qubits into entanglement with the CNOT gate.
The ControlledNot gate (CNOT) is analogous to the XOR gate (Exclusive OR) in classical computing. We have already presented in our previous article Introduction to quantum logic gates the XOR gate table truth: it gives a true (1 or HIGH) output when the number of true inputs is odd.
The quantum CNOT gate has two inputs, and thus two outputs. The target input is negated only if the control input is set to 1. If the control input is 0, the gate has no effect. The control qubit is not changed by the gate.
Below is a snapshot of both classical and quantum diagrams. We verify easily that they mirror each other: the quantum Target output column matches the y+x column of the classical XOR gate.
As we already know, each gate/operator can be expressed as a matrix. As the CNOT gate takes two qubits as inputs and two qubits as output it will be a 4x4 matrix.
There is a useful technique to transform a truth table to a matrix. tarting at row 0 column 0, you label the columns and rows consecutively in binary, from 00 to 11 for example. You then place a 1 in a cell if the input maps to the output; 0 otherwise. That's it, you're left with a matrix for your gate.
Let's try to apply the CNOTgate by example to the 00> state, by multiplying the CNOT matrix to the basis state vector
We observe that it is the expected result ;)
If we now try to apply the CNOT gate to the 10> state, by definition as the control qubit is 1>, the second qubit should be flipped from 0> to 1>. Let's observe it:
As expected, we get 11>.
Let's try now to prepare a twoqubit system by example in 00> state, and then:
 apply the Hadamard gate to the first one so we get a 50/50 superposition state as detailed in Introduction to quantum logic gates
 then apply the CNOT gate with the second qubit 0> acting as the control qubit
]]>
If we need to construct the tensor product of two operators, and we do already know the matrix elements of the building blocks, we can combine them directly.
Here is the rule for combining by example 2x2 matrices to form 4x4 matrices:
or
Remark 1: the same pattern works for matrices of any size. In general, the product of an m x n matrix and a p x q matrix is a mp x nq matrix.
Remark 2: All of this applies perfectly well to column and rows vectors, which are just some kind of matrices.
so we would write by example for the basis vector uu> of a qubit system:
]]>
In the previous article, we looked at singlequbit quantum gates, like the fundamental Hadamard gate.
We have detailed the way it was applying to one qubit. But let's see how it could operate on two qubits, by example on 00>.
As we know already from our previous article Expectation value of a product state, the final state will be the tensor product of the two transformed singlequbit states.
which can be developed to:
So by applying the Hadamard gate to two qubits, we generate the superposition of all four basis states of the tensor product of the substates, with an equiprobability of (1/2)^{2} = 0.25 of the computational basis states 0> and 1>.
Below is this quantum circuit visualized in the Quirk browser. Quirk confirms that the probability that the result state collapses to one of the four basis vectors equals to 25%.
Remark 1: More generally, starting with a state of n qubits 0...0_{n}⟩, if we apply the Hadamard gate to each qubit it results in the following state:
We end up with 2^{n} observable basis states. Many quantum algorithms use the Hadamard transform as an initial step, since it maps n qubits initialized with 0> to a superposition of all 2^{n} orthogonal states in the 0>,1> basis with equal weight.
]]>
In our previous article Introduction to quantum computing, we saw that the real value of quantum computing lies in the fact that a quantum system can do processing while the qubits are in a superposition state. Hence, the operations defined by the quantum algorithms do not manipulate by example just 3 bits, they manipulate 2^{3} probability^{[1]} values. One step in a quantum algorithm on a quantum computer with 3 qubits is therefore modifying 8 values. Adding one qubit doubles the processing capabilities of the quantum computer. This explains the term "exponential" that is often used together with quantum computing: adding N qubits adds processing power proportional to 2^{N} .
If it should be clear by now that a quantum computer allows some kind of parallel processing, what does we mean more precisely by processing a qubit?
In classical computing, all the operations finally come down to a sequence of simple manipulations of the bits in the computer systems. Those lowlevel operations are achieved using gates. It can be shown that with a limited number of gates, all possible scenarios can be achieved.
A very simple classical gate is the NOT gate, also known as the inverter.This gate has one input bit, and one output bit. The output bit of the gate is the inverse of the input bit. If the input is "0", the output will be "1". If the input is "1", the output will be "0".
The behavior of gates is often explained via simple tables where the possible combinations of input bits are listed, and the resulting output is listed in the last column. The following table shows the behaviour of the NOT gate
When the input of the gate is '0', the output is '1'. When the input of the gate is '1', the output is '0'.
The NOT gate involves a single bit only, but other gates involve more bits. The XOR (or EXOR) gate , for example, takes the input of 2 bits, and outputs a value that is '1' in case exactly one of the 2 input bits is 1 and the other is '0'.
If we want to process data in quantum computing, we have to use gates as well. Quantum gates are the building blocks of quantum circuits, like classical logic gates are for conventional digital circuits.
But unlike many classical logic gates, quantum logic gates sould be reversible. That is, it should always be possible to apply another gate and go back to the state of the system before the first gate was applied. This restriction does not apply on classic gates. For example, the XOR gate is not reversible. If the result of an XOR gate is '1', it is impossible to know whether the first bit was '0', or whether it was the second bit^{[2]}.
The quantum equivalent of the NOT gate for classical computers is called the PauliX gate. It acts on a single qubit. If the qubit is in 0> or 1> states, the PauliX gate flips its value of 0> into 1> and vice and versa.
]]>We have so far explained in an easy way the theorical basis of entanglement. How cool it would be to illustrate this abstract articles by a quantum program? Let's first give a quick overview of quantum computing.
Quantum computing has been an increasingly popular research filed and source of hype over the last few years. Very recently, Google has even officially announced that it has achieved quantum supremacy in a new article published in the scientific journal Nature. Google says that its 54qubit Sycamore processor was able to perform a calculation in 200 seconds that would have taken the world’s most powerful supercomputer 10,000 years.
Almost everybody has heard that quantum programs work by asking quantum computing hardware for quantum bits or qubits, quantum analogues of classical bits that we can use to perform some computations. But what is exactly a qubit and how it differs from a classical bit?
Qubits are the basic unit of information in a quantum computer and represent the simplest quantum systems, i.e. the ones described by a twodimensional Hilbert space. They can be physically implemented by systems that have two states and thus they could be either the quantum coin described in our previous article Introduction to quantum entanglement or more likely the spin electron as exposed in Expectation value of a product state, which is fully described by the combination of the twobasis vectors up and down, or u> and d>, or equivalently 1> and 0>.
Say we have classical 3bit computer: at any moment of time it will be able to store only one of the 2^{3} combinations possibles in memory, by example 101.
On the opposite, a 3qubit quantum computer will be able to store any combination α000> + β001> + γ010> + δ011> + ε100> + ζ101> + η110> + θ111> with α^{2} + β^{2} + γ^{2} + ε^{2} + ζ^{2} + η^{2} + θ^{2}=1^{[1]}
By example our 3qubits system could be in the following state at a given time t
At a given time, the eight complex numbers of the second column are the exact content of the memory of the quantum computer. Of course, it is impossible for us to know anything about these complex numbers during the algorithm processing; that's only at the end that a measure will output a triplet of classical bits, by example 110.
If we had N qubits, we would have 2^{N} lines in the above table. And with around 300 qbits, we would have more lines than the number of atoms in the entire observable universe!
]]>So far we have described the entangled state as this very puzzling composite state where we ignore everything about each subsystem taken separately but where the measure of one subsystem gives us direct (or real in the EPR vocabulary) information about the measurement of the other subsystem.
The previous article Expectation value of an entangled state exposed the first part: the outcome of the measurement of each subsystem is completely random. It's time now to look at the second part, the correlation.
For this, we have to introduce a new kind of observable, wider than the ones that Alice and Bob can measure separately, by using only his own detector. The measurement of this new family of observables, the composite observables, requires both detectors.
More precisely, the composite observable is an observable that is mathematically represented by first applying Alice's observable and then Bob's observable.
To make it more concrete, let's take the example of the previous entangled state
If we ask Alice to measure σ_{Az}, Bob to measure σ_{Bz} and then to compare their results, that's what we have to calculate to predict the result:
We have done half the work in the previous article:
Applying now σ_{Bz}, we get:
or more simply:
]]>
More precisely, we have decided to start this new series of articles by the presentation of the famous EPR Paradox and will dive further into the explanation of the so puzzling concept of entaglement.
Some basic concepts of quantum mechanics are assumed to be known, but don't worry we progress gently without too much high mathematical abstraction.
Enjoy your reading !
]]>After having spent some time on the expectation value of the the sigma observable for a product state, let's try to do the same for an entangled state.
Let's choose our entangled state to be for instance^{[1]}:
For simplicity, let's write each composite vector as uu> and dd>  but still remembering that such vectors are nothing else than a tensor product of the basis vectors of each subspace  and so simplify this equation from now on as :
Let's look first at the expectation value of Alice's σ of this state along the zaxis. We know already all the machinery we need to compute it
or
so finally
Let's look now at the expectation value of Alice's σ of this state along the yaxis
]]>
After having introduced the basic concept of entanglement in our previous post Introduction to quantum entanglement by considering a composed system made up of a coin and a dice, let's consider a new more realistic system built up from two spins  each spin is attached to a a particule at a different space location.
Alice has a set of spin operators, labeled σ_{A}, that act on her system and Bob has a similar set for his system, which can be labeled σ_{B}, so we don't mix them up with Alice's ones.
Each operator/apparatus can be independently oriented along any axis, so that the full set of components for Alice's and Bob's are: σ_{Ax}, σ_{Ay}, σ_{Az} and σ_{Bx}, σ_{By}, σ_{Bz}
Knowing that any vector of a given Hilbert space can be expressed in any orthonormal basis, let's choose the basis along the z axis. Along this axis, each spin can be either in up state u> or in down state d>.
As per our previous article, the basis vectors are nothing else than the tensor product of the basis vectors of each subspace, ie the tensor product of Η_{A} = {u_{A}, d_{A}} x H_{B} = {u_{B}, d_{B}}:
which can be also written in a more abbreviated way as
We know that the simplest state for the composite system is a product state (the result of completely independent preparations by Alice and Bob) and is of the form:
where the left factor represents Alice's state and the second factor represents Bob's.
We recall also from our basic quantum mechanics lesson how the spin operators act on the distinct states of a single spin
In this paragraph, we want to focus on the expectation value of a product state.
Let's start with a very simple product state with both spins of Alice and Bob prepared in the up state^{[1]}:
By definition, the expectation value of this state along the z axis from Alice's observable σ_{A} is^{[2]}:
The only thing we have to know is how to apply an operator of a subsystem when applied on a tensor product state. This answer is very simple: when the operator of Alice applies to u_{A}u_{B}>, u_{A}d_{B}>, d_{A}u_{B}> or d_{A}d_{B}> it just ignores Bob's half of the state label^{[3]}.
It's easy to see when we replace the operator by the tensor product of the operator with the identity operator I, as it should be.
So that finally
]]>In our previous article EPR Paradox, we exposed the counterintuitive predictions of quantum mechanics about strongly correlated systems as they were first discussed by Albert Einstein in 1935, in a joint paper with Boris Podolsky and Nathan Rosen.
However, the three scientists did not coin the word entanglement, nor did they generalize the special properties of the state they considered. Following the EPR paper, Schröndinger wrote a letter ^{[1]} to Einstein in German in which he used the word Verschränkung (translated later by himself as entanglement) to describe
the correlations between two particles that interact and then separate, as in the EPR experiment.
In this article, we would like to expose gently^{[2]} the basis of the theory which underlies this very special correlation.
Even if the EPR paradox invokes the formal language like System I and System II, we will prefer instead lighterweight, informal language like Alice and Bob.
We assume that Alice's system is described by a space of states called S_{A}, and similarly Bob's system is described by a statespace called S_{B}.
Although Einstein's EPR paper was about describing two particles with the observables position and momentum, let's suppose for simplicity that Alice's system consists of a quantum coin of a finite^{[3]} twodimensional space of states defined by the two basis vectors H (for head) and T (for tail). As it is a quantum mechanical coin, it can exist in superposition and we can thus write its state as:
We can choose Bob's system to be a coin as well, but let's assume it is something different, like a sixsided dice. Bob's Hilbert space of states S_{B} would then be sixdimensional, with each state being decomposed on the folllowing basis:
Now if we were to consider these two systems to exist no more independently, but as a composite system, how could we construct the statespace S_{AB} for this combined system?
]]>Since 1927, although Einstein was considered to have played a fundamental pioneer role in the birth of quantum physics and was himself recognizing the quantum theory’s significant achievements, he started to have some reservations with some of its fondamental aspects and constantly argued against the pretention of its founders and proponents to have settled a definitive theory^{[1]}.
How could the great scientist accept the essential statistical nature of the quantum theory, which ceases to be deterministic as soon as we try to measure an observable^{[2]}? How could he give a full credit to a theory which gives such an important place to the observation or more precisely to the act of measuring?
Einstein started to wonder whether it was possible, at least in principle, to ascribe certain properties to a quantum system in the absence of measurement: both indeterminism and irrealism so far inherent to the quantum interpretation would then begin to crack.
The so called 1935 EPR^{[3]} paper Can QuantumMechanical Description of Physical Reality Be Considered Complete? (Phys. Rev. 47, 777 (1935) – Published 15 May 1935) challenges more particularly the prediction of quantum mechanics that it is impossible to know both the position and the momentum of a quantum particle.
The articles begins with the logical disjunction of these two following assertions as a first premise; one or another of these must hold:
 (1) the description of reality given by the wave function in quantum mechanics is not complete
 (2) two physical quantities described by noncommuting operators cannot have simultaneous reality
According to the Copenhagen interpretation of quantum mechanics, (1) is false^{[4]} and (2) is true.
The aim of the EPR article is for the authors to show, on the contrary, that (1) is true and (2) is false.
]]>After having derived the Einstein equation by the historical way (refer to Einstein Field Equations) and through the more modern lagragian road (see EinsteinHilbert action), it is now time to seek for a solution.
Minkowski spacetime is usually known as a 'trivial' solution to the Einstein field equations. Triviality meaning that both the energy momentum tensor and the Riemann curvature tensor equal to zero.
Trying to give an exact solution to the Einstein equations which do not reduce to Minkowski flat spacetime is notorious difficult. Einstein himself used approximation methods when working out the predictions of general relativity ^{[1]}  in the same way it is common to use approximation when looking at the Newtonian limit (weak, static gravitational fields and slow moving particles) by assuming the metric to be g_{μν} = η_{μν} + h_{μν}, refer by example to our previous article Geodesic equation in the Newtonian Limit.
That's why Einstein was so pleasantly surprised when in 1916, shortly after he had proposed his general theory of relativity, a German astrophysicist Karl Schwarzschild published an exact solution to the field equations. That's how Einstein's letter from 16 January 1916 to Schwarzschild ^{[2]} begin:
"Highly esteemed Colleague, I examined your paper with great interest. I would not have expected that the exact solution to the problem could be formulated so simply. The mathematical treatment of the subject appeals to me exceedingly. Next Thursday I am going to deliver the paper before the Academy with a few words of explanation."
]]>An alternative route to Einstein's equation is through the principle of least action, as we did previously to deduce the geodesic equation in curved spacetime in Geodesic equation from the principle of least action.
In this article, we will therfore go through the process of deriving the Einstein equations in vacuum and then in the presence of matter using the variational approach.
The derivation of the action from a set of equations of motion is very hard, not always possible, and there is no systematic way to do it. We therefore will begin by guessing the action and show that it gives the right answer.
So we will first seek an action S for gravitation that leads to the field equations of general relativity in the absence of matter and energy (in vacuum), that is, we will guess something like:
where L is a scalar Lagrange density and d^{4}V is the element of 4volume. We thus need both a scalar and the 4volume element.
]]>This article looks at the process of deriving the so called Palatini equation and follows the demonstration found in D'Inverno Introducing Einstein's relativity , Chapter 111 (General Relativity from a variational principle, The Palatini equation).
Many tensor identities are derived most easily using the technique of geodesic coordinates in a Local Inertial Frame, where we choose an arbitrary point P at which the Christoffel symbols nullify, which in D'Inverno notation could be written as:
As we know from our article Riemann curvature tensor part III: Symmetries and independant components in this particular case, the Riemann tensor reduces to:
Looking now at a variation of the connection Γ^{a}_{bc} to a new connection Γ^{a}_{bc}(hat):
Then δΓ being the difference of two connections, is a tensor of type (1,2), and this variation results in a change in the Riemann tensor between two coordinate systems as:
since partial derivatives commute with variation and is equivalent to covariant derivative in geodesic coordinates.
]]>It should be clear that General Relativity describes gravitation in terms of curvature of spacetime and reduces to Special Theory of Relativity for Local Inertial Frame (LIF). However, it is important to explicitly check that the description reduces to the Newtonian treatment when we select the correct boundary conditions.
These conditions, referred to as the Newtonian limit, are applicable to physical systems exhibiting:
Mathematically, this leads to the following approximations:
]]>Just as Maxwell's equations govern how the electric and magnetic fields respond to charges and currents, Einstein Field Equation governs how the metric respond to energy and momentum.
There are traditionally two ways of deriving this equation:
Let us consider a matrix from a general form
Then the trace of this matrix, as for any square matrix, is the sum of the elements on the main diagonal (the diagonal from the upper left to the lower right), so
Trace(A) = tr(A)=a_{0} + a_{1}
so that
If we now consider the exponential matrix e^{A} as:
then the determinant of this matrix, defined as the product of the elements on the main diagonal can be expressed as:
so that finally we can write
]]>As we know from the Newton's first law of motion, a free particle in motion travels in a straight line with constant velocity.
In this article our aim is to express this law, sometimes referred to as the law of inertia in the terms of special relativity's concepts. More precisely, we will try to demonstrate that in special relativity's Minkowski flat spacetime, a free particule has the maximum proper time of all possible world lines that connect two events.
As a starting point, let us recall the expression of the Proper Time as measured by a clock between two events A and B in any inertial referential, with t and v correponding respectively to the time and the speed of the clock as measured in this referential  here we assume c=1 for more readability:
]]>The unravelling of Einstein's erratic quest for the correct field equations can be traced in a dramatic series of weekly communications from Einstein to the Prussian Academy of Science.
On the four Thursdays of November 1915, he presented to the Prussian Academy of Science a paper of fundamental importance
1915, November 4: Einstein abandons the Entwurf theory^{[1]} and submits to the Prussian Academy of Science the first of series of papers, titled "On the General Theory of Relativity".
1915, November 11: Einsein submits "On the General Theory of Relativity (Addendum)", in which he introduces the hypothesis that macroscopic matter could eventually be reduced to "purely electromagnetic processes"^{[2]}.
]]>For 2017 new year's resolution, we have decided to undertake the french translation of the integrality of the website.
From now on, each new article will be proposed in both languages whereas each already existing article will be translated to french over time.
If you are interested to contribute to this task, feel free to contact me at the email adress specified in the About us section.
Happy new year!
]]>In General Relativity, these are the curves that a free particle (that is, a particle upon which no force acts, where ‘force’ in this case excludes gravity, since the effects of gravity are felt entirely through the curvature of spacetime) will follow in a curved spacetime .
A geodesic could be equivalently defined as:
[1] Refer to our article Geodesics as proper time maximization to see how this applies also in Special Relativity.
]]>Einstein's Zur Elektrodynamik bewegter Körper (On the Electrodynamics of Moving Bodies), his third paper published in the stellar year of 1905, was received on June 30 and published on September 26 in the Annalen der Physic scientific journal.
The fact that the paper had been accepted and published without difficulties three months later by this prestigious journal was a good sign for Einstein and gave the hitherto completely unknown physicist good hope for receiving very shortly some reactions, should they be severe criticisms.
Unfortunately, despite his efforts to carefully pick through the next publications of the Annalen der Physic journal, Einstein could not find any one single reference to his theory. Retrospectively, this sounds almost incredible!
]]>On October 17, 1933 Albert Einstein and his wife Elsa moved to the US and Albert took up a position at the Institute for Advanced Study at Princeton, New Jersey.
In 1935 Einstein decided to stay in the US and became a citizen in 1940. His affiliation with the Institute for Advanced Study would last until his death in 1955.
Einstein moves to US October 17, 1933 from EDN web site
]]>The saga of Einstein's search for his general theory of relativity could have successfully ended as early as in 1913, with the completion of the Outline of a Generalized Theory of Relativity and of a Theory of Gravitation^{[1]}, known also as the "Entwurf".
Indeed, in this paper Einstein opens the Section 5 in a very promising way by writing the field equations in the following general form:
where k is a constant, Θ_{μν} is the contravariant stressenergy tensor and Γ_{μν} is the still to be found gravitational tensor "which has to be derived from the fundamental tensor g_{μν} by differential operations. In line with the NewtonPoisson law one would be inclined to require that these equations be secondorder."
]]>Empty spacetime is flat  it looks exactly like the Minkowski's spacetime of Special Relativity.
In Einstein's geometric General Theory of Relativity, a mass  or equivalently energy  that we place in an region of space will lead to a distortion of spacetime, commonly referred to as spacetime curvature.
In curved spacetime, there are no straight lines  just as there are no straight lines on the surface of a sphere. The closest we can get to the notion of a straight line is a geodesic, a spacetime curve that is as straight as possible.
The tendency of objects in freefall along a geodesic to approach or recede from one another, in other words, the fact that the initially parallel geodesics of two objects deviate from each other  referred to as geodesic deviation, is the signature of a curved spacetime.
Another way of measuring spacetime curvature is by parallel transporting a vector (moving a vector along a path, keeping constant all the while). See the article Riemann curvature tensor part I: derivation from covariant derivative commutator where we are deriving the curvature by using the covariant derivative commutator.
]]> "You told us how an almost churchlike atmosphere is pervading your desolate house now. And justifiably so, for unusual divine powers are at work in there." Besso to Einstein, 30 Oct 1915 
In the last two articles, we have derived the G^{μν} (Einstein tensor) and T^{μν} (energymomenum tensor) components of the Einstein equation:
We have yet to determine the constant k.
To achieve this, we need to show that the Einstein equation reduces to Newton’s law of gravity for weak and static gravitational fields (Newtonian limit).
The first step consists in writing the previous Einstein equation in a slightly different form that is sometimes more practical to use in calculations.
That's actually under this second form that Einstein published it in his article "The Field Equations of Gravitation" submitted on November 25, 1915 in Königlish Preussiche Akademie der Wissenschaften
]]>If our aim is to find the relativistic generalization of Poisson 's equation for the gravitational field:
where Φ refers to the gravitational potential and ρ to the mass density, we are half way.
Indeed, we have already seen in our previous article The energymomentum tensor that the generalisation of the mass density (right hand term of the equation) corresponds to the energymomentum T^{μν}.
It seems reasonable then to assume that our equation should take the form of:
where k stands for a scalar and G^{μν}, called the Einstein tensor represents a rank2 tensor describing the spacetime curvature .
]]>Almost one year after having been lauched, the stats show an increasing interest for einsteinrelativelyeasy.com.
We are now approaching 900 unique visitors per month, with almost 1500 visits.
Our most frequent visitors are from Great Britain, India, Spain and United States.
I am hoping such a growing trend for the next following months also!
]]>As our ultimate goal is to formulate a relation between the spacetime geometry and its content, we first have to find the right mathematical tool to describe this spacetime content.
In Special Relativity, we have seen in our article Introduction to Fourmomentum vector and E = mc2 that mass, energy and momentum are all related, as expressed in the energy momentum relation:
]]>"The general theory of relativity must be capable of treating every coordinate system, whatever its state of motion relative to others may be, as "at rest", i.e., the general laws of nature must be expressed by identical equations relative to all other systems, whichever way they are moving." Einsten , Fundamental Ideas and Methods of the Theory of Relativity, Presented in Their Development (1920) 
The term covariance implies a formalism in which the laws of physics maintain the same form under a specified set of transformations.
]]>Einstein's complex general theory of relativity was accessible only to professional colleagues when it was first published in 1916. For Nature four years later (1920), Einstein sought to recount, for the general reader, the process by which he reached his revolutionary conclusions.
"On the occasion of the finding of the gravitational curvature of light rays by the British expedition that was sent to observe the eclipse of the sun, I have been urged by many to give a brief description to nonmathematicians of the theory and its development."
]]>As we have seen in our previous article, Einstein published for the first time in 1905 a demonstration to try set the equivalence of inertia and energy, or mass and energy.
However, it is only in 1912 in the Manuscript of Special Relativity, that the famous formula E=mc^{2} appears in its well known form.
]]>
We thus find the occurrence of a gravitationnal field connected with a spacetime variabilty of the g_{στ}. [Einstein The Foundation of the General Relativity Annalen der Physik, vol XLIX 1916 The Collected Papers of Albert Einstein doc. 30] 
Once you have arithmetized a space with an arbitrary coordinate system, there is one tensor that allows you to define fundamental quantities such as lengths and time in a consistent manner, no matter which coordinate system you employ.
That tensor, the one that "provides the metric" for a given coordinate system in the space of interest, is called the metric tensor, and is represented by the lowercase letter g.
]]>I have recently replied to a comment posted by Ingeniero concerning the relation between contravariant/covariant vector components and basis vectors transformation matrix.
Hope it makes things clearer.
See the comment
]]>
In 1907, two years of the appearance of his first paper on the theory of relativity, Einstein published in the then prestigious Jahrbuch der Radioaktivität und Elektronik an extensive survey article on the subject entitled On the relativity Principle and the conclusions drawn from it"^{[1]}.
In particular, in §V. Principle of Relativity and Gravitation section 17. Accelerated Reference System and Gravitational Field, Einstein assumes "the complete physical equivalence of a gravitational field and a corresponding acceleration of the reference system."
]]>Albert Einstein's paper Does the inertia of a body depend upon its energy content?^{[1]} was published in the journal "Annalen der Physik" on November 21, 1905.
In this paper, Einstein revealed the relationship between energy and mass that would eventually lead to the famous massenergy equivalence formula E=mc^{2} (energy equals mass times the velocity of light squared).
To be more exact, in this article, the equation E=mc^{2} is not originally written as a formula but as a sentence in German saying that "if a body releases the energy L in the form of radiation, its mass diminishes by L/V^{2}". Einstein uses V to mean the speed of light in a vacuum and L to mean the energy lost by a body in the form of radiation.
]]>Our aim is to get more familiar with the Riemann curvature tensor and to calculate its components for a twodimensional surface of a sphere of radius r.
First let's remark that for a twodimensional space such as the surface of a sphere, the Riemann curvature tensor has only one not null independent component.
Actually as we know from our previous article The Riemann curvature tensor part III: Symmetries and independant components, the first pair and last pair of indices must both consist of different values in order for the component to be (possibly) nonzero. Therefore, in two dimensional space where each indice could only take the values 0 and 1, the only possibility for each pair is to contain these distinct indices 0 and 1, which represent the coordinates θ and φ in polar coordinates.
]]>As we will see later, the Bianchi Identity equation will be of fundamental importance to find the Einstein equation.
Also the complete, unalterated form of the Riemann curvature tensor doesn't appear in the Einstein field equations. Instead, it is contracted to give two other important measures of the curvature known as the Ricci tensor and the Ricci scalar.
In this article, our aim is to define these three important Rieman tensor derivatives.
]]>The definition of a manifold captures the idea that the coordinate systems are local, and constraints the permitted transformations between these local coordinates. In this sense, a surface is a twodimensional manifold; spacetime is a fourdimensional manifold.
The Metric tensor g allows one to define and compute the infinitesimal distance on the manifold:
We recall that the physical significance of this is that if we have a small displacement in spacetime (dt, dx, dy, dz) then ds is the total distance moved, and also that this quantity is an invariant, i.e all observers in any frame of reference will agree on it.
A metric is positive definite if ds^{2} is always positive, and Riemannian manifolds have a metric that is positive definite.
]]>In April 2017, eight observatories around the world—including the Atacama Large Millimeter Array (ALMA) and the South Pole Telescope—will simultaneously observe Sagittarius A* , the supermassive black hole at the center of Milky Way.
]]>In our two previous articles, we have deduced the rather complicated expression of the Riemann curvature tensor, a glorious mixture of derivatives and products of connection coefficients, with 256 (=4^4) components in fourdimensional spacetime.
But we have also demonstrated in our article Local Flatness or Local Inertial Frames and SpaceTime curvature that any arbitrary coordinate system could nullify all but 20 second derivatives of a given metric in a curved spacetime. Our aim in this article is to demonstrate that the Riemann tensor has only 20 independant components and that these component are precisely a combination of these second not null derivatives.
]]>As suggested by Richard P. Feynman in his lecture dedicated to Special Relativity, we should look at the Lorentz transformations in the same way as what is happening for an 'ordinary' object: when we look at an object, its depth and width are only apparent and will be different if we step aside and look at the same object from a different optical angle.
By analogy, we should think about space and time coordinates as only 'apparent' properties depending on our velocity; in other words, in the space measurements of one man there is mixed in a little bit of the time, as seen by the other and vice versa. We don't easily get this intuition because as we have had no effective experience of going nearly as fast as light in the past, our brain is unable to apprehend the fundamental unique nature of the concepts of space and time, and to recombine them together as it would do for width and length when we move around an object to a new position.
]]>Commitment: 12 weeks of study, 46 hours per week
More information there Introduction into General Theory of Relativity
]]>In the previous article The Riemann curvature tensor part I: derivation from covariant derivative commutator, we have shown a way to derive the Riemann tensor from the covariant derivative commutator, which physically corresponds to the difference of parallel transporting a vector first in one way and then the other, versus the opposite.
Another interpretation is in terms of relative acceleration of nearby particles in freefall.
]]>In our previous article Local Flatness or Local Inertial Frames and SpaceTime curvature, we have come to the conclusion that in a curved spacetime, it was impossible to find a frame for which all of the second derivatives of the metric tensor could be null.
We have also mentionned the name of the most important tensor in General Relativity, i.e. the tensor in which all this curvature information is embedded: the Riemann tensor  named after the nineteenthcentury German mathematician Bernhard Riemann  or curvature tensor. In other words, the vanishing of the Riemann tensor is both a necessary and sufficient condition for Euclidean  flat  space.
In this article, our aim is to try to derive its exact expression from the concept of parallel transport of vectors/tensors.
]]>In the precedent article Covariant differentiation exercise 1: calculation in cylindrical coordinates, we have deduced the expression of the covariant derivative of a tensor of rank 1, i.e of a contravariant vector  type (1,0) or of a covariant vector  type (0,1).
It can be shown that the covariant derivatives of higher rank tensors are constructed from the following building blocks:
Following the three rules given above, we obtain for tensors of rank 2, respectively of type (1,1) T^{μ}_{ν} , (0,2) T_{μν} and (2,0) T^{μν}:
We recall from our article Minkowski's FourDimensional SpaceTime the Euclidean metric tensor's expression for Cartesian coordinates
And substituting g_{ij} in the second above equality dedicated to type (0,2) tensor gives
As all the terms g_{ij} are constant, the first term δg_{ij}/δx^{β} is null. And because there is no curvature in the Euclidean space, all the Christoffel symbols vanish, so that the conplete righthand side of the equation equals zero, and therefore
But this is the magic and clever thing about tensors, if this equation holds true in a particular coordinates system, here the Cartesian coordinates, it must be true for the Euclidean metric in ALL coordinate systems.
Let's try to verify this by calculating one component of the covariant differentiation in the spherical coordinates.
We recall from our article that in spherical coordinates, the metric's expression is
If we were to calculate the component g_{ΦΦ;θ}, we should then write
But g_{αφ} !=0 only if α = φ, based on the above expression, so we can simplify this equation
Or
And from previous article Christoffel symbol exercise: calculation in polar coordinates part II, we know the expression of the Christoffel symbols Γ^{Φ}_{θΦ} = Γ^{Φ}_{Φθ} = cosθ/sinθ
so that
Finally, we confirm that this component of the covariant derivative with respect to θ equals also zero in a polar coordinates system, as expected
]]>
To make the meaning of the equations of covariant differentiation seen in last article Introduction to Covariant Differentiation more explicit, we will consider the covariant derivative of vector V with respect to θ in cylindrical coordinates (so x^{1}=r, x^{2}=θ, and x^{3}=z).
Setting β=2 in the following equation, since we are interested in the covariant derivative with respect to θ:
we get
We know the values of the first two Christoffel symbol as we have already calcuted them in the previous article Christoffel symbol exercise: calculation in polar coordinates part I
so that we already know that
We know also that since
all the symbols from the following form vanish
thus we end up with this equality
which says that a change in the rcomponent of vector V caused by a change in θ is caused both by a change in V^{r} with respect to θ and by a change in the basis vectors which causes a portion of V that was originally in the θdirection to now point in the rdirection.
Likewise, for the change in V^{θ} as the value of θ is changed, we have
which says that a change in the θcomponent of vector V caused by a change in θ is caused both by a change in V^{θ} with respect to θ and by a change in the basis vectors which causes a portion of V that was originally in the rdirection to now point in the θdirection.
Thus finally the covariant derivative of vector V with respect to θ in cylindrical coordinates is
]]>
"For much of the period between 1912 and 1916 he was truly lost in the tensors, quite completely on the wrong path, accompanied by erroneous reasons he claimed to be fundamental.
And yet, quite singularly, in the course of a month he abondoned his errors and their justifications.
The moral, perhaps, is that a certain fickleness is more conducive to theoretical progress than is any abundance of conceptual clarity  at least if one is Einstein."
Lost in the Tensors: Einstein's Struggles with. Covariance Principles, 19121916
]]>
I really enjoyed watching these videos on General Relativity and would like to share them with you:
]]>After spending some time looking at tensors, we can now expose the problem of how to differentiate a tensor.
Consider a vector V = V^{α}e_{α} (ie the tensor has contravariant components V^{α} and coordinate basis vectors e_{α}). Using the product rule of derivation, the rate of change of the components Vα (of the vector V) with respect to x^{β}.
But we recall from our article Christoffel Symbol or Connection coefficient that the connection coefficients are defined by:
Substituing this expression in the above equation gives
The right hand term has two dummy indices (ie indices to be summed over) α and γ. We can improve the formula by changing α to γ and γ to α to give:
and factoring out e_{α} gives
This expression indicates the rate of change of V^{α} in each of the directions β of the coordinate system x^{β}, and is known as the covariant derivative of the contravariant vector V. The nabla symbol is used to denote the covariant derivative
In words: the covariant derivative is the usual derivative along the coordinates with correction terms which tell how the coordinates change.
The intesting property about the covariant derivative is that, as opposed to the usual directional derivative, this quantity transforms like a tensor, i.e. it is independant of the manner in which it is expressed in a coordinate system.
Remark 1: As we have seen in our articles Local Flatness or Local Inertial Frames and SpaceTime curvature and Local Inertial Frame (LIF), in a inertial frame of reference, the vanishing of the partial derivatives of the metric tensor at any point of M is equivalent to the vanishing of Christoffel symbols, and then we can write this fundamental equality in the context of any inertial or local inertial frame:
Remark 2: the fact that the Christoffel symbol by itself does NOT transform as a tensor can be easily deduced from the fact that we can always find an (local) inertial frame in which its value equals zero, which should not be possible for a tensor.
Remark 3: we can also find these equivalent notations for the covariant differentiation. In particular, common notation for the covariant derivative is to use a semicolon (;) in front of the index with respect to which the covariant derivative is being taken (β in this case)
Let's take the scalar product A^{μ}Bμ of two arbitrary vectors, one covariant A and the other contravariant B. We have then, applying the derivation rules:
But as the value of a scalar in a point in spacetime does not depend on the basis vectors, then the covariant derivative of a scalar equals to its ordinary derivative:
Comparing these two last equations gives by renaming some of the mute indices:
As this equation should hold true for each arbitrary A vector, the quantity in brackets should be necessary null. So we have shown that the expression of the covariant derivative of the covariant components of a vector B is as below:
Note that the term involving Christoffel symbols is substracted in this case.
In the same manner as with the contravariant vectors, the second term vanishes in a context of inertial frame of reference. We have then:
]]>
In our last article Local Flatness or Local Inertial Frames and SpaceTime curvature, we have introduced the concept of Riemann tensor, saying that the importance of this tensor stems from the fact that nonzero components are the hallmark of the spacetime curvature.
But before to delve into more details and to give a complete formulation of the most important tensor in General Relativity, it seems reasonnable to get a better understanding of the tensor's concept itself.
Let us start by giving a definition first:
A tensor of rank n is an array of 4^{n} values (in fourdimensionnal spacetime) called "tensor components" that combine with multiple directional indicators (basis vectors) to form a quantity that does NOT vary as the coordinate system is changed^{[1]}. 
So we will have to think of tensors as objects with components that transform between coordinate systems in specific and predictable ways^{[2]}.
Corollary 1: Combined with the principle of General Covariance, which extends the Principle of Relativity ^{[3]} to say that the form of the laws of physical should be the same in all  inertial and accelerating frames, it means that if we have a valid tensor equation that is true in special relativity (in an inertial frame), this equation will remain valid in general relativity (in a accelerating frame)
Corollary 2: A null tensor in one coordinate system is null in all other coordinate systems. In other words, a quantity that we can nullify by coordinate system transformation is NOT a tensor.
A good place to begin is to consider a vector, which is nothing else thant a tensor of rank one, and to consider this question:"What happens to a vector when you change the coordinate system in which you're representing this vector?" The quick answer is that nothing at all happens to the vector itself, but the vector's components may be different in the new coordinate system.
Let us consider the simple rotation of the twodimensional Cartesian coordinate system shown below. In this transformation, the location of the origin has not changed, but both the x and y axis have been tilted counterclockwise by an angle of θ. The rotated axes are labeled x' and y' and are drawn using red color to distinguish them from the original axes.
Our aim is to express the components A'^{x} and A'^{y}^{[4]} of the vector A in the primed/rotated coordinate system relative to the components A^{x} and A^{y} in the unprimed/untransformed coordinate system, defined as follows:
If you think to the changes to components A^{x} and A^{y} of the vector A, you might come to realize that the vector component A'^{x} in the rotated coordinate system can not depend entirely on the component A^{x} in the original system. Actually, as you can see in the figure above, A'^{x} can be considered to be made up of two segments, labeled L_{1} and L_{2}.
So A'^{x} = L_{1} + L_{2}.
You can see that A_{x} is the hypothenuse of a right triangle formed by drawing a perpendicular from the end of A^{x} to the x'axis. Then it is easy to see that the length of L1 (the projection of A^{x} onto the x'axis) is A^{x} cos θ.
L_{1 = }A^{x} cos (θ)
To find the length of L_{2}, consider the right triangle formed by sliding A'^{x} upward along the y' axis and then drawing a perpendicular from the tip of A'^{x} to the xaxis. From this triangle, we should be able to see that
L_{2 = }A^{y} cos (π/2  θ)
where (π/2  θ) is the angle formed by the tips of A'^{x} and A^{y} (which is also the angle between the x'axis and the yaxis as you can see from the parallelogram)
So we can finally write A'^{x} = A^{x} cos θ + A^{y} cos (π/2  θ)
A similar anaylis for A'^{y}, the ycomponent of vector A in the rotated coordinate systems, gives:
A'^{y} = A^{x} cos (π/2 + θ) + A^{y} cos (θ)
The relationship between the components of the vector in the rotated and nonrotated systems is conveniently expressed using matrix notation as:
It is very important to understand that the above transformation equation does not rotate or change the initial vector in any way; it determines the values of the components of the vector in the new coordinate system. More specifically, the new components are weighted linear combinations of the original components.
As a final simplification, we can use the Einstein index notation by writing the equation as follows:
This last equation tells you that the components of a vector in the primed/transformed coordinate system are the weight linear combination of the components of the same vector in the unprimed/orginal coordinate system. And the weighting factors a_{ij} are the elements of the transformation matrix.
So in our example, we could write the transformation matrix a_{ij} as follows:
Let us try now to figure out how a basis vector transform from the non primed to the primed coordinate when the original basis vector is rotated through angle Θ. We have to be very careful on the meaning of transformation when referring to basisvector: we are not looking at how the components of the same vector transform from an original to a new coordinate system (above example of a_{ij} transformation matrix), but how to find the components of the new (rotated) vector in the original/same coordinate system.
We could show easily through geometric constructions such as those shown precedently that the components A'x and A'y of the new rotated vector (A') in the original coordinate system are:
Multiplying the two matrices = the transformation matrix for finding components of same vector as coordinate system is rotated through angle Θ, and the transformation matrix for finding new basis vectors by rotating original basis vectors through angle Θ reveals the nature of the relationship between them:
There is clearly an inverse relationship between the basisvector transformation matrix and the vectortransformation matrix, so we can say in that case that the vector components transform "inversely to" or "against" the manner in which the basis vector transform. That's exactly why we qualify these components as contravariant components and why we use the superscript notation.
For any coordinate system in which a linear relationship exists between differential length elements ds, writing the equations which transform between the system is quite straightforward. If you call the differentials of one coordinate system dx, dy and dz and the other coordinate sytem dx', dy', and dz' the transformation equations from the unprimed to the primed system comes directly from the rules of partial differentiation:
which once again, using the Einstein summation convention could be written as:
But the dx'^{i}/dx^{j} terms are also the components of the basis vectors tangent to the original (unprimed) coordinate axes, expressed in the new (primed) coordinate system.
Let us confirm this by an example of the transformation from 2d polar (r,θ) to cartesian (x,y) coordinates. In such case, we have x'^{1}=x, x'^{2}=y, x^{1}=r, and x^{2}=θ. We know also that x=rcosθ and y=rsinθ
Calculating the appropriate derivatives, we get
Are these really the components of the tangent vector to the original (r,θ) coordinate axes? We can confirm that by writing these components in the primed (cartesian in this case) coordinate system
The first of these expressions is a vector pointing radially outward (along the rdirection in polar coordinates) and the second is a vector pointing perpendicular to the radial direction (along the θdirection). This demonstrates that the partial derivatives do represent components of the original (unprimed here polar) covariant basis vector in the new (primed here cartesian) coordinate system.
But since we know from tha above paragraph that contravariant vector components combine with covariant basis vectors to produce identity, then differential length elements must transform as contravariant vector components.
And we should now understand why the transformation equation for contravariant components of vector A is often written as
In Cartesian coordinate system as the one used previously, there is no ambiguity when you consider the process of projection of a vector onto a coordinate axis.
Now imagine a twodimensional coordinate system in which the x and y axes are not perpendicular to one another. In such cases, the process of projecting a vector onto one of the coordinate axes could be done parallel to the coordinate axes, or perpendicular to the axes.
In the diagram below, to understand parallel projections, we have to consider the basis vectors e_{1} and e_{2} pointing along the non orthogonal coordinate axes and the projections X^{1} and X^{2} of the X vector onto those directions.
In this case, vector X may be written as:
where as seen above, X^{1} and X^{2} represent the parallelprojection (contravariant) components of vector X.
Now if we project vector X in a orthogal way along the axes, we come up with the X_{1} and X_{2} components of the vector. First remark to do is that the "parallel" projections and the "orthogonal" projections don't have quite the same length and that obviously using the rules of vector addition with X_{1 }and X_{2} don't form vector X. The perpendicular projections simply don't add up as vectors to give the original vector._{}
It's then reasonable to wonder if there are alternative basis vectors than e_{1} and e_{2} that would allow the perpendicularprojection components to form a vector in a manner analoguous to the contravariant components.
There are, and those alternative basis vectors are called "reciprocal" or "dual" basis vectors. These have two defining characteristics:
 Each one must be perpendicular to all original basis vectors with different indices. So if we call the dual basis vectors e^{1} and e^{2} to distinguish them from the original basis vector e_{1} and e_{2}, you have to make sure that e^{1} is perpendicular to e_{2} (which is the yaxis in this case). Likewise, e^{2} must be perpendicular to e_{1} (and thus perpendicular to the xaxis in this case).
 The second defining characteristic for dual basis vector is that the dot product between each dual basis vector and the original basis vector with the same index must equal one, so e^{1}oe_{1} = 1 and e^{2}oe_{2}=1.
The covariant components X_{1} and X_{2} made onto the direction of the dual basis vectors rather than onto the directions of the original basis vectors can than be written as follows:
We use superscript notation to denote the dual basis vectors as the inverse tranformation matrix has to be used when these basis vectors are transformed to a new coordinate system, as it is for the contravariant vector components X^{1} and X^{2}.
So a vector A represents the same entity whether it is expressed using contravariant components A^{i} or covariant components A_{i}:
where e_{i} represents a covariant basis vector and e^{i} represents a contravariant basis vector.
In transforming between coordinate systems, a vector with contravariant components A^{j} in the original (unprimed) coordinate system and contravariant components A'^{i} in the new (primed) coordinate system transforms as:
where the dx^{'i}/dx^{j} terms represent the components in the new coordinate sytem of the basis vector tangent to the original axes.
Likewise, for a vector with covariant components Aj in the original (unprimed) coordinate system and covariant components A'i in the new (primed) coordinate system, the transformation equation is:
where the dx^{j}/dx'^{i} terms represent the components in the new coordinate sytem of the (dual) basis vector perpendicular to the original axes.
[1] Defintion given by Daniel Fleisch in his Student's Guide to Vectors and Tensors  Chapter 5  Higher rank tensors p.134
[2] In more formal mathematical terms, a transformation of the variables of a tensor changes the tensor into another whose components are linear homogeneous functions of the components of the original tensor (reference MathWorld article Homogeneous Function).
[3] We recall that according to the Principle of Relativity, laws of physics are the same in any inertial frame of reference.
[4] We will see in the next part of the article why we are superscript index notation for the 'x' and 'y' there; just let us say for now that is because they represent the contravariant components of the vector and this is for distinguishing them from the covariant components A_{x} and A_{y}.
Let us try to illustrate this by the tensor that we have used extensively so far, at least since our article Generalisation of the metric tensor in pseudoRiemannian manifold, i.e the metric tensor.
where ξ^{α} are the coordinates in an inertial referential and x^{μ} the coordinates in a arbitrary referential.
If we now try to express this metric tensor components g'_{μν} in an another arbitrary referential R' with coordinate x'^{μ}, we get:
which is actually conform to the transformation equation of the covariant components of a secondrank tensor
In this expression, T'_{μν} are the covariant tensor components in the new coordinate system, T_{αβ} are the covariant tensor components in the original coordinate system, and δx^{α}/δx'^{μ} as well as δx^{β}/δx'^{ν} are elements of the transformation matrix between the original and new coordinate systems. These elements of the transformation matrix represent the dual basis vectors perpendicular to the original coordinate axis.
One of the very useful functions of the metric tensor is to convert between the covariant and contravariant components of the other tensors.
So imagine that you are given the contravariant components and original basis vectors of a tensor and you wish to determine the covariant components. One approach could be to determine the dual basis vectors, performing the perpendicular projections as seen above, but with the metric tensor you have the sorther option to use relations such as
If you wish to convert from a covariant index to a contravariant index, you can use the inverse g_{ij} (which is just g^{ij}) to perform operations like
This same process works also for higherorder tensors
This is the consequence of a more general mecanism called contraction, by which a tensor can have its rank lowered by multiplying it by another tensor with an equal index in the opposite position, ie by summing over the two indices. In this example, the upper and lower α indices are summed over:
]]>
In our previous article Introduction to Fourvelocity vector, we have presented the spacetime velocity vector equivalent to the classical three dimensional velocity vector.
The next logical question should be: what could be the expression of the spacetime momentum vector?
Let's be naive and suppose first that our spacetime momentum vector could be written as p = mv with v the classical newtonian velocity vector.
Let us consider what is commonly called an inelastic collision, i.e. two objects of the same kind, moving oppositely with equals speed v, hit each other and stick together, to become some new, stationary object (to help visualize this, we can imagine the two particules as being chewing gums...).
Let us estimate the momentum of the system before and after the collision in referentials R and R', with R' attached to the first left particule.
In referential R:
In referential R', the important thing to remember is that the relative speed of the right ball with respect to the speed of the left ball is NOT vv = 2v because in special relativity, as seen in our last article Introduction to Fourvelocity vector, the velocity do not transform according to Galilean answer, but according to the following expression, with v being the relative speed of the referentials.
So in our case, in R' referential, we can write:
Obviously, as momentum must be conserved in all referentials^{[1]}, our hypothesis was wrong, and the relativistic momentum could not equal the particle's mass m multiplied by its ordinary spatial velocity v, as it is defined in Newtonian mechanics.
It would be certainly a good idea to multiply the fourvelocity by the rest mass m of a particle to get our fourmomentum definition:
Also we have seen previously that the scalar product of the fourvelocity is given by η_{μν}U^{μ}U^{ν}=c^{2}
Therefore, as fourmomentum is given by P^{μ}=mU^{μ}, we can write: η_{μν}P^{μ}P^{ν}=m^{2}η_{μν}U^{μ}U^{ν}=m^{2}c^{2}
and our fourmomentum with norm mc could be respresented with its two time and spatial components (respectively γmc and γmv) as follows on a spacetime diagram:
The proposed momentum arrow always has a length equal to mc and points off in the direction of travel of the object in spacetime.
The part of the momentum spacetime vector that points in the space direction has a length equal to γmv and that part of the momentum vector that points off in the time direction has a length equal to γmc.
^{}
Remark 1: In Newtonian limits, i.e if the speed v of our object is much less than the speed of light c, then γ is very close to one. In that case, we regain the oldfashioned momentum, namely the product of the mass with the speed p = mv.
Remark 2: With this new expression γmv, what happens if a constant force (which is the rate of change of momentum) acts on a body for a long time? As opposed to Newtonian mechanics where the body keeps picking up speed until it goes faster than light  the mass being constant, in relativity, the body keeps picking up, not speed, but momentum, which can continually increase because the mass is increasing, and the velocity could never reach the speed of light.
In his book SIxNotSoEasy pieces^{[2]}, Richard Feynman illiustrates this form of 'inertia' when v is nearly as great as c, by the example of the synchrotron used in Caltech to deflect highspeed electrons: they need a magnetic field that is 2000 times stronger than would be expected on the basis of Newton's law.
The law of momentum conservation tells us that the total sum of all the new arrows must be exactly the same as the sum total of the original arrows. This in turn means that the sum total of the portions of each of the arrows pointing in the space direction must be conserved, as should the sum of the portions pointing in the time direction.
So we appear to have two new laws of physics: both γmv and γmc are conserved quantities.
So what does it mean exactly that γmc should be conserved?
Since c is a universal constant upon which everyone always agrees, then the conservation of γmc means first of all the conservation of γm (slightly new modified law of mass conservation with γ as new factor).
But if γmc is conserved, than so too is γmc^{2} simply because c is a constant. Then in the limit of small speeds (v<<c), we can write:
We just have discovered that there is a thing that is conserved that is equal to something (mc^{2}) plus the kinetic energy 1/2mv^{2}. It makes sense to refer to "something that is conserved" as the total energy of the body, which has now two bits.
Thus, even if an object is standing still (v=0), it has still energy associated with it, and that energy is given by Einstein's famous massenergy equation E = mc^{2}.
To be more precise, if we assume with Einstein, that the energy of a body always equals mc^{2 } (refer to Einstein paper outlines E=mc2, November 21, 1905), or that the mass is equal to the total energy content divided by c^{2}, we should then write, where the "rest mass" m_{0 }represents the mass of a body that is not moving:
Total relativistic energy E consists of the particle's relativistic kinetic energy plus the second term m_{0}c^{2}, which is the particle's mass energy E_{0}. Providing that no external forces act, total relativistic energy is conserved in all inertial frames, irrespective of whether mass or kinetic energy are conserved. In high speed particle collisions for example, mass, kinetic energy, even the total number of particles may not be conserved, but the total relativistic energy of the system will be.
If we come back to our example of the collision of two lumps of clay, applying the conservation of the total relativistic energy gives:
Let's assume that each lump of clay has a mass of 5kg, each travelling at 3000m/s, the increase of mass of the system will be:
which is an exceedingly small amount!
We still have to elucidate a last question: the total relativistic energy's formula and the massenergy equivalence equation we have established before can not apply to massless particule such as photons.
From above, we know that the norm of the four momentum is a constant in all inertial frames equals to rest mass x c = m_{0}c.
But we know also from the precedent paragraph that the time component of the fourmoment vector γm_{0}c could be expressed also as E/c as we know that total relativistic energy E = γm_{0}c^{2}
And for a photon, which has zero 'rest mass' but still does have energy and momentum, we get
which describes the energymomentum relation for a photon.
As the energy of a photon is given by E = hc/λ, where h = 6.63x10^{34}Js (Planck's constant) and λ is the wavelength, we can easily calculate for example the momentum of a photon of blue light, which has a wavelength of 450nm
[1] Should it not be the case, then the fundamental Principle of Relativity would be violated, as we would have a way to distinguish a 'rest' frame where the conservation of momentum law would be valid (our referential R above) from a 'moving' frame where this law would not hold (our R' referential).
[2] Richard Feynman SIxNotSoEasy pieces Chapter 3  The special Theory of Relativity §3.8 Relativistic dynamics
]]>Chapter 3 is dedicated to a general introduction to special Relativity and to Lorentz transformations
Chapter 4 introduces the notion of relativistic energy
Chapter 5 describes the spacetime notion
And finally Chapter 6 takes the curvature of spacetime as principal study object.
]]>
]]>
In nonrelativistic physics, the velocity of an object is a three dimensional vector whose components give the object’s speed in each of three directions (the directions depend on the coordinate system).
The path of a particle moving in ordinary threedimensional Euclidean space can be described using three functions of time t, one for x, one for y and one for z. The three fuctions x=f(t), y=f(t), z=f(t) are called parametric equationsand give a vector whose components represent the object's spatial velocity in the three x,y,z directions.
The spatial velocity of the particle is a tangent vector to the path and can be written as:
In special relativity, a fourvector A is a vector with a "timelike" component and three "spacelike" components, and can be written in various equivalent notation^{[1]}
More precisely, a point in Minkowski space is a time and spatial position, called an Event, or sometimes the position fourvector or fourposition or 4position, described in some reference frame by a set of four coordinates:
where r is the threedimensional space position vector. If r is a function of coordinate time t in the same frame, i.e. r = r(t), this corresponds to a sequence of events as t varies. Also the definition x^{0} = ct ensures that all the coordinates have the same units (of distance). These coordinates are the components of the position fourvector for the event.
Generally speaking mathematically, one can define a 4vector a to be anything one wants, however for special relativity between one Inertial Frame of Reference and another, our 4vectors are only those which transform from one inertial frame of reference to another by Lorentz transformations.
The vector that represents the relativistic counterpart of velocity, which is a threedimensional vector in space, is a fourvector and is called the Fourvelocity vector.
As we have seen in Proper time, a clock fastened to a particle moving along a worldline in fourdimensionnal spacetime will measure the particle's proper time τ and therefore it makes sense to use τ as the parameter along the path. The fourvelocity of a particle is then defined as the rate of change of its fourposition with respect to proper time, and is also the tangent vector to the particle's world line
To determine the components of the fourvelocity vector, we recall that a process that takes a proper time ΔΤ in its own rest frame has a longer duration Δt measured by another observer moving relative to the rest frame, i.e
Δτ = (Δt / γ)
Taking the derivative with respect to propert time, we can then rewrite that:
We can use the chain rule to find the spatial components of U^{μ} for μ = i = 1,2,3:
But dx^{i}/dt is the particle's ordinary spatial velocity v = dx^{1}/dt, dx^{2/}dt, dx^{3}/dt = v_{x }, v_{y}, v_{z }so that finally the particle's fourvelocity is finally given by:
[1] We have already come across this index notation in our article The Lorentz transformations Part IV  Lorentz transformation matrix (tab index notation).
To confirm that this four velocity vector is effectively a fourvector, we have to check that it transforms well under Lorentz transformation.
Let's consider a particule p which has fourvelocities U and U' respectively in referentials R and R'
If we assume that the referentials R and R' in standard configuration are animated by a relative velocity v_{r/r' }along the x axis caracterized by a Lorentz factor γ relative to each other, than the Lorentz transformation between the two fourvelocities can be written as:
If we consider the two first lines of the matrix product:
Now let's tackle the velocity transformation problem using the Lorentz transformation. If we have two events in spacetime there will be a difference between the corresponding time and spatial coordinates, the intervals Δt, Δx, Δy, Δz.
For example, if we had two events E_{1}=(t_{1},x_{1},y_{1},z_{1}) = (3,1,0,0) and E_{2}=(t_{2},x_{2},y_{2},z_{2}) = (5,4,0,0) then the time interval Δt = 53 = 2, and the spatial interval Δx = 4 1 =3.
We show easily that we get those transformation rules for intervals:
If we bring the two events on the x axis closer and closer together, eventually as Δx and Δt approach 0, the quantities Δx/Δt and Δx'/Δt' become the instantaneous velocities v_{x} and v'_{x} of an object moving through the two events E_{1} and E_{2} respectively in referentials R and R'.
We can then confirm that this expression of velocity transformation is strictly equivalent to the one we have found for the x component U_{1} of the four velocity vector U above.
Similarly, if we consider the components U_{2} and U'_{2} along the y and y' axis:
If we use again the Lorentz transformation rules we get:
which transfoms exactly as the four velocity U_{2} component does transform, as expected.
We have just shown that the four velocity vector is defined as a quantity which transforms according to the Lorentz transformation:
In special relativity the scalar product of two fourvectors A and B is defined by applying the Minkowski metric to the two four vectors, as follow:
One result of the above formula is that the squared norm of a nonzero vector in Minkowski space may be either positive, zero, or negative.
If A^{2}<0, the fourvector A^{μ} is said to be timelike; if A^{2}>0, A^{μ} is said to be spacelike; and if A^{2}=0, A^{μ} is said to be lightlike. The subset of Minkowski space consisting of all vectors whose squared norm is zero is known as the light cone.
As a direct consequence, the scalar product of the fourvelocity vector with itself, i.e. its squared norm is given by:
which is obviously an invariant in all the inertial referentials.
]]>
For any point m in a spacetime Riemannian manifold M, there exists a local coordinate system at m such that:
we call such a coordinate system a local inertial frame or a normal frame.
Remark1: the possibility of the existence of this local referential is fully demonstrated in our article Local Flatness or Local Inertial Frames and SpaceTime curvature.
Remark2: as all the first order derivatives of the metric are null, given the Christoffel symbol expression:
then in a local inertial referential the vanishing of the partial derivatives of the metric tensor at any point of M is equivalent to the vanishing of Christoffel symbols at that point and in this referential the geodesics are straight lines.
Remark3: if the metric first derivatives can always been nullified, it is not the case for all the second derivatives which can only be nullified in a flat spacetime (refer to the same above article for more details)
Remark4: below is a link to an excellent youtube tutorial which gives an overview of how a local inertial frame (black colour) can be obtained by a general coordinate transformation at any point P of a spacetime manifold (blue colour)
]]>
So far we have seen how free particles move in spacetime, and it led us to the concept of geodesic. But we have to understand in more details how these two concepts are related.
We recall that the geodesic equation for a particule with mass is
As we know that in General Relativity gravity is equivalent to spacetime curvature , can we then affirm that if all the Christoffel symbols are null, then spacetime is flat, as we have verified in our article Geodesic exercise part I: calculation for 2dimensional Euclidean space for a twodimensional Euclidean space  in which geodesics are straight lines?
In fact, what we are looking for is the generalization for the fourdimensional spacetime of the twodimensional Gaussian curvature, e.g a way of determining the intrinsic measure of curvature, depending only on distances that are measured on the surface, not on the way it is embedded in any space.
If you take a flat piece of paper and bend it gently, it bends in only one direction at a time. At any point on the paper, you can find at least one direction through which there is a straight line on the surface. You can bend it into a cylinder, or into a cone, but you can never bend it without crumpling or distorting to the get a portion of the surface of a sphere.
In Gaussian terms, the cyclinder curvature  defined as the product K_{1}K_{2 } of the two principal curvatures  is null because one of these principals  the one along the axis of the cylinder  is null.
In this article, we want to find a test similar to the piece of paper bending used in two dimensions.
In fourdimensional spacetime, the test will be to try to transform a general metric g_{μν} to the Minkowski form η_{μν} by an appropriate coordinate transformation x^{μ} > x'^{μ}
From our previous article Generalisation of the metric tensor in pseudoRiemannian manifold, we know that the metric g_{μν} has 10 independant components (16  6 as the tensor is symmetric.
Also concerning the coordinate transformation x^{α } > x'^{α }, we have the choice to find a function which relates x'^{α } to (x^{0}, x^{1},x^{2} and x^{3}) for each value 0,1,2 and 3 of α.
So we have 4 degrees of freedom to fill 10 conditions, which is generally not possible: as expected, it is generally impossible to reduce a metric describing a curved spacetime to the one of a flat spacetime by a coordinate transformation.
It appears that we have to be less ambitious and try to find a primed coordinate system that locally only , i.e around a event P in spacetime
If we take the analogy of the curved surface of the earth, we have to isolate lots of little square patches, each patch being pretty near flat.
We recall from our article Generalisation of the metric tensor in pseudoRiemannian manifold that an arbitrary metric could be expressed as:
and more generally that the components of a metric tensor in primed coordinate system could be expressed in non primed coordinates as:
Each of the partial derivatives is a function of the primed coordinates so, for a region close to the event point P, we can expand these derivatives in Taylor series:
where
These three quantities represent constant coefficients that we are allowed to choose freely in order to find a local inertial frame around the event P.
Let's try to count the degrees of freedrom we have at each step of this Taylor series development.
In the case of the coefficient a_{μ}^{α}, we can freely choose the 16 coefficients (4 possible values for α x 4 possible values for μ).
Concerning the coefficient b_{μν}^{α}, we have still 4 possible values for α but only 10 possible distinct choices over 16 for μ and ν as the order of the partial derivatives doesn’t matter here (for example, δ^{2}x^{α}/δx'^{1}δx'^{2} = δ^{2}x^{α}/δx'^{2}δx'^{1}^{[1]}). There are then 4x10 = 40 values for b_{μν}^{α }that we can freely choose.
When it comes to c_{μνβ}^{α}, again the order of the 3 partial derivatives doesn’t matter, so for each value of α we can have μ=ν=β, or two indices equal with the third different, or all three indices different. We have then to calculate the number of times that each possibility (0,1,2,3) appears in the triplet (μ,ν,β): by example in the case of the three following combinations c^{1}_{023} = c^{1}_{032} = c^{1}_{302}, we would say that they are all equivalent to one time 0, one time 1 and one time 2. This could represented with the notation xxx where the first case represents the number 0, the second case represents the number 1 and the third one represents the number 2. Another example would be the three distinct coefficients c^{1}_{002} = c^{1}_{020} = c^{1}_{200} which would be represented in a unique way by as xxx (2 number zero in the first case, 0 number 1 in the second case and 1 number 2 in the last case). The number of independants coefficients c_{μνβ}^{α} is then obtained by the number of possible unordererd permutations of three x and N1 symbols  in N+2 cases, which is by definition (N+2)!/(3! x (N1)!) = (N+2)(N+1)N/6.
As α can take N values, we finally get the number of possibilities = (N+2)(N+1)N^{2}/6 so for N=4, we get the number of independant coefficients c_{μνβ}^{α} = 16x5 = 80 values.
We can also think of the unprimed metric as a function of the primed coordinates, and expand it in a Taylor series as well:
We can now substitute the product of the partial derivatives with the non primed metric into one and collect terms. Due to the large number of indices floating about, it’s easier to use a condensed notation for this step.
The transformation now becomes, up to second order terms:
If we are finally expanding the Taylor expansion of g(x'_{p} + Δ'x) where every factor is now a function of x', we get:
We therefore have, taking each term separately and restoring the indices:
As we have seen above, every choice of the two indices α and μ gives an independent quantity, so there are 16 different variables and thus we have 6 more degrees of freedom than we need to solve g'_{μν} = η_{μν }( η_{μν } being symetric only requires 10 independant coefficients).
Again, as g'_{μν} is symmetric, we have 10 distinct values which have to be mutlitplied by the 4 distinct values of the partial derivative, which gives 40 independant equations. Do we have enough freedom degrees to verify that d'_{γ}g'_{μν} = 0 around the point P?
Recalling from above that there are then 4x10 = 40 values for b_{μν}^{α }that we can freely choose, we just enough degrees of freedom to fill this condition.
We then have just defined what is called a Local Inertial Frame (LIF): mathematically, it is a coordinate frame in which the metric is locally equivalent to the Minkowski metric and for which we impose the constraint that all the derivatives of the metric are locally equal to zero. Physically, that's the free falling referential from our famous Equivalence Principle, in which gravity seems to disappear locally .
[1] In the case of a symmetric matrix of dimension N, we can show easily that the number of distinct values is equal to N(N+1)/2. If N=4, the number of distinct values is then (4x5)/2 = 10.
The last equations to be checked concern the second order derivatives of the metric:
Given the fact that both the metric and its partial derivatives are symmetric by swapping the indice order, this line represents actually 100 independant equations.
Once again, the values of the non primed metric and of its derivatives are given, and as we have already determined the values of different a and b, our freedom is limited to the 80 independant values of c^{α}_{μνγ} (see above for demonstration)
It means that there will remain 20 second derivatives that we will not be able to set to zero: it turns out that this is a result of the inherent or intrinsec curvature of the spacetime  so it’s not surprising that we can’t get rid of that.
We have found then a test equivalent to the K=k_{1}k_{2} gaussian curvature for fourdimensional Riemannian manifold: if the surface is curved, we can not find a frame for which all of the second derivatives of the metric could be null. And we will see in a later article that these 20 extra derivatives are precisely embedeed in an object called Riemanian tensor of fundamental importance to assess the spacetime curvature.
]]>(Source: Le Destin de L'Univers by Jean Pierre Luminet Chapter 2 Relativites)
In the previous article, we have presented one kind of 'twin paradox', e.g. how two persons born at the same time but at a different location in the Earth referential and with one person flying by in a rocket animated with constant velocity v, would apprehend the other one's ageing process, given the fact that in special relativity time dilation is symmetrical. We have resolved this apparent paradox by showing that from the moving referential's perspective, the two persons were not born at the same time (well known end of simultaneity phenomenon in special relativity).
In this article, we will focus on the more known version of the so called 'Twin paradox' which involves again two twins born at the same time and at the same location in the Earth referential, with one twin remaining on Earth while the other twin takes off on a trip to some distant star at relativistic speeds and comes back. The second difference with the previous article's thought experiment is so that the travelling twin has to accelerate to take off and to decelerate to land.
This is sometimes considered a paradox in that each twin sees the other twin as travelling, and so, it is argued, each should see the other aging more slowly. But in fact this is based on a misunderstanding of relativity (only the inertial observers are subjects to the same laws of physics) and because here one twin only experiences acceleration and deceleration, he is the only one of the two twins who ages less.
We will consider that the space ship is traveling from Earth to a distance d of 10 lightyears, at a speed v = √3/2c so that the dilation time factor γ equals 2, as in the previous article.
Let's use the spacetime diagrams again to try to visualize the situation from each twin's perspective.
For the rest of the article, we will name T the traveller twin and S the sedentary twin staying on Earth (no luck for him ;)
What does the Traveller measure
In theory, one "should" take 11.54 years at the speed of √3/2c to travel 10 lightyears.
But as we know, from the the traveller's T perspective, his Proper time or time measured at rest in his own referential will be dilated by the Lorentz factor, so in our case, twin T will reach the star after only 11.54/2 = 5.77 years (blue line in the diagram below)
Moreover, upon arrival on the star, T will see the Earth as it was 11.54  10 = 1.54 years after his take off from Earth, as by definition it takes 10 years for light to travel 10 lightyears (green line in the diagram)
Conclusion: T has seen S's clock ticking 5.77/1.54 = 3.74 times more slowly.
What does the twin on Earth measure
S knows that T flying at a velocity of √3/2c should have reached the 10 lightyears far away star after 11.54 years.
However, the light takes 10 years to travel from the star back to the Earth.
On Earth, the twin S therefore sees his twin T reaching the star after 11.54 + 10 = 21.54 years.
Conclusion: For S, T's clock has run 21.54/5.77 = 3.74 times slower also.
What does the Traveller measure
T reaches planet Earth after a 5.77 years long travel (exactly the symetric from the outbound = blue line in the diagram below)
However, during this time, he has observed (2x11.54)  1.54 = 23.08  1.54 = 21.54 years passing on Earth.
Conclusion: T has seen S's clock ticking 21.54/5.77= 3.74 times faster.
What does the twin S on Earth measure
S sees all the way back from T happening in 23.08  21.54 = 1.54 years and gives his twin brother a hug after 23.08 years.
Conclusion: S has seen T's clock ticking 5.77/1.54 = 3.74 times faster.
So to put it in a nutshell, when the twins T and S meet each other again on Earth, S's clock has ticked 23.08 years whereas T's clock has only ticked 11.54 years.
To put it in other words, when T lands on Earth after his space travel, it turns out that he is γ times younger than his sedentary twin, and the time dilation symmetry is broken. The T's proper time is shorter than the one of S.
During the outbound travel, we have seen that the situations stay symmetrical for both twins because the two inertial referentials T and S are in uniform constant motion at velocity v relative to ech other.
In the same way, during the way back, T and S's respective inertial refentials are in uniform motion with velocity v relative to each other.
But if we consider the travel as a whole, the paths are no more symmetric because T has changed from inertial referential in E (or in other words he has had to accelerate).
]]>https://www.facebook.com/einsteinrelativelyeasy
It will be a place where i will give an update of any new article, any interesing reading or more generally of any interesting news about Relativity.
]]>In this article, we will expose an apparent paradox related to two twins, but without considering any frame acceleration (this so called 'twin paradox' will be discussed in the next article in part 2).
The apparent paradox is the following: if two persons are born at the same time in a given 'rest' referential R but with one of the person moving with uniform velocity v, as both referentials R (rest referential) and R' (moving referential attached to the second person) dilate the time one relative to each other, how different will be the twins' ageing process in each referential?
Here we consider two persons born at the same time t=0 in the same referentiel, but one of the person P' is born in a rocket animated with a uniform movement with velocity v relative to the rest frame (considered to be the Earth), whereas the other one P is born on Earth, say along the xabsciss at the point B with coordinates (0,L), and stay immobile there, watching rockets flying in the space for the rest of his life.
We take the hypothesis that the relative velocity is given so that Lorentz factor = γ = 2.
If we were to draw the spacetime diagrams of these two people P and P', we should then recall from our previous article The Lorentz transformations Part V  2nd observer in Minkowski spacetime diagram that if an object has a velocity v defined as
But as we know the value of γ = 2, then we deduce immediately that 1β^{2} = (1/4) => β^{2} = (3/4) => β = (v/c) = (√3/2).
So the tangent of the wordline of the person born in the rocket to the ct axis equals (√3/2) and the equation of the wordline is ct = (2/√3)x (blue line)
And withour any calculus, we can draw the word line of the second person P staying immobile at the surface of the Earth as a verticale line with absciss x = L (red line)
The green line, with equation ct = x, is the light wordline.
We are now interested into the moment when observer P sees P' in the sky above him.
Suppose that at this moment, that's the 20th birthday of P in referentiel R. How old would be P' in the same Earth referential, i.e how many candles would P see through the rocket window as P' is celebrating his own birthday?
As we know, the time of a moving referential relative to another one considered to be 'at rest' is dilated by the factor γ so in this case it is slowed down by a factor = 2. So 20 years old P is seeing a 10 old years child in the rocket, even if they are born the same day (t=0) in the same referential R.
As an exercice, we can try to visualize graphically this time dilation factor.
We have already given an insight of the method in the previous article Minkowski's FourDimensional SpaceTime (tab Time dilation visualization) but let's explain it again here.
We know the equation on which all pairs of events with a given interval lie: ds^{2} = c^{2}t^{2}  x^{2}.
If we choose the unit time interval, we get the equation of an hyperbola 1 = c^{2}t^{2}  x^{2}, whith by convention c=1 gives
t^{2 } = x^{2} +1 (red hyperbola on the diagram below)
We also have to remember that the blue wordline of P' in R referential is at the same time the ct' axis in the R' referential (x'=0).
Then the hyperbola can be used to calibrate one axis (ct) with respect to the other (ct'). In the diagram, the hyperbola intersects the time axis of the various observers/referentials, and marks off the points at which each observer measures his local time variable to be 1.
Those are the point with coordinates (x=0, t=1) in R and the point B (x'=0, t'=1) in the R' referential.
In R referential, it is the point with coordinates (x=0, t=1) and in R' referential it is marked as point B in the below diagram with coordinates (x'=0, t'=1)
The horizontal dotted blue line (ct=2) passing through events A and B is a line of simultaneity for observer P in R, meaning all events on that line have the same time value of ct = 2.
Observer P' measures event B occuring at time ct' = 1 on his ct' axis. However, observer P measures the same event occuring at time ct=2 on his ct axis. We have then the confirmation that from the point of view of P, the clock on frame R' belonging to P' are running two times slower.
Also we could try to find calculate the distance L at which the worldlines of P and P' do cross in R referential.
That's easy to find as by definition we have t = L/v = L/βc
We find L = βtc, and by expressing L distance in lightyear,we find that L = 20β  as we know by hypothesis that P sees P' passing by when he celebrates his 20 years.
So t = 20 x √3/2 = 10√3 = 17.32 lightyear in R referential.
So far, we have seen that in referential R, person P aged of 20 years old will see the 10 years old person P' passing by in the rocket.
We should now consider this event in referential R' and make sure that from this perspective, we still have the same situation, i.e a 10 years old person P' seeing a 20 years old person P looking at him through the rocket window.
First, let's try to determine the coordinates of P birth (point B with coordinates x=L, t=0 in R), this time in R' referential.
Applying the Lorentz transformation, we get:
So in R' referential, the person P is not born at t=0 (as in R referential), but he is born at t'= 30 years.
But we also know that when person P' sees person P through his window, he is 10 years old.
So that the time elapsed in R' referential between the birth of P and the crossing of the two persons is 10  (30) = 40 years.
Finally, we can deduce the age of person P in R' referential, as the time is dilated by a factor γ = 2, person P will be aged of 40/2 = 20 years.
The physical ageing of both observers is coherent from both perspective R and R'.
Let's try to visualize this graphically, by drawing the x' (red line equation ct = βx) and ct' axis (blue line equation ct = (1/β) x) of the R' referential.
We get the t'_{b} coordinate of the birth of person B (point B) by drawing the parallel of x' axis through B (red dash line): we can check that this line crosses the ct' time axis at the coordinate ct=60 in R, which means at ct'=30 in R' (we have shown this in the paragraph above).
Also the T'm coordinate of the meeting point M has coordinatect=20 or ct'=10.
So the time distance between the two events person P's birth and crossing of person P and person P' on ct' axis = 10 + 30 = 40 years.
]]>
"The special theory of relativity is nothing but a contradictionfree amalgamation of the results of MaxwellLorentz electrodynamics and those of classical mechanics." Einstein [January 1920] Fundamentals and Methods of the Theory of Relativity 
In 1905, Albert Einstein published the theory of Special Relativity, which explains how to interpret motion between different Inertial Frame of References.
This theory can be seen as a synthesis of both the Newton's laws of motion and the Maxwell's electromagnetism equations.
It takes place in a new pseudoflat fourdimensional spacetime, the Minkowski's SpaceTime, in which the space and time absolute invariants Δd and Δt from the classical Newtonian mechanics are replaced by these two new invariants^{[1]}:
From the law of conservation of momentum, Einstein was also able to derive in this new context the equivalence between mass and energy, the famous E = mc^{2} equation. See Introduction to Fourmomentum vector and E = mc2 for this last point.
[1] A quantity is invariant in special relativity if it has the same value in all inertial frames. The speed of light in a vacuum remains the univeral constant, whereas space shrinks and time slows down when two observers are uniformly speeding either toward or away from each other. Space and time are different in each reference frame.
]]>In our previous article Gravitational redshift Part III  Experiments, we have mentionned the Pound–Rebka experiment, proposed in 1959 by Robert Pound and his graduate student Glen A. Rebka Jr, to test the gravitational redshift or Einstein effect, predicted as soon as 1907 in Einstein's paper On the relativity Principle and the conclusions drawn from it".
This experiment, as successfull as it was  the result confirmed that the predictions of general relativity were borne out at the 10% level, still had two limitations:
 it only tested gravitational time dilation
 it was not measured with macroscopic clocks
]]>A few days ago, there was a discussion about the interval transformations under Lorentz transformations.
Some contributors/moderators there don't seem to believe that the space and time coordinate differences Δx or Δt do transform under Lorentz transformations (which is obvious due to the linearity of the transformation) or that these intervals could even be used to demonstrate the length contraction or time dilation phenomena, simply showing that they don't understand even the basis and real signification of the Lorentz transformation!
]]>So how can we measure the gravitational redshift?
As we recall from our previous article Gravitational redshift or Einstein effect  Part I, the expected amount of redshift for light from the surface of a massive object reaching a distant observer is proportional to the object's mass M divided by its radius r
One can first think that the gravitational redshift of light coming to us from the Sun could be a good candidate, but the accuracy is not very good because of gas motions on the solar surface: in this case, the Doppler shifts due to the moving gas are somewhat larger than the gravitational redshift due to the light having to climb out of the field of the sun.
So we have to think to another candidate with a potentially stronger redshift effect, meaning a bigger ratio mass/radius.
Actually, the stars that astronomers call White Dwarfs, which are formed when lowmass stars like our Sun have exhausted their nuclear fuel, are interesting candidates for observation: White dwarfs have masses close to that of the sun, but radii smaller by factors near 100.
The following illustration shows the relative sizes of our Sun, the Earth, and a White Dwarf star. The sun is so large that we can only show some part of its disc here;
The first observation of the gravitational redshift in the spectral lines from White Dwarf was the measurement of the shift of the star Sirius B, the white dwarf companion to the star Sirius by W.S. Adams in 1925 at Mt. Wilson Observatory.
Adams observed a shift of 0.007%, which was exacly the value predicted theoretically by Sir Arthur Eddington one year before, but obtained with wrong data concerning the mass and the radius of Sirius B!  also it's only decades later that anyone noticed or complained that about half the light he was studying was really scattered from the much brighter Sirius A.
The correct value of 0.03% has been only observed in 1965, by Pound, Rebka and Snider. Indeed,with the correct values of the mass and radius of Sirius B, we can write:
Although this measurement, as well as later measurements of the spectral shift on other white dwarf stars, agreed with the prediction of relativity, it could be argued that the shift could possibly stem from some other cause, and hence experimental verification using a known terrestrial source was preferable.
Experimental verification of gravitational redshift using terrestrial sources took several decades, because it is difficult to find clocks (to measure time dilation) or sources of electromagnetic radiation (to measure redshift) with a frequency that is known well enough that the effect can be accurately measured.
The Pound–Rebka experiment, proposed by Robert Pound and his graduate student Glen A. Rebka Jr. in 1959, is considered to be the one that definitely proved the gravitational redshift with high accuracy, i.e only one percent error in the end.
But before to delve in the details of this famous experiment, let's first mention the Bondi's thought experiment, after the name of the physicist Hermann Bondi, which will make the Pound experiment more meaningful.
Bondi imagined a machine consisting of a series of buckets attached to a conveyor belt. Each contains a single atom, with those on the right in an excited state (red atoms) and those on the left (blue atoms) in a lower energy state. As they reach the bottom of the belt, the excited atoms emit light which is focused by two curved mirrors onto th atom at the top of the belt; the one at the bottom falls into the lower state and the one at the top is excited. Because energy is equivalent to mass, those on the right, which have more energy, should be heavier. The force of gravity should therefore keep the belt rotating in perpetuity.
So what is wrong with this imaginary experiment? Precisely that the photons are redshifted as they climb up through the gravitational field, preventing their absorption by the atoms at the top of the belt.
Indeed, this experiment is based on the principle that when an atom transitions from an excited state to a ground state, it emits a photon with a characteristic frequency and energy. Conversely, when an atom of the same species, in its ground state, encounters a photon with the same characteristic frequency and energy, it will absorb the photon and transition to the excited state. If the photon's frequency and energy is different by even a small amount, the atom cannot absorb it.
It will be the idea of Pound and Rebka to try to 'offset' or cancel the potential redshift by making the atom at the bottom of the tower still emit a photon up, toward the top of the tower, but let the receiving atom move down, in order to correct for the redshift due to gravitationnal time dilation by the blueshit of the relativistic Doppler effect  caused by the movement of a emitter towards the absorber.
The test was carried out at Harvard University's Jefferson laboratory. A solid sample containing iron (^{57}Fe) emitting gamma rays was placed in the center of a loudspeaker cone which was placed near the roof of the building. Another sample containing ^{57}Fe was placed in the basement. The distance between this source and absorber was 22.5 meters (73.8 ft).
From our article Gravitational redshift Part II  Derivation from the Equivalence Principle we know that given a height of 22.5m, Pound and his collaborator had to detect a fractional redshift of:
which is infinitesimal!
With the relativistic doppler effect given by the following formula:
By making the speaker cone vibrating, Pound and Rebka moved the gamma ray source with varying speed, thus creating varying Doppler shifts: they then 'just' needed to observe the speed v for which the gravitationnal shift was cancelled out, and to verify that this speed v was conform to:
]]>
Physically, this principle (sometimes referred as to EEP = Einstein's Equivalence Principle) postulates that there is no experiment done in a small confined space which can tell the difference between a uniform gravitational field and an equivalent acceleration.
Mathematically, this important observation states that in presence of a gravitational field, small enough free falling frames will be inertial, and that in these Local Inertial Frame (LIF), where the metric g_{μν} reduces to η_{μν}, the laws of physics from Special Relativity will hold true.
Refer to the article The Equivalence Principle for more details.
Refer to the below youtube tutorial to get a good overview of the mathematical expression of the principle, i.e. how a local inertial frame (black colour) can be obtained by a general coordinate transformation at any point P of a manifold (blue colour)
Refer to the article 1907 Equivalence Principle first published expression to read the Einstein's first formulation of this principle.
]]>
By applying the Equivalence Principle, Einstein was able to obtain important results of the general theory of relativity even before he could solve the corresponding field equations.
You can read this demonstration in the 1907 article On the relativity Principle and the conclusions drawn from it". Actually, as outlined in our article 1907 Equivalence Principle first published mention, the equivalence of gravitation and acceleration is mentionned there for the first time by Einstein.
We will try to demonstrate that the gravitational redshift , i.e the fact that the light freqency changes when entering or leaving a gravitational field  which has been derived from the metric tensor in Newtonian limit in the previous article, could also be derived from this principle of Equivalence.
Consider light traveling from the bottom to the top of a rocket undergoing constant acceleration a. Let point A be at the bottom of the spacecraft, and point B at the top. The separation distance measured in the reference frame of the rocket is H.
When light first leaves point A, the velocity of the rocket is v_{A} with respect to another reference frame (the Earth, for example), and let's call T the time for light to travel to point B, so:
The time T for light to reach B (from the Earth perspective) is:
we approximate T to H/c as we consider a << c ('small' acceleration)
Also the change in velocity of the rocket between emission and reception is:
In the Earth referential (external to the rocket), the observer at point B is moving away from the light at an additional relative velocity of v. In this external frame, we can use therefore the results of Einstein’s Special Theory of Relativity, and particularly the Doppler shift equation for receding velocities:
But The Principle of Equivalence tells us that there is no experiment done in a small confined space which can tell the difference between a uniform gravitational field and an equivalent acceleration.
So the exact same redshift phenomenon should be observed on the Earth, if we replace the rocket by a tower of height H and the constant acceleration a of the rocket by the gravitational field g on the Earth.
Let's try to confirm that the formulation of the gravitationnal shift derived in the previous article from the metric tensor equals the new formulation
Based on the previous article, we have the frequency redshift equals to:
The Equivalence Principle, almost without any calculus but just by using the classical Doppler effect of special relativity flat spacetime, has led us to exactly the same result!
Note: You can find a scintillating explanation of the derivation of time dilation (equvalent to the gravitationnal redshift) in a gravitationnal field in the Feynman's lecture dedicated to Curved Space; more particularly, paragraph 426 The speed of clocks in a gravitationnal field
Feynman's lecture about curved space
]]>
Suppose that an observer is standing on the surface of Earth and is pointing a torch in the sky. A pair of events might be the successive peaks of the light wave leaving the torch.
From our article Generalisation of the metric tensor in pseudoRiemannian manifold, we know that in the non inertial Earth's referential frame, the space time line element between the two events can be written as:
where the indices μ and ν run over 0, 1 ,2, 3 for spacetime.
But as the observer is at rest in his own referential, the only non null coordinates is x^{0}, so that the square of the line element can be simplified to:
If our observer is at rest in his own referential, we know also how to express the space time distance with respect to the proper time τ (tau), as explained in Proper time article
But we know from our previous article The Geodesic equation in the Newtonian Limit that at the surface of the Earth, the metric tensor can be expanded in terms of the gravitational potential as follows
which gives then, by expliciting the value of the Newtonian gravitational potential field at a point in a gravitational field
In the above equation, the infinitesimal intervall dt can be considered as the time interval observed in a referential without gravitational effect, or say in another way in a ideal referential situated at an infinite distance
We can therefore write:
where dτ_{∞ }refers to the period of the light wave as measured by a distant observer without gravity and where dτ is the period of the wave measured where it is emitted, ie from the surface of the Earth.
This equation tells us that clocks run slower in a gravitationnal field as seen by a distant observer, this effect is known as gravitationnal time dilation.
As a direct consequence, because frequency is the reciprocal of the period, we can say:
where f_{∞} is the frequency of the wave as measured by a distant observer and f_{emission} is the frequency of the wave measured at the point of emission.
This equation tells us that the frequency of a wave as recorded by a distant observer is less than the frequency recorded by an observer located where the events occured in the gravitationnal field. This phenomenon is known as the gravitationnal redshift, because a reduction in frequency means a shift toward the longer wavelengths or 'red' end of the electromagnetic spectrum.
We can think of the photons losing energy as they climb out of the gravitational field  loss of energy equating to drop in frequency.
Remark1: we can simplify the equation by replacing the quantity 2GM/rc^{2} by the Schwarzchild radius notation R_{s}
_{}
Remark2: we have considered so far the case of an distant observer upon which no gravitational field acts. A more realistic scenario is that the second observer is itself under the effect of the gravitational field.
Let's assume that the observer A pointing the torch stands at the surface of the Earth at a distance r_{a} from the center of the Earth and that the second observer B stands by example at the top of a tower at the distance r_{b} = r_{a} + h
We can then write:
.If we suppose as it is the case on Earth that R_{s}<<r_{a} (= Earth radius) and so that R_{s}<<r_{b}, the redshift can be approximated by a binomial expansion to become:
We have then the confirmation that when observed in a region of a weaker gravitational field (rb > ra), an electromagnetic radiation originating from a source that is in a gravitational field is reduced in frequency (fb < fa) or redshifted.
Remark3: we demonstrate in the article Newtonian limit how this gravitational redshift, i.e. the time curvature is enough to ensure that freefalling bodies follow their Newtonian trajectories. Put in other words, Newtonian gravitation is equivalent to time curvature which is the same as gravitational redshift!
]]>Newtonian gravity consists of two equations: one tells us how matter responds to gravity, and the other tells us how matter produces gravity.
The first equation, derived from Newton's second law of motion, says
where a vector is the acceleration through space, nambla is the Euclidean gradient operator and Φ is the gravitational potential.
In this article, we will focus on this first equation, and we will try to derive an approximation of the Newtonian gravitational equation with the mathematics of general relativity.
In the Newtonian limit, we make three assumptions:
As the geodesic equation describes the worldline of a particle acted only upon only by gravity, our goal is therefore to show that in the context of the Newtonian limit, the geodesic equation reduces to the first Newton's gravity equation, as expressed above.
We recall from our previous article Geodesic equation and Christoffel symbols that the geodesic equation, using proper time as the parameter of the worldline is:
The second term hides a sum in μ and ν over all indices (16 terms). But because the particle in question is moving slowly (first assumption of the Newtonian limit), the timecomponent (ie the 0th component of the particle's vector) dominates the other (spatial) components, and every term containing one or two spatial fourvelocity components will be then dwarfed by the term containing two time components.
We can therefore take the approximation:
If we restrict ourselves to the Newtonian 3D space, meaning that we assign β to spatial dimensions only, we can then replace β by the latin letter i (i = x, y, z), giving:
From our article Christoffel symbols in terms of the metric tensor, we know how to calculate the Christoffel symbol with respect to the components of a given metric:
But because the field is supposed to be static (second assumption of the Newtonian limit), the time derivative g_{j0,0} is zero, so that the Christoffel symbol can be simplified to:
Now, if the gravitational field is weak enough, then spacetime will be only slightly deformed from the gravityfree Minkowski space of special relativity, and we can consider the spacetime metric as a small perturbation from the Minkowski metric η_{μν}
_{}
At this step, Equation 1 becomes:
If we now define g^{ij} = η^{ij}  h^{ij}, we find that g^{μσ} g_{σν} = δ^{μ}_{ν} to within first order of h_{ij}, defining an inverse metric.
We obtain then
But as η^{ij} is not null only if j=i, for which η^{ii} = 1 (i refers to the spatial components)
We now need to change the derivative on the left hand side from τ (tau) to t.
For this, let's first replace i by 0 in the above equation:
With this result in the hands, we still need to play around with the partial derivative with respect to tau:
So finally, expressing this in vectorial form:
which is another way of writing the Newtonian gravitational Equation from the beginning.
By writing the metric component g_{00} as:
we can see the direct link between the metric tensor (component _{00}) on the left side and the gravitational potential Φ on the right side.
We can calculate the h_{00} value on the Earth and check that its value is infinitisimal, meaning that the deviation relative to the Minkowski metric due to the gravitational field is negligible.
In the same way, we would calculate a correction to the Minkowski metric of 10^{6} at the surface of the sun and of 10^{4} at the surface of a white dwarf. We conclude that the weakfield limit is an excellent approximation.
]]>We have already shown how to derive the geodesic equation directly from the Equivalence Principle in in our article Geodesic equation and Christoffel symbols.
Here our aim is to focus on the second definition of the geodesic (path of longer Proper Time^{[1]}) to derive the Geodesic Equation from a variationnal approach, using the Principle of least Action. That's actually how Einstein deduced it in his 1916 synthetic paper The Foundation of the General Relativity of Relativity
]]>We will show thereafter that for a surface of a sphere like the Earth:
We know from our previous article Geodesic exercise part I: calculation for 2dimensional Euclidean space that the geodesic equation can be written as
In spherical polar coordinates for a two dimensional surface where r = cste, the index i equals 1 or 2 and x^{1 }= θ, x^{2} = φ.
We then need the connection coefficients for a surface of a sphere as calculated previously in Christoffel symbol exercise: calculation in polar coordinates part II
In the geodesic equation, if we replace x^{i} by θ, we have to sum over four elements in the form of Γ^{θ}_{jk}.
But the only connection coefficient not null in the form of Γ^{θ}_{jk} is Γ^{θ}_{φφ} = sinθ cosθ so that the geodesic equation becomes (i=θ, j=φ, k=φ)
Equation 1
Now we have to replace x^{i} by φ In the geodesic equation, and therefore we have to sum over four elements in the form of Γ^{φ}_{jk}.
As the only two connection coefficients in the form of Γ^{φ}_{jk} which are not null are Γ^{φ}_{φθ} = Γ^{φ}_{θφ} = cosθ/sinθ then the geodesic equation becomes (i= φ , j=θ, k=φ and i= φ , j=φ, k=θ ) :
Equation 2
If we parameterise our meridian by saying that θ = λ where 0 <= λ <= π/2, and φ = 0
then dθ/dλ = 1 => d^{2}θ/dλ^{2} = 0 also as φ = 0, then dφ/dλ = d^{2}φ/dλ^{2} = 0
So equation 1 becomes 0  sinθcosθ x 0 = 0 so 0 = 0 which holds true
In the same way, equation 2 becomes 0 + 2(cosθ/sinθ) x 1 x 0 = 0 <=> 0 = 0 which is verified
Both equations 1 and 2 are true (lefthand side equals the right hand side), therefore they satisfy the geodesic equations, meaning we have shown that the meridian is a geodesic.
If we parameterise our equator by saying that θ = π/2 and φ = λ where 0 <= λ <= 2π
then dφ/dλ = 1 => d^{2}φ/dλ^{2} = 0 also as θ = π/2, then dθ/dλ = d^{2}θ/dλ^{2} = 0 and cosθ = cos (π/2) = 0 sinθ = sin(π/2)=1
So equation 1 becomes 0  1 x 0 x 1^{2} = 0 so 0 = 0 which holds true
In the same way, equation 2 becomes 0 + 2 x 0 x 0 x 1 = 0 <=> 0 = 0 which is verified
Because equations 1 and 2 are true (lefthand side equals the right hand side), therefore they satisfy the geodesic equations, meaning we have shown that the equator is a geodesic.
Let's take now the example of a circle on the top part of the sphere defined by θ = π/4 and φ = λ where 0 <= λ <= 2π
Then dφ/dλ = 1 => d^{2}φ/dλ^{2} = 0 also as θ = π/4, then dθ/dλ = d^{2}θ/dλ^{2} = 0 and cosθ = cos (π/4) = sinθ = sin(π/4)= √ 2
So equation 1 becomes 0  √ 2 x √ 2 x 1^{2} = 2 ! = 0 (not true!)
Equation 2 becomes 0 + 2 x 1 x 0 = 0 <=> 0 = 0 which is verified
As equation 1 is not verified even if equation 2 Is verified, it means this top circle of latitude is a not geodesic.
]]>
In Cartesian coordinates and in two dimensional space, as there is no z coordinates, the Euclidean line element there becomes:
dl^{2} = dx^{2} + dy^{2}
Therefore, the corresponding metric is  we are using Latin indices as we are not working in spacetime:
We also know from our previous article Geodesic equation and Christoffel symbols that the geodesic equation can be written as
But the Proper time is clearly not a convenient parameter in the case of the propagation of photons (the proper time is not defined for massless particles)
We should better use a so called affine parameter λ, as per below:
In order to calculate the eight Christoffel symbols (2*2*2 in 2D space), we need to use the equation given in Christoffel symbols in terms of the metric tensor
But as the values of the metric are constant (equal to 0 or 1 as pointed out above), the partial derivatives g_{ij,k} = 0 for all values of i, j and k. Therefore Γ_{jk}^{i} = 0 for all values i, j, k and the geodesic equation simply becomes :
The function x^{i} = aλ + b where a and b are constants is obviously a solution to this equation as when derived twice it gives 0.
As we are using Cartesian coordinates where x^{i} equals x and y, the above equation becomes then
Solving for λ gives:
which is the equation of a straight line with gradient c/a and constant (adbc)/a.
]]>In this article, we will calculate the Euclidian metric tensor for a surface of a sphere in spherical coordinates by two ways, as seen in the previous article Generalisation of the metric tensor
In this Euclidian threedimensionnal space, the line element is given by:
dl^{2} = dr^{2} + r^{2}dθ^{2} + r^{2}sin^{2}θdΦ^{2}
If we set the polar coordinate r to be some constant R we lose the dr term (because r is now constant) and the line element now becomes:
dl^{2} = R^{2}dθ^{2} + R^{2}sin^{2}θdΦ^{2}
which describes a twodimensional surface using the two polar coordinates (θ, Φ)
Or we know from the previous article that this line element could be written as:
dl^{2} = g_{ij}dx^{i}dy^{j}
We can deduce immediately that the metric and inverse metric for this surface, using coordinates x^{0}=θ and x^{1}=Φ, are:
This was the easy part. Let's try to calculate the same metric by using the formula of the coordinates derivatives product.
We should recall that we also defined the metric tensor as the product of derivatives to another coordinate system (in the previous article, it was from a Minkowski inertial referential)
Or the cartesian coordinates and spherical coordinates are linked together by the following equations:
At this point we can confirm that by both the space line element and the product of coordinates derivatives, we have found exactly the same components for the metric of a twodimensional surface of a sphere in polar coordinates
]]>In this article, our aim is to calculate the Christoffel symbols for a twodimensional surface of a sphere in polar coordinates.
We have already calculated some Christoffel symbols in Christoffel symbol exercise: calculation in polar coordinates part I, but with the Christoffel symbol defined as the product of coordinate derivatives, and for a two dimensional Euclidian plan.
Here we will be using the formulation relative to the metric found in the previous article Christoffel symbols in terms of the metric tensor and we will apply it for the surface of a sphere.
From the previous article Metric tensor exercise: calculation for the surface of a sphere, we know that the metric and the inverse metric describing the surface a sphere are respectively
In polar coordinates, we know that we have to find the eight following symbols:
Let's start by calculating the four symbols with θ as upper indice. We can write:
The four first symbols are now very easy to deduce:
These four values could be summarized in our first matrix:
It's now time for us to calculate the four symbols with φ as upper indice. We can write:
From there, we can easily deduce the last four connection coefficients:
So finally for a surface of a sphere, the eight Christoffel symbols are equal to:
]]>As all the information about the spacetime structure is being contained in the metric, it should be possible to express the Christoffel symbols in terms of this metric.
The following calculation is a little bit long and requires special attention (although it is not particularly difficult).
So far, we have defined both the metric tensor and the Christoffel symbols as respectively:
Let's begin by rewriting our metric tensor in the slightly different form g_{αμ}:
Now, in this second step, we want to calculate the partial derivative of g_{αμ} by x^{ν}:
Now let's try to rewrite the Christoffel symbol by multiplying each part of the equation by the partial derivative of ξ^{σ} relative to x^{β}:
We can now rewrite the partial derivative of g_{αμ} by x^{ν} as follows:
or we recognize from our previous article Generalisation of the metric tensor that
If we now do the operation (3) + (4)  (5) we get:
Finally the last step consists in multiplying both sides of the equations by the inverse metric g^{βα} to isolate the Christoffel symbol
Usually, we adopt the following convention for writing partial derivatives:
Phew!
]]>In the same way as we have generalized the formulation of a geodesic equation from an inertial referential to an arbitrary referential (see Geodesic equation and Christoffel symbols), our first goal in this article is to generalize the definition of the metric tensor from a Minkowski spacetime (see The Minkowski metric) to the one of a so called pseudoRiemannian manifold, which is the mathematical structure by which the General Relativity spacetime can be modelled.
If we adopt the same convention as in the article Geodesic equation and Christoffel symbols, by naming the spacetime coordinates ξ^{α }in the local inertial referential, i.e: ξ^{0} = ct, ξ^{1} = x, ξ^{2} = y, ξ^{3} = z, we can then write the Minkowski line element as follows
Let's name x^{μ} the coordinates in the new arbitrary referential (non inertial), we can then write given that
ξ^{α} = ξ^{α}(x^{0}, x^{1}, x^{2}, x^{3}) , and so that an infinitesimal variation dξ^{α }is:
The metric tensor has also the following properties:
The metric tensor g_{μν} is of fundamental importance: it contains all the information of the spacetime and because spacetime curvature is equivalent to gravitation, the metric contains all the information about the gravitationnal field. The goal of the general relativity could be therefore defined as to be able to calculate this metric. For symmetry reason, it is easy to see that the 16 metric components can be reduced to only 10 independant values.
To get familiar with this metric formulation, we will dedicate the next article Metric tensor exercise: calculation for the surface of a sphere to the calculation of the metric of the surface of a sphere of a radius R in polar coordinates.
[1] There is a precise reason why the inverse metric is noted with upper index: as we will see later in our article dedicated to the Introduction to Tensors, it is a rank2 contravariant tensor, whereas the metric tensor itself, which is a rank2 covariant tensor, is noted with lower indexes.
]]>The equation giving the distance between two points in a particular space is called the metric. Once we know the metric of a space, we know almost everything about the geometry of the space, which is why the metric is of fundamental importance.
We have already met the function that defines the distance between two points in Minkowski spacetime (see Minkowski's FourDimensional SpaceTime article): it's the spacetime interval given by the formula
ds^{2} = c^{2}Δt^{2}  Δx^{2 }  Δy^{2 }  Δz^{2}
This can be written as ds^{2} = 1xc^{2}Δt^{2}  1xΔx^{2 }  1xΔy^{2 } 1xΔz^{2}
And +1, 1, 1, 1 can be defined as the diagonal elements of a 4x4 matrix, denoted by η_{μν}:
The indices μ,ν after the η symbol identify the elements of the matrix by reference to its rows (μ) and its columns (ν). The convention is that the metric coefficients run from 0 to 3, so η_{00}=1, η_{11}=1, η_{22}=1, η_{21}=0, etc.
This matrix simply tells us how to multiply the differentials cdt, dx, dy, dz to obtain the spacetime interval equation. We can see it by writing the matrix product as follow:
So finally, as only η_{00}, η_{11}, η_{22} and η_{33} are not null, we get the product equals to
1xc^{2}Δt^{2}  1xΔx^{2 }  1xΔy^{2 } 1xΔz^{2 } = c^{2}Δt^{2}  Δx^{2 }  Δy^{2 } Δz^{2 }
which is the spacetime interval defining Minkoswki spacetime.
Finally, using the index notation, the Einstein convention (implicit summation on repetead indices) and the Minkowski metric η_{μν} , we can write the Minkowski line element in a more compact form:
Recall that the symbol η_{μν } irefers specifically to the Minkowski metric.
]]>The brief lessons cover seven areas in modern physics — relativity, quantum mechanics, the structure of the universe, particle physics, quantum gravity, probability and black holes, and finally, how all these topics relate to us mere mortals.
]]>In this section, as an exercise, we will calculate the Christoffel symbols using polar coordinates for a twodimensional Euclidean plan.
and given the fact that, as stated in Geodesic equation and Christoffel symbols
we are then ready to calculate the Christoffel symbols in polar coordinates. As we know from the definition of Christoffel Symbol or Connection coefficient, in 2 dimensional space, we have to find 2x2x2 = 8 connection coefficients, and only 6 distinct values because of the symmetry on the lower indices.
The eight Christoffel symbols to find are summarized in the two matrix below, with the symbols being symmetric on the lower index (meaning that the connection coefficients that are linked below by the blue arrow are equal).
Let's start by populating the four values of the first matrix with r as upper indice:
So finally we get the first matrix equal to:
Let's calculate now the four following coefficients, all with θ as upper indice:
The calculation for the last coefficient gives:
Finally the last four Christoffel symbols can be summarized as follow:
]]>
We have mentioned that relativistic effects have been observed so far only into the realm of subatomic particles because those are the only ones that physicians are able to accelerate to very high speed in their laboratories.
But in theory, nothing prevents these effects of special relatiivity to apply equally well to things the size of humans. And even more, as Brian Cox states it in his book E=mc^{2}, we could rely on these strange effects for our own survival!
As we now know it, the Sun is roughly middle aged and will remain fairly stable for more than another five billion years. However, after hydrogen fusion in its core has stopped, the Sun will undergo severe changes and become a red giant. It is calculated that the Sun will become sufficiently large to engulf the current orbits of Mercury, Venus, and possibly Earth. It means that if humans have not become extinct by some other reasons by then, it will become a necessity for us to escape our beloved home and to journey to some other stars. But which stars?
If we did not know anything about special relativity, we would say that there would be a very little chance to be able to travel to any of the 200 billion stars that the Milky Way contains. For the very simple reason that our galaxy has a diameter usually considered to be about 100,000–120,000 lightyears, and that we could barely be expected to travel to the edges of the galaxy that would take light itself 100,000 years to reach.
But imagine that one day we could build a spaceship that could send us into space at speeds very close to the speed of light (and we have 4 billions years to figure out how), then the theory of special relativity tells us that distance to the stars would shrink, all the more so when we get close to the speed of light.
In this context, it becomes even conceivable to reach the neighboring Andromeda Galaxy, almost 2.5 million lightyears away.
As an exercise, let's try to figure out how fast we would need to travel if we don't want to age more than 1 year on the journey to Andromeda.
If we assume that our spaceship is moving with constant velocity relative to the Earth frame (ct,x,y,z), then special relativity applies and we want our Proper time to be equal 1 year, so:
Provided we get a very fast spaceship, Einstein's theory of relativity and subsequent distance contraction makes the travel to distant parts of the Universe imaginable in a way that it never was before!
We end up with this very paradoxal result that by travelling at 0.99999999999984c, so a little bit less than at speed of light, it takes us only 1 year to reach a Galaxy 2.5 million light years away, whereas by definition it takes light itself 2.5 milion years to travel the same distance.
]]>In our previous articles, we have seen that the time dilation factor, also known as the Lorentz factor γ, becomes measurably larger than 1 only for objects or referentials that can move close to the speed of light.
Ideally, we would like to find an object with a given lifetime and we could then look to see if we could prolonge his lifetime simply by moving it fast.
The muon particle^{[1]} meets these requirements as it has a very short lifetime: it decays via the weak interaction into an electron and in another pair of subatomic particles called neutrinos after 2.2 microseconds. Also, because this particule is very small, it very easy to be accelerated to very high speeds.
In the late 1990s, scientists at Brookhaven National Laboratory on Long Island, New York, used the Alternative Gradient Synchrotron (AGS) to produce beams of muons circulating around a 14meterdiameter ring at a speed of 99.94 percent of the speed of light.
If muons live for only 2.2 microseconds at this speed, then they would manage no more than 15 laps of the ring before they died.
In reality they managed around 433 laps, which means their lifetime extended by a factor of 28.8 to just over 60 microseconds.
If Einstein is right and the special theory of relativity holds true, we should find the same value by calculating the Lorentz factor given a speed of 99.94 percent the speed of light
So Einstein's prediction was right: special relativity theory's prediction was that the accelerated muon should live 29 times longer than a muon standing still, and this is exactly what was observed by the scientists at Brookhaven!
That is a great result but we still should take a moment to clarify a point: how many times would we travel around the ring if we could speed around the ring with the muons?
As we are standing still relative to the muon and the muon lives for 2.2 milliseconds in its own rest referential (that's the muon's Proper time), our watch would also measure a 2.2 milliseconds time interval. Nevertheless, we would still manage to make 433 laps of the ring. How is it possible?
That's because of the second effect of special relativity, the length reduction in the direction of the movement, as seen in The Lorentz transformations Part III  MeasuringRods and Clocks in motion. From the muon's perspective, the ring is travelling at speed 99.94 percent speed of light and therefore its length should shrink from the same factor γ. So in the muon's rest referential, we are flying more than 400 times around a 14/29m = 0.5m only! There is no contradiction!
Below is a space diagram which shows the length contraction from one referentiel relative to another: if we suppose that in our context, the referential R' is attached to the Earth, and the muon's referential is R, we can see that the length in R is shrinked.
To visualise the time dilation effect, draw a tangent (dotted black line) to the space hyperbola at the point where the object's spatial axis (x') crosses the hyperbola. This line is parallel to the ct' axis and therefore has a constant value of x'=1. The distance from the origin to where this tangent crosses the observer's spatial axis (x) is equal to the spatial contraction. From the point of view of the observer in R or from muon's perspective in our case, the ring of length 1 has contracted to length < 1
Remark: on the diagram, the ring is shrinked to roughly 0.8 to make the ring still visible but in reality, it should be shrinked to 1/29 = 0.0345. Recalling from our last article The Lorentz transformations Part V  2nd observer in Minkowski spacetime diagram, the x'muon axis should be much closer to the 45° light worldline, as we should have tan θ = (v/c) = 0.9994 so θ = 44.98°, which will make the tangent line much more steeper.
.
[1]The muon is an elementary particle similar to the electron, whith electric charge of 1 and a spin of 1/2, but with a much greater mass. It is classified as a lepton, along with the electron, the tau and the three neutrinos.
]]>In our previous article Minkowski spacetime and time dilation calculation, we have explained how to visualise the time dilation effect among two inertial referentials Ziga and Ranja in constant velocity v relative to each other via the Minkowski space time diagram.
However, we did not really explain how to show the second inertial frame Ranja belonging to a second observer O'. How do we draw the ct' and x' axis relative to Ranja referential?
Each possible event that can happen in frame Ranja when the spatial coordinate x' equals zero, joined together, will form the ct' axis. So we just have to consider the point x'=0 : this point is moving along the x axis with a velocity v (as the frame Ranja is moving at this velocity)
We could state that if an object is travelling with a constant velocity v then that velocity will equal distance travelled divided by time taken and is given by
But we could have found this equation directly using the Lorentz transformations:
Similarly, if we want to find the equation of the x' axis, which is the line where ct'=0, we deduce from Lorentz transformation applied to ct' transformation:
The figure belows shows the lines ct = (v/c) x and ct = (c/v) x, which are the equations of the x' and ct' axes of a frame R' moving with speed v relative to S.
The angles between ct' and ct and x' and x are equal and are defined as tan θ = v/c.
]]>
It's important for us to understand what Christoffel symbols do exacly mean from a physical point of view, as we know that in General Relativity, the paths of particles and light beams in free fall are calculated by solving the geodesic equations in which the Christoffel symbols explicitly appear (refer to Geodesic equation and Christoffel symbols)
In many applications, it is important to know how a vector fiels changes as we move from one location to another.
For vectors expressed using Cartesian coordinates, taking the derivative of a vector is quite staightforward: we simply take the derivative of each of the vector's component.
Let's consider a vector field V (x,y, z) representing air moving in a room. We can imagine an arbritrary function describing our vector field:
V = (3xy) ê_{x} + (x+ 4y + 3z) ê_{y} + (2y) ê_{z}
If we now are asked to find the rate of change of the air with respect to the (x, y, z) coordinates system, we could easily take the partial derivates of V:
Here we don't need to worry about differentiating the basis vectors ê_{x}, ê_{y}, ê_{z} because they are constant, each is one unit long and pointing along the x, y, z axis respectively.
But here is the crux of the problem: in most of the coordinate systems (the exception being when we are using Cartesian coordinates in the Minkowski space of special relativity), because the basis vectors point in different directions as wemove aroud the space, we must also consider the derivatives of the basis vectors.
So if we consider a vector V = V^{α}e_{α} , the exact derivative should give us:
In which the index α specifies the basis vector for which the derivative is being taken, the index β denotes the coordinates being varied to induce this change in the αth basis vector, and the index γ identifies the direction in which this component of the derivatives points:
The example that follow in polar coordinates should help make the things clearer
This equation should be read as follows: the change in e_{r} caused by a change in θ has zero magnitude in the e_{r} direction, and the change in e_{r} caused by a change in θ varies inversely with distance in the e_{θ} direction.
We recall from our article Geodesic equation and Christoffel symbols that the Christoffel symbol can be calculated during a transformation from one referential ξ^{α} to x^{α} also as
You can refer to the article Christoffel symbol exercise: calculation in polar coordinates in which we are calculating all the Christoffel coefficients in polar coordinates for a twodimensional Euclidean space.
Finally, the Christoffel symbols have the following characteristics:
 they are symmetric on the lower indexes, i.e Γ^{γ}_{αβ} = Γ^{γ}_{βα} (that's evident from the above definition)^{ [1]}
 at each point of a Ndimensional spacetime, as each of the three indices (lower and upper) can take N values, N x N x N Christoffel symbols will be defined.
 in a fourdimensionnal coordinate system, 4x4x4 = 64 different Christoffel symbols should theoretically been defined, but because of the lower indices symmetry, and as there are only 10 different ways to arrange 4 coordinates if the permutations are equivalent  nx(n+1)/2 , we finally get only 4x10 = 40 distinct values.
[1] From a more mathematical perspective, these Christoffel symbols called of the 'second kind' are the connection coefficients—in a coordinate basis—of the LeviCivita connection and since this connection has zero torsion, then in this basis the connection coefficients are symmetric.
]]>
Another common way of expressing the Lorentz transformation is in matrix form:
Recalling the rules for matrix multiplication we see that:
We have found exacly the same Lorentz transformation equations as described in The Lorentz transformations Part I  Presentation
The two ways of expressing the equations are strictly equivalent.
We can write an even more compact form by using the index notation (see tab Index Notation)
We can write the Lorentz matrix in a even more compact notation, using index notation, in the form:
where the indices μ and ν take the values of the number of spacetime dimensions, ie 0 to 3.
So the components of x'^{μ} are (x'^{0}, x'^{1}, x'^{2}, x'^{3}) = (ct', x', y', z')
And those of x^{μ} are (x^{0}, x^{1}, x^{2}, x^{3}) = (ct, x, y, z)
Concerning the matrix,
the μ index refers to the μth row and the ν index refers to the νth column
]]>
The proper time noted by convetion τ (tau) is by definition the time measured by an observer in their own rest frame, i.e the time between two events as measured in a frame where the events stay in the same position.
In the case of a particule, proper time would be then the time given by an imaginary clock strapped to the particle, or a kind of 'internal clock'.
We should think of the signification of 'proper' as a synonym of 'property', not as synonym of 'correct' (in Special Relativity, as we know, there is no 'correct' observer or referential)
Consider Δt as an infinitesimal lifetime interval of a particule measured in its on rest frame, the spacetime interval between any two positions of a particule in such a frame is given by
Δs^{2} = c^{2}Δt^{2}  Δx^{2}  Δy^{2}  Δz^{2} = c^{2}Δt^{2}  0^{2}  0^{2}  0^{2} = c^{2}Δt^{2}
But as propert time Δτ, by definition, is the time measured by an observer in their own rest frame, we can say Δτ = Δt and therefore
Δs^{2} = c^{2}Δt^{2} = c^{2}Δτ^{2}
Since the interval is assumed timelike  refer to the article Minkowski SpaceTime for the definition, one may take the square root of the above expression
Δs = cΔτ or Δτ = Δs/c
and the proper time interval is defined as
where P is the worldline from some initial event to some final event
Also, as the spacetime separation of events is an invariant quantity, ie is measured the same for all inertial observers, the following quantity c^{2}Δτ^{2} holds same not only for events occuring at the same position, but also applies to time separated events measured from any frame R'
c^{2}Δτ^{2 }= Δs'^{2} = c^{2}Δt'^{2}  Δx'^{2}  Δy'^{2}  Δz'^{2}
We have demonstrated in our article Minkowski's FourDimensional SpaceTime (tab time dilation calculation), that this invariance interval gives us the relation between Δτ and Δt. We have found that :
Δτ = (Δt' / γ)
As γ>1, another way of seeing it is that the process that takes a certain proper time (Δτ measured by definition in its own rest frame) has a longer duration Δt' measured by an another observer moving relative to the rest frame, ie moving clocks run slow.
To summarize,
c^{2}Δτ^{2 }= Δs^{2} is an invariant quantity in Minkowski spacetime (which is equivalent to say that it remains the same by Lorentz transformation) => For two inertial referentials in uniform rectilinear motion with respect to each other, the proper time will be the same ( the proper clocks get older at the same pace)^{[1]}, whereas the coordinate time observed from each other will run slow. 
You can refer to the article Length contraction use case  Destination Andromeda! for an example of proper time dilation calculation.
[1] This is no more true when one referential is in acceleration relative to the otherl: both proper times are no more equal and this is how we resolve the famous Twin paradox.
In General Relativity, as we have shown in the article Gravitational redshift or Einstein effect  Part I, infinitesimal proper time dτ is defined as follows (by replacing η_{μν} from the special relativity with the general metric tensor g_{μν})
In the same way that coordinates can be chosen such that x^{1}, x^{2}, x^{3} = const in special relativity, this can be done in general relativity too. Then, in these coordinates
]]>
As per the considerations of the Equivalence Principle, if we were to describe the movement of an object in the Earth's gravitational field, we would then have to follow the following steps:
Note: You can refer to this article The Lorentz transformations Part IV  Lorentz transformation matrix (tab index notation) for a gentle introduction to the index notation and Einstein summation convention if you are not familiar with these concepts.
In the free falling referential, let's name the spacetime coordinates ξ^{α }in index notation, i.e:
The movement of the free particule is given by the fouracceleration vector's magnitude equal to 0 so using the index notation:
With τ refers to the time as measured by an observer at rest in his own rest referential also called the Proper time for a non massless particle.
Let's name x^{μ} the coordinates in the new arbitrary referential (non inertial)
So applying the chain rule to our initial freefalling equation we get:
or Kronecker delta is defined as 1 only if β = μ and 0 otherwise, that means we can replace the μ indice by β in the last term
Remark 1: The geodesic equation in the (accelerating) laboratory's referential shows that the particule's motion is no more a straight line, because some kind of 'inertial force' represented by the term with the Christoffel Symbol or Connection coefficient is now acting on it.
At this point, the coordinate transformation does not seem to have anything to do with the gravitational force. However, if our referential in ξ is free falling in a gravitational field, then in the fixed laboratory referential, the inertial force f^{β} is the gravitational force. And the movement of a body in the laboratory can be determined if we know the Christoffel symbol components. That's precisely one of the goals of general relativity.
Remark 2: This equation generalizes the notion of a "straight line" to curved spacetime. Actually, we will see in Christoffel symbols in terms of the metric tensor how the Christoffel symbol (at the heart of the gravitational force) can be calculated from the space time metric.
Remark 3: This equation represents in reality 4 distinct equations (indice β runs from 0 to 3) of 11 terms (the 16 terms involving the connection reduce to 10 for symmetric reasons).
Remark 4: from the same geodesic equation, we will show later in The Geodesic equation in the Newtonian Limit how, under specific conditions called as Newtonian limit, we can derive the direct relation between the metric tensor and the gravitational potential.
Finally, in a later article Geodesic equation from the principle of least action, we will demonstrate how to derive the same geodesic equation from the principle of least action and the EulerLagrange equation.
Remark 5: you can follow this demonstration 'in live' via this excellent tutorial: everything is worth watching but the deduction of the geodesic equation from Equivalence Principle starts at 32:20
]]>
Einstein's journey to Special Relativity was triggered by a simple question: what would it mean if the speed of light were the same for all observers?
The start of his long and tortuous journey to General Relativity began with an equally simple observation: in a gravitational field and in the absence of air resistance, all things fall to the ground at the same rate of acceleration, regardless of their mass. That's it...!
Formulated in a more precise way, Einstein assumed the correspondence between inertial mass, which appears in Newton's F= ma and describes how difficult it is to accelerate an object and (passive) gravitational mass, which describes the strength with which gravity acts (that is the so called weak version of the Equivalence Principle)
Actually, if we have an object with inertial and gravitational masses m and M, respectively, and if the only force acting on the object comes from a gravitational field g, combining Newton's second law and the gravitational law gives the acceleration
and if we postulate m = M, then it follows that a = g.
After Galileo, a more precise confirmation of the equality of the two masses came from the austrohungarian physicist Loránd Eötvös (5×10^{−9} order of precision)
He measured the torsion on a wire, suspending a balance beam, between two nearly identical masses under the acceleration of gravity and the rotation of the Earth. The idea was that if two bodies were fixed to opposite ends of a rod suspended at it centre by a thin wire, and if they have inequal inertial mass but equivalent gravitational mass, then the attraction of the earth would be the same for both, but the acceleration corrections would be different: then the rod shoud twist.
Currently, as the foundational stone of General Relativity, the Weak Equivalence Principle is being probed in space at the 10^{–15} level by the MICROSCOPE satellite mission of ONERA and CNES. It is worth pointing out that observation of a deviation of the universality of free fall would imply that Einstein’s purely geometrical description of gravity needs to be completed or amended.
Below a synthetic diagram of all the tests of WEP throughout the 20th century:
On the same subject, the video of Apollo 15 commander David Scott, who in 1971, dropped a feather and a hammer in the high vacuum of the lunar surface is also spectacular and is worth watching! (with a funny analysis as a bonus..)
If we are in a laboratory on Earth, a mass that is released will fall, or accelerate, downward due to the gravitational attraction of the Earth with a rate of acceleration g = 9.80665 m/s^{2}.
Now, let us move this laboratory into space, away from the gravitational influence of a planet. We now take the same object and release it within an accelerating rocket, of which the level of acceleration, a, is the same as the Earth gravity, g.
In such a situation the rocket will 'push' onto the laboratory floor and move this floor towards the 'falling' mass. As far as the observations of the two motions of the mass relative to the floor are concerned, the accelerated motion in the two cases will be exactly the same. If the laboratory would have no windows, the observer could not distinguish between an acceleration due to gravity (e.g. on Earth) and an acceleration due to a 'push' of e.g. a rocket in space^{[1]}.
Inversely, if gravity and acceleration are equivalent, we should be able to 'cancel' the effects of gravity by choosing a referiential in freefall: locally, this referential becomes inertial and the special relativity holds true for its coordinate system.
In a lift in free fall, provided that an observer makes observations only in a local spacetime neighbourhood, there is are no experiments that he can do inside the lift that are able to distinguish between the possibilities that he is plummeting toward earth or floating in outer empty space.
This kind of tiny free falling lifts or frames are referred to as Local Inertial Frame (LIF), and will turn out to be of major importance in the development of general relativity.
Actually, Einstein referred to this 'reduction' of the physics of curved/gravitationnal spacetime over small free falling/flat regions to the physics of special relativity as "the happiest thought of his life".
Thus Einstein did suggest a sense in which his ideas on gravitation 'generalized' the theory of relativity, i.e a way in which the Principle of Equivalence could generalize the Principle of Relativity.
For if an unaccelerated frame of reference subject to a homogeneous gravitational field is physically equivalent to an accelerated frame, then, at least when there is gravitation, it seemed to Einstein that there is no such thing as absolute acceleration.
Just as the special theory eliminates absolute velocity, its generalization through the principle of equivalence eliminates absolute acceleration.
You can watch this very well made video from Dr Physics A for an introduction to Equivalence Principle
General Relativity: An Introduction  Part 1 of 2
General Relativity: An Introduction  Part 2 of 2
[1] We should note that assimilating a referential at rest on Earth to an accelerating referential in empty space is another way of disqualifying it as a proper inertial referential. We could refer also to our article Inertial Frame of Reference where a referential on Earth was already considered as a non inertial frame because of the Coriolis force.
]]>You can already read the Introduction to General Relativity
From now on, we will be writing articles related to both Special and General Relativity theories in parallel.
]]>For those who don't know what RSS feedconsists in, RSS is a way to enable website content to be syndicated for usage on other websites and publish frequently changing content, such as news headlines, forum posts, blog comments, video content and calendar events  read more by clicking the Read More button below...
]]>
So far, we have then deduced the time dilation factor by 3 different ways:
 by simple geometry and the light clock example Constant Speed of light  Introduction to Time Dilation and Lorentz factor
 by the Minlowski space time interval invariance Minkowski's FourDimensional SpaceTime
 and finally by using the Lorentz transformation equations The Lorentz transformations Part III  MeasuringRods and Clocks in motion
]]>
I am a software engineer with strong interest in science and especially in physics.
I also hold a master's degree in philosophy from University of ParisSorbonne.
You can contact me at cyril@einsteinrelativelyeasy.com
{KomentoDisable}
]]>
You can find below a list of resources which i have found very useful to get familiar with the special and general theories of relativity.
 http://www.einsteinonline.info/ Max Planck Institute's web portal which provides information about Albert Einstein's theories of relativity and their coolest applications, from the smallest particles to black holes and cosmology.
 http://www.phys.vt.edu/~takeuchi/relativity/ Special Relativity Lecture Notes of Department of Physics, College of Science, Virginia Tech.
 Einstein Light: A brief illumination of relativity. Multimedia presentation of the basics of special relativity  includes many helpful animations.
If you are interested in the subject of space and time, and in special relativity, then this is a must read.
Brian Cox and Jeff Forshaw do a superb job of explaining Einstein's famous formula and theories in a way that is as easy and enjoyable to read as possible with very low content in mathematics.
It is mostly focused on special relativity, with the last chapter introducing the general relativity.
The book to read once you are familiar with special theory of relativity. Here we go through the mathematics (to paraphrase Euclid, there is no royal road to relativity  you have to do the mathematics). But we are taken gently by the hand, with even a crash course in foundation mathematics for those with minimal mathematical background. Chapter 1 to 3 are dedicated to special relativity, the rest of the book exploring general relativity in deep details.
Although unanimously criticized for its bad translation, this book allows one to follow Einstein's actual thought process in arriving at these theories.
Reading the words of the master is an experience that should not be missed, even if sometimes some calculation details are left out during the demonstration.
Hopefully the reader will find some help in our articles, as they are supposed to go step by step.
You can find the book in html and pdf format at this location
 Reflections on Relativity by Kevin Brown Very complete presentation and stimulating thoughts on Relativity
 John Baez' general relativity tutorial : Easygoing introduction to general relativity  including formulae and mathematical concepts
 Introduction to General Relativity by Gerard 't Hooft
A General Relativity Workbook is a textbook intended to support a onesemester undergraduate course on general relativity. It's very well written and Moore has a gift for language that few other scientists have.Through its workbookbased design, it enables us to develop a mastery of both the physics and the supporting tensor calculus.
The mathematics is introduced gradually and in a completely physical context.
A must have!
One small regret however: the unavailability of solutions to the Exercises and/or Problems
The collected papers of Albert Einstein: Princeton University Press presents The Digital Einstein Papers, an openaccess site for The Collected Papers of Albert Einstein, the ongoing publication of Einstein's massive written legacy comprising more than 30,000 unique documents.
]]>Note: The consequences of the Lorentz transformations regarding the behaviour of clocks and measuring as it will be exposed in this article are described by Einstein in Chapter 12 of his book Relativity.
We are mentionning here two supplementary consequences which are the spacetime invariance and the conservation of the light speed in different inertial frames.
You can find a synthetic demonstration of all these effects in the following contribution also: Thread about Lorentz transformation
Suppose a metrerod in the x' axis of R' in such a manner that one end (the beginning) coincides with the point x'=0 whilst the other end (the end of the rod) coincides with the point x'=1. What is the length of the metrerod relatively to the system R?
As we know, we just have to take a photography from the rod in our R referentiel at a given time. Let's choose t=0, as we are now used to.
But the metrerod is moving with the velocity v relative to R, so the rigid rod is thus shorter when in motion than when at rest, and the more quickly it is moving, the shorter is the rod.
For the velocity v=c we should have ,
and for still greater velocities the squareroot becomes imaginary. From this we conclude that in the theory of relativity the velocity c plays the part of a limiting velocity, which can neither be reached nor exceeded by any real body.
For illustration purposes, let's suppose that a referentiel R' is moving with a relative speed of 0.6 times the speed of light with reference to frame R; it is now very easy to calculate by how much the rod is then shortened from R frame's point of view
Let us consider a secondsclock which is permanently situated at the origin x'=0 of K'.
t'=0 and t'=1 are two successive ticks of this clock.
The first and fourth equations of the Lorentz transformation give for these two ticks :
As judged from R, the clock is moving with the velocity v; as judged from this referencebody, the time which elapses between two strokes of the clock is not one second, but γ > 1.
As a consequence of its motion the clock goes more slowly than when at rest.
Remark: we confirm here that the time dilation γ found by the Lorentz transformation is the same that we did find geometrically in our introduction article about time dilation Constant Speed of light  Introduction to Time Dilation and Lorentz factor
As we have earlier calculated the length contraction observed by a moving referential animated with a speed of 0.6 speed of light, let's try to calculate the time dilation observed from the 'fixed' R referential
To check the correctness of Lorentz transformations, we should make sure that the second postulate of special relativity, i.e the speed of light remains equal in all inertial referentials, holds true.
If a lightsignal is sent along the positive xaxis of the frame reference R, its motion will obey the equation
x = ct with c defined as the speed of light
According to the equations of the Lorentz transformation, this simple relation between x and t implies a relation between x' and t'.
Calculating the speed of light in R' referential is now straightforward
We just have verfied that the speed of light remains constant between two referential frames transformed by Lorentz equations.
We have seen in one of our first articles that the simultaneity of events could depend on the observer's reference frame.
This can be now shown algebraically using the interval transformation rule
Remark 1: The interval of time will expand as the relative speed or/and the distance is increasing.
Remark 2: Inversely, for very low speeds where v << c, then the time interval tends to 0.
As an exercice, let's try now to calculate the inteval of time observed by the cosmonaut between the lauching of the two rockets, as seen in our article Constant Speed of light  Introduction to End of Simultaneity
Suppose that the cosmonaut travels at the speed of 0.9 speed of light (noted c) and that the two rockets are separated by a distance of 1km = 10^{3}m.
]]>
Note: the simple derivation of the Lorentz transformation as it will be exposed in this article is done by Einstein in Appendix 1 of his book Relativity. We are giving here a little bit more detailed calculus.
Again with take the hypothesis of two referentials R and R' in standard configuration.
We require to find x' and t' when x and t are given, assuming that R' is moving along the x axis relative to R.
A lightsignal, which is proceeding along the positive axis of x of the referential R, is transmitted according to the equation
Likewise, since the same lightsignal has to be transmitted relative to R' with the same velocity c, the propagation relative to the system R' will be represented by the analogous formula
The only way that the same event could satisfy the two equations is if
(1)
so that at t = t' = 0, the disappearance of (x  ct) involves the disappearance of (x'  ct')
If now we apply quite similar considerations to light rays which are being transmitted along the negative xaxis, we obtain the equivalent formula
(2)
If we now sum up equations (1) and (2) we get:
Substracting (2 and (1) gives the equation of ct'
By defining a and b as follows
we can rewrite the two equations simply like
Our problem becomes now to find a and b.
If we try to find the the movement of the origin of R', given by x'=0, with the reference to R, we just have to use first equation
We can now express the speed of the referential R' with a,b and c = speed of light.
If now we were to express the length of a solid stick measuring 1m in referential R', from the referential R, we just have to take a photography of this stick at a given time, say at t=0 in R. Using the first equation, we get
(equation 3)
Let's do the same exercise from the R' perspective; we need to take a snapshot of the stick at a given time t' in the R' referential. We guess that we have to choose t'=0
(equation 4)
But we know that the stick length in R' seen from R should be equals to its length in R observed from R', hence equation 3 = equation 4
we can then rewrite our two main equations expressing x' and t', and verify that they conform to the Lorentz expression as given in the precedent article The Lorentz transformations Part I  Presentation
]]>
Note: the presentation of the Lorentz transformation is done by Einstein in chapter 11 of his book Relativity
Remember that since the beginning, we are trying to find the relationships between the coordinates of an event in two inertial frames R and R', with R' moving with a velocity v with respect to R.
After having spent time giving some insight into the strange nature of the spacetime and of his contruitive effects, it is now time to give a more precise algebraic formulation of how coordinates change for different inertial observers.
This set of equations are called the Lorentz transformations, named after the Dutch physicist Hendrik Lorentz (Nobel Prize 1902)
Given the assumptions that R and R' are in standard configuration, i.e:
If an observer in R records an event t, x, y, z , then an observer in R' records the same event with coordinates t', x', y', z' defined as below:
With y = Lorentz factor having been already defined in the previous article Constant Speed of light  Introduction to Time Dilation and Lorentz factor
According to principle of relativity, there is no privileged frame of reference, so that the transformations from R to R' must take exactly the same form as the transformation from R' to R.
The only difference is R' moves with velocity v relative to R (same magnitude but is oppositely directed). So that if an observer in R' records an event in t', x', y',z' then an observer in R records the same event with coordinates
The reciprocal transformations from each referential relative to the other coud be visualised in a synthetic manner as follow:
Remark 1: Compared to the classical Galilean physics with absolute time, we begin to understand what Minkowski meant by the union of the two concepts.
Remark 2: When v<<c, i.e when the relativistic effects fade away, we can verify that the Galilean transformation can be derived from the Lorentz transormations.
]]>An event is something that happens instantaneously at a single point in spacetime, such as a light flashing, or a position on a moving object passing another point.
All events in spacetime are defined using the four coordinates t, x , y, z.
It is often referred as x^{μ} using index notation, with μ = {0,1,2,3}
The quantity x^{μ} is known as the fourposition (it is one example of fourvector) as it describes an event or position in spacetime using the four components (ct,x,y,z).
We can think of a particule moving through spacetime as a succession of events, and if we link all these events together we would obtain a line representing the particle's travel through spacetime. This line is called the particle's world line.
]]>
What to remember from this article
Minkowski spacetime is the most common mathematical structure on which special relativity is formulated. It has the following features:

As already explained in our introduction, the special theory of relativity describes the relationship between physical observations made by different inertial or nonaccelarating observers, in the absence of gravity.
Each such observer labels events in spacetime by four inertial coordinates t, x, y, z.
In Newtonian mechanics, events are described using a threedimensional Euclidean space time plus an independant scale of absolute time.
In special relativity (as in general relativity), space and time are fused together into a single fourdimensional entity known as spacetime.
The words of Hermann Minkowski  a german mathematician and also Einstein's professor at Zurich polytechnic  delivered at the 80th Assembly of German Natural Scientists and Physicians (21 September 1908), and by which he introduced spacetime to the world, are now famous:
"The views of space and time which I wish to lay before you have sprung from the soil of experimental physics, and therein lies their strength. They are radical. Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality."
Although we still use the Cartesian coordinates (x, y, z) as spatial coordinates to specify where the event happened plus a time (t) coordinate describing when it happened, we have to move away from the familiar Euclidean space we have become used to using so far. Let's try to see why.
Consider two events in space time, let's say the start and the end of the famous cyclosportive La Marmotte, covering a distance of 174.4 km (108.4 mi). Let's suppose that it takes you eight hours to finish the race: what does it mean exactly that in the earth's referential your bike computer has measured the exact distance between Bourg d'Oisans and L'Alpe d'Huez to be 174.4km and that you measured the time interval using a clock at Bourg d'Oisans and another  synchronized  one displaying the arrival time in Alpe d'Huez, and found 8h.
Don't forget that we already know that these two distances, both in space and in time, are not universally agreed upon. Someone able to watch your race from a space rocket would say that it did not take you eight hours to finish the race neither that the total distance of La Marmotte is 174.4km.
It's like Einstein has upset the order of things, and that we cannot even rely on both space and time to build a reliable picture of the universe.
It's certainly time to make a conjecture to save the edifice: if the two concepts of time and space are relative to a moving observer, can we not search for some kind of unification of the two which would not vary between two different referentials?
Let's make the radical conjecture that the distance in spacetime is invariant for all observers; let's suppose that distance is a measure that we can all agree upon.
But how do we visualise a distance in a fourdimensionnal spacetime? That's almost impossible, even for Stephen Hawking^{[1]}, and we usually reduce our spacetime into a graph of two dimensions, with a vertical time (t) axis and a horizontal spatial (x) axis. ^{}
We have also to agree on units, which should be different than the SI ones: given the speed of the light, if we were to choose seconds and meters for our two axes, we would end up with the path of light represented by a line as near as horizontal.
To get around this problem, we multiply the time in seconds by the speed of light in meters per second and use this quantity ct as units for the vertical time axis. One unit of ct is the speed of light (3x10^{8 }ms^{1}) multiplied by the time t ((1/3)x 10^{8}) that it takes the light to travel 1 meter^{[2].}
Then, by definition, we have defined the speed of light c as being equal to 1m and can now draw the path of a light ray as line with a slope of 45° degrees.
Any particule moving with a lower speed than the light wiill then have a word line steeper than the 45° one, as for a given position on the x axis, it will take it more time to reach it than for the light. Any particule travelling at half speed of the light will have a world line twice as steeper as the light one.
Now let's go back to our initial question to figure out how to calculate the distance in our Minkowski spacetime.
Say that we naively take the formula which holds true in our well known Euclidian space: in our twodimensional context, we would write this as S^{2} = x^{2 } + c^{2}t^{2}, which is also the equation of the circle of radius S and center 0.
Put it in other words, for two given events A and O which happened in spacetime, two observers in relative movement to each other could disagree at which position on the x axis and the t axis each event may occur, but if our hypothesis that the spacetime distance remains the same holds true, then we must also accept the contrainst that each event A' (seen by observer 2) or A'' (seen by observer 3) should graphically lie anywhere on the circumference of the circle of radius S.
Consider the event A'' : it is on the circle so obviously sits at an equal distance from the event O as the event A.
But there is a big difference: A'' has happened in the past of A, as it sits on the negative part of the time axis.
If we think a little bit on it, something is wrong there: if we consider again our example of the Marmotte race and define the two events 0 and A respectively as the start and the end of the race, we can concede that two observers could see each event at different time and different position, but are we ready to exchange the order of the two events?
If we were to accept this galilean formulation of the distance, it would then mean that it exists somewhere the possibility for an inertial observer to see the arrival of the race before it has even started! The basic causality relation would then be broken, which is quite annoying!
Let's try to see what it would look like to transform the + sign into a  sign, i.e to imagine a spacetime mathematically described by S^{2} = x^{2}  c^{2}t^{2}
Pictorially, the locus of points whose squared distance from the origin is ±1 consists of the two hyperbolas labeled +1 and 1 in the figure below:
In this spacetime, we can observe that the events siuated at distance +1 of the the origin lie all either on the upper hyperbola (t positive) or either all on the lower hyperbola ( t negative), which marks a real progress compared to our galilean circle.
Another great feature about Minkowski diagram is that they let us visualize quite easily the time dilation effect between two inertial frames.
Consider the diagram below with two inertial frames named Ziga and Ranja, with coordinate axis (ct,x) and (ct',x'). We want to find the (coordinate) elapsed time between two events.
Let those two events be the origin and event A (ct=1, x=0) in Ziga referential. To find the coordinate elapsed time in the primed/Ranja coordinate system, draw a line from A, parallel with the x axis  this the line of simultaneity for Ziga for all events happening at ct=1. You then cross the ct' axis of Ranja at the time seen in Ranja referential, here marked as A'. You can clearly see that OA' = Δt' < 1' so from Ranja perspective, time in Ziga (marked as 1 for event A) runs slower.
The Lorentz factor or time dilation factor as experienced from Ziga perspective in the moving referential Ranja can then be visualized as the difference/distance between the projection of A' and B' on the Ziga axis.
The apparently paradoxal proposition, i.e that the time in Ranja runs slower from Ziga perspective, could be visualize exactly in the same way.
Let's define the event B' happening at ct'=1 and x'=0. To find the correspondant event in the unprimed Ziga referential, we have to draw the tangent at this point to the hyperbola: this line defines the line of simultaneity in the primed coordinate system. We have marked this event as B in the diagram below. Again and paradoxally, we can see that OB < 1 in the Ziga referentiel system, which means that from a Ranja observer, the time runs slower in Ziga referential compared to its own referential.
[2] Using ct units of time means we are measuring time in metres.
In the previous tab, we have seen how the spacetime interval equality ds^{2} = c^{2}Δt^{2}  Δx^{2 }  Δy^{2}  Δz^{2} lead us to the hyperbola space diagram, where time dilation γ could be graphically represented.
Here we will use the invariance of this spacetime interval for all observers to calculate the exact value of γ. Hopefully we should find the same value as the one found geometrically in the previous article Constant Speed of light  Introduction to Time Dilation and Lorentz factor
Let's suppose that a person Z stands still in Ziga referential sitting down in a train with horizontal speed v relative to Ranja referential.
As Z is at rest in Ziga referential, the values Δx, Δy and Δz are all equal to zero and the spacetime distance that Z has traveled during the time interval Δt in Ziga referential is therefore c^{2}Δt^{2 }  the time as measured by an observer using their own clock is called Proper time.
Now let's consider the same journey from Ranja's perspective. For a person R belonging to Ranja referential, person Z has traveled a distance v×Δt' where Δt' is the time interval Δt as seen from Ranja perspective. So in Ranja referential, person Z has traveled a space time distance ds'^{2} = c^{2}Δt'^{2}  Δx'^{2} = c^{2}Δt'^{2} (v×Δt')^{2}.
The crucial point here is that both P and Z should agree on the spacetime distance of the journey.
That means that ds^{2} = ds'^{2}, or also that c^{2}Δt^{2} = c^{2}Δt'^{2} (v×Δt')^{2}.
We can check that we have found exactly the same formula for time dilation as the one which emerged by thinking about light clocks in the previous article.
]]>
When Einstein first hit upon special relativity, he thought one effect had special importance, so much so that it fills the first section of his "On the Electrodynamics of Moving Bodies." It is the relativity of simultaneity  see Kinematical Part  Definition of simultaneity.
According to it, inertial observers in relative motion would disagree on the timing of events at different places.
Let's try to illustrate this by the following scenario, first in classical Newtonian (non relativistic) context.
In the exact middle of a train coach, a speaker has been set up to emit at time t=t'=0 a sound wave travelling with speed v towards each part of the coach.
Suppose that a referentiel R is attached to the ground, with the origin O located at time t=0 at the middle of the coach, i.e. at the place of the speaker.
Suppose also that a referentiel R' is attached to the coach in a standard configuration, which means:
If the train is at rest in frame R, i.e the velocity u=0, and if L is the length of the coach, we deduce immediately that the time T_{a} when the observer in R is hearing the sound on the left side of the coach and the time T_{b} when the observer in R is hearing the sound reaching the right side B of the coach are given by:
We suppose now that the train is moving towards the right at a constant speed u relatively to the observer on the platform.
The time T_{b} at which the observer on the platform will hear the sound reaching point B of the coach is given by the following equation:
Let's try to calculate now the time t_{a} at which the observer on the platform will hear the sound reaching the left coach
In both situations, the train at rest or in movement, the observer in the fixed referential will hear the sound hitting both sides of the coach at exactly the same time.
Left sound has less distance to travel to reach point A as the coach moves against it, but it is compensated by a lower speed uv, as heard from the platform.
Inversely, right sound has more to travel as the coach moves away from it, but it's cancelled by his faster speed u+v, as heard from the platform.
On an orbital station, it has been agreed that two rockets should take off exactly at the time when they would both receive a light signal emitted by a control tower located at the same distance from each other.
Inertial frame of reference = Surface of the planet
As the light travels the same distance at the same speed, the light would reach the two rockets simultaneously, and the observer watching this scene from the ground would see the two rockets taking off exactly at the same time, as expected.
Inertial frame of reference = Spaceship travelling at constant high speed v to the right
Now suppose that we choose as a new referential frame an observer piloting a spaceship travelling with a very high and constant speed to the right.
The cosmonaut will still see the two light signals leaving the tower of control exactly at the same time (two events happening at the same time at the same place in an inertial frame of reference will always been seen as simultaneous in any other frame of reference).
But as the spaceship is travelling to the right side, the planet and the orbital station are relatively shifting to the left side. If it's the case, then the right rocket is moving towards the spaceship whereas the left rocket is moving away from it.
The right ray of light will then have less distance to travel to reach the right rocket, and recalling the second postulate of Special Relativity that the speed of light is constant in any inertial frame, the right light ray would then reach the right rocket first.
Then only a tiny bit later (we will in an another course that this time interval will both depend on the speed of the spaceship and on the distance L between the two rockets) that the pilot will see the left ray of light hitting the left rocket.
Therefore, from the spaceship point of view, the takeoff of the two rockets is not simultaneous.
This shows that events that appear simultaneous in one reference frame, may not do in another, and that what we perceive as the present only corresponds to what is occurring simultaneously to us, in our reference frame.
]]>The coordinate system from which an observer takes measurement of events in space and time (classical mechanic) or in spacetime (special and general relativity) is called a frame of reference.
Within the realm of Newtonian mechanics, an inertial frame or inertial reference frame, is one in which Newton's first law of motion is valid.
Newton's first law:
An object will remain at rest or in uniform motion in a straight line unless acted upon by an external force. 
So then you might be wondering, when could Newton's first law ever not appear to be true?
Imagine you are on Earth and assume that you could remove both friction and air resistance.
Now hit a ball gently so it moves slowly along a perfectly smooth road with uniform velocity. If we were in a strict inertial frame, the ball would move in a straight line along the road, but it's not the case: due to the Earth's rotation, the ball's path is ever so slightly curved.
This force causing moving objects on the surface of the Earth to be deflected to the right (with respect to the direction of travel) in the North Hemisphere and to the left in the Southern Hemisphere, known as the Coriolis effect, along with the centrifugal force, is precisely what makes the rotating Earth a non inertial frame^{[1]}.
More generally, apparent forces which are not caused by any physical interaction but are due to an observer using a noninertial frame of reference  rotating or accelerating  are known as inertial or fictitious forces^{[2]}.
It leads to our second, more general definition of an inertial frame of reference:
Inertial frame of reference:
The inertial frame of reference is the one where the fictitious or inertial forces vanish. In other words, the laws of physics in the inertial frame are simpler because unnecessary forces are not present. 
An inertial frame obey the following properties:
Note 1: In special relativity, inertial frames are known as Lorentz frames. In common with classical Newtonian inertial frames, they obey Newton's first law, but they differ in how they deal with gravity.
Unlike Newtonian inertial frames which treat gravity just like any other force, Lorentz frames can only be constructed in flat spacetime, known as Minkowski spacetime, one which is precisely not curved by the presence of mass/energy.
Note 2: As we will see in General Relativity course, the gravity field is equivalent to an noninertial, uniformily accelerating frame, and the Earth's surface can be considered as accelerating up.
In this sense, the freely falling apple and not the Earth must constitute an inertial frame. In our article Geodesic equation and Christoffel symbols we show how gravity appears as an additional force due to nonuniform relative motion of two reference frames (free falling inertial reference of frame and accelerating up earth's frame). You can read this definition also of a Inertial Observer.
Note 3: concerning Earth taken as referential, one should distinguish between the Earth Centered Inertial (ECI) frames which have their origins at the center of mass of the Earth but dont rotate with it  so could be considered as inertial, and the Earthcentered, Earth fixed (ECEF) frames which rotate with the surface of the Earth
[1] You could find there a great video about Coriolis effect.
[2] Another commonly experienced example of inertial force would be the one that push you to the back of the seat in an accelerating car.
]]>Before the advent of general relativity, Newton's law of universal gravitation had been accepted for more than two hundred years as a valid description of the gravitational force between masses, even though Newton himself did not regard the theory as the final word on the nature of gravity.
However, the classical Newton's theory suffered from two major inconsistencies:
From the last point of view, General Relativity could be seen as an attempt to reconciliate gravity and special relativity. It took Einstein a journey of no less than eight years of hard conceptual and mathematical^{[2]} work before he finally succeeded in resolving these difficulties.
The first step of it will be to postulate The Equivalence Principle, introduced in 1907 in the long review article on Relativity commissionned by the editor of the Jahrbuch der Radioaktivität.
[1] You can watch this nice video for more information about the Mysterious Orbit of Mercury.
[2] In his paper The Foundation of the Generalised Theory of Relativity, Einstein paid special tribute to his friend Grossman for his priceless help in these terms: "Finally in this place I thank my friend Grossmann, by whose help I was not only spared the study of the mathematical literature pertinent to this subject, but who also aided me in the researches on the field equations of gravitation. "
]]>
What is Special Relativity?
It's a theory proposed by Albert Einstein in 1905 in the Paper "On the Electrodynamics of Moving Bodies"^{[1]}, that corrects Newton's mechanics laws in situations involving motions nearing the speed the light.
The problem Einstein is trying to resolve is conceptually quite simple: "How events that happen in space and time are measured in different frames of reference moving in a constant motion relative to each other, in the absence of gravity?".
Einstein based his theory on two fundamental postulates:
These two fundamental assumptions and in particular the revolutionary one, i.e the constancy of speed of light, have the following drastic and counterintuitive consequences when relative velocities are close to speed of light.
The theory is 'special'^{[3]} in the fact that it only applies in the case where the spacetime is flat, and not curved by the presence of mass/energy.
The equivalence of space time curvature and gravity will be established by Einstein himself in the General Relativity Theory formulated 10 years later, in 1915.
[1] "Zur Elektrodynamik bewegter Körper" under its original german title.
[2] The exact value of the speed of light has been estabished since 1983 as 299792458 metres per second.
[3] In 1905, Einstein did not mention the expression "special theory of relativity" but only the "Principle of Relativity". The expression "theory of relativity" is credited to the physicist A.H Bucherer. The adjective "special" was retrospectively added in 1915, after Einstein published his theory of General Relativity.
]]>"The special theory of relativity owes its origin to Maxwell's equations of the electromagnetic field. Conversely, the latter can be grasped formally in satisfactory fashion only by way of the special theory of relativity." Einstein, written in 1946 for "Autobiographical Notes" 
Newtonian mechanics assumes time and space to be unrelated absolutes^{[1]}, and is fundamentally based on the principle of relativity first enunciated by Galileo Galilei in 1632 in his Dialogue Concerning the Two Chief World System. In its modern formulation, this principle states that laws of mechanics are the same for all the inertial observers, or from a more theoretical point of view, that these laws are invariant under Galilean transormations. There is no special concern about the speed of light.
In 1865, in his Dynamical Theory of the Electromagnetic Field the Scottish scientist James Clerk Maxwell (18311879) formulated the theory of classical electromagnetism, bringing for the first time together electricity, magnetism and light as the same phenomen.
In the Maxwell's equations as formulated by Lorentz in 1895, only in one "aether" frame is the speed of light constant in all directions and independant of the speed of the source. In all other inertial frames, the equations are not the same, meaning that the Maxwell equations are not unvariant under Galilean transformations.
One of the two theories, the old respectable Newtonian mechanics or the most recent Maxwell's electromagnetism has then to be proven wrong.
Einstein who devoted boundless admiration for the elegance of these four differential equations, took this constancy of speed of light as a postulate, generalized to all inertial frames, and abandoned the aether hypothesis. For this, he had to replace the Galilean transformations with a new set of equations called the Lorentz transformations, which leave the speed of light the same when we switch between different inertial frames, as required by the second postulate. But with a terrible price to pay: distances in space and intervals in time do change under Lorentz transformations!
It was even more easy for Einstein to rule out this stationary aether, as the so called MichelsonMorley experiment performed over the summer of 1887 failed to detect it ^{[2]}.
In 1912 Einstein wrote:
"It is impossible to base a theory of the transformation laws of space and time on the principle of relativity alone. As we know, this is connected with the relativity of the concepts of "simultaneity" and "shape of moving bodies." To fill this gap, I introduced the principle of the constancy of the velocity of light, which I borrowed from H. A. Lorentz’s theory of the stationary luminiferous ether, and which, like the principle of relativity, contains a physical assumption that seemed to be justified only by the relevant experiments (experiments by Fizeau, Rowland, etc.)"
"Einstein's inspiration on the road to special relativity was the mathematical beauty of Maxwell's equations, which impressed him to such a degree that he decided to take seriously the prediction that speed of light is a constant" (Extract from Why does E=mc2 from Brian Cox and Jeff Forshaw)
You can read an interesting thread with the following subject: Was Einstein's postulate for the speed of light a consequence of Maxwell's equations?
An interesting contextualization to Einstein's special theory can be found here: Spacetime of special relativity
[1] The term absolute means two different things here: space and time are the same for every observer, but also space and time are not affected by the presence of matter or energy. The first aspect is the one which will be called into question by Special Relativity theory whereas the second aspect will be examined by the General Relativity.
[2] The extent to which the null result of the Michelson–Morley experiment influenced Einstein is disputed. Alluding to some statements of Einstein, many historians argue that it played no significant role in his path to special relativity, while other statements of Einstein probably suggest that he was influenced by it.
]]>"Spacetime tells matter how to move; matter tells spacetime how to curve" John Archibald Wheeler
What is General Relativity?
It is the theory that Einstein developed starting from 1907 in order to incorporate the graviational field within the framework of special relativity, and that his author presented in its final version to the Prussian Academy of Science on 25th November 1915 in the paper The Field Equations of Gravitation.
In much the same way as the theory of special relativity was grounded on two postulates  the one of relativity and the one of invariant light speed in vacuum, Einstein based his theory of general relativity on two fundamental postulates:
Let us imagine an observer in free fall, measuring events in a small neighbourhood: according to the Principle of Equivalence, the gravity is cancelled out, the frame attached to this observer reduces locally to an inertial frame, and all the electromagnetic phenomena still obey the laws of Special Relativity.
In this new context, gravity will not be seen as a 'force' exterted on a massive body, but rather in the differential gravitational acceleration of nearby local free falling inertial observers^{[2].}
It turns out that this relative acceleration could be interpreted as the curvature of the spacetime, and so the final result of Einstein's general relativity is an equation known as Einstein's equation, which is able to quantify exactly how much warping there should be in the presence of matter and energy.
To put it in a nutshell, what Newton calls gravitation is called curvature of spacetime by Einstein and the heart of Einstein's work was finding the correct relationship in the form of A = kB where A describes the curvature of spacetime, and B on the right hand side describes the massenergy source of that curvature^{[3]}.
In his article from 1916 The Foundation of the Generalised Theory of Relativity, using approximate solutions of his equations, Einstein proposed three tests of general relativity which were not predicted by Newton's theory of gravity and were observable in the solar system^{[4]}:
At this point of time, these three predictions have been confirmed by experiments, and whenever the predictions of Einstein have been found to differ from the ideas of Newtonian mechanics, Nature has chosen Einstein's: space–time ceases to be an absolute, nondynamical framework as envisaged by the Newtonian view, and instead becomes a dynamical structure that is deformed by the presence of massenergy.
Therefore, a century after its inception, General Relativity has established itself as the standard theoretical description of gravity, with applications ranging from the Global Positioning System and the dynamics of the solar system, to the realm of galaxies and the primordial universe.
[1] That is why Einstein applied the word General to this extension of his theory of relativity because this new theory would not be restricted to the nonaccelerating reference frames of Special Relativity.
[2] From the point of view of someone standing by example on the earth's surface, the relative acceleration corresponds in the difference of g values at their two nearby locations. In the classical Newtonian context, this relative acceleration would be referred to as tidal forces.
[3] These equations represent a set of Partial Differential Equations, where the object G_{μν} on the left hand side, called the Einstein tensor, is a mathematical object  a tensor  with 10 independant components, and where the object Tμν, called the energymementum tensor describes the source of this curvature.
[4]Two later other predictions of general relativity were the existence of gravitational waves (1917) and the existence of black holes but Einstein's position regarding these two last predictions are a little bit more complex as:
 even though Einstein predicted the existence of gravitational waves from the equations of the General Relativity, he was convinced that they were too weak to be of physical significance and that they would never be discovered.
 Einstein himself did bot believe in the existence/reality of the black holes: he tried to demonstrate it in a article publisehd in1939 in the review Annals of Mathematic On a stationary system with spherical symmetry consisting of many gravitating Masses