# Lorentz transform: A Derivation and Commentary

The Lorentz transform and the Postulate of Relativity form the basis for the Special Theory of Relativity, which was the first major advancement in understanding our physical world since Newton.

On a personal level when my daughter was born, I decided that there were only two concepts which a civilized person needed to know. One was a proper proof of the Pythagorean Theorem and the other was a solid derivation of the Lorentz transform. Middle schools still don’t give a proper proof of the Pythagorean Theorem. And most people don’t have a clue about the Lorentz transform. But these two statements are central to any rigorous understanding of our physical world.

I have worked with my daughter on the Pythagorean Theorem, but this paper will be my prelude to the Lorentz transform.

Prerequisites:

There is surprisingly little that one needs to know in order to get a good basic understanding of the Lorentz transform. Anyone that has taken a college prep course which includes Algebra II should have sufficient mathematical skills. One might even be able to get by will Algebra I assuming that they are reasonably good at math.

Terminology:

The only terminology that the reader may need to understand is the term Reference Frame.

If you, or any one you know, have a watch and a ruler then you can establish your own reference frame. You can lay out a Cartesian coordinate system with your ruler and measure the lapsed time of events taking place in that coordinate system with your watch.

Large Assumption*:
The speed of light is a constant in all reference frames.

This may sound simple. But it is not. The more you understand the consequences of this statement the less intuitive it becomes. The problem starts to appear once you consider moving reference frames.

Consider the following diagram:

Here the rectangular box is assumed to be moving at a velocity of v along the x axis of the Reference frame. The yellow ball is assumed to be moving with a different velocity, w, along the x axis of the Reference frame. For now the reader might consider the rectangular box to be a car and the yellow ball a motorcycle.

A natural question to ask is how fast is the ball traveling relative to an observer in the box? A normal answer would be w-v.

But light is a different kind of entity. If we consider the yellow ball (motorcycle) to be moving at the speed of light, and the speed of light to be c, then w - v would be c. More bizarre yet, it would not matter how fast the rectangular box was moving, even if it were at 95 percent of the speed of light itself, as measured by the Reference frame, the answer would be the same! Basically, one can never catch up with the speed of light.

This means that we have to rethink our answer concerning the relative speed of the rectangular box and the yellow ball. We can not use the answer w - v.

How can we describe a moving reference frame in terms of a Reference frame at rest?

Here we start by assuming that there are 4 dimensions that describe our Rest frame. The 3 space dimensions and one time dimension. We use the normal x, y, and z coordinates to denote spatial dimensions and t to denote the time coordinate. Correspondingly, we will denote x’, y’, z’, and t’ as the coordinates in the moving Reference frame.

Additionally we will assume that x’, y’,z’, and t’ are only defined in terms of x, y, z, and t.

Mathematically this means that x’ = f[size=85]1/size, y’ = f[size=85]2/size, z’= f[size=85]3/size and t’ = f[size=85]4/size for some functions f[size=85]1[/size], f[size=85]2[/size], f[size=85]3[/size], and f[size=85]4[/size].

But we don’t know what these functions are. So the question becomes how can we get a better idea of what these functions are?

The following derivation almost exactly duplicates the text book “Introduction To Special Relativity” by Robert Resnick on pages 56 through 60.

We will assume that the moving Reference frame is moving, with a non 0 velocity, along a common x - x’axis with a speed of v and that at the instant where the origins coincide t =t’ = 0.

To help visualize this, consider the following diagram:

Figure 1:

Because we don’t want space to be lumpy (Resnick refers to this as the homogeneous assumption) [Assumption 1]. we will want all the variables to be of the first degree. It should be noted that this is a requirement of Special Relativity where the reference frames experience no “pulls” or “spins”. It might be noted that in General Relativity this is not the case. But this is the simpler case of Special Relativity. So we will write:

and i represents the i-th row and j represents the j-th column.

The following is exactly quoted from Resnick’s book except for my square bracketed blue insertions.

Resnick writes:

“If the equations were not linear, we would violate the homogeneity assumption [Assumption 1]. For example, suppose that x’ depended on the square of x, that is, as

Then the distance between two points in the primed frame would be related to the location of these points in the unprimed frame by

Suppose now that a rod of unit length in S had its end points at x[size=85]2[/size] = 2 and x[size=85]1[/size] = 1; then x[size=85]2[/size]’ - x[size=85]1[/size]’ = 3a[size=85]11[/size]. If, instead, the same rod happens to be located at x[size=85]2[/size] = 5 and x[size=85]1[/size] = 4, we would obtain x[size=85]2[/size]’ - x[size=85]1[/size]’ =9a[size=85]11[/size]. That is, the measured length of the rod would depend on where it is in space. Likewise, we can reject any dependence on t that is not linear, for the time interval of an event should not depend on the numerical setting of the hands of the observer’s clock. The relationship must be linear then in order not to give the choice of origin of our space-time coordinates (or some other point) a physical preference over all other points.

How then do we determine the values of these sixteen coefficients? Basically, we use the postulates of relativity, [Einstein’s postulates] namely (1) The principle of Relativity - that no preferred inertial system exists, the laws of physics being the same in all inertial systems - and (2) The Principle of the Constancy of the Speed of Light - that the speed of light in free space has the same value c in all inertial systems…”

To simplify further Resnick writes:

“The axis coincides continuously with the x’ axis. This will be so only if for y =0, z =0 (which characterizes points on the x-axis) it always follows that y’ = 0, z’ = 0 (which characterizes points on the x’-axis). Hence the transformation formulas for y and z must be of the form

y’ = a[size=85]22[/size]y + a[size=85]23[/size]z and z’ = a[size=85]32[/size]y + a[size=85]33[/size]z

That is, the coefficients a[size=85]21[/size], a[size=85]24[/size], a[size=85]31[/size], and a[size=85]34[/size] must be 0. Likewise, the x-y plane (which is characterized by z = 0) should transform over to the x’- y’plane (which is characterized by z’ =0); similarly, for the x-z and x’-y’ planes, y=0 should give y’ = 0. Hence it follows that a[size=85]23[/size] and a[size=85]32[/size] are zero so that

y’ = a[size=85]22[/size]y and z’ = a[size=85]33[/size]z.

These remaining coefficients, a[size=85]22[/size] and a[size=85]23[/size] can be evaluated using the relativity postulate [Assumption 2]. We illustrate for a[size=85]22[/size]. Suppose that we have a rod lying along the y-axis, measured by S to be of unit length. According to the S’ observer the rod’s length will be a[size=85]22[/size], (i.e., y’ = a[size=85]22[/size] x 1). Now suppose that the very same rod is brought to rest along side the y’ axis of the S’-frame. The primed observer must measurer the same length (unity) for this rod when it is at rest in his frame as the unprimed observer measures when the rod is at rest with respect to him; otherwise there would be an asymmetry in the frames. In this case, however, the S-observer would measure the rod’s length to be 1/a[size=85]22[/size] (i.e., y = (1/a[size=85]22[/size]) x y’ = (1/a[size=85]22[/size]) x 1).Now, because of the reciprocal nature of these length measurements, the first postulate requires that the measurements be identical, for otherwise the frames would not be equivalent physically. Hence, we must have a[size=85]22[/size] = 1/a[size=85]22[/size] or a[size=85]22[/size] = 1. The argument is identical in determining that a[size=85]33[/size] =1. Therefore, our middle transformation equations become

y’ = y and z’ = z. (2-2)

There remain transformation equations for x’ and t’, namely,

x’ = a[size=85]11[/size]x + a[size=85]12[/size] y + a[size=85]13[/size] z + a[size=85]14[/size]t and
t’ = a[size=85]41[/size]x + a[size=85]42[/size]y + a[size=85]43[/size]z + a[size=85]44[/size]t.

Let us look at the t’-equation. For reasons of symmetry, we assume that t’ does not depend on y and z. Otherwise, clocks placed symmetrically in the y-z plane (such as at +y, -y or +z, -z) about the x-axis would appear to disagree as observed from S’, which would contradict the isotropy of space [Assumption 3]. Hence a[size=85]42[/size] = a[size=85]43[/size] = 0. As for the x’-equation, we know that a point having x’ = 0 appears to move in the direction of the positive x-axis with speed v, so that the statement x’ = 0 must be identical to the statement x = vt. Therefore, we expect x’ = a[size=85]11[/size](x - vt) to be the correct transformation equation. (That is, x = vt always gives x’ = 0 in this equation.) Hence x’ = a[size=85]11[/size]x - a[size=85]11[/size]vt = a[size=85]11[/size]x + a[size=85]14[/size]t. This gives us a[size=85]14[/size] = -va[size=85]11[/size] and our four equations have now been reduced to

Equations (2-3)

x’ = a[size=85]11[/size](x - vt)
y’ =y
z’ = z
t’ = a[size=85]41[/size]x + a[size=85]44[/size]t.

There remains the task of determining the three coefficients a[size=85]11[/size], a[size=85]41[/size], and a[size=85]44[/size]. To do this, we use the principal of the constancy of the velocity of light [Assumption 4]. Let us assume that at time t = 0 a spherical electromagnetic wave leaves the origin of S, which coincides with the origin of S’ at that moment. The wave propagates with a speed c in all directions in each inertial frame. Its progress, then, is described by the equation of sphere whose radius expands with time at a rate c in terms of either the primed or unprimed set of coordinates. That is,

If now we substitute into Eq. 2-5 the transformation equations (Eqs.2-3) we get

Rearranging the terms gives us

In order for this equation to agree with Eq. 2-4, which represents the same thing, we must have

[These equations balance the coefficients of the

(The xt term needs to vanish)

Resnick simply gives the solutions for a[size=85]44[/size], a[size=85]11[/size], and a[size=85]41[/size] at this point. This is a common treatment of mathematical problems in a physics book. I personally find this unsatisfactory, and will give a full derivation of the solutions.

From the above equations we know:

Factoring terms we get:

If (a[size=85]44[/size] + va[size=85]41[/size]) = 0 Then a[size=85]44[/size] = -va[size=85]41[/size] by subtracting va[size=85]41[/size] from each side.

By substituting –va[size=85]41[/size] for a[size=85]44[/size] in Eq A we get:

We would also get a contradiction if we substituted into Eq B.

Therefore, we need to look at the equation

Substituting the positive root for a[size=85]41[/size] in Eq B we get:

Since t’ = a[size=85]41[/size]x + a[size=85]44[/size]t, any observer at the origin (x = 0) would see

(v is less than c or else the quantity becomes either undefined or an imaginary number), this means that an observer at x = 0 would see the t’ clock running backwards. Since this does not meet with actual observation, we would conclude that we must chose the negative root for a41. That is to say:

Subtracting the left summand from each side, multiplying by -1,
expanding the squared term, and canceling like term powers we get:

Here we need to determine if the root is positive or negative.

Let’s recall the setup, from Figure 1

We know from the first equation in 2-3 that x’ = a[size=85]11[/size]x + a[size=85]14[/size]t. At x’ = 0 we have:

0 = a[size=85]11[/size]x + a[size=85]14[/size]t. Subtracting a[size=85]11[/size]x from each side we get:

-a[size=85]11[/size]x = a[size=85]14[/size]t. Multiplying by -1 and dividing a[size=85]11[/size] we get:

x = - a[size=85]14[/size]t/a[size=85]11[/size] Eq 5

We also know that the origin of S’ is located at x’ = 0 which is equal to vt. See figure above.

Therefore vt = -a[size=85]14[/size]t/a[size=85]11[/size]. Dividing both sides by t we get:

v = - a[size=85]14[/size]/a[size=85]11[/size] Eq 6
Since x’ = a[size=85]11[/size]x + a[size=85]14[/size]t, we can distribute out a[size=85]11[/size] to get:

x’ = a[size=85]11[/size](x + (a[size=85]14[/size]/a[size=85]11[/size])t), and substituting –v for a[size=85]14[/size]/a[size=85]11[/size] ((Eq 6 x -1) we get:

x’ = a[size=85]11[/size](x – vt) Eq 7

We know by the setup (see Figure 1) that anything to the right of the S’ origin is a positive x’.

Therefore if x > vt then x – vt > 0 and x’ > 0. Since x’ = a[size=85]11[/size](x – vt), dividing both sides by (x – vt)) we get:

a[size=85]11[/size] = x’/(x - vt).

Since both x’ and (x – vt) are greater than 0, we must have a[size=85]11[/size] is greater than 0 when x > vt. But a[size=85]11[/size] is a constant, therefore we must conclude that a[size=85]11[/size] is always greater than 0.

Now we know that a[size=85]11[/size] is the positive root of Eq 4 we write:

All that remains is to determine a[size=85]14[/size].

From Eq 7 we know, x’ = a[size=85]11[/size](x – vt). Multiplying a[size=85]11[/size] out we get:

x’ = a[size=85]11[/size]x – a[size=85]11[/size]vt.

But since a[size=85]14[/size] is the coefficient of t, we must have a[size=85]14[/size] = -a[size=85]11[/size]v.

Recapping we have:

My insertion ends here.]

Here we have three equations in three unknowns, whose solution (as the student can verify by substitution into the three equations above) is

By substituting these values into Eqs. 2-3 we obtain, finally, the new sought-after transformation equations,

".

[Resnick’s quoted material ends here]

The transforms for the unprimed coordinates in terms of the primed coordinates are given by:

Space and Time Contractions:

A simple consequence of the Lorentz transform is that lengths in the S reference frame, along the x-axis, are contracted in the S’ reference frame along the x’-axis.

Proof:

The length of an object in the S reference frame is given by x[size=85]2[/size] – x[size=85]1[/size], where x[size=85]2[/size] is greater than x[size=85]1[/size].

Substituting the primed coordinates we get:

But since the length of an object is constant at any given time, i.e. we can assume that t’[size=85]2[/size] = t’[size=85]1[/size], or vt’[size=85]2[/size] = vt’[size=85]1[/size] and thus vt’[size=85]2[/size] – vt’[size=85]1[/size] = 0.

Therefore,

(v is always greater than 0, due to the setup), we must have x[size=85]2[/size] – x[size=85]1[/size] is greater than x’[size=85]2[/size] – x’[size=85]1[/size].

We can get a similar result with time

A proof of this statement can be found at:

en.wikipedia.org/wiki/Time_dilation

Commentary:

As mentioned in the introduction, the Lorentz transform and the Postulate of Relativity form the basis of Special Relativity. In fact one only needs to assume these two statements to develop, discard, or modify any macro physical law.

For example, let’s assume that F(x, y, z, t) is some physical Law in one inertial Reference frame. As examples the reader might assume that F(x, y, z, t) is Newton’s second Law (i.e. Force equals mass times acceleration). It could also stand for some statement about fields. Then we must have:

Postulate of Relativity:
If F(x, y, z, t) is a law in one inertial Reference frame then F(x’, y’, z’, t’) whose the prime variables are related to the unprimed variables by the Lorentz transform, must also be a law in any other inertial Reference frame.

Examples of failing transforms:

The Galilean transform (used by Galileo and Newton)

x’ = x – vt
t’ = t

A light sphere is given by:

of the light sphere in the unprimed coordinate system, we must have the coefficients match.

Therefore -2xvt = 0 and c – v = c. The only way that this can be true is if v = 0, which violates our setup.

Therefore the Galilean transform

x’ = x – vt
t’ = t

is NOT valid.

Many people believe, justified or not, that the Special Relativity produces paradoxes primarily due to the space and time contractions. Since Special Relativity is founded on the Lorentz transform, there have been efforts to modify or discredit it.

An example of a more complex transform is the following:

Again writing the light sphere in S’ we get:

Expanding the square on the right hand side of the equation we get:

Since there are no more x squared terms on the right side of the equation the coefficient of the x squared term must be equal to 1.

Therefore

Which is in contradiction to our setup. Therefore the transforms

are also NOT valid.

In general, a good technique for falsifying any proposed alternative transform is to see if the alternate transform preserves the form of the light sphere.

Some Questions Answered:

What fails and what succeeds? A very large failure is Newton’s second law (Force equals mass times acceleration). A very large success is the Maxwell equations.

Why does it seem like Newton is right anyway? If you look at the transforms, you might see that if v is very small compared to c, then the transform looks like the Galilean transform which in turn means that the Newtonian physics holds in this case.

On a Foundational Matter:

• There is a very elegant derivation, independent of the assumption to the constancy of the speed of light (our Assumption 4), of the Lorentz transform using only the assumptions of Isotropy and the Postulate of Relativity on Wikipedia located at

en.wikipedia.org/wiki/Lorentz_transformation

I would encourage anyone with even the briefest exposure to Group Theory to look at it.

As exercises one can verify that the light sphere maintains its’ form under the Lorentz transform; and the Euclidean metric given by: