CHAPTER 1:
INTRODUCTION
Riding on a light beam was one of Albert Einstein’s most famous gedankenexperimente, or thought experiments. In this experiment, he imagined himself going fast enough to catch up with a beam of light, seeing a light beam frozen in time, stopped in mid-cycle. This book will show why that thought experiment challenged even that great mind: because while light propagates in all directions, we, and all other matter, are the ones propagating in a single direction, that of time, at the speed of light. Thus, all observers will see light propagating at the same speed toward or away from them, regardless of the speed of the source.
Perhaps no other scientific theory is so mathematically simple, yet so difficult to visualize, as the Special Theory of Relativity. Just a handful of algebraic equations comprise the theory, but these simple equations seem to warp the very units of measurement of time and space into something made of rubber.
Who is my intended audience for this book? Anyone who has a good understanding of algebra, geometry and trigonometry can read this book and come away with a good understanding of relativity physics. But those with a PhD in physics can also come away with a new understanding of the Special Theory, and perhaps insights to find new ways to expand my approach. I am not going to oversimplify the Special Theory with analogies that limp, but provide insights that will remove a great deal of the apparent mystery of this subject.
To introduce myself, I graduated from the US Naval Academy in 1970 with a bachelor’s degree in Aerospace Engineering, and later got a master’s in the same subject from the US Naval Postgraduate School. My main specialization has been in all aspects of voice and data communications. I am not a physicist by profession, but by passion. The Special Theory has been a lifelong fascination for me.
I first encountered the theory of relativity in the 1950s and 1960s by reading science fiction, some good, some more fiction than science. In high school my physics teacher was a truly remarkable nun, Mother Katherine Winters. When I expressed an interest in the theory of relativity, she responded by giving me a book by Albert Einstein0F0F[i] which I read from end to end, and still have today. I grasped the principles easily enough, the universality of the speed of light, the relativity of simultaneity, the difference in measurements of time and distance from different reference frames, the increase in mass with increasing speed. What I could not grasp was - why? While I could visualize in my head Newtonian physics, I could not visualize at all what was happening with the Special Theory. Being a teenager, I assumed I just wasn’t ready for it yet.
At the Naval Academy, I took a course in Modern Physics, which included the Special Theory. The Naval Academy was where Albert Michelson, himself a graduate of the Class of 1873, first measured the speed of light while an instructor of physics at the Academy in 1879. He came up with a figure of 299,940 kilometers/second (km/s), within .05% of the modern measurement1F1F[ii]. The Academy named the Science and Engineering Building Michelson Hall in his honor, and it was here that I was to take my Modern Physics course. Surely, I would find the understanding that had eluded me in my self-education five years before. But alas, when I asked the professor to explain why the Special Theory worked as it did, with contracting rulers, mass increasing as the speed increased, time dilation and so forth, his response was along the lines of ‘it just is’. I went on to get an A in the course, easily applying the equations I had known for years already. But it was the only course for I which I had to rely on rote memory, rather than any understanding. It seemed something to be taken on faith, almost as if divinely revealed to only a select few, rather than understandable through logic. For those of you readers who are learning about the Special Theory for the first time, and for those of you who work with it but have yet to fully visualize it, let me give you an example.
Sally leaves Earth in her spaceship at noon, ‘zero hour’. Enroute, she will pass Fred in his space station, which is stationary at a distance of 1 light-hour (1.088 million km) away. Sam, in Mission Control on Earth, is tracking Sally on his powerful radar, verifying that she is going at half the speed of light, and that Fred remains stationary. Sally in turn is tracking Earth receding at half the speed of light and the space station approaching at the same speed. At 1PM, Sam calls Sally and Fred, to inform them that he expects Sally’s passage at 2PM, and that they should both report the time of Sally’s closest point of approach. Sally and Fred receive the message, read their respective clocks and report as directed. At 3PM, Sam’s radar shows their two blips merging on his screen, which he knows would have actually happened at 2PM, the reflection taking an hour to return. Their reports both arrive at that time also. Fred reports her passage at 2PM as Sam expected, because that was what his radar is showing. However, Sally reports her arrival at 1:44PM, and wonders why the space station moved, since she has covered only 87% of a light-hour of distance, by her clock. Sam and Fred see her clock running slow, she sees the space station closer than expected.
I love insoluble problems. I nibbled at the bone of the Special Theory for decades, trying and discarding various approaches. I reconsidered the discarded all-pervading ether theory. Was mass rise really a compressibility effect, akin to approaching the speed of sound in air? And what was speed doing to measurements of time and space, anyway? I bought pads of yellow paper by the case, scribbling on them, trying in vain to find the key to understanding these simple equations by adapting aerodynamic principles.
One day, I had a flash of insight. What does it mean that something is located at a point x at a time ct? How do I know it is there, and when it is there? Coordinate systems are meaningless by themselves. We observe events which emit light, and it is from that observation that we assign the event to a time and place in our coordinate system.
Figure 1. Sally and Sam and the Takeoff Roll
Take a look at Figure 1, which is a low-speed example of the same problem we just discussed. This begins with Sally sitting at the controls of an airplane at the end of a 2-km runway, and Sam, the observer, in the control tower, also at her end of the runway. At time 12:00, Sally begins her takeoff roll, averaging 240 kilometers per hour (km/h). Sam observes her takeoff roll at once, since he is collocated at the end of the runway; Fred at the far end will see her begin to roll 6.66 microseconds (μs) after 12:00, when the light from that event reaches his eye.
At 12:00 plus 30 seconds, Sally lifts off, and reports via radio that she was ‘off the deck’ at that time. Fred, at the end of the runway, likewise reports her lift-off at the same time, since he is now the one with zero distance separating him from Sally. However, he believes her takeoff roll is 6.66 μs shorter than the 30 seconds Sally measured it as being, because he observed her roll starting that much later.
Sam in the tower consults his high-precision aviator’s wrist watch and notes that he received the two reports at 12:00 plus 30 seconds and 6.66 μs, the time it took light from the takeoff, and Fred’s radio message, to reach him from the end of the runway. However, Sam’s estimate of her takeoff roll was longer by 6.66 μs than Sally measured it, because he thinks she took off later than she did, and 13.33 μs longer than Fred’s estimate of her run.
This is not a relativity problem, since the speeds are far too slow to affect each other’s measurement of the length of the runway and intervals between clock ticks. But this example establishes that there is a delay between when an observer knows that an event has happened at some location, and when that event actually happened. All observers can easily correct their observation by dividing the distance by the speed of light (~300,000 km/s) to get the takeoff time. The radar in the tower performed precisely that calculation to determine Sally’s correct distance: the pulse that located the aircraft at the end of the runway left the tower 6.66 microseconds before that event, and the reflection returned to the tower 6.66 microseconds after that event, along with Sally’s takeoff report by radio. The radar multiplied the difference in time between the outgoing and reflected pulse by the speed of light and divided it by 2, to determine the aircraft was 2 km away at time of reflection.
Secondly, Figure 1 also demonstrates that the passage of time is real motion, normal to the spatial axes, though in order to make a three-dimensional depiction of this I must omit one spatial axis, in the case z, the vertical axis. We shall see later that this real motion in time is the source of all relative motion. Relative motion is not movement in space independent of time, but the projection of an event’s real movement in time onto another observer’s spatial axes, and that is the key to unraveling the mystery of the Special Theory.
CHAPTER 2:
CLASSICAL REFERENCE FRAMES
For the most part, very few people regularly deal with the Special Theory. Physicists dealing with fast subatomic particles, astronomers dealing with fast-moving but distant objects, or engineers dealing with precision atomic clocks in orbit about the earth for global navigation, these people work day- to-day with the Special and General Theories. Most of the rest of us can spend our whole lives in engineering, science or mathematics and not encounter a single problem in the Special Theory outside of the classroom. The world we live in is the non-relativistic world where velocities are very small with respect to the speed of light. But before we move on to the strange new relativistic world, let us review the basics of this non-relativistic world, how light travels from a moving object, what reference frames are, and how we assign coordinates to events.
As the intended audience of this book includes everyone from those not quite graduated from high school to those with PhDs, I am going to review some basic concepts. For those of you for which this very familiar, please skim, but do not skip. I am covering some very important fundamentals about measurement that will be critical later.
2.1 The Speed of Light
The speed of light is central to the Special Theory. In fact, the Special Theory evolved from the Michelson-Morley Experiment2F2F[iii], which attempted, but failed, to detect the Earth’s relative motion with respect to the ether, the then-prevalent theory of an all-pervading medium in which light was believed to propagate. This relative motion was to show up as a change in the speed of light based on the direction and speed of the Earth through the ether. The experiment showed that the speed of light was unaffected by relative motion, and it was subsequently determined that the speed of light in a vacuum, c=299,792,458 m/s, is a universal constant3F3F[iv], as shown in Equation 1.
Equation 1
Light will expand as a sphere of radius s in all directions in the observer’s spatial coordinates x, y, and z. The radius s is equal to the elapsed time Δt from the time of emission by a point source, multiplied by c, the speed of light, unaffected by the relative velocity of the emitter with respect to the observer.
What about a beam of light? After all, isn’t that, according to the title, what we are trying to ride? The answer is that light, as shown by Equation 1, expands in all directions. We may focus it into a directional beam with lenses or mirrors, or collimate it as a laser, but despite all focusing possible, some light always escapes in all directions as backscatter, sidelobes, or those pesky photons, wandering in quantum uncertainty in whatever direction they choose, regardless of improbability. But very weak backscatter or very improbable photons are not slower, just weaker or not as many. We are not going to deal with focused light here, but with light always propagating in all directions at c.
Figure 2. Sally, Sam and the Speed of Light
Let’s return to Sally, who has completed her takeoff, and engaged a new hyperdrive which allows her jet to travel at half the speed of light. She is conducting tests to prepare for the example with which we opened. In Figure 2, we see how this speed affects the propagation of light. Sally’s plane is equipped with a beacon which flashes at 1-second intervals, at ct1=1, 2, etc, as shown in Figure 2. Sam observes Sally’s beacon flash immediately on takeoff at 0,0, then at ct1=1.5, 3, and 4.5 seconds, each flash increasingly delayed because Sally’s aircraft has moved an additional distance of 0.5 light-seconds between flashes. Correcting for Sally’s movement, Sam determines that the flashes occurred at 1-second intervals by his clock. Looking at Figure 2, we see that light spreads out in circular rings on the x1-y1 plane centered around each emission’s point of origin. While the point of origin moved with Sally, her velocity was not added to the speed of light, which remained constant about its point of emission, and cΔt1 is the radius of a circle expanding at the speed of light in space from that point. The light emitted by the beacon at 0,0 has expanded to a radius of 5 light-seconds (1.5 million km) at ct1=5 seconds after departure, as required by Equation 1. The light emitted at ct1=1 has a Δt1 of 4, and a radius of 4, etc.
What about Sally? If she did not add her speed to the speed of the emitted light, then she must see herself moving with respect to the light spheres, as we see her doing in Figure 2. But she does not see that. She sees herself at rest at the center of all the spheres, and Sam moving with respect to them. But we will have to unravel the Special Theory before we can untangle this apparent paradox.
Note that an observer ahead of Sally will see the light flashing at a faster rate, every 0.5 seconds, while Sam, behind Sally, sees the light flashing slower, every 1.5 seconds. This is the non-relativistic Doppler effect on light, in which the time intervals between the observation of two events, as measured by the observer’s clock, are shortened or increased by relative velocity toward or away from the observer. We saw this in the example of Figure 1 in the introduction, in which Sam measured Sally’s takeoff roll at 30 seconds plus 6.6μs, because she was receding from him; Fred measured her takeoff roll as 30 seconds minus 6.66μs, because she was approaching him.
This is exactly the same as the Doppler effect on sound: an observer hears the horn of an approaching car higher-pitched while approaching, and lower after it passes. And that Doppler effect of sound is generated by exactly the same process, and governed by the same equation, because like light, the speed of sound is not affected by the speed of the source, only by the temperature, pressure, and density of air in which it propagates. An observer seeing a fighter aircraft approaching supersonically will hear nothing at all, just an eerie silence, although he may be close enough to see the shock wave radiating from the jet’s nose and tail as a misty cone of condensation. The sound will not reach the observer’s ears until the aircraft’s shock cone trailing behind the jet passes him. He will then hear first the explosive shock wave, then the roar of the engines in afterburner as the aircraft recedes.
We will deal extensively with the analysis of Doppler later. Doppler, in which two observers see the other emitting a higher or a lower pitch, plays a major role in explaining the apparent paradoxes of the Special Theory, in which two observers see each other’s clocks running slower, and each other’s rulers growing shorter. There is both a relativistic and non-relativistic Doppler, and we shall show when and how to use them.
2.2 Reference Frames
I have a reference frame attached to me. Straight ahead of me is my x direction, to my left is my y direction, and above me is my z direction. On my left wrist is my cheap Timex, which marks my motion in the t direction. I am the observer located at the origin of those four orthogonal axes. My wife has a similar set of coordinates attached to her, and she is the observer in that reference frame. But she cannot be an observer in my reference frame, she is a measurement in mine, of some displacement in distance, and I am a measurement in hers. Using her distance from me, I can correct her measurement, say of the cat, to a measurement in my reference frame. This insistence that the observer always be at the origin of the reference frame will be critical when we deal with relativistic reference frames, because all measurements of events must be reduced to a ‘proper’ event, a measurement in which there is only time, and all spatial coordinates are zero.
[i] Albert Einstein, Relativity: The Special and the General Theory (Crown Publishers, New York, 1961), pp. 1-157
[ii] "Michelson's 1879 Determinations of the Speed of Light". Department of Statistics and Actuarial Science, University of Waterloo (Canada), 24 May 2000.
[iii] Richard Staley, Einstein's Generation. The Origins of the Relativity Revolution (University of Chicago Press, Chicago, 2009), p. 27
[iv] Gabriel Bergmann, Introduction to the Theory of Special Relativity (Dover Publications, New York, 1976), p. 34