Cosmology I
M.Sc. in Particles, Strings & Cosmology
CPT, University of Durham

Rodrigo Alonso
IPPP OC307
rodrigo.alonso-de-pablo@durham.ac.uk
A patch of the night sky augmented so that what might be mistaken for stars or nebulae with the naked eye reveal themselves as distant galaxies, some elliptical, some irregular some in profile some showing their spiral arms.
Figure 1: Credit: NASA, ESA, and S. Beckwith (STScI) and the HUDF Team

These are the notes for Cosmology I, the MSc course in Durham’s Centre for Theoretical Physics (CPT) Institute, a joint venture of the Physics and the Mathematics departments in the University of Durham. Here an introduction to the smooth expanding universe is presented and the history of the universe is traced back in time. The basis of the current model of cosmology is laid out and the evidence that lead up to this theory is reviewed. Problems and references are to be found throughout the text. Although we will review all concepts from fundamentals, prior knowledge of general relativity, thermodynamics and statistical mechanics will be useful. As for the mathematics familiarity with differential equations and differential geometry will facilitate the derivation of results.

These notes are meant to be self-contained yet for additional reference the following is the recommended bibliography

  • Modern Cosmology Scott Dodelson [5]

  • Cosmology notes at Amsterdam University and Nikhef

  • An introduction to Modern Cosmology Andrew Liddle [6]

  • Cosmology Steven Weinberg [7]

Previous lecturer notes can be found here for reference.

1 The dawn of modern cosmology

A rough picture with the main features of cosmology can be directly drawn from experimental evidence –with a little guidance– by the reader with virtually no knowledge of physics. This picture is an expanding universe where everything we see originated from a primordial hot plasma. This first lecture will provide this little guidance and help ease us into cosmology with barely an equation. At the same time this introduction will serve to stablish the context for more quantitative computations to follow.

Before laying out the evidence that lead to modern cosmology it is pertinent to set the stage, in particular the size of the stage in cosmology. We have known for some time that our world is a rocky planet orbiting 150 million kilometres or a micro parsec (pc =3.0856×1013km) around a mid-sized, midlife star some 8 kilo pc (kpc) away from the centre of a galaxy of 50kpc diameter. The nearest neighbouring galaxy, Andromeda, is 770 kpc away and is the largest in our local group of 20 galaxies, placed in the outskirts of the Virgo supercluster of some 30Mpc in size, this super-cluster being one of the many known. The furthest objects we have seen are early stages of galaxies known as quasars some Gpc away like 3C273.

A 2D slice of the distribution of galaxies showing them aligned along filaments that come together in nodes.
(a) 2d map of galaxies as determined by the Sloan Digital Sky Survey

Galactic edge

Andromeda

Virgo cluster edge

3C273 Quasar

17 pc

8105 pc

3107 pc

109 pc
(b) Cosmological signpost

Cosmology concerns itself with the largest distances today, those above the Mpc, or 1022m. However much the previous paragraph has managed to convey, this scale is hard to grasp; for comparison the exploration of the sub-atomic world has gone roughly (yet coincidentally) as many orders of magnitude below our human scale with the LHC searching the (14TeV)-110-20m scale. It is nonetheless good to state the obvious: the ‘ruler’ in cosmology being this large, within its closest tick-marks fall many galaxies and intergalactic dust all of whose internal structure and particulars we ignore to extract instead simple averages. For example we will be interested in how much matter there is in one box of 10 Mpc side regardless of whether this matter consists of stars, dust, planets or carbon-based life forms; this information is lost as details smaller than the pixel size in a photo.

Distances between cosmological objects however were not always the same as we learned as recently as the first half of the last century. Let us then now turn to the first piece of evidence for the present theory of cosmology.

1.1 Hubble Diagram

In 1929 Edwin Hubble put out an article with the distances and relative velocities of a few dozen galaxies he and his assistant Milton Humason had measured combined in a plot that was to make history (fig. 1(c)). These results were however much less simple to obtain than the previous statement would suggest, just like its interpretation into the present model of cosmology took decades to be established.

Hubble's data for distance vs velocity showing a few dozen points fitted into a linear function as published in 1929
(c) Velocity to distance correlation of Galaxies (y axis is km/s). Hubble, E. P. (1929) Proc. Natl. Acad. Sci. USA 15, 168–173, full text
Modern day Hubble diagram showing many many  more data points and an excellent fit to a line
(d) Modern day version of Hubble diagram (with swapped axes) from Betoule et al.
Figure 2: Original and current Hubble diagram.

When Hubble started his research, these galaxies were not even thought to be so, part of the scientific community believed they were nebulae within the Milky Way, which gives an idea of how little was known about distances at the turn of the twentieth century. The confusion was there for several reasons, for one the poor resolution of the instruments made them appear as cloudy patches in the sky. Indeed one can, in an exceptionally clear night far away from any artificial light, spot Andromeda with the naked eye but it will not look much like a galaxy, just a hazy spot of faint light. This type of objects had been catalogued in the eighteen century when in 1771 Charles Messier published Nebulae and Star Clusters. Its intent was basically to mark the place of these uninteresting objects so they would not get in the way of identifying the more relevant comets that were sought for at the time. There are some hundred objects in this catalogue, which is still used as reference today (Andromeda is Messier 31 or M31 for short) and as we now know, although they might look similar with low resolution, a third of them are galaxies while the rest are actual nebulae.

To find out the distances of these objects, Hubble used the 100 inch telescope at Mt Wilson Observatory in California to observe them each for a prolonged period. In this way he was able to identify Cepheid stars within them. These are stars whose luminosity varies periodically with a frequency known to be tightly correlated with absolute luminosity (the power or how much light the star produces per unit time). By measuring their apparent luminosity, that is how much light reaches us, one can work out how far the star is by how diluted the power seems. In this way Hubble estimated that M31 (Andromeda) was 300 kpc away. The size of the Milky way had just recently been determined by the work of Harlow Shapley around 1920 to be 30 kpc so this result showed that these formations where well outside the Milky way.

For the other axis of the plot in fig. 1(c) one needs relative velocity to us. In order to find this out one can use the spectrum instead of the intensity of the light reaching us. Spectral emission lines from common elements we can at times compute from first principles as with Hydrogen or simply and to more accuracy measure them here in the laboratory. Then when looking at the spectrum of light from a galaxy one can identify emission lines and hence deduce information about the elements making up its composition. For example we know that the transition between the two hyperfine split 1S states of Hydrogen has a wavelength of 21 cm. These spectral lines however do not sit exactly on top of the ones we measure here on earth but are displaced up or down. This up or down is actually given a name, if the line appears of higher frequency (smaller wavelength) we say it is blue shifted whereas if is of lower frequency (longer wavelength) it is red shifted. As you might have guessed the naming comes from the visible spectrum of light and its ultraviolet and infrared limits.

The reason for this shift is the relative velocity to us in an phenomenon dubbed Doppler effect. This is the effect that makes ambulances approaching sound at a higher pitch than those going away. In astronomy Vesto Slypher had already made use of this effect to estimate distances and Hubble employed the same methodology. The relation between the spectrum displacement and the relative velocity is given as

z=νemitνobs-1=λobsλemit-1vc (1.1)

where z is called the redshift and the last equality holds in the non-relativistic limit, where all the data that Hubble took falls in. Finally with these two sets of data Hubble put together the plot of fig. 1(c). It is not perhaps a very convincing linear correlation yet the present day version of the plot to the right 1(d) should convince the sceptics.

The slope of the linear correlation is called the Hubble rate H0 with a 0 subindex to mark it is measured today, again because it was not the same in the past, numerically it is parametrized as

H0=100×(h)kms1Mpc=[4]100×(0.67±0.01)kms1Mpc (1.2)

where we took the value for h from [2] in the second line but given that this value differs notably when taken from other sources it is common leave h unspecified (to avoid confusion with Planck constant here we will only use ). This linear correlation tells us that velocity is proportional to distance v=H0d i.e. that galaxies are receding from us and faster the furthest away. We will elaborate on the simple consequences of this in the end of this lecture, for now let’s review the other main piece of evidence for modern cosmology.

1.2 Cosmic Microwave Background

Back in the 1960s Arno Penzias and Robert Wilson were given access to a giant horn antenna designed to detect microwaves at Bell Telephone Laboratory. The original purpose of the antenna was to receive radiation from the Echo satellite in an early commercial test of satellite communications. The project was however done by 1964 and Penzias and Wilson were given control of the antenna to study cosmic microwave light. Microwave light has a wavelength from around milimeter to meter size and Penzias and Wilson started looking at the wavelengths around 7 cm . As in any experiment one of the first steps is to study systematic errors in the apparatus. There was a background noise getting in the way of their experiment which they tried to pin down for a year. They tested for this noise coming from the electronics of the equipment by cooling it, for radiation in the atmosphere by pointing the antenna in different directions, and even evacuated nesting pigeons and cleaned their mess. None of these seemed to account for this static, this noise, yet at least this revealed the noise was isotropic, i.e. coming with equal intensity from all directions in the sky and hence likely of extra-galactic origin.

Still they hesitated to put out their results up until a fateful phone call. In conversation with a colleague and when asked about how his work was going with the antenna Penzias replied that everything was well except for part of the results they did not understand. His colleague then recalled that he had been told by another colleague of a talk given by P. J. Peebles about some cosmic radiation coming from the early stages of the universe evolution and by now redshifted into the microwave range. The prediction actually predates the work of Peebles and his colleague Dicke and was first posed by Ralph Alpher and Robert Herman in 1948 although the former did not know about the latter.

Penzias and Wilson got in contact with Peebles and two papers were produced and sent to the Astrophysical Journal, one by Penzias and Wilson presenting the results (but cautious not to stipulate on the cause of the noise other than a mention to a companion paper) and another by Peeble, Dicke, Roll, and Wilkinson interpreting the result in terms of the cosmological big bang theory. We today call this isotropic cosmic noise the microwave Cosmic Microwave Background (CMB) and is one of the pillars of observational cosmology.

Penzias and Wilson Horn's antenna, like a gigantic hearing aid
(a) Penzias, Wilson and the Horn antenna.
The remarkably precise fit of CMB data to black body radiation
(b) CMB spectrum from COBE
Figure 3: The cosmic microwave background

There is a wealth of data to be found in the CMB, for now what is relevant to underline is the shape of this spectrum. This can be seen in fig. 2(b) where experimental data is superimposed with prediction. The prediction is that of black body radiation, the shame shape that one would get on the inside of a hermetic furnace in equilibrium. The distinctive feature of this type of radiation is that the shape depends only on one parameter, the temperature T. In particular the expression for the intensity as a function of radiation frequency reads

I(ν)=4πν3e2πν/kBT-1 (1.3)

where kB is Boltzmann constant, kB=8.617×10-5 eV/K with K degree Kelvin and Plancks reduced constant =6.582×10-16eVs. The fit is staggeringly accurate as shown in fig. 2(b); it is in fact the most ideal black body radiation we have ever observed. The associated temperature that one can extract from this plot is

T0 =2.725 Kelvin CMB temperature today (1.4)

1.3 The early universe

Let us finally put together these two experimental findings. First from Hubble’s discovery one has that if at cosmic distances everything is moving away from us, things used to be much closer and we can estimate when.

Take any galaxy at a distance d from us, one has, naively extrapolating its velocity to be the same in time, that if it is moving away at a velocity v, it used to be next to us some time t ago

t=dv=dH0d=1H0 (1.5)

where in the second equality we have just substituted in Hubble’s law. The distance, the one label that the galaxy we chose had, drops out of the equation; this means that this time t is the same for any galaxy. Roughly all of the known cosmos used to be clumped together a time 1/H0 ago! What is this time if we convert (1.2) into years? The current estimate for the age of the universe is 14 thousand million years, how does it compare with the naive estimate?

If one takes this conclusion further, when all the matter in the known universe was compressed together the energy per unit volume of the system must have been exceedingly large, just as a gas heats up when compressed. In fact if the universe had been hot enough, atoms would have ‘melted’, the electrons would be stripped away from the nucleus and for higher temperatures nuclei would have dissociated into neutrons and protons. Implicit in this discussion of a hot universe there is of course temperature.

Well we have already talked about temperature, but in the context of the microwave radiation discovered by Penzias and Wilson. Although this talk of temperature was simply used to characterize the radiation, it probed to be a temperature that we can trace back to this hot and dense initial universe; whereas electrons and nucleons have undergone processing since the early universe to appear as intergalactic gas or galaxies themselves, the light from this early universe has travelled undisturbed to reach us today. Not only that but the fact that the spectrum follows so closely the shape of a black body tells us that the early universe was in nearly perfect equilibrium.

These are the two basic concepts that thread through most of cosmology and whose balance accounts for much of the events in cosmological history: (i) the universe is expanding (ii) in its early stage it used to be in thermal equilibrium with temperature T.

Problems

I Let’s do some units conversion; from next chapter on, we will use natural units =c=kB=1 in our formulae to reduce the number of letters in them. However we will want to restore powers of these constants to express final quantities in e.g. cm, years or Kelvin. So inserting powers of ,c,kB calculate:

H0-1 in years H0-1 in parsecs T0 in eV T03 in cm-3 (1.6)

II Deduce Doppler’s shift for the relativistic case to find

z= 1+v/c1-v/c-1 v>0 for receeding emitter (1.7)

[Hint: In the emitter frame the ‘crests’ of the wave are emitted every 1/ν seconds and so a distance apart (= wavelength) c/ν, convert this time interval 1/ν to the observer frame to find the distance between crests she sees]. Looking at data on in fig. 2(a), is the non-relativistic approximation justified? On fig. 2(b), looking at the horizontal axis is it still?

2 Homogeneous & Isotropic

The Hubble diagram did not break down galaxies by where in the sky they were located; whether the galaxy one is looking at is in the direction of Cassiopeia or in the vicinity of the Southern Cross, they all follow the same law for their velocity vs distance. Furthermore one would roughly find the same number of galaxies looking into a fixed-size solid angle in any part of the sky. This is to say that the universe seems isotropic, the same in all directions at cosmological scales. It is also homogenous on these scales.

A theory argument to arrive at an homogeneous universe from an isotropic one is to assume our place in it is not special (a.k.a. cosmological principle); if a universe looks the same from any point of reference it must be homogeneous too. One need not rely on theory however, measuring distances of galaxies from us and their angular coordinates in the sky we can create a map, which is what galaxy surveys do. The size of this ‘census’ is quite formidable, as an example the Sloan Digital Sky Survey has mapped nearly a million galaxies and figure 1(a) shows a 2d slice of the map that the experiment produced.

This homogeneity and isotropy seems counter-intuitive from the human or even astrophysical perspective; right where we are there seems to be a lot of matter around whereas if one goes a few tenths of thousand light years out there is comparatively nothing. The answer to this is two-fold. First one should remember what our ruler is in cosmology; consider water, it does seem to us humans homogeneous yet we know at a microscopic level it is not a continuum but it is made of molecules and space between them. Similarly with cosmology there are as many galaxies in a few dozen megaparsec-side box around us as in the box next to it. Secondly what we ‘see’ is far from all there is; as the dynamics will reveal in two lectures time, the weight of galaxies is negligible, a mere 0.5% of the total energy density. Even within visible matter, galaxies constitute 10% of the whole with the majority being intergalactic dust. The use we will find for galaxies is as probes into the cosmological expansion (tracers) with negligible effect (back-reaction) on the dynamics.

Before moving on to codify this homogeneity and isotropy mathematically, it is worth pondering for a minute the fact that the universe is so ‘simple’ at the largest scale when it needed not be so. It is indeed a clue to early-time dynamics, but we do not yet have the tools to explore this. Let us then for now embrace this happy circumstance that will simplify the mathematics for us.

2.1 The metric

The previous lecture gave the evidence for an expanding universe where, we have just learned, no point is special. Therefore the distance between any two given points, let us take them to be two distant galaxies, will grow with time. We describe this effect by introducing the scale factor a(t) as

=a(t)|Δx¯| (2.1)

where is the physical distance, Δx=x¯A-x¯B with x¯A=(xA1,xA2,xA3), |Δx¯|2=(Δxi)2 and we have used a to ‘factor out’ time from a space like variable x¯. There is some ambiguity in this splitting of distance into x and a as happens when one defines a physical quantity in terms of two constructs, so let us for practical purposes normalize. We take |Δx| to be physical distances as measured today (t0)

|Δx| (t0) a(t0) =1 (2.2)

where the zero index denotes present time, for reference t0=13.9 thousand million years. Lastly to implement the rate of expansion as measured in the Hubble law one has

d(t)dt =d(a(t)|Δx|)dt=dadta(t)=H(t) H(t) =1a(t)da(t)dt (2.3)

This gives H in terms of the scale factor and presents us with the Hubble rate H as the growth relative to the current size. In other words the inverse of H gives the time scale that takes for the universe to double in size. You can go back to eq. (1.2) for the value of Hubble today and check how much time this would take. The value of H0 together with the convention eq. (2.2) gives the scale factor and its first derivative today, these are the final (rather than initial in this case) conditions that we will later evolve back in time with the dynamics of general relativity.

The exercise above has informally set a coordinate system, xμ=(t,x¯) convenient to describe inertial observers caught in the expansion; a 4-dimensional trajectory in xμ of one such observer is (t,0,0,1Mpc). One need then only specify the ‘label’ x¯ which stays constant (and remeber corresponds to physical distance today) whilst all such observers would agree on the time coordinate, defining a common ‘clock’ ticking at the same rate. This coordinate system can of course be used to describe any other trajectory and indeed not all galaxies adjust exactly to the Hubble law; Andromeda is moving towards us so we would have dx¯Andromeda/dt0. We call x¯ co-moving coordinate and we say Andromeda has co-moving velocity. We will nonetheless neglect this type of effect in the following.

(t0,x¯)=

(t0,1Mpc,0)

(1.5t0,1Mpc,0)

a(t)|x¯|

(2t0,1Mpc,0)
Figure 4: In our coordinate system galaxies stay at constant x¯ yet distance grows.

The system above is useful for its concreteness while capturing the experimental evidence yet it is important to realize it is a conventional choice. One could multiply by inverse factors a and x¯, aCa x¯x¯/C and the physical distance would stay the same, or one could prefer a boosted frame or change to a different time variable. These would all be different ways to describe the same system, the same physics on which every observer should agree. Although this statement sounds just as sensible as it sounds vague it does have a mathematical rewriting based on invariants. Invariants are observables whose measure would yield the same value for every observer. The first instance of this we encounter is related to space-time distance, the 4-dimensional line element11 1 We set the speed of light to 1, c=1:

ds2=dt2-a(t)2dx¯2=dt2-a(t)2(dr2+r2(dθ2+sin(θ)2dϕ2))=gμνdxμdxν (2.4)

with d being a differential, the infinitesimal version of Δ, and gμν is the metric, the link between our coordinate system and physical space-time distances. Already above one has a change of variables; albeit a familiar one, it does serve to illustrate that what we call the metric is different in each frame

gμν =(1-a2(t)-a2(t)-a2(t)) xμ= (t,x1,x2,x3) (2.5)
gμν =(1-a(t)2-a(t)2r2-a(t)2r2sθ2) (x)μ= (t,r,θ,ϕ) (2.6)

The connection with distances as we have discussed them initially can be obtained as the difference of two space-time points Δxμ=xaμ-xbμ=(0,x¯a-x¯b), 2=-gμνΔxμΔxν.

Curvature
Thus far, for the sake of introducing one concept at a time, we have kept an implicit euclidean geometry in the 3 space coordinates. This need not be the case and the differential geometry language of metrics allows for a simple implementation of deviations from the ‘flat’ case.

Indeed the metric gμν could be a function of space and not only time. It cannot be however an arbitrary function of space given the requirement of homogeneity and isotropy; if space is not flat and has some curvature it must be the same everywhere and in every direction. One such geometric object is the sphere (although the one we can visualize is two dimensional), so to incorporate the effects of curvature into a metric let us first contemplate the case of positive curvature or spherical geometry. This is perhaps easiest if one goes to a one-extra-dimension Euclidean space and defines our space as a hyper-surface. This means a metric and constraint as:

(x4)2+x¯2 =RU2 d2 =dx42+dx¯2 (2.7)

where x4 is an artificial extra dimension that we promptly dispose of by solving for x4 as x4(x¯) and substituting back in the metric to obtain

(x¯dx¯)2RU2-x¯2+dx¯2=dr21-r2/RU2+r2(dθ2+sin(θ)2dϕ2) (2.8)

with r=|x¯|. One has therefore that distances in the θ,ϕ direction are as in flat space, yet the r direction has an r-dependent factor. Despite the appearances the point r=0 is in no way special and has the same curvature as any other.

The same exercise can be repeated for negative curvature with a Minkowskian metric

-(x4)2+x¯2 =-RU2 ds2 =-dx42+dx¯2 (2.9)

to obtain instead

-(x¯dx¯)2RU2+x¯2+dx¯2=dr21+r2/RU2+r2(dθ2+sin(θ)2dϕ2) (2.10)

Although the math is straight forward and a natural extension of the positive curvature case, the non-positive metric makes this case harder to visualize. Standard 2 dimensional surfaces to convey the three possibilities for curvature are sketched in fig. 5.

The three possibilities for curvature, a sphere, a plane and a saddle
Figure 5: Negative, null and positive curvature surfaces.

The previous results can be synthesized in the line elements

ds2 =dt2-a(t)2(dr21-Kr2+r2(dθ2+sin(θ)2dϕ2)) K ={1RU20-1RU2 (2.11)

the three cases are referred to as positive curvature or closed universe K>0, flat universe K=0, and negative curvature or open universe K<0 . In order to extend the co-moving distance concept to the curved case we perform a change of variables from r to χ as

ds2=dt2-a(t)2(dχ2+SK2(χ)(dθ2+sin(θ)2dϕ2)) SK(χ)= {RUsin(χ/RU)χRUsinh(χ/RU) (2.12)

In this way co-moving distance and physical distance keeps the linear relation so =aΔx now reads =aΔχ. Also note that the R limit returns a flat geometry as it should.

This metric, known as Friedman Robertson Walker, is the most general one adhering to the homogeneity and isotropy assumptions and is described by one function of time and a constant RU for the curvature of 3 space. Simple as it might seem it is worth spending some time familiarizing oneself with it which is what we do next.

2.2 The path of light

How does one turn observations into data in our coordinate system? First note the rather obvious yet unavoidable fact that we are very much stuck in the same place for cosmological observations; even the further distances we have sent missions to are negligible in cosmology22 2 The Voyager mission is 10-9Mpc from us.. We therefore rely on information reaching earth from far away. The information available at present comes overwhelmingly in the form of light, yet we have also ‘seen’ the cosmos through neutrinos and more recently gravitational waves. All these are the fastest means possible of transmission since photons and gravitons travel at the speed of light and neutrinos nearly so. Let us then examine how waves of massless particles, be it light or gravity waves, propagate in our coordinate system.

The statement of the speed of light being the same for all observers translates into the defining equation for the invariant line element of the trajectory of light:

(t(σ) ,x¯(σ))lighttrajectory ds(σ)2 =0 (2.13)

Let us for simplicity assume a radial ray of light in the coordinate system of eq. (2.12) to write it explicitly as

ds(σ)2 =[(dtdσ)2-a(t(σ))2(dχdσ)2](dσ)2=0 dt =±a(t)dχ (2.14)

With this much input and without solving for σ one can already ask questions; the first one we pose is how far has light emitted some time Δt=t0-t ago travelled to reach us? The answer is

(t)=𝑑=a(t)𝑑χ=𝑑t=c(t0-t) (2.15)

where we momentarily restored the speed of light in our equation. The result is then that light travels a physical distance equal to the time interval Δt in our frame (times the speed of light). If we ask instead what is the co-moving distance (equal to physical distance today) to the object that emitted that light we find

Δχ=𝑑χ=tt0dta(t) (2.16)

which one cannot compute unless we are given the form of a(t).

Diagram showing the drifting away of a galaxy that emitted long ago a signal reaching us today
Figure 6: Time (t) vs distance (d) for the path of a photon from a distant galaxy.

What we can conclude nonetheless, if the universe has been expanding ever since t, is that one distance is greater than the other Δχlight>Δlight. This is simple to understand; light takes some time to travel, at the time when light was emitted the universe was smaller and the emitter is farther today than it was at time t. This is also the case in Minkowski space-time for inertial observers moving away from each other. Nevertheless our case is not Minkowski and it should differ from it. This is relevant to realize because one could naively picture the expansion of the universe as an ‘explosion’ where fragments are moving away from each other but in otherwise an special relativity background. If this were true the distance light would have travelled is simply the distance of the object from us at the time of emission. This distance at time t one can obtain from the co-moving distance by rescaling with a(t)/a(t0). It then follows (given a monotonically increasing a(t)) :

distanceatemission =a(t)tt0dta(t)
distancetravelledbylight =tt0𝑑t
a(t)tt0dta(t) <tt0𝑑t<a(t0)tt0dta(t) (2.17)

with t<t0. That is, from the first inequality (distance at the time of emission) < (distance travelled) . Space itself has expanded in the intervening time Δt!

2.3 Measuring distance

As outlined in the previous lecture there are a number of methods for determining distances in cosmology, our coordinate system allows to put together all different measurements into coordinates and compare them. We next outline two methods of distance determination and what co-moving coordinates they correspond to.

Luminosity distance. Some astrophysical objects have a known luminosity (energy emitted per unit time in the form of light), which one can compare with how much energy in radiation reaches the area of our detector per unit time.

The energy emitted will dilute into a shell around the emission point of radius, today, a(t0)χ=χ. The area of this shell depends on the geometry; a positive geometry will have, for given χ, an area smaller than the flat case 4πχ2. Exercises for circles instead of spheres with a pen, an orange and peeling will convince the dubious. Conversely the negatively curved case has a larger area than 4πχ2. The area in the curved case can be computed realizing that χ/RU acts as the polar angle in our 3d sphere and a flat slice though our constant χ surface has radius RUsin(χ/RU) in the hyper-plane. The three possibilities read:

A=4π{RU2sin2(χ/RU)χ2RU2sinh2(χ/RU)=4πSK2 (2.18)

In addition to this effect one has that the rate at which one receives photons is dilated with respect to the rate of emission whereas the energy of an individual photon, proportional to the frequency, also diminishes with the expansion of the universe. This last point we will demonstrate explicitly in the next chapter; intuitively it is a consequence of the stretching of space and the energy of a photon being inversely proportional to the wavelength E1/λ. All in all the flux we observe on earth related to the luminosity L is

F=a(t)2L4πSK2(χ) L4πdL2 (2.19)

where we have defined the luminosity distance dL as it is conventional. This equation which can be inverted to find χ. One has then that the same star seems dimer in an open universe and brighter in a closed one.

Trigonometry on a sphere to deduce angular distance
Figure 7: Trigonometry on a sphere

Angular distance. Another possibility is looking at objects whose dimensions are known together with trigonometry. Note that angles do not change in time for an homogeneously and isotropically expanding universe but they stay the same. For simplicity consider a unidimensional rod of length l laid perpendicular to the line of sight at a distance much greater than its size so that the angle θ1 in radians.

Tracing time back to when the light reaching us was emitted, the object was at a distance a(t)χ (co-moving distance scaled back). The relation to the length l is then given by trigonometry in curved space. For the flat case a simple sketch leads to the relation θa(t)χ=l. The positive curvature case takes a bit more sketching, although the derivation for luminosity distance above offers clues. The co-moving coordinate over R gives the polar angle whose sine gives the distance to the axis at χ=0. This is sketched in fig. 7 and leads to the relation a(t)Rsin(χ/R)θ=l. This generalizes to the expression

l=a(t)SK(χ)θdAθ (2.20)

One has then that the same object at the same distance χ from subtends and angle

θ=la(t){(RUsin(χ/RU))-1χ-1(RUsinh(χ/RU))-1 (2.21)

That means, since RUsin(χ/RU)<χ<RUsinh(χ/RU) objects seem larger (larger θ) in a positive curvature space and smaller in a negative curvature one. To get a sense of why this is the case intuitively you can think of a sphere with the observer in the north pole and the object south of the equator.

If one knows both the luminosity and dimensions of an object, extracting χ from both methods above provides a useful consistency check. Let us collect here both definitions

dL= L4πF=SK(χ)a(t) dA =lθ=a(t)SK(χ) (2.22)

where χ itself can be written as a function of the emission time t as in eq. (2.16)

Problems

I The change of variables r(χ) from equation (2.11) to (2.12) satisfies, by comparing the two equations,

dχ=dr1-Kr2 (2.23)

Solve this relation with r(0)=0 for the three values of K. How is the expression you found related to SK(χ)?

Data on the Voyager mission, 11 billion miles from us
Trigonometry for the two voyager missions and earth

II Given the scale factor a(t)=(t/t0)2/3 compute the physical distance (t) light travels from t to t0 (2.15), the co-moving distance Δχ (2.16) and distance at emission a(t)Δχ. Is relation 2.2 satisfied? You can check by taking t=t0/2 for instance.

III Use the two voyager missions and the earth for a determination of the universe geometry via trigonometry. Assume the two missions are the same distance χ away from us today using data on the figure (careful this is NASA so billion=109), the line-of-sight of the two form a relative angle θ as seen from earth whereas on board the angle between earth and the other spacecraft is α, and take a spherical geometry for your computations. Determine α in terms of χ,θ,R to be:

cos(α)=sin(θ/2)cos(χ/RU)1-sin2(θ/2)sin2(χ/RU) (2.24)

Expand on χ/R and assuming we measure angles with 10-16 degree accuracy, find a bound on RU.

3 Geodesics, Horizons & Redshift

The previous lecture laid out the metric used in cosmology and connected it to observations. In particular the scale factor measuring the growth of the universe has a first derivative related to the Hubble rate as a/a=H0 today and we set a(t0)=1 so that co-moving distance |χ| is present time physical distance. An additional parameter of the metric accounted for a possible curvature of space, K, whose current experimental value is compatible with 0 but we keep general for now.

3.1 Geodesics

Although no reference to gravity has been made so far, the metric incorporates all its effects to the homogenous and isotropic approximation we are taking here. This will be made explicit in the next lecture when the equations of general relativity show how matter sources the dynamics of the metric. For now this section studies the motion of test particles or probes in this background metric.

Since gravity is the single relevant interaction in the cosmological regime today the metric must suffice to determine dynamics. Let us consider a point particle of mass m travelling in our FRW universe. As we did with light we describe its 4 dimensional path with one parameter σ such that the first derivative is the 4-momentum:

xμ (σ) dxμdσ =Pμ=(E,Pi) gμνPμPν =m2 (3.1)

The path is then given by solving the equation of motion yet no force of gravity is in sight. This is so because in the geometric language of general relativity the gravitational force is in the connection or first derivative of the metric and the equation of motion is the geodesic equation or path of minimum space-time distance. Since this formulation might be unfamiliar to the reader, here first we take a detour to introduce geodesics as describing dynamics.

  1.  

  2. Γ:

    To ease us into geodesic equations and the Connection let us resort momentarily to flat space xi and the simple equation for inertial motion i.e. in the absence of any force:

    d2xidτ2 =0 ds2 =δijdxidxj (3.2)

    where δij is the Kronecker delta. If one were to describe the system in other set of coordinates yi,

    ddτ(dyjdτxiyj) =0 ds2 =xiyjdyjxiykdykgijdyjdyk (3.3)

    and rewrite the equations of motion (EoM) in terms of the metric gij, which can be done by multiplying and summing on x/y and a little trickery

    0= xiylxiyjd2yjdτ2+[xiylxiyjyk]dyjdτdykdτ
    = xiylxiyjd2yjdτ2+12[yjxiylxiyk+ykxiylxiyj-ylxiyjxiyk]dyjdτdykdτ
    = gljd2yjdτ2+12[yjgkl+ykglj-ylgjk]dyjdτdykdτ
    glj(d2yjdτ2+Γjkjdyjdτdykdτ) (3.4)

    we obtain a second derivative term as in the original case plus a term quadratic in first derivatives with a coefficient which we call the connection or Christoffel symbols. This fictitious force has appeared in the process of changing to another set of coordinates or observer frame. The case of gravity itself is not unlike this, one can always find a local frame in which there are no forces (the connection vanishes), a local free falling frame. Nevertheless even such an observer would be able to tell the presence of gravity by the non-vanishing second derivative of the metric through tidal forces.

 

The dynamics then will be given by the connection or first derivatives of the metric which in our case reads

ds2=dt2-a(t)2(dχ2+SK2(χ)(dθ2+sin(θ)2dϕ2))=dt2-a2(t)g^ijdxidxj (3.5)

where we have defined for convenience the internal space metric g^ij=Diag(1,SK2(χ),SK2(χ)sθ2). The definition of the connection one derives from eq. (3.4) is

Γνρμ=gμκ2(xνgρκ+xρgνκ-xκgνρ) (3.6)

where the upper-indexes metric is the inverse as gμκgκν=δνμ.

A difference with the derivation above is that time itself is a coordinate and the geodesic equation has 4 components while the first derivative (momentum) is constrained to be on shell as in eq. (3.1). The system to be solved is then

[dxνdσν]dxμdσ =dxνdσ(xνdxμdσ+Γναμdxαdσ)=d2xμdσ2+Γναμdxνdσdxαdσ (3.7)
gμνdxμdσdxνdσ =m2 (3.8)

where we have introduced the covariant derivative and have used the chain rule (dxμ/dσ)/xν=d/dσ. The covariant derivative introduction serves more than a compact writing of the geodesic equation, it is the derivative which yields covariantly transforming objects (i.e. tensors) which one in turn needs to build the theory of gravity as the local theory of coordinate transformations. The similarity with gauge theories (as is the case of the Standard Model) which also present a covariant derivative is in fact indicative of a more profound connection between the two, yet this falls outside of our scope.

Relevant for our discussion is the fact that conservation laws which one may be used to with ordinary differentiation in the presence of gravity must be promoted, for example:

Special  Relativity General  Relativity (3.9)
μJμ =0 μJμ =μJμ+ΓμνμJν=0 (3.10)
μTνμ =0 μTνμ =μTνμ+ΓμαμTνα-ΓμναTαμ=0 (3.11)

where we see that the covariant derivative acts on every index and with a minus relative sign on lower indices:

μAν=νAμ-ΓμναAα. (3.12)

Back to cosmology with our FRW metric differentiation and a little patience yields a connection which can be synthesised as

Γij0 =aatg^ij Γ0ji=Γj0i =1aatδji Γjji =-12g^iig^jjxi Γiji =12g^iig^iixj (3.13)

with every other element being 0. One can be more explicit and give the space-like components

Γθθχ =-SKχSKχχ Γϕϕχ =-SKχSKχχsθ2 Γϕϕθ =-sθcθ (3.14)
Γθχθ =Γχθθ=1SKχSKχχ Γϕχϕ =Γχϕϕ=1SKχSKχχ Γϕθϕ =Γθϕϕ=cθsθ (3.15)

where we note that the partial derivatives are actually total derivatives d/dt,d/dχ given that the functions they act on depend only on one variable. So that finally we can write the differential equation for our trajectory in (3.1),

d2xμdσ2+Γνρμdxνdσdxρdσ=ddσPμ+ΓνρμPνPρ=0 (3.16)

explicitly by substituting in Γ. Let’s do it for the 0’th component and recalling the notation Pμ=(E,Pi) one finds

dEdσ+2Γ0i0EPi+Γij0PiPj=dtdσdEdt+adadtg^ijPiPj (3.17)

which can be turned into an equation for the energy alone by using the on-shell condition and the definition of momentum as

dx0dσ =dtdσ=E E2-a2g^ijPiPj =m2 (3.18)

and substitution back in eq. (3.17)

EddtE+1adadt(E2-m2) =0 (3.19)

The ultra relativistic (m0) and non-relativistic (Em) limits of this differential equation are simple to solve

UltraRel. 1EdEdt-1adadt =0 E= a(ti)a(t)E(ti) (3.20)
NonRel. dEdt =0 E= m (3.21)

Hence we have that the energy of massless particles (e.g. light) scales inversely with the expansion of the universe while the energy of slowly moving massive particles stays the same. This result one might derive with considerably less effort just by noting that the energy of a photon is:

Ephoton=2πν=2πcλ (3.22)

with ν and λ frequency and wavelength respectively. If as stated in the previous lecture, space itself is expanding, wavelengths expand with it as well and this greater wavelength means smaller energy. On the other hand a point particle with mass m will have a first order contribution to energy a and t independent.

3.2 Redshift & Horizons

The fact that the evolution of the energy in light tracks the (inverse of the) expansion of the universe allows for a simple conversion of experimental observations into dating. The first lecture outlined how the shift in the frequency of light described as the Doppler effect can be used to determine relative velocity. We can now relate this shift to the scale factor recalling the definition in eq. (1.1) which now reads:

z= ν(t)ν(t0)-1=a(t0)a(t)-1 a(t)a(t0)=11+z (3.23)

Some of the furthest away objects we have seen are quasars at z=6 and so bring information of back when the universe was a seventh of its current size. We can even see further back to higher redshifts, only then the universe did not look anything like now; for one galaxies had not formed!

This brings us to the daring question: how far would we have to look to see the ‘birth’ of the known universe, if such a thing were possible? Recall that we obtained in the first lecture an estimate of the age of the universe by approximating when all the galaxies and everything else we see was ‘in the same place’ which is what we mean by birth. This corresponds to a=0 and one would expect this limit to be singular or ill defined yet the question we posed has a finite answer, in comoving distance it is:

η(t0)=0t0dta(t) (3.24)

where even though a(t)0 it does so in general mildly enough so that the integral converges. This distance η(t0) is called the co-moving horizon today since we believe we have no way of knowing about anything beyond this radius.

However our initial scepticism about this t0 limit is justified, it as a convenient abstraction yet we should not take it literally; the Physical laws that we know can only be taken back in time so much until the conditions of the universe are such that they break down. In the integral above however this regime makes a tiny tiny contribution to the total which is dominated by late times and for all practical purposes we can take the limit.

One can make the answer to this question time dependent to introduce what is known as conformal time:

η(t)=0tdta(t) (3.25)

this gives the horizon at time t, to reiterate, the maximal co-moving distance that information about an event at t=0 can travel by time t. At the same time it doubles as an alternative time variable replacing t. The use of changing variables once more to this time η is that causality is specially transparent to discuss in terms of η.
Causality

Consider two events occurring at times t1 and t2 (>t1) in two different co-moving coordinate points χ1,χ2, let’s say for concreteness event one is a supernova explosion on a distant galaxy and event two is the decision of a scientific committee on whether to build an underground neutrino detector or not here on earth. The fastest that information from the supernova can travel is the speed of light and so by t2 this information has covered a co-moving distance:

Co-moving Horizon of event 1 by time t2  t1t2dta(t) =η2-η1 (3.26)

Assume that the two points are far away enough so that η2-η1<|χ2-χ1|, that is the horizon of event 1 by the time of event 2 is smaller than the distance between them. In this case information about the supernova could have not reached the committe at the time of the decision, we say the two events are causally disconnected; they couldn’t have influenced one another in any way. The light of the supernova will reach the planet later and neutrinos a bit later still, if the detector was built these would provide useful data, but they couldn’t have possibly known at the time! As history has it a number of government agencies around the world took the right decision, even if they did not have supernovas in mind, see SN 1987A.

As a last discussion on causality which will be of use later, let’s consider the possible relation between events A,B in opposite directions in the sky which occurred at time t and whose light is just reaching us today on earth. The fact that the light from these events is just reaching us now, if it has travelled unperturbed, means that comoving distance and comoving horizon coincide:

|χA-χus|=|χB-χus|=η(t0)-η(t) (3.27)

(in practice we can pick simultaneous events by selecting equal redshifts). Say now that the two events look strangely similar and so we ask, could they have had a common cause? The mathematical positing of this question is can they be fitted on the same light cone? Fig. 8 tries to depict the possible situations, shifting about the two simultaneous events in the presence of a vertex will convince you that the earliest the two positions could have obtained the same information is if it was emitted from the midpoint at t=0. By the time t when the events took place the possible original signal would have covered a diameter of 2(η(t)-η(0))=2η(t) which we can compare with their distance derived from eq. 3.27

2(η(t0)-η(t)) vs  2η(t) {η(t)η(t0)/2Causally connectedη(t)<η(t0)/2Causally disconnected (3.28)

The rather simple answer we obtain is that events that we see in opposite ends of the sky cannot be causally connected if they occurred in the first half the conformal lifetime of the universe. On the other hand when we say causally connected we mean they could have been, but they need not.

Light-cones to discuss causality
Figure 8: Events in the past causally connected LHS and disconnected RHS

Problems

I Show, given the covariant derivative action on lowered indices μAν=μAν-ΓμναAα that the derivative acting on the invariant object JμJμ satisfies

ρJμJμ=(ρJμ)Jμ+Jμ(ρJμ)=ρ(JμJμ) (3.29)

II Compute the connection elements Γij0, Γ0ji for space-curvature-less (K=0) metric:

ds2=dt2-a(t)2((dx1)2+(dx2)2+(dx3)2) (3.30)

Check that any other element is either related by Γνρμ=Γρνμ or zero.
III Take a universe with a=(t/t0)p with p<1 and compute explicitly conformal time and redshift η(t), z(t). What time t1/2 and redshift corresponds to η(t1/2)=η(t0)/2? IV Turn eq. (3.19) into an differential equation for dE/da and solve for E(a) with initial conditions E(ai)=Ei. If initially the energy was much above the mass Eim what value of a approximately sparates the regions where EEiai/a and Em?
V Derive the geodesic equation from the variational principle applied to:

𝑑σgμν(x(σ))dxμdσdxνdσ

4 Einstein’s equations

Thus far we have taken the metric as a given, a background where we studied the evolution of light and matter, the position of our horizons and the galaxies. The metric is however dynamic and it is affected by what and how much matter is there and perturbations on the metric propagate as light does to first order.

In loose terms the dynamics of the metric will be given by an equation of motion which will contain second order derivatives, just as Maxwell’s equations contain second order derivatives of the gauge potential Aμ. These equations of motion will not be arbitrary but admit a covariant form, like the geodesic equation. This means that the equations of motion themselves are in tensor form and obey simple transformation rules which guarantee they look formally the same to all observers.

4.1 Curvature

The covariant second-order derivative that can be built out of the metric is the Riemann tensor defined through the commutator of covariant derivatives:

(μν-νμ)VρRαμνρVα (4.1)

which reads, in terms of the connection

Rβμνα=μΓβνα-νΓβμα+ΓμραΓβνρ-ΓνραΓβμρ (4.2)

The Riemann tensor is antisymmetric under exchange of the last two indices by definition but also under exchange of the first two with lowered indices Rαβμν=-Rβαμν, symmetric under exchange as Rαβμν=Rμναβ and with a vanishing cyclic combination of the last 3 indexes Rαβμν+Rανβμ+Rανμβ=0 which means that, all in all, the number of independent components is 20.

Contracting two of the indices we obtain a two-index tensor which is called the Ricci tensor:

RμνRμανα=αΓμνα-νΓαμα+ΓαραΓμνρ-ΓνραΓαμρ (4.3)

which is symmetric Rμν=Rνμ. Finally the Ricci tensor can be reduced to a scalar by contracting the two remaining indices

R=Rμμ (4.4)

which is an invariant under coordinate transformations.

For the second ingredient in the dynamics, the fact that matter sources gravity would mean the equating of a combination of the tensors above to a tensor describing matter. Rather than trying out different possible tensor structures there is a more direct and restraining route in the use of a variational principle. All the types of fundamental dynamics known to us can be described by a variational principle, which states that the classical solution to the EOM is that which minimizes the action functional S. The action functional, for a common description of physics for all observers, is an invariant built out of the metric and matter fields. This still does not say what is this functional form, however if we assume that the EOM are linear in the curvature there is one answer only:

S[gμν]=d4x|g|(-R16πGN-Λ8πGN+m) (4.5)

where GN is Newton gravitational constant, in =c=1 units, 6.708×10-39=GeV-2, in the matter Lagrangian m we have hidden the matter fields, we included a cosmological constant term in Λ and d4x|g| is the covariant measure (you can think of the square root as the Jacobian).

As customary now the EoM for gμν is obtained taking an infinitesimal variation33 3 Unfortunately the opposite convention for the metric (-,+,+,+) means that in Einsteins equations Λ-Λ:

δδgμνS[g] =0 Rμν-12gμνR-Λgμν=8πGNTμν (4.6)

where we have defined the stress energy tensor as

Tμν-2|g|δδgμν|g|m. (4.7)

These are the celebrated Einstein field equations relating the curvature of space-time to matter. The curvature dependent piece of the equation is called Einstein tensor Gμν which one can compute explicitly with eqs. (4.2-4.4) for our metric

ds2=dt2-a(t)2(dχ2+SK2(χ)(dθ2+sin(θ)2dϕ2)) (4.8)

with the Connection we computed in eqs. (3.13-3.15). Again calculus and patience yields:

Rμν=(-3a¨a(aa¨+2(a˙)2+2K)g^ij) (4.9)

so that

Rμν-12Rgμν=(3K+a˙2a200-(2a¨a+a˙2+K)g^ij) (4.10)

where we introduced the notation a˙=da/dt.

4.2 Matter

The stress energy tensor as defined in eq. (4.7) is not very useful to us given we do not know what type of matter there is a priori. To determine the form of the stress energy tensor for different type of matter, the QFT minded physicist would still begin from (4.7) and evaluate it say on a state with a particle of mass m. This can certainly be done and the problem section outlines how to, but it is a rather cumbersome road to extract results for the present case.

One can instead start from the implications of homogeneity and isotropy to determine the form of Tμν. The homogeneous and isotropic nature of the metric means the curvature is also so, but Einstein field equations relate the curvature to the stress-energy tensor so for consistency it must have the same form

Tνμ=(ρ00-𝒫g^ij) (4.11)

where 𝒫 is the pressure and ρ the energy density.

The stress energy tensor’s pressure and energy density are related in the limiting cases of massless and non relativist matter. To derive this relation let us consider a fixed energy E isotropic distribution of particles with abundance n=(number of particles)/(Volume). Then the energy density is simply

ρmonochr.=E×n (4.12)

What about the pressure? It is the force per unit area, so consider a tiny sphere of radius r; the number of particles inside is n×4πr3/3 each with momentum p. After some time they would all have left the sphere having carried a total momentum out of dp=p×n×4πr3/3. This time would be when the last particles from the center have escaped dtv=r. Let’s put it all together then in the definition of pressure:

𝒫=(Force)×(Area)-1=dpdt14πr2=pnv3=p23Epn (4.13)

Particles in the universe nonetheless do not come in one single energy; assume we are given the distribution (we will later compute it for the equilibrium case) of number of particles per volume per three-momentum f. Our approximation of isotropy means this distribution can only depend on the modulus |p| or alternatively energy of the particle so the total number, energy density and pressure read:

n =f(E)d3p(2π)3 ρ =Ef(E)d3p(2π)3 𝒫= p23Ef(E)d3p(2π)3 (4.14)

where we note that these magnitudes read like averages and so as in statistical mechanics the average value of E squared differs from the average of E2 albeit for order of magnitude approximations is an error we can live with.

It is now easy to take the approximation of massless and non-relativistic matter to find:

Trμν =(ρ13a2ρ) Tmμν =(ρ0) (4.15)

where with an abuse of notation we will refer to non-relativistic matter as matter with an m subindex and to ultra-relativistic matter as radiation with an r subindex.

In addition any energy momentum tensor satisfies a conservation law, in parallel with the current conservation of electromagnetism μJμ=0, only now the derivative is covariant:

μTμν=μTμν+ΓμαμTαν+ΓμανTμα (4.16)

Our tensor then is subject to the constraint, selecting the 0th component and according to eq. (3.13).

μTμ0=tT00+Γi0iT00+Γij0Tij=ρt+3aaρ+3aa𝒫=0 (4.17)

which tells us that if we know the time dependence of ρ the time dependence of pressure is given and there’s one less differential equation to solve.

In the cases of radiation and matter it becomes an even stronger constraint since it is a closed equation for the time evolution of the energy density:

ρt+4aatρ=0 ρrad= a4(ti)a4(t)ρrad(ti) (4.18)
ρt+3aatρ=0 ρmatt= a3(ti)a3(t)ρmatt(ti) (4.19)

4.3 Einstein field equations

After all we have learned lets come back to Einstein field equations for the relation of metric, dynamics and matter in an FRW universe

Rμν-R2gμν-Λgμν= 8πGNTμν (4.20)
(3K+a˙2a2-Λ00-(2a¨a+a˙2+K-a2Λ)g^ij)= 8πGN(ρa2𝒫g^ij)

One can then pull out the two equations from the tensor structure to write:

(a˙a)2=8π3GNρ-Ka2+Λ3 2a¨a+a˙2a2=-8πGN𝒫-Ka2+Λ (4.21)

These remarkable relations give us the reaction of the metric to the energy content of the universe, and given the fact that gravity couples to everything the RHS of these equations has an implicit sum over ρ and 𝒫 of all components. In particular the first equation gives the Hubble rate squared as the total energy density of the universe plus the curvature and cosmological constant. This brings a new light on a quantity we are already familiar with: inadvertently we have been carrying around the measure of the ‘weight’ o the universe.

H02=8πGN3ρi(t0)-Ka(t0)2+Λ3 (4.22)

Problems

I Einstein field equations for our metric give two differential equations for a single function a(t). For consistency they should not be independent. Differentiate the 00 component equation, use eq. 4.17 to substitute tρ and the 00 equation for a further substitution to arrive at the second equation. In general the consistency is ensured by the Einstein tensor satisfying μ(Rμν-gμνR/2)=0.
II Compute the 00 component of the Ricci tensor given the Connection we computed in eq. (3.13) with g^ij=δij
III Derive the curvature dependent part of Einstein field equations using

δRαβδgμν =0 δgδgμν =12ggμν (4.23)

and recalling gμν is the inverse of gμν, gμαgαν=δνμ.
IV Let’s count the number of independent components in Riemann’s tensor for n dimensions. The conditions Rμναβ=-Rνμαβ, Rμναβ=-Rμνβα mean that there’s (n2-n)/2 values for the first two indices and the same for the second two. Further the condition Rμναβ=Rαβμν means these two sets are interchangeable, show that this means a counting:

q2-q2+q =q2+q2 q =n2-n2 (4.24)

Finally for the cyclic condition Rμναβ+Rμβνα+Rμαβν=0, show that if any two of the indices are the same the previous conditions already ensure the vanishing. Then this only brings as many new conditions as combinations of 4 different indices and the final counting is:

n2(n2-1)12 (4.25)

V Derive the energy momentum tensor - Lagrangian relation

Sm= d4xg(12gμνμϕνϕ-12m2ϕ2) Tμν =μϕνϕ-12gμν(ϕ2-m2ϕ2) (4.26)

Now the form of this tensor depends on the state it is evaluated on; of relevance here will be a thermal ensemble, an homogeneous classical field ϕ(t) and a single particle state. Here let’s take a single particle state in Minkowski space gμν=Diag(1,-1,-1,-1) with quantized field as (Pμ=(Ep,pi) and V is volume)

|p =ap2EkV|0, ϕ(x) =d3p2Ep(2π)3(ape-iPμxμ+h.c.), [ak,ap]= 2Ek(2π)3δ3(p-k)

compute p|Tμν|p, what is the enegy density T00 for th state?

5 The universe we live in

Starting from the description of the expanding universe with a time-dependent metric we derived the curvature of space-time which Einstein field equations for the dynamics related to the total energy density of the universe. The remarkable equation we found is known as Friedman equation and we will make use of it evaluated at the present time to find out about what makes up the known universe.

The first of Friedman’s equations can be written in terms of the Hubble rate as:

H2(t) =8πGN3ρ(t)-Ka2(t)+Λ3 (5.1)

it is customary to define the critical density as

ρcr=3H028πGN=4.7(h0.67)2KeVcm3=8.5×10-30(h0.67)2grcm3 (5.2)

so that Friedman’s equation can be rewritten in a weighted form

H(t)2=8πGN3ρcr(Ωra(t)4+Ωma(t)3+ΩKa(t)2+ΩΛ) (5.3)
H(t)2=H02(Ω𝑟a(t)4+Ω𝑚a(t)3+ΩKa(t)2+ΩΛ) i Ωi=ρi(t0)ρcr=1 (5.4)
Pie Chart for the composition of the universe 68% dark energy, 27% dark matter and 5% matter
Figure 9: The energy composition of the universe, Ωi, i=Λ, m, b, ν,γ

with the Ω terms giving the fractional contribution of each type of substance and where we assumed matter is either ultra-relativistic (radiation) or non-relativistic (matter) as is indeed the case today.

There is a few caveats when taking this definition, first is that not all components necessarily add up constructively to the total, in particular the curvature and the cosmological constant could give negative Ω, as we have energy densities as per the definition in eq. (5.4)

ρΛ =Λ8πGNΩΛ=Λ3H02, (5.5)
ΩK =-KH02 (5.6)

where we note that ρΛ is indeed what appears in the action of eq. (4.5) and can be taken as the energy density when nothing else is there i.e. the vacuum. One also sees that living on a sphere K=1/RU2 would mean there is a negative contribution. That being said, we believe in our universe all contributions happen to be positive.

Next we turn to review each of the possible contributions and what the present value is.

5.1 Photons

Light permeates the universe in the form of a Cosmic Microwave background which is, as sketched in the introduction, one of the pillars of cosmology. The Friedman equation allows for comparison with the other defining observational evidence in the Hubble rate so let us compare the energy density in CMB photons to the critical density derived from H0. Given that there is only one parameter that characterizes the CMB, the temperature, we can estimate the energy density by dimensional analysis:

ργ(t0)=0.66T04 (5.7)

where with a justified abuse of notation we assumed the majority of energy density of photons γ is given by those in the CMB and anticipating results from chapter 6 we borrowed the constant out front to be 0.66. Then one has

Ωγ(t0)=5.5×10-5(0.67h)2 (5.8)

and so the contribution to the total at present time is negligible; it is not radiation that is driving the expansion of the universe.

Nevertheless as we recall from solving the continuity equation for relativistic matter we have

ργ(t)=ρcrΩγa4(t) (5.9)

and so the energy density used to be much larger in the early universe.

5.2 Neutrinos

There is sound theoretical evidence for a Cosmic Neutrino Background (CNB) yet the elusiveness of neutrinos has prevented its direct detection thus far. The detection of this background would allow exploration of even earlier times than those the CMB probes and would constitute a landmark discovery in cosmology.

Neutrinos fall in a peculiar category cosmologically since we believe their contribution to Friedman’s equation today is that of non-relativistic matter yet they have not had time to cluster like other known matter: they preserve an equilibrium distribution with temperature Tν in parallel with the CMB. The contribution to the energy density is not known currently but only bounded from above and below given that at the elemental level we have not completed the picture of neutrino masses yet. This in turn means the possibility of cosmology itself completing this picture in one of the most explicit cases of particle physics and cosmology interplay.

What we are concerned with here nonetheless is the fractional contribution to the total coming from neutrinos. With their hybrid nature we have that an estimate of the number density is given by their temperature yet their energy is dominated today by the mass.

ρν=(imνi)0.18×Tν3 (5.10)

the temperature, given that the neutrinos used to be in equilibrium with the rest of matter, is ballpark the photon temperature; we will work out their explicit relation in chapter 7. The fractional contribution of neutrinos is then

Ων=mν42eV(0.67h)2 (5.11)

laboratory based bounds from Tritium decay set mν<2eV and will improve to 0.2eV [3] so one has that neutrinos contribute at most a percentage to the total. Note however that this is still sizeable and cosmology provides a better bound on the sum of neutrino masses <0.12eV [2] and has the potential to complete the neutrino spectrum picture. In a more speculative note, the fact that the contribution falls not too far off the critical density has stimulated research with extra neutrinos playing the role of dark matter.

5.3 Baryonic matter

There is (yet again) a slight abuse in notation in cosmology when referring to baryonic matter since it also includes the electrons that surround atoms and yield an average charge neutral universe. That being said the possible misunderstanding is of no practical consequence for the energy density given protons are a thousand times heavier than electrons.

This type of matter is the most evident to the human eye in the night sky and indeed there are multiple reliable independent observations of the amount of baryonic matter in the known universe; some are

  • Most of baryonic matter in cosmological scales is in the form of intergalactic dust, with the stars in galaxies making up a fractional amount. This dust or gas is heated by ambient light which we can use to estimate how much of it there is.

  • Another similar method in essence is to look at distant quasar’s light to estimate how much light was absorbed by intergalactic dust in the course of the light reaching us.

  • The CMB contains information about small variations on energy density which are influenced by the amount of baryonic matter and this provides an indirect probe on Ωb.

  • The ratios of light elements in the universe were determined by dynamics in the early plasma which depend on the baryon density.

All of these determinations agree on a contribution to the pie as [2]

Ωb=0.050(0.67h)2 (5.12)

again falling quite short of the total critical density given by Hubble rate.

Recall that the density of non-relativistic matter scales as a-3 so one used to have a larger contribution as:

ρb= ρcrΩba3(t) (5.13)

We have now exhausted the list of contributions from known particles, so whatever remains to make up the critical density is news to us.

5.4 Dark matter

The methods outlined above all rely on the interaction of light and ordinary matter but they miss any substance that photons don’t interact with. There are other methods available that do not rely on this relation and they agree within errors with on the striking fact that there is roughly 5 times more matter than what we see in baryons. Let us sketch some of these observations:

  • A technique to estimate the amount of mass in a given system is determining its gravitational potential. Just like all energy contributes to the Hubble rate squared in Friedman’s equation, the gravitational potential will be generated by the total mass (most systems are non-relativistic). This method applied to clusters of galaxies reveals that luminous matter is insufficient to keep bounded system together. This same system’s pressure deduced from X-ray observations leads to the same conclusion.

  • The power spectrum in the distribution of galaxies (the correlation of distance and density) is influenced by Ωm. This we can measure directly via galaxy surveys and the correlation of distributions of galaxies but also from the small anisotropies in the Cosmic microwave background.

  • In astrophysics the same method of measuring the gravitational potential can be applied within galaxies. Here one expects that in the outskirts of the galaxy when most of the luminous matter is contained in a shell around the centre the potential will fall as 1/r and so the average of orbital velocities will also fall as 1/r (remember the virial theorem).

  • Finally events like the crossing of two galaxies (e.g. bullet cluster) where the luminous matter mostly passed each other are interpreted as being dragged by some non-interacting matter making the bulk of the galaxy mass.

In these notes we will take the value [2]

Ωdm=0.268 (5.14)

5.5 Curvature

Experimental determination of the components in Friedman’s equation leaves no room, within errors, for purely spatial curvature. The universe, as far as we can see, seems spatially flat. There is a priori no theoretical reason to have this be the case, as outlined in the second lecture any value of the curvature K is compatible with our assumptions of homogeneity and isotropy. In words more commonplace to high energy physics the K=0 case is not special since it has the same amount of symmetry as K0. As we shall see when introducing inflation, this circumstance might have arisen due to early time dynamics.

In terms of the room left for curvature, we have ([1])

|ΩK|0.005 (5.15)

Although the bound on curvature is often simply given on Ω is good not to lose track of the fact that curvature is not a substance just because we hid it under a definition. This translates into, via the expression for curvature in eq. (5.5)

RU 60Gpc (5.16)

To put this bound in perspective, given we have sent probes up to distances of 10-3pc, the proportions would be the same if one has that a microbe which has explored some 10μm makes deductions about the radius of the earth.

5.6 Cosmological Constant

In one of the greatest landmark discoveries in cosmology, and one that the community is still scrambling to make sense of, we learned about the vacuum by looking at the largest scales accessible.

First the evidence. The method we review here is based on luminosity distance as given in sec. 2.3. Recall this is the distance we deduce from comparing the apparent brightness of an object to their a priori known luminosity. In this case the objects are type Ia supernovae and the quantity used experimentally to measure brightness is the apparent magnitude (m) which is given in terms of absolute magnitude and -5/2 times the decimal logarithm of flux so ([4])

m-M=5log10(dLMpc)+25 (5.17)

Given the observed redshift of a supernova one can determine its comoving distance and luminosity distance as (take t as the time of emission)

dL=χ(t)a(t)=(1+z)χ(z)=(1+z)tt0dta(t)=(1+z)0zdzH(z) (5.18)

Type Ia Supernovae occur when white dwarfs exceed the mass of the Chandrasekhar limit and explode, in a process which is believed to be fairly independent of the the formation history. One can then take supernovae and, given that they have the same absolute magnitude, assume that differences in their apparent magnitude must be given simply by how far from us they are. In particular the luminosity distance which as made explicit above depends on the type of universe via the different z dependence of H(z).

Let us flesh out this further and take two actual data points with the same magnitude M; supernova 1992P with z=0.026 and m=16.08 and 1997ap with z=0.83 and m=24.32. The first supernova is close enough that we can approximate H(z) to Hubble today and so dLz/H0 (z1) for 1992P. The other supernovae is far enough to be sensitive to the different cosmological scenarios, we find from experiment only with the previous approximations

m1997ap-m1992P=5log(dL1997apdL1992P) =8.24 dL1997ap =1.16H0 (5.19)

This is a magnitude that we can compute with eq. (5.18) given the limits of integration z=0,0.83 and taking the universe to be dominated by our best-guess-substance. Let us take two cases:

Ωm =1 dL1997ap =0.95H0 (5.20)
Ωm =0.3,ΩΛ=0.7 dL1997ap =1.23H0 (5.21)

None of them fall squarely on top of the experimental observation but we have overlooked statistical and systematic errors, so we would not expect so. What we observe is that the Λ dominated universe falls 3 times closer to the experimental value, and indeed a proper analysis as that which can be found on the original papers as well as other measurements provide the evidence of a cosmological constant term Λ. The central value for this contribution we will take from [2] as

ΩΛ=0.68 (5.22)

Which completes the cosmological pie in fig. 9.

This substance then rules the present evolution of the universe yet cosmology is the only evidence we have for it. If we go back to the action in eq. 4.5 we find that ρΛ as defined in the ratio Ω appears as a constant in the action; ρΛ is the energy density when there is nothing else: the energy density of the vacuum. It is in a sense such an elementary term yet in most applications we do not see this energy density; we are interested in differences in energy. We study the kinetic energies of particles or differences in potential energies when we change their separation. Gravity however, through, Einstein field equations, is sensitive to the total energy density including that of empty space.

The description of this energy density as a substance in the stress energy tensor can be made in full by defining

TΛμν ρΛgμν=Λ8πGNgμν ρ= Λ8πGN  𝒫=-Λ8πGN (5.23)

Unlike other contributions this density is independent of a(t) which means a constant Hubble rate. This in turn has the rather dramatic effect of an accelerated expansion, if we resort back to the second Friedman equation

2a¨a =-8πGN(𝒫+ρ3) (5.24)

where we used the first equation to substitute H2. One has therefore that all positive or zero pressure substances produce a deceleration in the expansion. A cosmological constant however with a negative pressure as in eq. (5.23) overcomes the ρ contribution for an accelerated expansion instead. To recall what negative pressure means, a refresher of thermodynamics where the change in internal energy (extensive magnitude) is, for change in volume

dE=-𝒫dV (5.25)

So internal energy (again extensive quantity) increases with volume.

As a last word to spice up the discussion and give a sense of how little we know on the subject, we can take our theory of Nature today, the pinnacle of human knowledge that is the Standard Model and try to estimate Λ. A prediction strictly speaking is not possible since we take the action to be our definition of the model so Λ is an input not an output; we can however estimate the quantum corrections on top of this input that give the effective Λ. In an exercise in dimensional analysis and 4π counting which you should be familiar with by the end of the MSc one can estimate

δ(Λ8πGN)mH416π2 (5.26)

You can amuse yourself to see how off is this estimate, given that mH is the Higgs mass 125GeV and Λ4.3×10-66eV2

Problems

I (i) Given the average energy of a photon, proton and T0, mp1GeV use the value of the critical density and the contribution on the pie chart to estimate how many photons and protons are there on average per cm3, (ii) Neutrinos have a temperature a factor 0.7 below that of the photons and their abundance is given by 0.18Tν3 estimate their abundance, (iii) Assume dark matter is made of a single species of non-relativistic particles with mass mdm and derive an expression for the abundance of dark matter nχ as a function of the mass.
II Derive yourself the bound on the universe radius starting from eq.  and using the critical density.
III Compute the luminosity distance for two z’s, start from recasting Friedman equation into the dependence of Hubble on redshift H(z) for a matter dominated and CC dominated universe. Work out the integral for a matter dominated and cosmological constant dominated universes (both flat) to obtain

dL =1+zH02(1-11+z)Ωm=1 dL =1+zH0zΩΛ=1 (5.27)

Plot these two lines and the experimental points for 1992P and 1997ap. Which line falls closer?

6 The universe in equilibrium

Electromagnetic radiation in the present composition of the universe is a tiny part of the total yet it holds the key to the early epoch of our universe. The reason for this is twofold, i) this radiation is what remains of the primordial plasma in an almost frozen-in-time form and ii) due to the scaling with a(t) of radiation energy density, this used to be the dominant component in Friedmann equation at early times and ruled the expansion.

6.1 Equilibrium distribution

As outlined in the introduction, studying the CMB is studing a substance in equilibrium, so first for completeness we go through an overview of statistical mechanics (we momentarily restore in our equations to remind us of the quantum nature). Consider phase space (x¯,p¯) for a particle species with energy E given to a good approximation by that of a free particle E2=m2+p2. Heisenberg’s uncertainty principle states that the tighter one can ‘pack’ a particle in phase space is to have it inside Δx3Δp3=(2π)3. This is the unit of phase-space volume to count particles which is to say

Vn=d3xd3p(2π)3f(x,p) (6.1)

with f(x,p) a probability distribution in phase space and V, n the volume and particle abundance. Alternatively, if you find this argument too vague, you can put particles on a box of length L to find the density of states through the quantization of momentum:

pi =2πLki d(States)=(dk)3=L3(2π)3(dp)3=d3xd3p(2π)3 (6.2)

with ki a 3-vector with positive integer values, k1,2,3=(0,1,2,).

Next one has that in a thermal distribution the number of particles is not conserved and we consider placing any number of particles in any of the states in phase space. Obviously the options are different whether one has a fermion or a boson so the two are discussed in parallel for Bose-Einstein (BE) or Pauli-Dirac (PD) distributions respectively. Consider first the partition function at fixed energy E; for a boson we can have any number of particles in a given state, call it N, and so the energy is N×E. The partition function is then

Z(E)BE=Ne-N(E-μ)/T=11-e-(E-μ)/T (6.3)

where we also included a chemical potential μ for completeness and we used the geometric series expansion to sum the series. The sum for fermions is much simpler, one or none

Z(E)PD=1+e-(E-μ)/(TkB) (6.4)

In the following we will set Boltzmann constant

kB=8.617×10-5eV K-1 (6.5)

to 1, implying temperature and energy have the same units. Then the expected number of particles per phase space for a given energy E reads like the number of particles N times the respective probability e-NE over the sum of probabilities Z. This can be synthesized in the derivative:

f(x,p)=TZZμ=Tμlog(Z(μ,E))={BE(e(E-μ)/T-1)-1PD(e(E-μ)/T+1)-1 (6.6)

This completes all components we need to describe the thermal ensemble of particles and one can put it to use right away in the CMB today. Experimental evidence indicates that the chemical potential can be neglected for the CMB and indeed this will be the case for substances in chemical equilibrium as we shall see in chapter 8. The number density of photons in the bath is then

nγ=gγdp3(2π)31eE/T-1=gγ4π(2π)3T3y2dyey-1=gγπ2ζ(3)T3 (6.7)

where the degeneracy factor gγ=2 comes from the 2 possible polarizations of photons, we changed variables to y=p/T and ξ is Riemann zeta function with ξ(3)1.202. The energy per unit volume in the CMB is on the other hand the integral over the distribution function times the energy

ργ=gγd3p(2π)3E(p)fBE(x,p)=gγ(4π)T4(2π)30𝑑yy3ey-1=gγπ230T4 (6.8)

where in this case Riemann’s zeta function has an exact expression which we substituted in, ζ(4)=π4/90. We have therefore now computed the proportionality constant that we took as a given in chapter 5, eq. 5.7.

The difference in the case of relativistic fermions is slight but computable exactly

ρν=gνd3p(2π)3E(p)fPD(x,p)=gν(4π)T4(2π)30𝑑zz3ez+1=78gνπ230T4 (6.9)

where we took a neutrino to be our massless fermion and in the Standard Model gν=2 for the two spins states. In reality there are three neutrinos and they are massive, whose consequences we will see on due time.

One can put the two cases together in an expression for relativistic particles as:

ρrad=gπ230T4 g =(bosonspecies)+78(fermionspecies) (6.10)

assuming all particles have the same temperature T. If some species have a different temperature one can still use the above definition with a reference temperature and the degeneracy g of some species rescaled by (T/T)4.

Another quantity we will find useful is the entropy density for a particle with negligible chemical potential which is defined as

s=ρ+𝒫T=dp3(2π)3f(p,x)(ET+p23ET) (6.11)

recalling thermodynamics relation TS=Vρ+V𝒫. For all relevant processes of study here the entropy will be dominated by relativistic particles in which case Ts=4ρ/3.

Let us turn now to the other type of contributions in Friedmann’s equation. For a non-relativistic particle the distribution function can be approximated as:

f(E)=1em2+p2/T1p,Tm1em/T+p2/(2mT) (6.12)

Where we see that in this approximation the distribution is independent of the bosonic or fermionic nature. The approximation is off for the high energy tail p2mT where the distribution falls as an exponential e-p/T and not a Gaussian as the approximation above gives; this can be appreciated in fig. 10. Nonetheless for integrated magnitudes this part of the spectrum contributes a exponentially small share and we can approximate for non relativistic (mT), μ=0, particles in the plasma:

nN.R.=g4π(2π)2e-m/T(mT)3/20y2e-y2/2𝑑y=ge-m/T(mT2π)3/2 (6.13)

Equivalently any other magnitude will have an exponential suppression e-m/T and units mT. We can understand the exponentially small number of particles in simple terms, the average energy T available for interactions is not enough to produce the species of mass m, only those particles in the high p tail of the distribution have enough energy but their abundance is smaller by precisely this exponential factor.

Black body spectrum plot
Figure 10: Black body spectrum for particle density as a function of x=p/T for m=0 LHS and m/T=10 with the approximation of eq. in yellow.

6.2 Equilibrium in an expanding universe

This far the discussion has overlooked the fact that this thermal bath is placed in an expanding universe. What keeps the bath in equilibrium is the continuous interaction of particles with one another; if the expansion rate is slower than this interaction rate of the particles the equilibrium distribution is preserved. The two magnitudes to compare are therefore the i) Hubble rate H(t) whose inverse gives the time it takes for the universe to double in size versus ii) the interaction rate Γ whose inverse gives the average time between interactions. This number can be estimated as σvn with σ the cross section, v the relative velocity and n the number density of particles that the species we are interested in interacts with.

For now let us assume ΓH so equilibrium is preserved. The adiabatic approximation then applies which in practice means the expressions we were using are valid but now the magnitudes change with time t.

Let us connect first the expression for the time evolution of radiation energy derived in the first chapters with that obtained here for a distribution in equilibrium:

ργ(t)=a(ti)4a(t)4ρ(ti)=gπ230T4 (6.14)

which is consistent if the temperature scales as

T(t)=a(ti)a(t)T(ti) (6.15)

That is, the plasma cools as time passes and space expands; when the universe doubles in size the temperature falls by a half.

The same matching does not apply to non-relativistic particles however; eq. (6.13) scales with T3/2 which given the above would imply na-3/2 yet we know for matter na-3. The answer to this apparent contradiction is many-fold and we should unfold it in subsequent chapters. The short explanation is that we cannot neglect the chemical potential for matter which today is very far from an equilibrium.

The abundance in eq. (6.13) today can be understood as produced by photons so high up in energy in the tail of the distribution that they can pair produce non-relativistic particles, say baryons. These type of photons are however so rare that the abundance of these thermal baryons is negligible for all purposes, npeq/nγe-m/T010-1012.

Finally as pointed out in the beginning the fact that the energy density in radiation scales with the most inverse powers of the scale factor means in return that this type of energy used to dominate the total budget of the universe. The Friedman equation then reads in this regime

a˙2a2 =8πGN3gπ230T4 z 104 (6.16)

where g counts the number of relativistic degrees of freedom as in eq. (6.10).

6.3 Entropy conservation

One useful conservation equation is that of entropy for relativistic particles. One can derive this from the divergence of the stress energy tensor μTνμ being vanishing, in particular for the first component we have, from eq. (4.17)

μT  0μ=ρt+a˙a(3ρ+3𝒫)=a3ta3(ρ+𝒫)-𝒫t=0 (6.17)

Using either Bose Einstein or Fermi-Dirac distributions with a dependence solely on the energy one can derive for relativistic particles;

𝒫T=Td3p(2π)3p23E(p)f(E/T)=mTd3p(2π)3p3(-pT)pf(p/T) (6.18)

next integration by parts where the boundary term vanishes gives the relation 𝒫/T=(ρ+𝒫)/T which back in the conservation equation

1TμT  0μ=1Ta3ta3(ρ+𝒫)-dTdtρ+𝒫T2=a3ta3(ρ+𝒫)T=1a3(a3s)t=0 (6.19)

with says the time dependence of entropy density s=(ρ+𝒫)/T is given by the factor a-3(t). You might be familiar with the conservation law T, which comes from the action invariance under space-time shifts, giving in other settings total energy-momentum conservation, like in say QFT in Minkowski space; here however the situation is more aptly described by fluid mechanics and thermodynamics given that radiation’s pressure means the substance does work with the expansion and it is entropy instead that is conserved.

Problems

I Now that we have the distributions for relativistic spin 1 and 1/2 particles compute the exact expression for the abundance of photons and neutrinos given Tν= (recall there’s 3 ν in the SM with one chirality only.) [Hint: nPD=3nBE/4]
II Check the result in eq. 6.6 explicitly using the expressions for ZBE, ZPD.
III At temperatures above the MeV photons, neutrinos electrons and positrons are present in the plasma with the same temperature, what is g then? At the EW scale 246GeV all particles in the SM can be taken relativitic, what would g be then?
IV Show that Pauli-Dirac and Bose-Einstein distributions are related as nPD=3/4nBE ρPD=7/8ρBE as part of the series of averages of powers of energy

d3p(2π)3pnep/T+1=(1-12n+2)d3p(2π)3pnep/T-1 (6.20)

[Hint: Use fBE(E)-fPD(E)=2fBE(2E) and a change of variables, you do not have to perform the integrals explicitly]

7 Turning back time

The framework has now been laid for the study of the universe at its largest scale. The observations that determine the state of the cosmos today together with the evolution equations in the form of Friedman’s equations allow to turn back the dial of time and reconstruct the history of the universe. The next chapters will take us back in time as far as solid evidence can take us.

Reviewing the components of the universe we found that the main contribution is in the form of a cosmological constant followed by matter contribution (both of them being none of the substances we can produce in a lab), we can take the short term evolution of the universe then given by

(1adadt)2=H02(ΩΛ+Ωma(t)3) (7.1)

with the definition of Ω and critical density as in (5.4). One can solve this differential equation exactly to find the growth of the universe in the future, but also how the growth used to be. Although the exact solution can be found with a little work and a table integral, the two limiting behaviours are good approximations in the respective regimes

(1adadt)2 =H02Ωma(t)3  tH0-1; (1adadt)2 =H02ΩΛ  H0t0 (7.2)

The one thing left to specify for the explicit solutions are the original conditions, which are taken as

a(0) =0 (7.3)

given that we define the origin of time as the singular point. The solutions in the two regimes then are simple to find

a(t) =(32ΩmH0t)2/3,tH0; a(t)=a(t+)eH0ΩΛ(t-t+),tH0-1. (7.4)
Time dependence of the scale factor for a matter and cosmological constant dominated universe, good fits for the times before and after ours, the true scale factor interpolates between the two
Figure 11: Time dependence of the scale factor for a matter and cosmological constant dominated universe

where t+ is reference time in the future. The initial conditions are imposed in early times so that the solution in that regime is fully specified. For the late time region the constants in the solution as given depend on an integral which covers the full range t=0,t+H0-1. This piece of information we borrow from the exact solution to find

a(t)=[Ωm4ΩΛ]1/3eH0ΩΛt,tH0-1. (7.5)

The plot in fig. 11 shows the shape of the exact solution and approximations. As can be seen just one half of the Hubble time into either the future or the past puts us in a regime in which the matter only or Λ only solution is an excellent one.

Given our present knowledge the fate of the universe is to fall into an accelerated expansion with an exponential growth. This statement must however be put in context, first the daily remainder that the scales we talk about here are cosmological, so for example our galaxy and Andromeda will not be pulled apart not to mention our galaxy itself. This is because the gravitational field that bounds us together is much stronger than the Hubble drag. Second, as the solution shows we enter a fully accelerated phase in half the Hubble time, 7000 million years from now; making extrapolations so far in time might well be an overreach. That being said the information we have today points to this acceleration taking over till the end of time.

7.1 The epochs of the universe

Next lets turn our gaze back. First observe that present time is derived from the solution rather than put in. One has a(t0)=1 which, given we solution a(t) with a(0)=0, turns into an equation that one solves for to find t0. The exact solution then yields a value for t0 given by the point where the horizontal and blue line cross. This returns

t0=0.949H0-1=0.949100hkm -1 s Mpc=13.9×109years (7.6)

You can derive this expression for the age of the universe yourself following the problems at the end of this chapter.

Again from fig. 11 one infers that before the half life of the universe one can neglect the cosmological constant contribution which corresponds to a<1/2 since at this time

Ωma3(t0/2) 8Ωm2.4 vs ΩΛ 0.7 (7.7)

This follows from the fact that matter density dilutes with a3 and it used to be much larger in the past. For that purpose radiation dilutes even faster with a4 so it must have been even more relevant in early times.

Indeed we can estimate at what scale factor matter and radiation energy densities where around the same

Ωma(t)3=(z+1)3Ωm=Ωra(t)4=(1+z)4Ωr (7.8)

Which returns zeq3.4×103 with this event in time known as matter-radiation equality or sometimes just equality. The universe was 3400 times smaller which corresponds to about a hundred thousand years old. Befor this time therefore radiation dictated the expansion of the universe in the form of the CMB photons and neutrinos. The CMB had a temperature then of:

T(teq)=(zeq+1)T0=0.80eV (7.9)

The Hubble rate and growth of the universe then was determined by the energy in radiation which we can express in terms of the temperature as given in eq. (6.16):

H(t)2=8πGN3gπ230T4 (7.10)

where g counts the effective degrees of freedom.

At such early times t the universe looked very different; matter had not begun to condense and collapse into galaxies due to gravity since the interaction with photons was so frequent and energetic that kept it in thermal equilibrium at the same temperature. Earlier still at temperatures of T2me the average energy of photons was high enough that pairs of electron and positrons were produced with abundances given by the Pauli-Dirac distributions of chapter 6. This also means that the number of relativistic degrees of freedom g used to be different. Earlier still at higher temperatures other known particles would be produced as mass thresholds are crossed, as to which ones one can get this information from particle physics.

This gives you an idea of why particle physics and cosmology are connected fields. The primordial plasma is a ‘particle physics’ experiment where the evolution and properties of the plasma can be traced with simple mathematics to fundamental physics. Further still, evolution in cosmology is ruled by gravity and offers an experimentally testable set-up to look into an interaction which in conventional particle physics experiments is far too weak to be relevant.

Case in point, g for T=1eV one should be able to infer from particle physics by looking at how many species are there lighter than that value. Photons in the CMB with two polarizations and gγ=2 of course but a look at particle listings reminds us there are also neutrinos with an upper bound on their masses of precisely about eV. We know there are 3 species of neutrinos νe,νμ,ντ each with two spin states gν=3×2. Nevertheless do they have the same temperature as the CMB?

The answer is no, they are not in thermal equilibrium with the photons and do not share the same temperature, they are actually cooler.

Neutrinos are very weakly interacting particles and they fell out of touch with the plasma at temperatures TMeV scale, as to why this value we will find out in the next chapter. At that temperature the plasma included electrons and positrons (me0.5MeV) yet they were on their way out as when temperature dipped below MeV they annihilated into photons e+e-γγ. This process heated the plasma and we will estimate by how much using entropy conservation.

At times t- when the temperature was above me and e+e- where abundant but neutrinos had already decoupled the entropy on the plasma was

s(t-)=2π245gbath(t-)T3(t-)=2π245(gγ+78ge++78ge-)T3(t-) (7.11)

whereas after the annihilation the entropy would read

s(t+) =2π245gγTγ3 (7.12)

where we called Tγ=T(t+) the temperature of photons in the CMB after e+e- annihilation. Entropy conservation tells us sa3 is conserved and so

(a(t-)T(t-))3g(t-)=gγ(a(t+)Tγ)3 (7.13)

In the meantime neutrinos were decoupled watching things unfold from the sideline; this means their distribution was Pauli-Dirac with a temperature Tν even if they were non-interacting, much like the CMB today. This neutrino temperature simply scales with the expansion oblivious to what is going on in the bath. Furthermore at early times t- this temperature was the same as that of the bath so

Tν(t)=a(t-)T(t-)a(t) (7.14)

so one can rewrite eq. (7.13) to read

Tν3(t+)Tγ3(t+)=gγgbath(t-)=411 (7.15)

which means neutrinos are a factor (4/11)1/30.7 cooler than photons today.

Therefore after T MeV (e+e- annihilation) and before T eV (when neutrinos turn non-relativistic) the energy density on radiation is

ρr=π230gγTγ4+π230gν78Tν4=ργ(1+78(411)4/3Neff) (7.16)

with Neff=3 for the 3 neutrino species. We went to the trouble of introducing Neff because in general there might be other relativistic species contributing to the total which would manifest in Neff3 and this variable is a commonplace observable defined to reflect this possibility; at present cosmological data sets Neff=2.99±0.34 [2].

7.2 Interchangeable labels for history

The labels to refer to events in the universe’s history have been piling up: time t, conformal time η, scale factor a, redshift z, and now temperature T. All of them either increase or decrease monotonically and so any two of them are univocally connected by a mapping, which might involve solving an ODE or a given algebraic relation. Yet if after all they are interchangeable labels why don’t we stick to one and write everything else in terms of our chosen variable?

The answer is that depending on the event and epoch one is looking at one of these variables will be specially suited to describe the physics. It is therefore good to know how to change variables (labels) and what is the use of each one, so here is a rough guide

  • Time t is the time as inertial observers would measure it and for the radiation and matter dominated eras, its inverse gives an order of magnitude estimate of Hubble Ht-1.

  • Conformal time is the time coordinate for a transformed metric as in eq. 3.25 in conformal form. It is useful for discussing causality and as we shall see evolution of perturbations.

  • Scale factor a(t) tracks the size of our universe and given the present pie-chart breakdown of energy density is a useful marker for the different eras of the universe: radiation/matter/CC dominated.

  • Redshift z is related to the scale factor algebraically 1+z=1/a so for early enough times is simply its inverse yet it is also easy to extract experimentally observing standardized luminous objects.

  • Temperature T which unless otherwise stated refers to photon temperature, signals which species are in thermal equilibrium and scales, away from mass thresholds, with the inverse of the scale factor.

A plot of the relativistic effective degrees of freedom in the  history of the universe to appreciate their step-like decrease as we underwent transitions
(a) Relativistic degrees of freedoms in the plasma
Diagram of the exhcangeable labels for the history of the universe, conformal time, scale factor, reshift and temperature
(b) Our labels for the history of the universe

Problems

I Consider a radiation dominated universe Ωr1 in Friedman’s equation and with the initial conditions a(0)=0 compute a(t), z(t), η(t), H(t). Assuming radiation follows a black body distribution with temperature T and g degrees of freedom compute T(t). Which of these variables do and don’t depend on H0?
II Determine the redshift of equality substituting in eq. 7.8 the energy density of the CMB 5.8. Do you get the same result zeq=3.400? What is the temperature at that redshift and what is wrong with the calculation? Use eq. (7.16) to compute zeq instead.
III Let’s solve exactly eq. (7.1) with a(0)=0 and find the age of the universe and the normalization of the asymptotic limits. You can do it in your favourite way but here’s a hint

ddxArcTanh(11+L(x))=-1211+L(x)dlog(L(x))dx (7.17)

with ArcTanh the inverse hyperbolic tangent ArcTanh(x)=1/2log((1+x)/(1-x)). Use the values of Ω from chapter to find 0.949.

8 Boltzmann Equations

The primordial plasma back when the temperature was around MeV contained photons, e+, e- and neutrinos in chemical equilibrium (μ=0) distributions. As we shall see it also contained the baryons that formed the known universe in thermal but not chemical equilibrium (μ0). How does one determine what particles were in equilibrium in the original soup and for how long? Looking into this question will lead into another, more ambitious one; what aspects of the universe can one account for from the fundamental properties of particles and the evolution of the original plasma?

i

j

k

l
Figure 12: Particles i,j particles k,j reaction

The answer lies in Boltzmann equations, applied to our homogeneous and isotropic approximation to the universe. Let us compute the change with time in the number of particles of a given species i. First, the change in particle abundance of the species i we are already familiar with is the decrease with the expansion of the universe. One can make this explicit with the current Jμi=(ni,0¯) conservation equation

μJμ=nit+Γk0kni=nit+3aani (8.1)

This will equal 0 in the absence of interactions with the solution nia-3 that presents the volume dilution with expansion as intuition suggests. If the number of i particles is not a conserved number in the presence of interactions, that is at the fundamental level, there will be a source term. Consider first a process in the soup with 2 different species of particles i,j coming in and 2 other k,l coming out. The process has an amplitude which one can compute in quantum field theory given the interaction Lagrangian but here we take it as a given. This process we assume is the leading reaction which depletes the number of i particles. However the inverse reaction k,l to i,j must also be possible and this will increase the number of i particles. One has then the two competing reactions, each possible for any momenta pi,pj,pk,pl such that the total is conserved so we integrate over momentum space as

1a3da3nidt= dpi3(2π)32Eidpj3(2π)32Ejdpk3(2π)32Ekdpl3(2π)32El(2π)4δ4(pi+pj-pk-pl)
×||2(fkfl(1±fi)(1±fj)-fifj(1±fk)(1±fl)) (8.2)

The LHS above is a rewriting of eq. (8.1) whereas the collision term on the RHS has the integral over Lorentz invariant phase space which you might already be familiar with from QFT. In addition now there are factors of the distributions f; the second term in the second line is proportional to fi and fj since they are the abundances of the initial states in i,jk,l whereas the extra (1±fk) terms arise from Pauli blocking (-) when the particle k is a fermion and the state might be occupied already or Bose enhancement (+) for a boson. The first term is the inverse reaction and the relative sign comes from one reaction increasing and the other decreasing the number of i particles. Finally we assumed is the same for the inverse and direct process which happens if CP (charge conjugation×parity) symmetry is preserved.

This is Boltzmann’s equation for the abundance of particle species i. At face value is a formidable equation to formulate, let alone solve; it contains derivatives in time but integrals in momentum of the functions f(t,p) we want to solve for and to obtain a closed system one needs the coupled equations for the other species distributions fj,fk,fl. Luckily a number of approximations hold in the processes of interest which greatly simplify the system.

First, even when a particle abundance diverges from the equilibrium value, process which do not change the particle number (like Compton scattering) ensure a Bose-Einstein or Fermi-Dirac distribution is maintained, only with a time-dependent chemical potential μ(t). One can then substitute f as a function of μ, E as given in chapter 6 and translate the differential equation into μ. In addition, the relevant part of the distributions will be at E-μ above T, so the ± factors of bosonic or fermionic statistics can be neglected in favour of Boltzmann’s distribution.

All this makes the equation collapse into44 4 We set all degeneracies to g=1 here for simplicity, restoring them is straight forward.:

1a3da3nidt= dpi3(2π)32Eidpj3(2π)32Ejdpk3(2π)32Ekdpl3(2π)32El(2π)4δ4(pi+pj-pk-pl)
×||2e-(Ei+Ej)/T(e(μk+μl)/T-e(μi+μj)/T) (8.3)

where we used energy conservation to factor out the exponential in initial energy. Next one can take the chemical potential exponentials out of the integrals and define the prefactor with use of the equilibrium, zero potential distribution:

ni(0) d3pi(2π)3e-Ei/T (8.4)
σv 1ni(0)nj(0)dpi3(2π)3e-Ei/Tdpj3(2π)3e-Ej/Tσv (8.5)
σv= dpk3(2π)32Ekdpl3(2π)32El(2π)4δ4(pi+pj-pk-pl)14EiEj||2 (8.6)

where the third line contains the definition of the cross section times velocity σv which you might be familiar with from QFT. The difference with the thermally averaged cross section times velocity σv being that there is a distribution of the initial states which one also integrates over; that being said the familiar cross section with centre of mass energy ET gives a good estimate for case of temperatures above particle masses. These definitions convert Boltzmann equation into

1a3da3nidt=ni(0)nj(0)σv(nknlnk(0)nl(0)-ninjni(0)nj(0)) (8.7)

where we used ni/ni(0)=eμi/T. Although all we have done from eq. (8.3) to eq. (8.7) is to hide the integrals in definitions, these new magnitudes have physical meaning and are useful in discussing the behaviour and outcome of these equations for the dynamics.

The magnitude on the LHS has units of particle abundance per unit time and is of order niH with H the Hubble rate while on the RHS we have ninjσv. Whether the expansion or the reactions dominate will determine the evolution of a particle abundance so the dimensionless ratio njσv/H is the key to equilibrium. The rate of interaction is Γ=njσv given by the number of collisions particle i experiences in the plasma per unit time, that is the number of particles nj that a particle i sweeps in time dt within its cross section:

Ncoldt=σ×vdt×nj

vdt

σ
(8.8)

So the inverse Γ-1 is the average time between collisions. On the other hand H-1 gives the time scale for the universe to double in size. It is illustrative to see how these two quantities relate to gravity and the gauge interactions of the Standard Model; at high temperature, higher than the mass scale of the Standard Model T100 GeV we can estimate cross sections via dimensional analysis whereas the Hubble rate depends on temperature squared:

(T100GeV)  σv×neq 4πα2T2×T3=4πα2T H TMplT (8.9)

where GN=Mpl-2 and α is a typical SM gauge coupling order the fine structure constant. Therefore at energies below 1016 GeV one has that SM interactions are larger than the Hubble rate and all known elementary particles would be present and at equilibrium. In particular equilibrium abundances scale with T3 so that n(t)a3(t) is time independent and the LHS in eq. (8.7) vanishes, for consistency the RHS should vanish as well giving

chemical equilibrium nknlnk(0)nl(0)= ninjni(0)nj(0) μi+μj =μk+μl (8.10)

which is called chemical equilibrium condition. Note that if enough processes occur much faster than H the combined linear system only has the trivial solution μi=0.

One final comment on the equation is that the RHS depends on the temperature though the definitions of averaged cross section and 0th equilibrium abundance. Temperature itself changes with time and this dependence should be substituted for a closed equation. It is often the case that temperature is used as the variable by changing the derivative in time to

T3(-HT)ddTniT3 =ni(0)nj(0)σv(nknlnk(0)nl(0)-ninjni(0)nj(0)) T= a(ti)a(t)T(ti) (8.11)

where for most of the processes of interest we will be in the radiation domination era so

H2=8πGN3gπ230T4 (8.12)

With this much information we can turn now to why did neutrinos decouple at T0.5MeV temperature. As opposed to the high temperatures/energy cross section for neutrinos, at low temperatures/energy neutrino interactions are given by a contact 4 fermion interaction. Recalling a bit of Feynman rules jargon, the propagator (T2-MW2)-1 for temperatures T<MW can be expanded and the cross section will be proportional to (1/MW2)2GF2. So once more dimensional analysis dictates a thermal cross section

σvn GF2T5 H=4π3g45 T2MPl (8.13)

so the interaction rate Γ falls fast with decreasing T crossing over the Hubble rate at about T MeV which is when the universe was about a second old. The decoupling of neutrinos is the simplest case to study because the cross over happens at temperatures much above neutrino masses. When a relativistic species decouples it simply maintains its equilibrium abundance and distribution. In fact the same happened when photons decoupled from matter to form the ’frozen in time’ CMB as we observe it today and just like for photons we believe there is a cosmic neutrino background CNB.

Problems

I At around the eV scale the only particles in the plasma in chemical equilibrium were photons whereas protons and and electrons were in thermal equilibrium. Write down the Boltzmann equation for the change in proton abundance due to Hydrogen formation p+eH+γ according to (8.7). Use the equilibrium condition abundance formula for non-relativistic particles (6.13) to find

nHnp=ne(2πmHmempT)3/2e(mp+me-mH)/T (8.14)

II Given the cross section 4πα2s/M4 for massless particles so s=(P1+P2)2=2p1p2(1-cosθ) compute the thermally averaged cross section:

σv =1n1(0)n2(0)d3p1(2π)3dp23(2π)3e-p1/Te-p2/Tσv=72α2πT2ζ(3)2M4 (8.15)

III Boltzmann’s equation for neutrinos can be parametrized as

1a3dnνa3dt=β(nν(0))2GF2T2(1-n2(nν(0))2) (8.16)

Define Yν=nν/nν(0) assuming Tmν and substitute back in the differential equation assuming T(t)1/a(t). Check that, given our approximations Yν=1 is a solution. The approximation is no longer valid when temperatures drop below eV, by then what is the inverse of the interaction rate Γ-1=(βnν(0)GF2T2)-1, that is the time between interactions (you can take β1) and how does it compare with the age of the universe?

9 Nucleosynthesis

We now turn back in time to temperatures of MeV to derive one crucial result of the current theory of cosmology. We call this crucial not because it was an event driving the dynamics of the expansion but rather because is one of the predictions of the theory that has been tested experimentally. Although experimental evidence reviewed this far like the CMB discovery points towards an early hot universe whose dynamics we have gone some depth into, extrapolating back in time makes some assumptions which remain so until tested. Big bang nucleosynthesis is one very valuable such test which allows us with certainty to extrapolate our history back to MeV temperatures or mere seconds after the big bang.

At temperatures above the MeV, photons, electrons, positrons and neutrinos were in chemical equilibrium (μ=0) with thermal abundances T3. In kinetic equilibrium with them, meaning non-number-changing interactions like pγpγ kept them at the same temperature T, there was a small amount of baryons. This smallness we quantify typically as the baryon to entropy ratio

ηbnb-nb¯s=nb2π2gT3/45=(8.75±0.23)×10-11 (9.1)

which also gives an idea of the baryon to photon ratio: one baryon per thousand million photons. This magnitude will be our input or initial condition for predictions and so it is convenient to keep it arbitrary ηb just bearing in mind it is much smaller than one. About this number there are two things to note, first it is small but much larger than the equilibrium abundance which would be down by the exponential factor e-m/T10-400 but it is also qualitatively different in that the equilibrium abundance for baryons and antibaryons would be the same whereas ηb gives an unbalance between matter and antimatter. One of the main areas in current research is to derive this number from some process in earlier time. This would require new physics and violation of Baryon number and CP symmetry, candidate theories include Leptogenesis and Electroweak Baryogenesis.

The reason to rewind back to MeV energies is that these are typical nuclear binding energies, for example in the case of Deuterium, mp+mn-mD=2.22MeV. One would expect that at temperatures around this value protons and nucleons will start binding into nuclei. This is not the case and the reason is the small baryon abundance; even at MeV temperatures a great number of photons per baryon have energies above MeV and can break apart any nucleus that might form. To quantify this let us look at the equilibrium condition for Deuterium synthesis

p+n D+γ (9.2)
1a3da3nDdt= nD(0)nγ(0)σv(npnnnp(0)nn(0)-nDnγnD(0)nγ(0)) (9.3)

with ΔmD=mp+mn-mD. Photon abundance is the equilibrium one so it drops out inside the parenthesis to yield and equilibrium condition55 5 Here we put in gD=3 since the deuterium has spin 1 (there is a spin zero state but is more massive).

nDnp=nnnD(0)np(0)nn(0)=nn34(2πmDmnmpT)3/2eΔmD/Tηbπ2g30(4πTmp)3/2eΔmD/T (9.4)

So for MeV energies the ratio was still small and most of neutron and protons had not been converted yet. To determine when this ratio reaches order one we can take the logarithm to find

ΔmDT-log(ηb)-32log(T/mp) (9.5)

which gives T=0.07 MeV instead. Once this temperature is reached other reactions can combine Deuterium into heavier elements like D+pγ+3He, but the bottle neck of no stable mass number 5 isotope means 4He+pX+γ is not viable and the chain stops at 4He. The production of heavier elements like carbon had to wait till the matter domination era to take place in stars. This in turns mean that one can break down the primordial formation of nuclei in stages each governed by a simple 2 to 2 process.

9.1 Protons and neutrons

Let us start from the beginning with protons and neutrons. The proton-neutron changing interaction is the weak force responsible for radioactive decay. The process in particular is given by the 4 fermion interaction proposed by Fermi which gave us neutrino decoupling in the previous chapter. One now has

p+e- n+ν  p+ν¯n+e+ (9.6)
1a3da3nndt=nn(0)nν(0)σnνv (npnenp(0)ne(0)-nnnνnn(0)nν(0))+nn(0)ne(0)σnev(npnνnp(0)nν(0)-nnnenn(0)ne(0))

At these temperatures the neutrinos have not decoupled yet (although they will shortly after) so all lepton abundances sit at their equilibrium abundance and factor out so that the two terms combine into

1a3da3nndt=nl(0)σv(npnn(0)np(0)-nn) (9.7)

Where the ratio of equilibrium abundances one can compute to find

nn(0)np(0)=(mnmp)3/2e-(mn-mp)/Te-Δmn/T (9.8)

Note that in this case there is no small number to contradict our intuition that it is around T=Δmn=mn-mp that neutrons will be sizeably depleted into protons. One can also see that at temperatures above this one has as many protons and neutrons with e0=1

Equation (9.7) is therefore what needs to be solved to determine the abundance of protons and neutrons by the time that temperature has dropped to 0.07MeV. What follows is simply a recasting of this equation in a form better suited for its solution and the magnitude observed experimentally. First one is interested in the ratio of neutrons to total baryons and second we should cast all terms of the equation in terms of one variable only, not time here and temperature there. We define then our function and variable in our differential equation as

Xn= nnnn+np x =ΔmnT (9.9)

given that no heavier nuclei are present we have nn+npηbsT3 something that we will use below. The variable x marks the onset of neutron depletion at x=1 while it grows with time which fits better our preset mind state of variables evolving towards the right in plots. To rewrite derivatives in terms of x:

ddt=dTdtdxdTddx=(-HT)(-xT)ddx (9.10)

Hubble itself depends on temperature since we are in the radiation domination era as

H2=8πGN3g*π230T4 (9.11)

given that the equilibrium particles are photons, electrons and neutrinos you can amuse yourself to compute g=10.75 so back in our equation

xdXndx=n(x)σv(x)H(x)((1-Xn)e-x-Xn) (9.12)

the dependence on x of the pre-factors we can work out for Hubble and estimate for the rate Γ as

H =4π3GNg45Δm2x-2 Γnlσv =βΔm3x3GF2Δm2P(x) (9.13)

with β=0.385, P(x) a polynomial in x-1 which one can compute from Fermi interaction as 1+6x-1+12x-2. Let us write the equation then in terms of the interaction and Hubble rates to see explicitly how the ratio determines the evolution

xdXndx=Γ(x)H(x)(e-x-Xn(1+e-x)) (9.14)

One can obtain the equilibrium distribution given by the vanishing of the RHS, this is the distribution that neutrons follow at high temperatures/early times

Xneq(x) =e-x1+e-x Xn(x1) =Xneq(x)12 (9.15)

one can see that his solution is an attractor at early times; if the abundance Xn exceeds (is below) the equilibrium abundance Xeq+δ, (Xeq-δ) the differential equation gives a negative (positive) derivative for the deviation δ which drives the solution back down (up) to equilibrium.

At times x>1 however the equilibrium distribution is not approximately constant and the LHS does not vanish like the RHS does so it is no longer a solution. At late times instead the back reaction which produces neutrons from protons is shut off (as marked by the exponential suppression) and neutron abundance starts to decrease according to

xdXndx =-Γ(x)H(x)Xn Xn =Xn(xf)exp(-xfxdxxΓ(x)H(x)) (9.16)

If the interaction rate over Hubble drops with x this integral converges; a look at eq. (9.13) for large x shows the ratio decreases as x-1 and therefore the convergence means the relative abundance of neutrons will ’freeze’ at a constant value. To estimate this value one can make an estimate for the value xf which is given by the point when the solution deviates significantly from equilibrium and one can take Xn(xf)Xneq. The larger the value of the ratio Γ/H at x=1 the more frequent the interactions that deplete Xn and so the longer the distribution tracks the equilibrium value and less neutrons left at the end of the day.

You might have noticed however that the definition of xf is not very precise; is it when X/Xeq differs by 1, 1/2, 2? In practice one can solve the equation either numerically or as in this case exactly and compare against the choice of xf in the approximation taken. Let’s then here simply give the constant value of neutron abundance at late x,

Xn(x1)=exp(-dxxΓH(1+e-x))dx(-dXeqdx)exp(xdxxΓH(1+e-x))

this gives 0.15 and inspecting the exact solution x dependence this constant value is reached rather quick so

Xn(x>2.6)=0.15 (9.17)

which corresponds to a temperature T0.5MeV. We have then seen our first instance of ‘freeze out’ and how a particle falling out of equilibrium reaches a constant Xn or particle number per comoving volume.

9.2 Deuterium and Helium

The next stage in the evolution is therefore the neutrons with relative abundance no longer modified by interactions with the bath (fallen out of chemical equilibrium) at temperatures below 0.5MeV, where nucleosynthesis has not quite started yet given the estimate in eq. (9.5). In this brief period neutrons can spontaneously decay, which is what a neutron does if left alone. This is accounted for by the simple radiative decay law dXn/dt=-Xnτn with τn the neutron lifetime so

Xn(2.6<x<18.6)=0.15e-(t-ti)/τn (9.18)

One needs to convert between time and temperature (or x) to determine the time that correspond to the initial and final stages in this period, from Hubble we can get

H=(8πGN3gπ230)1/2T2 (9.19)

with the solution for radiation a(t)t, H(t)=1/2t and taken g after e+e- annihilation:

t=132sec(0.1MeVT)2 (9.20)

so that given τn=886.7 sec

Xn(2.6<x<18.6)=0.15exp[-0.00088(x2-2.62))] (9.21)

By the time of the onset of nuclei creation, the neutron abundance is:

Xn(T0.07MeV)=0.11 (9.22)

This is the period of nucleosynthesis proper when neutrons and protons combined into nuclei. As we have seen it is energetically favourable for a proton and a neutron to make deuterium p+nD+γ since mp+mn-mD>0, similarly once enough deuterium has formed the fact that mp+mD-mHe3>0 favours conversion of Deuterium into Helium by the reaction D+pγ+3He. One might think this iteration will go on until the most stable nuclei, Iron, is reached. Nevertheless there is a local maximum of stability for 4He and the unstable mass number 5 isotopes do not live long enough to capture another nucleon and continue the chain.

It is then a good approximation to assume that all neutrons are ‘burned’ into 4He which gives a Helium abundance

XHe4nHenbXn20.055 (9.23)

Observationally we can corroborate this expectation by looking at what are called systems with low metalicities, that is regions of the cosmos where heavier elements have not been produced by stellar combustion. This gives a range

XHe4(0.055,0.063) (9.24)

consistent with our estimate for Helium abundance which nonetheless can use a few refinements for a more precise result.

Not all Deuterium is burned however, for the same reason that not all neutrons got converted into protons originally; although quite efficient the reaction rates eventually drop below the Hubble rate and some value of Deuterium freezes out. The fact that this value is very sensitive to the details of the process and in particular the baryon entropy ratio makes Deuterium abundance ratios a great probe of nucleosynthesis.

This same success of nucleosynthesis, sometimes referred to as big bang nucleosynthesis BBN, means in turn that modifications of our picture will not be consistent with observations which set stringent constraints on models of new physics. An obvious example is the baryon to entropy abundance which cannot be changed by much as can be seen on fig. 13; all values of abundances fall into place at 3-4 10-31g cm-3. Implicit physics that we have used also are subject to constraints since e.g. the evolution is sensitive to the number of relativistic degrees of freedom.

Plot to show the dependence of the light element abundances on the baryon asymmetry and how they lead to a consistent prediction for it
(a) Light elements abundance as a function of the baryon to photon ratio
Abundance of light elements as a function of time, the most common being helium 4 and the least lithium 6
(b) Light element abundance evolution with time
Figure 13: Synthesis of nucleosynthesis results.

Problems

I Revisiting problem I of chapter 8, now that we know the equilibrium abundance of baryons as in eq. (9.1) and assuming neutral universe nenp=1.5×10-10T3, and given that the mass difference or binding energy of Hydrogen is 13.6eV, what was the ratio at T=14eV, and 1.4eV at 0.14eV? Determine the temperature when nH/np=1 and the universe turned neutral.
II Use Friedman equation to reproduce the temperature-time relation of eq. (9.20) recalling that only photons and neutrinos are relativistic at these temperatures (but with different temperatures!)
III Rewrite the RHS differential equation as Γ(x)(1+e-x)(Xneq(x)-Xn(x))/H(x). Given the initial condition X(xi)=Xneq(xi) show that the solution reads

Xn(x) =Xneq(xi)e-(I(x)-I(xi))+e-I(x)xix𝑑xXeq(x)dI(x)dxeI(x) (9.25)
I(x)-I(xi) xixdxxΓ(x)H(x)(1+e-x) (9.26)

where in the second term we have cancelled out the dependence on I(xi). Use integration by parts on the second term to rewrite it as

Xn(x) =Xneq(x)-e-I(x)xix𝑑xdXeqdxeI(x) (9.27)

At late times x1 the equilibrium contribution can be neglected and I(x)0 so compute numerically the integral to obtain X(x1)=0.15 (you can take xi=0.1, β=0.385,Δmn=1.293MeV, GF=1.166×10-5GeV-2 and g=10.75)

10 Thermal dark matter production

This section turns to dark matter and a possible mechanism for its production in the abundance observed today. Nucleosynthesis provided a valuable check of our theory of the hot early universe since observations could be compared with definite predictions based on our knowledge of all particles and interactions involved; in contrast dark matter remains an unknown substance not composed of any of the particles we know of and whose couplings to matter and to itself are unknown. Applying the thermal production mechanism we just saw at play in BBN to dark matter is therefore an unchecked hypothesis albeit one based on a process which has occurred in nature before. In addition, as is often the case in research, extension of a tested idea to new domains helps us develop a scenario where computations can be made and where enough experimental inputs can prove or disprove the original idea.

The basis for the production of thermal dark matter is the interaction with the primordial plasma as described by Boltzmann equation. As our study of this equation shows the relevant parameters to discuss the outcome are the mass or energy threshold, in this case dark matter mass, and interactions, that is the cross section. We will denote dark matter properties like mass or coupling with a dm subindex.

One important consideration about particle dark matter is that it must be very long lived or completely stable since otherwise it would not stick around long enough to be present today. There are a few examples in nature for inspiration of stable massive particles: the proton is the lightest particle with what we call baryon number and hence stable; the electron is the lightest electromagnetically charged particle so it too is stable. We say they are stable due to symmetry, or explicitly because they are the lightest particle to carry certain conserved number. This too can be applied to dark matter by assuming it has some new conserved quantum number which prevents its decay; for example in supersymmetric extensions of the standard model dark matter is the lightest particle to possess R-parity. Here we will assume it is stable or at least τdm>H0-1.

10.1 Freeze out

The case that follows closest nucleosynthesis is referred to as dark matter freeze out. One starts with an equilibrium abundance of dark matter ndmT3 and as temperature drops below the mass of the dark matter particle T<mdm abundance falls and then freezes at some value once equilibrium is lost.

The reaction responsible for dark matter equilibrium with the primordial bath should respect the symmetries we have imposed on the theory; if we have given dark matter a conserved quantum number as assumed here, it cannot be produced one at a time but instead in an amount with no net quantum number. The simplest case is then that it is produced in particle anti-particle pairs so that assuming ndm=nanti-dm

dm+(anti-dm) a+b (10.1)
1a3da3ndmdt= (ndm(0))2σv(nanbna(0)nb(0)-ndm2(ndm(0))2) (10.2)

with particles a and b being either known or other proposed particles but in any case in equilibrium with the bath. This means that the dependence on their abundance and equilibrium abundance cancels out to give

a-3da3ndmdt=σv((ndm(0))2-ndm2) (10.3)

which is a closed system for the evolution of dark matter abundance. This simplifies remarkably the production and gives the mechanism part of its appeal in the prediction of abundance as a function of only its couplings σv and mass mdm given that the initial conditions are set by assuming equilibrium.

It remains then to ready up the equation for solution with the same steps we are already familiar with now defining

Y ndmT3 x= mdmT (10.4)

where Y is the yield which gives roughly the number of dark matter particles per particles in the bath. With these definitions

xH(x)dYdx =mdm3σvx3(Y(0)2-Y2) (10.5)
xdYdx= Γ(x)H(x)(Y(0)2-Y2) (10.6)

where we defined the rate and equilibrium yield as

Γ(x) =m3x3σv Y(0) =gχx3m3e-E/Td3p(2π)3=gχ4π2z2𝑑ze-x2+z2 (10.7)

so that Y(0) is approximately a constant at low x but it decays exponentially at large x signalling that the production of χ+χ pairs is no longer energetically viable. In general σv itself depends on x but for renormalizable interactions this dependence is mild for the range of interest and the estimates we derive here will be for constant cross section.

Freeze-out dark matter density lines as we cross its mass threshold in temperature, the largest the cross section the smallest abundance

The two limiting behaviours of the abundance are again simple to study. At early times x1 the equilibrium distribution over temperature cube is a constant which makes the RHS in eq. (10.6) vanish in accordance with zero derivative on the LHS, Y(x)=Y(0)(x). As a side comment, although a plot of Y(0)(x) for x<1 does not seem constant the derivative on eq. (10.6) is a logarithmic derivative xddx=d/dlog(x); in terms of log(x) the distribution does resemble a constant for x<1, log(x)<0.

At late times or low temperatures x1, the equilibrium yield falls off exponentially and can be neglected to yield an equation

xdYdx =-Γ(x)H(x)Y2=-λxY2 λ =mσv4π3GNg/45 (10.8)

where given our approximations λ is a dimensionless constant which gives the rate to Hubble ratio at x=1. This should influence the final abundance in that the largest λ the smaller abundance of dark matter particles left. This is simple to see play out in solving the equation to find

1Yf-1Y =λx-λxf Y(xxf) xfλ (10.9)

where in the second line we assumed the final abundance is much smaller than that when the distribution deviates from equilibrium Yf-1Y-1(xxf) (YfY(xxf)).

This shows the simple inverse relation that gives a larger abundance for weakly coupled particles yet the exact value depends on xf. The differential equation which we are working with now does not have a known analytic solution and so the estimates for xf rely on numerical studies. An approximation often used which works surprisingly well for its simple-mindedness is to take xf=10 independent of λ, in the problem set you can derive a more precise relation yourself. Then we have a yield frozen out in the value

Y(xxf) =xfλ ndm =xfλT(x)3=xfλT(x1)3a(t1)3a(t)3 (10.10)

where we express it in terms of the scale factor rather than temperature since as we have seen already the scaling Ta-1 is not followed when going through thresholds.

This is relevant since the abundance of dark matter is measured today as

ρdm=mdmndm(t0)=0.27ρcr=1.3(h0.67)2KeVcm3 (10.11)

so we evolve the abundance with a bit of trickery as

ρdm(t)=mdmxfλT13a(t1)3a(t)3=mdmxfλT03(a1T1a(t)T(t))3=mdmxfλT03g(t)g(t1) (10.12)

where there is a dependence on the relativistic degrees of freedom by decoupling time. One can write the equation above in terms of the cross section and with appropriate units as

(h0.67)2Ωdm=2xfg10-38cm2σv (10.13)

where the mass drops out (neglecting the mild dependence in g). So the right abundance would be reproduced for a cross section around femtobarn (barn=10-24cm2), this value is familiar from collider physics with this being the typical weak process cross section, to be precise this weak cross section would involve the mass too as GF2mχ2 so the mass of dark matter if the interaction is the weak force would be 10 GeVish. The fact that the choice of values which are around typical weak cross section (like that of the neutrino and proton to neutron conversion) and electroweak scale masses was taken to be a strong indication for models of WIMP dark matter, like those based on supersymmetry, going as far as to call it a miracle. It is indeed strong theoretical evidence, (although divine intervention not needed), that the solution to two independent open problems in physics, i.e. electroweak hierarchy problem and dark matter points towards particles of the same mass/couplings.

10.2 Freeze-in

We can revisit our initial conditions to find a thermal production mechanism with qualitative different features while still described by Boltzmann equation. The freeze out mechanism described above assumed an initial equilibrium abundance; this is a plausible possibility since as we have seen the equilibrium solution is locally an attractor and we believe that all standard model particles used to be in this state. For the standard model particles this was derived considering their typical couplings, α, as compared to the Hubble rate so for temperatures Teq4πα2MPl1016GeV and below an equilibrium abundance T3 was reached. However for dark matter we do not know what is the value of the couplings but what we can say is that the smaller it is, the longer it would take to attain equilibrium Teq4παdm2MPl. In particular if this temperature is smaller than the mass of the dark matter particle equilibrium is never attained since by temperature Teq the production is not energetically viable. So roughly speaking

αdm2MPl >mdm equilibrium is attained (10.14)
αdm2MPl <mdm equilibrium is not attained (10.15)
Dark matter density lines as we cross its mass threshold in temperature for freeze in, the largest the interaction  the more abundant
Figure 14: Freeze in

The second possibility follows from feeble (i.e. small) couplings and is what we explore next. Although there are a few possibilities for the reaction that produces feeble dark matter including three body interactions, here we will take the one closest to what we have done thus far in the reaction C+χA+B where all particles A,B,C are in thermal equilibrium with the bath. For simplicity we take dark matter to be the only relevant mass of the four particles.

The basic tools to describe this scenario we have now used a good number of times, Boltzmann equation reads

a-3da3(t)ndmdt=ndm(0)nC(0)σv(1-ndmndm(0)) (10.16)

Our customary refurbishment of the equation gives

xdYdx=nC(x)σv(x)H(x)(Y(0)-Y)Γ(x)H(x)(Y(0)-Y) (10.17)

This is a linear inhomogeneous first order equation and hence we can solve it exactly, nonetheless given our assumption of equilibrium never quite reached one does not even need the full solution. Neglecting Y in the RHS which is assumed never reaches Y0 before x=1 one can approximate

xdYdx =ΓHY(0) Y(x) =dxxΓHY(0) (10.18)

Given that Y(0)gχπ2/30 for x1 and Y(0)e-x for x1 we can further approximate it to a step function whereas taking the cross section to be σv4παdm2/T2 one obtains

Y(x)gχgC(π230)2MPl4παdm2mdm4π3g/45𝑑xΘ(xfi-x) (10.19)

So in this rough picture the abundance initially grows linearly with x up until x=xfi when it reaches a plateau. The actual shape can be appreciated in fig. 14.

Converting the yield to present day energy density involves the same manipulation of scale factors and temperature as for freeze out so one has:

ρdm(t)=mdmY(xxfi)T03g(t)g(t1)=gdmgC(π230)2MPl4παdm2xfimdm4π3g/45T03g(t)g(t1) (10.20)

which gives an relative contribution of

Ωdm=2.6(2xfigdmgCg3/2)(4παdm210-28) (10.21)

which again is independent of the dark matter mass while the prefactor point to the right relic abundance for a very small coupling αdm10-15.

Finally one can synthesize the two mechanisms in terms of the same variables with σv(xfi)=4παdm2/mdm2 so that

Y HΓ|xf1mdmMPlσv(xf) Freeze out (10.22)
Y ΓH|xfimdmMPlσv(xfi) Freeze in (10.23)

which can be regarded as the outcome when expanding on a small parameter, a small rate of interactions over Hubble for freeze in or a small Hubble over the interaction rate for freeze out. Note that in both cases yield is proportional to the small parameter and Y1. At the same time, a valuable property that these mechanisms share is the independence on the initial time/ temperature xi where one assumes the evolution starts.

Problems

I Substitute in eq. (10.13) σv=GF2mdm2 and find the mass of dark matter as a function of xf, g
II For eq. (10.21) of the freeze in substitute 4παdm2=mdm2σv and derive a cross section mass relation, for a cross section around 10-12ab with ab=10-15b and b=10-24cm2 what would dm mass be? You can take g=106.75
III A more accurate estimate of the freeze out time is xf such that

n(0)(xf)σv=H(xf) (10.24)

substitute in the equilibrium abundance for a non relativistic particle to find the explicit equation for xf.
IV Both freeze in and out predict small yield Y, can you recall any instance in which a particle retains an order one yield after decoupling? If we use this third case to produce dark matter what would be the approximate mass required (you can neglect changes in g)?

11 Inflation

Experimental evidence shows that Einstein field equations describe the dynamics of the universe back to a hot bath of particles of temperature at least TMeV when the expansion was dominated by radiation. One can take the scale factor for this period, that is for t<teq as:

a(t)=1zeq(t2.1×1012s)1/2 (11.1)

so matter-radiation equality occured when the universe was about hundred thousand years old. Radiation domination at early enough times is a consequence of the scaling with a of its energy density ρr being the most steep of the known substances; it would seem reasonable then to extrapolate radiation domination back to arbitrarily small time t0. If we do this however we run into the horizon problem.

Recall from sec. 3 that events that ocurred in opposite directions in the sky at time t and whose light is reaching us now could have been in causal contact or not depending on

η(t) η(t0)/2causally connected (11.2)
η(t)< η(t0)/2causally disconnected (11.3)

Take for example two photons from the CMB, decoupling occured at t*380.000y a bit into the matter era, taking this into account and computing η(t*) as (explicitly we take z*103)

0t* dta(t)=0.075H0-1109y 0t0 dta(t)3.2H0-160×109y (11.4)

The ratio is much less than 1/2 so these two photons had not been in causal contact, how come then the temperature of the CMB is the same in all directions to one part in a hundred thousand? 66 6 It is worth remarking that even though we have evidence of events prior to decoupling this is the furthest we can see, before decoupling light did not travel in a straigh line

11.1 Solving the horizon problem

To restore causality we have to give up our extrapolation of radiation domination all the way back to t=0 and substitute it by another a(t) behaviour. In order to make the events we see in the sky causally connected we should push them into the upper half of η(t0), which can be accomplished if we make η grow much faster than ordinary time in an early phase. In other words, make the first integral in eq. (11.4) for some early time interval much larger than the interval itself, inflate it. As to how much larger, given that the universe is radiation dominated at least as early as BBN (t1s), we want the conformal time for an event at some time te smaller than a second to be already larger than half η(t0)=3.2H0-1=3.4t0, i.e.

0tedta(t)3.42t0 (11.5)

Let us cast this in a slightly more readable from as

daaaeHeaHaeHe3.42t0108ste (11.6)

where for aeH(te) we used the radiation domination formula assuming the universe has just started that phase. The integral on the LHS has a first term which is dloga and so the contribution during an e-fold a(t)/a(t)=e in the integral is roughly given by the remaining integrand if it can be taken as a constant. The integral must yield a hierarchy of at least 8 orders of magnitude, in reality, typical models solving the horizon problem operate so early (t10-32-10-40s) that the number of orders of magnitude to be accounted for are 50 or 60 e-folds; what could possibly produce such a hierarchy?

A very mild requirement for this integral to be much larger in the past is to have the integrand be larger the further back in time or smaller a, that is

ddt 1aH<0 ddt1a˙(t) =-a¨(a)2<0 (11.7)

That implies accelerated expansion d2a/dt2>0; none of radiation, matter or even curvature can reproduce this behaviour but we have encountered a case that does the job.

Assume then an epoch of cosmological constant domination, we would then have

a(t)=a(te) eΛI/3(t-te) t <te (11.8)

Then the conformal time integral reads

aeHedta(t)=Hetite𝑑teHI(te-t)=HeHI(eHi(te-ti)-1) (11.9)

If one approximates the Hubble rate at the end and during the acceleration to be approximately the same the expression simplifies to

eHIΔt=a(te)a(ti)>102210-32ste1010TMeV (11.10)

Hence one can account for the large hierarchy with an exponential of the fundamental parameters of the theory, it suffices that the time interval last some 60 times the Hubble time during inflation. In terms of universe growth it has to be in an accelerated phase for that many efolds, i.e it inflated by a factor of 1027. Note that we were lead to this solution by the mild condition of eq. (11.7).

Diagram of scale factor vs time to show how inflation requires a very fast increase
Figure 15: Inflating a to solve the horizon problem, the blue and red areas are supposed to be the same

We turn next to the implementation of this mechanism in quantum field theory.

11.2 Realization from QFT

One of the differences with the accelerated expansion we have today and the phase which might have happened at very early times is that the latter has to end and give way to radiation domination. Another quantitative difference is that the cosmological constant during inflation and nowadays differs by many, many orders of magnitude (see problem sheet). These differences point towards something not as ‘simple’ as a cosmological constant, so one should explore the possibility of a transient very large cosmological constant.

To explore this possibility we look into the evolution of the universe when Hubble is dominated by a the energy density from a scalar field with a potential. We only know of one (possibly) elemental scalar field in nature which could play this role, the Higgs boson. Nevertheless the Standard Model as we know it does not have the right couplings to produce inflation so here we will assume our scalar is another scalar particle, conventionally called the inflaton ϕ. We derive our dynamics from the action:

d4x|g|(-R16πGN+12μϕμϕ-V(ϕ)) (11.11)

This action has a energy-momentum tensor:

Tμν=μϕνϕ-gμν[(ϕ)2/2-V(ϕ)] (11.12)

so that Einstein field equations together with the scalar EOM

Rμν-12gμνR =8πGN(-2|g|δδgμν|g|ϕ)
=8πG(μϕνϕ-gμν[(ϕ)2/2-V(ϕ)]) (11.13)
μμϕ =-V(ϕ) (11.14)

Substituting in now our beloved homogeneous approximation for the metric and scalar field as ϕ=ϕ(t), μϕ=(ϕ˙,0).

Tμν=(12(ϕ˙)2+V(ϕ)a2(12(ϕ˙)2-V(ϕ))δij)=(ρa2𝒫) (11.15)

Where the associated energy density and pressure can be read off. In particular one sees that if the potential term dominates (and is positive) there could be negative pressure and a phase of acceleration.

The EOM with the homogeneous ansatz collapse to three equations

3H2 =8πG((tϕ)2/2+V(ϕ)) (11.16)
-3H2-2H˙ =8πG((tϕ)2/2-V(ϕ)) (11.17)
ϕ¨+3Hϕ˙= -V(ϕ) (11.18)

were Einstein field equations can be linearly combined into

H2= 8π3GN(ϕ˙22+V(ϕ)) H˙ =-4πGN(ϕ˙)2 (11.19)

If an accelerated expansion takes place the H2 term dominates and is approximately constant which is to say a small ratio ϵ

ϵ-H˙H2 (11.20)

so one of the conditions for inflation is ϵ1. Nonetheless one shall inspect the EOM for ϕ also to ensure that a constant value of ϕ to yield a constant V(ϕ) is dynamically viable. The EOM for ϕ resembles a point mass in one dimension subject to a potential but in a viscous substance that adds drag H. The field can then move slowly if the drag dominates over the kinetic term. This gives the second condition for inflation, which is for it to last enough, the ϕ EOM should be dominated by the friction term, i.e.

δ-ϕ¨Hϕ˙ (11.21)

should also be small. If it is indeed small and so is ϵ one can approximate:

3Hϕ˙ -V ϵ 3ϕ˙22V=V26H2V=116πGN(VV)2 (11.22)
3H˙ϕ˙+3Hϕ¨ -V′′ϕ˙ -3 ϵ-3δ-V′′H2=-38πGNV′′V (11.23)

While the two conditions δ,ϵ1 are satisfied there will be an inflationary phase.

Nonetheless, inflation has to end too! One can estimate the duration of inflation from the potential, and it should be such that the growth of the scale factor is the huge value given earlier, so let us write

Ne-folds=51-12log(te10-32s)=H(t)𝑑t=Hϕ˙𝑑ϕ=ϕiϕe4πGNϵ𝑑ϕ (11.24)

where ϕi is the value of the field at the beginning of inflation and the fvalue at the end we determine from ϕe=ϕ(te)

MPlV(ϕe)V(ϕe)1 (11.25)

The scenario we described is referred to as slow roll yet this might be quite misleading if using our intuition for balls rolling down slopes; the physical system here does not have quite the same dynamics, for one the Hubble friction depends on the potential itself and is larger the higher V so for example a ϕ2 potential which you might think is too step can do the job.

These are the basic dynamics of inflation at the homogenous and isotropic level, we have seen that there is the possibility of a scalar field stuck in a constant value for some period and eventually accelerating out of this phase. The inflationary phase is so violent that dilutes all possible initial abundances and possible curvature; the latter is actually a convenient by-product since as we know the universe is observed to have no curvature. As for the dilution of all particle abundances, this poses a potential problem since the universe should enter a hot radiation era after inflation; the abundance of standard model particles should increase up to their equilibrium abundances. This is assumed to happen in the re-heating period when the inflaton decays and produces Standard Model particles. Although successful reheating is not guaranteed and deserves its own study here we shall not go any deeper into it.

11.3 Initial perturbations

Primordial tilt vs tensor to scalar ratio showing the experimentally allowed regions and how they exclude some models while others like R squared are still compatible
Figure 16: Bounds on inflation parameters primoridal tilt ns=1-6ϵ+4δ and tensor-to-scalar ratio r=16ϵ for different models.

The uses of inflation extend beyond solving the horizon problem which lends further credibility to the theory. Inflation also provides the initial perturbations that seeded the distribution of matter as we observe it today. One of the consequences of the violence of inflation is that the causality of general relativity interferes with quantum mechanics.

Vacuum fluctuations of momentum Δp are allowed by the uncertainty principle to live within a distance ΔxΔp, however inflation is stretching out space and does it so fast that it might causally disconnect the region Δx. In that case the perturbation cannot evolve in time any longer and is said to be frozen out or grown larger than the horizon. The basis for this prediction is then the computation of the spectrum for vacuum oscillations, but this has to be computed in a de-Sitter vacuum |dS, that is, an accelerating universe

dS|ϕ(x)ϕ(y)|dS (11.26)

You might be familiar with this object in the case of QFT with a Minkowski vacuum; it is the field propagator or two point function d4keik(x-y)/(p2-m2). The computation here requires quantization in curved space time; the tools to carry it out we have not acquired yet so we will not reproduce the result here. The promise of this computation is awe inspiring in its ambition nevertheless: to account for the seeds of the universe as we know it, us included. The production of these initial perturbations can be compared with the inhomogeneities and anisotropies observed today and although the theory has indeed passed some non-trivial tests at the moment we do not know much about the potential of inflation and we have not stress tested the theory. Measurements of homogeneities can be translated to parameters of inflation, namely combinations of ϵ and δ as reported in fig 16.

Problems

I Present conformal time, assuming no inflationary phase is given by late times and is approximately η(t0)=3.2H0-1. Compute η(t1/2)=η(t0)/2 assuming you can take a matter dominated universe for the whole integration interval. What is the redshift and CMB temperature for t1/2? Had decoupling occured (T*0.3eV)?
II Take the potential to be a mass term V(ϕ)=m2ϕ2/2 and compute ϵ,δ. For small paremeters does the value of the field have to be above or below MPl? For that value is H above MPl? Compute the e-folds in eq. (11.24) taking ϕe from eq. (11.25), and write it as a function of ϵ(ϕi). If we approximate ϵ, δ, during inflation to be evaluated at this slow rolling value ϕi we have a predition for ns and r, where do they fall in fig16 (take N 50 and 60).

12 Cosmology as a field theory

The smooth hot expanding universe study has produced a number of tested predictions that confirmed the theory together with other untested yet plausible ones as we have seen in the previous lectures. We have indeed managed to learn a good deal and to turn back the dial of time to find what the universe used to look like to a good approximation.

This description nonetheless does not account for the fact that the universe is not exactly the same everywhere and the stress energy tensor and curvature vary from place to place, even at the largest scales.

Nonetheless both at the largest scales and at early times this space dependence of energy density and curvature is a small correction, in the order of one part in a hundred thousand, which means we can roll out the familiar machinery of perturbation theory. Even in the regime in which perturbation does not hold, like in structure formation, one can resort to numerical solutions and simulations, although we won’t cover these here.

Moving into this more detailed examination of the cosmos it is important to highlight a distinction. There will be aspects of what we see that we cannot account for like why does our group of galaxies contain a few dozen; what theory will be able do is trace back our current situation in time or give us how big are groups of galaxies on average.

This is to say, as we resolve spatial distributions, one makes use of statistical physics to treat the data whereas on the theory side full fledged quantum field theory in curved space time is put to use.

12.1 Perturbation theory for Cosmology

The advantage of perturbation theory is that the groundwork has been laid down with our first order study of the universe and we’ve made the journey once from theory to experiment. Now one needs only trace back our steps and add in a small space-dependent modification. In practice the algebra will get considerably more complicated, partly because we are aiming at describing more complex experimental data, partly due to the need to revisit our variables with new terms in our metric and the structure of general relativity.

So one starts from a metric with small deviations Ψ,Φ,B,h from FRW as:

gμν=g¯μν+δgμν=(1-ΨBiBi-a2((1+2Φ)δij+hij)) (12.1)

where g¯ is the metric we have been working with thus far g¯=Diag(1,-a,-a,-a). Filtering this through the machinery of Riemannian geometry and keeping only linear terms in the perturbations

Γνρμ =Γ¯νρμ-δgμμΓ¯μνρ+g¯μμ12({νδgμρ}-μδgνρ) (12.2)
R =R¯+δR (12.3)

to arrive at modified Einstein field equations

δRμν-R¯2δgμν-δR2g¯μν=8πGNδTμν (12.4)

The LHS can be obtained following a familiar yet now computationally intensive procedure.

The decomposition theorem tells us that perturbation with different space indices, that is to to say on different representations of the subgroup of rotations of the Lorentz group, evolve independently at the linear level. For simplicity we focus on the scalar perturbations next yet we note that the two-index (traceless) object hij describes possible gravitational wave perturbations.

The RHS of Einstein field equations nonetheless must be revised as well and to do so we go back to Boltzmann equations. For a collisionless set of particles the distribution function change with time at some phase-space point p,x is given by the particles that ’move into’ (x,p) minus the particles that leave it. The particles that move in were at position x-vδt and momenta p-dp/dtδt an instant δt before so the change in the distribution function:

δf =f(x-dxdtδt,p-dpdtδt)-f(x,p) (12.5)
ft =-dxdtxf-dpdtpf (12.6)

Or in other terms total derivative df/dt=0. This follows from any system whose dynamics are given by a Hamiltonian (itself coming from a Lagrangian) and the statement that the distribution function has zero total time derivative follows from Liouville’s theorem.

Nonetheless in our system there are collisions and the expansion of the universe makes the definition of our variables not evident a priori. In particular recall that the momentum satisfies, for a particle of mass m:

gμνPμPν=(1-Ψ)(P0)2-a2(1+2Φ)PiPjδij=m2 (12.7)

it is conventional to define

p2 =a2(1+2Φ)(Pi)2 E2 =(1+2Ψ)(P0)2 (12.8)

and find that, after some algebra and chain rules

ft-viafxi-E(vaΨ+v2(H2+tΦ))fE=C[f] (12.9)

where obtaining this equation required the modified geodesic equation, we changed to polar coordinates in pi, changed the derivative wrt to the modulus p to the energy E variable and neglected derivatives wrt the angle since they give a second order term. The RHS will depend on what species and regime one is looking at, e.g. for photons after e+e- annihilation the relevant interaction is Compton scattering.

One can then close the system with

6H(tΦ-HΨ)-22a2Φ=8πGNρδ (12.10)

where we expanded the energy density as ρ=ρ(1+δ). Two simplifications help see clearly the physics at play, Fourier transforming on x to k (not to confuse with p) one realizes the PDE’s go to independent ODE’s.

Φ(x,p)=d3k(2π)3eikxΦ~(k,p) (12.11)

and second all space derivatives in co-moving coordinates come with 1/a we can simplify this t-dependence by using conformal time instead.

All in all the distribution function equation takes the schematic form

ddηf~+ikpEf~=aC[f]+J (12.12)

where J is a source term linear in the remaining perturbations and the collision term C[f] which we did not make explicit is nonetheless expanded to be linear in f~. A word of warning about the literature is that the tilde to denote Fourier transform is often dropped.

Without being explicit and solving differential equations, the behaviour of the solutions can be understood from two observations:

  • First the linearity means all different k-modes evolve independently, that is, an initial distribution f(η0,k)=δ3(k-k0) will maintain it’s monochromatic shape. This is true to our linear approximation, where we can use our intuition of quantum mechanics, or linear superposition to understand that the modes evolve differently.

  • The solution to the homogeneous collision-less part, is

    f=Ce-ikμη μ =kv=kpE (12.13)

    where we note that for modes k with kη1 there is no time evolution

    ηk1  η2πL  eikηconstant (12.14)

    This does indeed match one’s intuition from causality. The perturbation on scales L larger than the horizon, are static up until causal physics can operate. The fact that this behaviour can be accounted for with causality means that the conclusions are more general and apply to the general solution.

We have seen therefore that conformal time and Fourier analysis significantly clarify the evolution of perturbations and allow for distinction of two regimes as dictated by causality.

12.2 Observational data

One can look experimentally for inhomogeneities and anisotropies in a number of ways. If looking at matter distributions one has

δm(x)=ρm(x)-ρmρm (12.15)

Matter is nonetheless predominantly dark, an observational difficulty which is overcome by assuming baryonic matter follows dark matter or looking at the gravitational influence of dark matter like lensing effects. Let us assume then that via galaxy surveys or other means we resolve a 3-dimensional spatial distribution of matter in the universe. Then contrast with theory of this distribution is not yet straight forward; as we outlined, theory cannot predict that an overdensity will develop here or there and one should look instead at averaged or relative magnitudes. A way of doing this while maintaining a 3-dimensional data set is to look at correlations of the field squared, these will depend on the relative distance with enough statistics:

δ(x)δ(y)=G(|x-y|) (12.16)

Where, at the risk of being pedantic we introduced double angle’s to signify averaging over distributions. It must indeed be so for the RHS to match the LHS, since every single element in the average in the LHS is a the product of a function of x and a function of y but the outcome on the RHS is a function of the difference x-y.

This brings us to statistics and the ‘problem’ of having one universe only to look at (this is the way physicist actually think). This problem is only acute at the largest scales nonetheless, if one is interested in say k10 Mpc-1 scales and below in our Gpc sized visible universe one can chop up the distribution into (1000/10)3 boxes and take each one as a different distribution so that there is enough statistics to obtain a meaningful correlation. It is then at the largest scales where having one universe only means we do not have enough data to compute eq. (12.16). This statistical limitation is at times referred to as cosmic variance. At the same time going to smaller scales, larger k, we deal with perturbations which have undergone more processing to form structure like clusters of galaxies and have enter a non-linear regime. This makes extraction of the original perturbations at earlier times a more complicated problem. It is for this reason that when testing the for largest scales of smaller k is a more direct probe of early-time conditions like inflation.

This wave-number discussion is also good to delimit the validity of perturbation theory and the scales at which it breaks down and structure forms which is very interesting on itself.

Given our preference for Fourier analysis, the conventional definition for field-squared correlations is given instead in wave-number space in terms of the power spectrum P as:

δ~(k)δ~(k)=P(|k|)(2π)3δ3(k-k) (12.17)

Much of the talk in cosmology is about this function P, the power spectrum and to what degree is it scale invariant as inflation predicts. The program nonetheless is to compare the power spectrum today with the power spectrum when the perturbations where seeded, for which one needs to evolve them (back) in time with the equations above.

The other major experimental source of input on the perturbations is the CMB itself, which provides a snapshot of the universe at t 380.000y. The presence of inhomogeneities is clearly seen as anisotropies in the CMB in a celebrated image whose resolution has been steadily improving in the last two decades to take the form of fig. 17.

Map of temperatures variations for the CMB, looks like a corrugated egg
Figure 17: Can you see the l’s for BAO?

These are areas slightly colder or hotter than average and offer a picture of the spatial distribution of matter at the time when the CMB photons where able to free stream. So it is best for comparison to write our Boltzmann distribution perturbation as:

fγ=1eE/(T(t)+Θ(t,x,p))-1eE/T(eE/T-1)2ΘET2 (12.18)

Unlike for galaxy surveys and other matter-related observables, the small perturbations stood small in the case of radiation so one can extend the wave-number analysis to smaller scales. Nonetheless it is not fully 3 dimensional information that we have on the CMB but bi-dimensional given by the direction one is looking at. So our Fourier decomposition is now discreet and given as

Θ(x,p,η)= l,mal,m(x,η)Yl,m(p^) 𝑑Ωp^Yl,m*Ylm =δllδmm (12.19)

where p^=p/|p| and now the two-field correlation reads:

alm*alm=δllδmmCl (12.20)

with m-independent variance Cl. One can take the average then over different m,m and the statistic limitation of the largest scales translates on the first few ls having just 2l+1 m terms. This in turn can be related to the power spectrum in the radiation field

Cl=𝑑Ωp^𝑑Ωp^Ylm(p^)Ylm*(p^)Θ(p)Θ(p)=2πk2𝑑kPΘ(k) (12.21)

Which can itself be to a matter power spectrum, see [5].

Fourier decomposition for the temperature variation in the CMB

In addition there is more information on the CMB if one can measure the polarization of the photons, which is and objective experiments such as Bicep are pursuing.

12.3 Initial perturbations from inflation

The theory of inflation does more than growing the conformal time to solve the horizon problem; it provides the possible source of the initial perturbations. These are the seeds that through dynamics grow into the structure of everything we see today, although to explain ‘us’ inflation does not suffice but we need the matter over anti-matter asymmetry. The origin of these perturbation is in this context quantum fluctuations of the fields in inflation.

The process works approximately as this; the relevant fields and perturbations are those of the metric and the inflaton field. The inflaton field is driving inflation as some ϕ¯(t) to leading order which we refer to as the background field, just like there’s a background metric with a(t). On top of this evolution quantum fluctuation of the field are allowed at the quantum level by the uncertainty principle, so we have ϕ=ϕ¯+δϕ

dS|ϕ|dS =0 dS|ϕ(k)ϕ(k)|dS =Pϕ(k)(2π)3δ3(k-k) (12.22)

Where by dS we mean the de-Sitter vacuum, that is an universe undergoing accelerated expansion. So to compute the power spectrum one should compute the two point correlator of the field in a background curved and dynamic space. This, although in an unfamiliar setting is a process which you have seen in QFT and here it can be simplified by using conformal time and some field redefinitions but it falls out of the range of these lectures, you can consult [5].

This spectrum of perturbations is however intrinsically quantum so they do not stick around, another key aspect of the background field is the acceleration of the scale factor. This means that scales k in our spectrum will become larger than the horizon at some time. After this causal physics cannot operate and the perturbation stays frozen up until the inflationary phase stops and conformal time grows large enough so ηk1. Borrowing results from the full computation the primordial power spectrum reads

Pϕ(k,η) =(2π)3k3+n atkη=1 (12.23)

So the fluctuations are snatched away by causality and come back. The evolution is simple then for those perturbations which evolved the least, or those that re-entered horizon the latest. These are the largest scales which is why we say that inflation is probed on the small l range of the angular power spectrum.

Finally we note that perturbations are produced in this way for all fields involved in inflation and this includes gravitons which we may hope to detect given that it’s proven possible for black hole and neutron mergers.

References

  • [1] P.A.R. Ade et al. (2016) Planck 2015 results. XIII. Cosmological parameters. Astron. Astrophys. 594, pp. A13. External Links: 1502.01589, Document Cited by: §5.5.
  • [2] N. Aghanim et al. (2020) Planck 2018 results. I. Overview and the cosmological legacy of Planck. Astron. Astrophys. 641, pp. A1. External Links: 1807.06205, Document Cited by: §1.1, §5.2, §5.3, §5.4, §5.6, §7.1.
  • [3] M. Aker et al. (2020) First operation of the KATRIN experiment with tritium. Eur. Phys. J. C 80 (3), pp. 264. External Links: 1909.06069, Document Cited by: §5.2.
  • [4] E. J. Copeland, M. Sami, and S. Tsujikawa (2006) Dynamics of dark energy. Int. J. Mod. Phys. D 15, pp. 1753–1936. External Links: hep-th/0603057, Document Cited by: §5.6.
  • [5] S. Dodelson (2003) Modern cosmology. Academic Press, An Imprint of Elsevier, San Diego, California. External Links: ISBN 9780080511979 Cited by: 1st item, §12.2, §12.3.
  • [6] A. Liddle (2015) An introduction to modern cosmology. Wiley, Chichester, West Sussex, United Kingdom. External Links: ISBN 1118690257 Cited by: 3rd item.
  • [7] S. Weinberg (2008) Cosmology. Oxford University Press, Oxford New York. External Links: ISBN 9780191523601 Cited by: 4th item.