General covariance, diffeomorphism invariance, and background independence

My attempts to understand the significance of diffeomorphism invariance in general relativity have been hampered by the confusion surrounding active vs. passive transformations, invariance vs. (general) covariance, background independence, etc. This post comprises my ambitious attempt to settle the matter once and for all.

Let’s start with invariance vs. covariance, which is relatively straightforward. A quantity is invariant under a transformation if it remains unchanged; that is, if {F} is a functional of the fields {\phi}, and we make the transformation {\phi\rightarrow\phi'}, then {F[\phi]=F[\phi']} means that {F} is invariant under this transformation. Effectively, invariant quantities transform as scalars.

Covariance is the invariance of the form of physical laws under a given transformation. General covariance is a slight extension of this, in which their form is invariant under arbitrary (differentiable) coordinate transformations. For example, the action of a real scalar field {\phi} is invariant under a Lorentz transformation, while the Klein-Gordon equation is Lorentz covariant (meaning that if {\phi} satisfies the equation of motion, then so will {\phi'}).

On to passive vs. active transformations. A passive transformation is merely a change of coordinates. In the case of the Lorentz group, which takes {x^\mu} to {x'^\mu=\Lambda^\mu_{~\nu}x^\nu}, we define {\phi'(x)=\phi(x')=\phi(\Lambda x)}. In other words, we think of {\phi} and {\phi'} to be the same field configuration, such that the new function in the original coordinates is the same as the original function in the new coordinates.

Active transformations, though ostensibly more abstract, are formally easier to understand. Consider two manifolds {M} and {N}, respectively equipped with coordinate charts {x} and {x'}. Let {\phi: M\rightarrow\mathbb{R}}, and {\phi':N\rightarrow\mathbb{R}}. Now consider a diffeomorphism {f:M\rightarrow N}. Then the original field {\phi} is related to the transformed field {\phi'} via the pullback, {\phi(x)=(f^*\phi')(x)\equiv(\phi'\circ f)(x)}. In general of course, {\phi(x)} and {\phi'(x')} may map to different points in {\mathbb{R}}. But if we demand {\phi'(x')=\phi(x)}, one can see from this picture that this imposes that the new field configuration (on the new manifold) nonetheless maps to the same point in {\mathbb{R}}. (Incidentally, the fact that this image doesn’t work for “passive diffeomorphisms” leads me to think that they don’t exist, i.e., that a diffeomorphism is “active” by definition. In part for this reason, we shall henceforth take the unqualified “diffeomorphism” to mean an active diffeomorphism, and relegate “passive diffeomorphism” to “coordinate transformation”).

In short, {\phi'(x)=\phi(x')=\phi(\Lambda x)} is a passive (Lorentz) transformation, while {\phi'(x')=\phi(x)\implies\phi'(x)=\phi(\Lambda^{-1}x)} is an active (Lorentz) transformation. The former amounts to a mere coordinate redefinition, while the latter specifies an entirely new field configuration; this answer [1] on Stack Exchange contains a helpful illustration of the difference (see also [2]).

In practical terms, the distinction between passive and active transformations amounts to a choice of convention. But when discussing diffeomorphism invariance, general covariance, or background independence in the context of general relativity, the distinction is important, and the failure to accord it due care can lead to a great deal of confusion. In particular, the salient feature of (active) diffeomorphisms is that they generate new metrics, while (passive) coordinate transformations merely re-express the original metric in new terms. To highlight the difference, consider the wave equation in curved spacetime,

\displaystyle \left( g^{\mu\nu}\nabla_\mu\nabla_\nu+\xi R\right)\phi(x)=0~, \ \ \ \ \ (1)

where {\xi} is the coupling constant (where minimal coupling means {\xi=0}). Obviously, a coordinate change will preserve solutions to this equation: they’ll simply be mapped from one coordinate system to another. It is therefore generally covariant. But it is not diffeomorphism invariant, since a diffeomorphism changes the metric {g}. In contrast, in Einstein’s equations

\displaystyle R_{\mu\nu}-\frac{1}{2}Rg_{\mu\nu}+\Lambda g_{\mu\nu}=\frac{8\pi G}{c^4}T_{\mu\nu} \ \ \ \ \ (2)

the metric is the very thing we’re solving for, so (2) is diffeomorphism invariant by construction.

When Einstein first introduced GR, he emphasized the background independence (“no prior geometry”) under the guise of general covariance. But as alluded above, all laws of physics, properly formulated, are generally covariant! Writing the wave equation in Cartesian or spherical coordinates does not alter its content. Thus to emphasize general covariance as the defining or special feature of GR is both misleading and rather void of content. (Misner, Thorne, & Wheeler’s classic textbook suggests that at the time, mathematics was not sufficiently advanced to properly distinguish background independence from coordinate independence, so Einstein’s choice of phrasing is only confusing in retrospect). Rather, the special feature of GR is that it is background independent: in contrast to the wave equation above, where the metric plays the role of a fixed background (spoiling diff invariance in the process), in Einstein’s equations the metric is a dynamical variable. This is what is meant by “no prior geometry”.

However, lest we be mislead by the above example, it is important to emphasize that diffeomorphism invariance is not the same as background independence, even though the two appear hand-in-hand in Einstein’s equations. To illustrate this, compare the action for Maxwell’s electromagnetism,

\displaystyle S_{M}[A]=-\frac{1}{4}\int\mathrm{d}^4x\sqrt{-g}F_{\mu\nu}F^{\mu\nu}~, \ \ \ \ \ (3)

with the Einstein-Hilbert action

\displaystyle S_{EH}[g]=\frac{1}{16\pi G}\int\mathrm{d}^4x\sqrt{-g}R^{\mu\nu}g_{\mu\nu}~. \ \ \ \ \ (4)

(Example taken from [3]). Both of these depend on the metric tensor {g_{\mu\nu}(x)}, and are manifestly covariant (i.e., all metric indices are contracted). But in the former, the metric appears as a fixed background, and only the electromagnetic potential {A_\mu(x)} appears as an argument of the functional {S_M[A]}. In contrast, in the Einstein-Hilbert action {S_{EH}[g]}, the metric is a dynamical variable, and thus the background is a solution to the equations rather than something given externally at the outset. Thus diffeomorphism invariance simply means that the manifold on which the theory is formulated is irrelevant (modulo isomorphisms) to the underlying physics (or, to take the passive view, that we can choose any coordinate patch we like), while background independence is the stronger statement that the manifold itself is not fixed a priori. And this is what makes general relativity special.

P.s. Note that reference [2], which states that diffeomorphisms do not generally map geodesics to geodesics, is misleadingly phrased. A diffeomorphism {f:M\rightarrow M} will certainly map geodesics {\gamma} for some metric {g} on {M} to geodesics {f\circ\gamma} for the new metric {f^*g}. What the author means is that, except in the special case where the diffeomorphism is an isometry (i.e., {f^*g=g}; note isometry {\neq} isomorphism!), {f\circ\gamma} will not be a geodesic for the original metric {g}.

This entry was posted in Physics. Bookmark the permalink.

3 Responses to General covariance, diffeomorphism invariance, and background independence

  1. francisco says:

    Thanks very much for your nice explanation about this confusing subject. However it is not clear to me the paragraph after Eq (4). I mean you say that diffeomorphism (diff) invariance is not the same as background independence, where in Maxwell theory the metric is fixed and in Einstein theory is not. Then Maxwell theory is not diff invariant, since the metric changes under diff. So, I can guess that background independence implies diff invariance but the converse is not true, but what about Einstein-Maxwell theory?



    • Thanks francisco, I’m very glad you found it helpful!

      First, let me clarify what is meant by “fixed”. The metric in eq. (3) is fixed insofar as it isn’t an argument (unlike the potential, {A}), but nothing stops us from varying it as a free parameter. In other words, by “fixed” we simply mean that one must pick a metric in order to evaluate the action (but you can pick any metric you like). In contrast, in eq. (4), the metric is an argument, not a fixed parameter one picks at the outset. That is, the action is a functional of the metric, viewed as a dynamical variable (which one could obtain by, e.g., minimizing the action and solving the Euler-Lagrange equation, which in this case is Einstein’s field equation).

      Second, the action for free electromagnetism given by eq. (3) is diff invariant. Under a diffeomorphism, the metric — and hence {\sqrt{-g}} — will certainly change, but the total action is invariant provided the diffeomorphism vanishes at the boundaries (or at infinity, as the case may be). In fact this is a general result: let {\Psi} be any scalar, possibly constructed out of the tensor fields of the problem (so in this case, {\Psi=F_{\mu\nu}F^{\mu\nu}}, which transforms as a scalar since all indices are contracted). Then {\int\!\mathrm{d}^4x\sqrt{-g}\Psi} is invariant under a diffeomorphism that vanishes at the boundaries. There’s a great explanation of this in Edmund Bertschinger’s superb notes on Symmetry Transformations, the Einstein-Hilbert Action, and Gauge Invariance, but going through it here would require an aside on Lie derivatives (as the generators of infinitesimal diffeomorphisms).

      (Note here that while {\sqrt{-g}} changes under the diff, {\mathrm{d}^4x} does not. This stands in contrast to a mere coordinate change, in which the volume element {\sqrt{-g}\mathrm{d}^4x} is invariant by construction: the determinant of the Jacobian from the change of coordinates in the measure is cancelled by the same factor from the change in the metric, so the combination transforms as a scalar. At a practical level, this is one key distinction between diffeomorphism invariance and general covariance. Again, I’ll refer you to Bertschinger’s notes for more details).

      Since both eq. (3) and (4) are diff invariant, but only eq. (4) is background independent, we can safely conclude that the former does not necessarily imply the latter. Conversely, I’m not sure what “background independent” means outside the context of general relativity (GR)—as mentioned in the post, this is one of its defining features. And in GR, the action is a scalar (and the Lagrangian must be covariant on physical grounds), from which diff invariance follows as explained above. Which is to say that I know of no physical action which is background independent but not diff invariant.

      As for Einstein-Maxwell theory, this is precisely what you get when you make Maxwell electromagnetism background independent: you simply couple it to gravity! (See ref. [3]).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s