In an earlier post, we sketched the basic mathematical description of quantum mechanics, culminating in the general description of quantum states as (reduced) density matrices. We also claimed that generic measurements are not orthogonal projections, and evolution is not unitary. We shall here expand upon the aforementioned infrastructure to explain these statements, resolving some un-answered questions in the process. We shall again draw from Preskill’s Quantum Information and Computation course notes, as well as a lecture given by Mario Flory on POVMs and superoperators.
The naïve picture is that, as a consequence of Schmidt decomposition, one can write the density matrix for a mixed state as an ensemble of orthogonal pure states, the eigenvalues of which are interpreted as the probability of their occurring. When we measure the system, we project onto one of these eigenstates, hence the notion of measurements as orthogonal projections. And indeed this works fine for isolated systems; but as explained previously, this is an idealization. The problem that demands a more generalized notion of measurement is that an orthogonal measurement in a tensor product is not necessarily orthogonal if we restrict to subsystem
alone.
Let us first make the notion of orthogonal projections a bit more precise, following von Neumann’s treatment thereof. To perform a measurement of an observable , we couple the system to some classical pointer variable that we can actually observe, in the literal sense of the word. In particular, we assume that the pointer is sufficiently heavy that the spreading of its wavepacket can be neglected during the measurement process (it is classical, after all). The Hamiltonian describing the interaction of the pointer with the system is then approximated by
, where
is the coupling between the pointer’s momentum
and the observable under study. The time evolution operator is therefore
where in the second equality we’ve expanded in the diagonal basis,
. (Note that we are implicitly assuming that either
, where
is the original, unperturbed Hamiltonian, or that the measurement occurs so quickly that free evolution of the system can be neglected throughout. We’re also suppressing hats/bold-print on the operators, since this is clear from context).
Since is the generator of translations for the pointer, it shifts the position-space wavepacket thereof by some amount
:
. Thus, if the system is initially in a superposition of
eigenstates unentangled with the state of the pointer
, then after time
it will evolve to
Now the position of the pointer is correlated with the value of the observable . Thus, provided the pointer’s wavepacket is sufficiently narrow such that we can resolve all values of
(namely,
, which can be guaranteed by making the pointer sufficiently massive since
), observing that the position of the pointer has shifted by
is tantamount to measuring the eigenstate
, which occurs with probability
. In this manner, the initial state of the quantum system, call it
, is projected to
with probability
. This is von Neumann’s model of orthogonal measurement, which involves so-called projection valued measurements, or PVMs.
Of course, in principle the measurement process could project out some superposition of eigenstates, rather than a single position eigenstate as in the above example. Indeed, if we can couple any observable to a pointer, then we can perform any orthogonal projection in Hilbert space. Thus to formulate the above more generally, consider a set of projection operators such that
. Carrying out the measurement procedure above takes the initial (pure) state
to
with probability
as usual.
Thus far we have been referring to measurements on a single isolated Hilbert space, for which PVMs suffice. But in practice we only ever deal with subsystems, for which our concept of measurement must be suitably extended. As we shall see, the relevant entities for the job are positive operator valued measures, or POVMs. The key difference between a POVM and a PVM is that the latter are a subset of the former for which the eigenstates are orthogonal by construction.
Mathematically, a POVM is a measure (basically, a partition of unity) whose values are non-negative self-adjoint operators on Hilbert space. That is, denoting the set of operators that comprise the POVM by , it has the properties
,
, and
, where
. The idea is that a POVM element
is assigned to every possible measurement result such that
(hence the requirement that these sum to 1).
Given the positivity of the operators , there exists a (not necessarily unique) set of so-called measurement operators
such that
. Introducing these operators allows one to express the state immediately after measurement in the usual manner:
Note that this expression is precisely the same as that given for PVMs above; in other words, identically. The difference here is that in the case of a POVM, repeated measurement will not necessarily yield the same result. This is because unlike the
, which are idempotent orthogonal projection operators, the
are not projectors, and hence the state after measurement does not exist in a single orthogonal eigenstate. The PVM
, which is used in decomposing an observable
, corresponds to the special case of a POVM with
.
To elaborate on this slightly further, let us take the familiar example of a tensor product space , containing an initial state
and a PVM given by
. We now wish to restrict our attention to
, so we define a new set of operators
acting thereupon that faithfully reproduces the outcome labeled by index
of a measurement on
, namely:
We may obtain an explicit expression for by writing this expression in component form. Recall that a reduced density matrix can be written in terms of basis vectors as
Since is a dummy index, this requires two indices when written in matrix notation,
. This implies that four indices will label the tensor product
. The quantity
therefore carries two free indices (since
is a map from
), and similarly
carries four, all of which will be summed over when taking the appropriate traces. Hence the above expression, in component form, is
where ,
and
,
are orthonormal bases for
and
, respectively. With this expression for
in hand, one can show (see, e.g., Preskill p87) that the
do indeed satisfy the properties claimed for it above, namely Hermiticity, positivity (non-negativity), and completeness
. As we have emphasized however, they are not necessarily orthogonal, which is again the crucial difference between POVMs and PVMs. Indeed, the number of
‘s is limited by the dimension of the total Hilbert space
, which may be arbitrarily greater than that of
.
As one might have expected given that POVMs act on subspaces, a POVM can be lifted to a PVM by expanding the Hilbert space of the former and performing the latter in the resulting superspace. This is the content of Neimark’s (sometimes transliterated from the Cyrillic “Ðаймарк” as “Neumark”) theorem. Note that the converse also holds: any PVM on a Hilbert space reduces to a POVM on any subspace thereof. This means that one can realize a POVM as a PVM on an enlarged Hilbert space, which allows one to obtain the correct measurement probabilities (by which we mean, the relative weights in the ensemble; see below) by performing orthogonal projections. Conversely, an orthogonal measurement of a bipartite system may be a nonorthogonal POVM on
alone.
In addition to the crucial role they play in measurement, POVMs are useful for formulating a suitable generalization of evolution that applies to subsystems. By way of example, suppose the initial state in is given by
. Since evolution of the total bipartite system is unitary, it is described by the action of a unitary operator
,
whereupon the density matrix of subsystem is
where is an orthonormal basis for
, and
is an operator acting on
. Note that it follows from the unitarity of
that
We may thus expression succinctly as
where is a linear map that takes density matrices to density matrices (linear operators to linear operators). Such a map, when the above property of
is satisfied, is called a superoperator, which we’ve written here in the so-called operator sum or Kraus representation. The operator sum representation of a given superoperator
is not unique, since performing the trace over
in a different basis would lead to different measurement operators
. However, any two operator sum representations of the same superoperator are related by a unitary change of basis, e.g.,
(in other words, the
may be thought of as a particular choice of the
considered above).
The mapping inherits the usual properties from
: it is Hermitian, positive, and trace-preserving (
if
). But these are not quite sufficient to ensure that our bipartite system evolves unitarily. The basic reason is that we are limiting our attention to subsystem
, and have no guarantee that there does not exist an uncoupled system
that evolves in such a manner as to screw things up. To amend this, we demand that
instead satisfy complete positivity: given any extension of
to
,
is completely positive in
if
is positive for all such extensions. For an example of the necessity of this requirement, see Preskill p97-98 for an exposition of the transposition operator,
, which is a positive operator that is not completely positive.
In addition to these three necessary properties, it is also customary to assume that is linear. As alluded in the previous post on the subject, non-linear evolution is difficult to reconcile with the ensemble interpretation, due to the inherently linear nature of probability. In some sense, linearity is demanded by the probabilistic interpretation — and indeed, as explained in Preskill, non-linear evolution can lead to rather strange consequences — but I’m not aware of any rigorous proof. Nonetheless, for the time being we shall demand this property of superoperators as well.
Unitary evolution, for an isolated system, is described by the Schrödinger equation. The analagous equation for general evolution by superoperators is called the Master equation. Preskill elaborates on this in some detail in section 3.5, but we will restrain ourselves from getting involved in such details here. Instead, we merely observe that unitary evolution can be thought of as the special case in which the operator sum contains only a single term. Under unitary evolution, pure states can only evolve to pure states:
and similarly mixed states remain mixed. But superoperators allow the evolution of pure states to mixed states. This is called decoherence. It is the process by which initially pure states become entangled, and consequently, it plays a fundamental role in both the mathematics of quantum mechanics and the (philosophical) interpretation thereof.
To connect back to our earlier example, suppose we perform a POVM on . By (11) and (12), this is tantamount to evolving the system with a superoperator that takes
By Neimark’s theorem, the POVM has a unitary representation on the bipartite space
, meaning that there exists a unitary
such that
In other words, the bipartite system undergoes a unitary transformation that entangles with
,
We could thus describe the measurement by a PVM on that projects onto
with probability
where the second equality follows from comparison with (6). Normalizing the final state accordingly, we may write (14) as
We mentioned previously that for POVMs, repeated measurements will not necessarily yield the same result. Now we see why: the result of such a general measurement (that is, on a subsystem) is given an ensemble of pure states, and thus we require a description in terms of a density matrix rather than as a single (orthogonal) eigenstate.
This is also the description we would use if we knew only that a measurement had been performed, but were ignorant of the results. For example, suppose we perform a measurement by probing the system with a single particle (say, a photon from a laser). Immediately after the interaction with the probe, but before the interaction with the classical detector that records it, the system is in an entangled state. We would thus describe the process as evolution by a superoperator that produces a density matrix/ensemble as above. In other words, the system has slightly decohered: if the initial state were pure, some of the coherence has been lost upon evolution to a mixed state. The subsequent interaction with the (classical) detector that we colloquially think of as “measurement” is simply the same process of decoherence on a hugely expanded scale: the (now mixed) state becomes entangled with the trillions of particles that comprise the detector, decohering essentially instantaneously to a classical state. All the uniquely quantum information of the system has now been lost.
This is what is referred to as “collapse of the wavefunction” in the Copenhagen interpretation. The reason for the invalidity of this interpretation is that it posits a projection onto a single eigenstate as a result of observation (by which we simply mean, interaction with the measurement apparatus; anthropocentric language aside, consciousness is emphatically not involved in any fundamental way). But as we’ve seen above, a proper description of measurement is that of entanglement with the environment under evolution via superoperators. The measurement process proceeds by POVMs, not PVMs, on the (sub)system under study. And while at the end of the day one does arrive at an eigenstate in the expanded Hilbert space (that includes the measurement apparatus/detector/observer/etc), this is a consequence of decohering to a classical state, rather than directly projecting to it. Decoherence can thus be thought of as giving the appearance of wavefunction collapse; but as evidenced by the countless reams of confused literature on quantum foundations and related areas, it is most dangerous to indulge in such simplifications so blithely. (We note in passing that the “wavefunction of the universe” never decoheres, since evolution in an isolated system is unitary).
Another important fact that no doubt contributes to the collapse confusion is that decoherence is irreversible. Consider composing two superoperators to form a third: if describes the evolution from
to
, and
describes the evolution from
to
, then
is a superoperator describing the evolution from
to
. But the inverse of a superopertor is only a superopertor if it is unitary. This is in stark contrast to unitary evolution, which is perfectly invertible: we can run the equations backwards as well as forwards. Not so for superoperators: inverting
will not result in a superoperator that evolves backwards from
to
. In other words, decoherence implies an arrow of time, and an irrevocable loss of quantum information. And while the former implication has philosophical implications which we shall not digress upon here, the latter is not at all surprising: as stated above, decoherence is the process by which quantum states become classical.
Several open questions remain. Perhaps chief among them is our failure to fully resolve the “disconcerting dualism” between deterministic evolution and probabilistic measurement. Insofar as probability is a statement of our ignorance and thus fundamentally epistemic, any formulation of quantum mechanics that relies thereupon is doomed to suffer the same characterization, for what does it mean to say that nature is fundamentally probabilistic? We may ask whether the associated lack of predictivity in quantum mechanics stems from the fact that there does not exist a state which is an eigenstate of all observables. One also wonders whether it is possible to formulate a consistent theory with non-linearly evolving superoperators, and what the interpretation thereof would be vis-à-vis probabilistic ensembles (that is, to what extent we can free ourselves from probability if we distance ourselves from the linearity it imposes). Zurek’s work on decoherence contains some clarifying insight into this issue, but that’s a subject for another post.
It is tempting to speculate that the issue of how to properly describe measurement and evolution lies at the heart of the black hole information paradox, wherein a black hole formed from the collapse of an initially pure state appears to evolve to a mixed state, in violation of the supposedly unitary S-matrix. Indeed, for various reasons, this picture is almost certainly too naïve. In particular, evolution is not unitary, but it remains to be shown precisely how a more ontologically accurate rendition of the problem would solve it.