It is to be emphasized that Low's work allows to classify the stable caustics of small wave fronts, but not directly of (big) wave fronts. Clearly, a (big) wave front is a oneparameter family of small wave fronts. A qualitative change of a small wave front, in dependence of a parameter, is called a “metamorphosis” in the English literature and a “perestroika” in the Russian literature. Combining Low's results with the theory of metamorphoses, or perestroikas, could lead to a classsification of the stable caustics of (big) wave fronts. However, this has not been worked out until now.
Wave fronts in general relativity have been studied in a long series of articles by Newman, Frittelli, and collaborators. For some aspects of their work see Sections
2.9 and 3.4 . In the quasiNewtonian approximation formalism of lensing, the classification of caustics is treated in great detail in the book by Petters, Levine, and Wambsganss [
274]
. Interesting related mateial can also be found in Blandford and Narayan [
34]
. For a nice exposition of caustics in ordinary optics see Berry and Upstill [
28]
.
A light source that comes close to the caustic of the observer's past light cone is seen strongly magnified. For a point source whose worldline passes exactly through the caustic, the rayoptical treatment even gives an infinite brightness (see Section
2.6 ). If a light source passes behind a compact deflecting mass, its brightness increases and decreases in the course of time, with a maximum at the moment of closest approach to the caustic. Such microlensing events are routinely observed by monitoring a large number of stars in the bulge of our Galaxy, in the Magellanic Clouds, and in the Andromeda Galaxy (see, e.g., [
226]
for an overview). In his millennium essay on future perspectives of gravitational lensing, Blandford [
33]
mentioned the possibility of observing a chosen light source strongly magnified over a period of time with the help of a spaceborn telescope. The idea is to guide the spacecraft such that the worldline of the light source remains in (or close to) the oneparameter family of caustics of past light cones of the spacecraft over a period of time. This futuristic idea of “caustic surfing” was mathematically further discussed by Frittelli and Petters [
127]
.
2.3 Optical scalars and Sachs equations
For the calculation of distance measures, of image distortion, and of the brightness of images one has to study the Jacobi equation (= equation of geodesic deviation) along lightlike geodesics.
This is usually done in terms of the
optical scalars which were introduced by Sachs et al. [
171,
290]
.
Related background material on lightlike geodesic congruences can be found in many textbooks (see, e.g., Wald [
341]
, Section 9.2). In view of applications to lensing, a particularly useful exposition was given by Seitz, Schneider and Ehlers [
301]
. In the following the basic notions and results will be summarized.
Infinitesimally thin bundles.
Let
$s\mapsto \lambda \left(s\right)$
be an affinely parametrized lightlike geodesic with tangent vector field
$K=\dot{\lambda}$
. We assume that
$\lambda $
is pastoriented, because in applications to lensing one usually considers rays from the observer to the source. We use the summation convention for capital indices
$A,B,...$
taking the values 1 and 2. An infinitesimally thin bundle (with elliptical crosssection) along
$\lambda $
is a set
$$\begin{array}{c}\mathcal{\mathcal{B}}=\left\{{c}^{A}{Y}_{A}{c}^{1},{c}^{2}\in \mathbb{R},{\delta}_{AB}{c}^{A}{c}^{B}\le 1\right\}.\end{array}$$ 
(8)

Here
${\delta}_{AB}$
denotes the Kronecker delta, and
${Y}_{1}$
and
${Y}_{2}$
are two vector fields along
$\lambda $
with
$$\begin{array}{ccc}{\nabla}_{K}{\nabla}_{K}{Y}_{A}& =& R(K,{Y}_{A},K),\end{array}$$ 
(9)

$$\begin{array}{ccc}g(K,{Y}_{A})& =& 0,\end{array}$$ 
(10)

such that
${Y}_{1}\left(s\right)$
,
${Y}_{2}\left(s\right)$
, and
$K\left(s\right)$
are linearly independent for almost all
$s$
. As usual,
$R$
denotes the curvature tensor, defined by
$$\begin{array}{c}R(X,Y,Z)={\nabla}_{X}{\nabla}_{Y}Z{\nabla}_{Y}{\nabla}_{X}Z{\nabla}_{[X,Y]}Z.\end{array}$$ 
(11)

Equation ( 9 ) is the Jacobi equation. It is a precise mathematical formulation of the statement that “the arrowhead of
${Y}_{A}$
traces an infinitesimally neighboring geodesic”. Equation ( 10 ) guarantees that this neighboring geodesic is, again, lightlike and spatially related to
$\lambda $
.
Sachs basis.
For discussing the geometry of infinitesimally thin bundles it is usual to introduce a Sachs basis, i.e., two vector fields
${E}_{1}$
and
${E}_{2}$
along
$\lambda $
that are orthonormal, orthogonal to
$K=\dot{\lambda}$
, and parallelly transported,
$$\begin{array}{c}g({E}_{A},{E}_{B})={\delta}_{AB},g(K,{E}_{A})=0,{\nabla}_{K}{E}_{A}=0.\end{array}$$ 
(12)

Apart from the possibility to interchange them,
${E}_{1}$
and
${E}_{2}$
are unique up to transformations
$$\begin{array}{ccc}{\stackrel{~}{E}}_{1}& =& cos\alpha {E}_{1}+sin\alpha {E}_{2}+{a}_{1}K,\end{array}$$ 
(13)

$$\begin{array}{ccc}{\stackrel{~}{E}}_{2}& =& sin\alpha {E}_{1}+cos\alpha {E}_{2}+{a}_{2}K,\end{array}$$ 
(14)

where
$\alpha $
,
${a}_{1}$
, and
${a}_{2}$
are constant along
$\lambda $
. A Sachs basis determines a unique vector field
$U$
with
$g(U,U)=1$
and
$g(U,K)=1$
along
$\lambda $
that is perpendicular to
${E}_{1}$
, and
${E}_{2}$
. As
$K$
is assumed pastoriented,
$U$
is futureoriented. In the rest system of the observer field
$U$
, the Sachs basis spans the 2space perpendicular to the ray. It is helpful to interpret this 2space as a “screen”; correspondingly, linear combinations of
${E}_{1}$
and
${E}_{2}$
are often refered to as “screen vectors”.
Jacobi matrix.
With respect to a Sachs basis, the basis vector fields
${Y}_{1}$
and
${Y}_{2}$
of an infinitesimally thin bundle can be represented as
$$\begin{array}{c}{Y}_{A}={D}_{A}^{B}{E}_{B}+{y}_{A}K.\end{array}$$ 
(15)

The Jacobi matrix
$\mathit{D}=\left({D}_{A}^{B}\right)$
relates the shape of the crosssection of the infinitesimally thin bundle to the Sachs basis (see Figure 3 ). Equation ( 9 ) implies that
$\mathit{D}$
satisfies the matrix Jacobi equation
$$\begin{array}{c}\ddot{\mathit{D}}=\mathit{D}\mathit{R},\end{array}$$ 
(16)

where an overdot means derivative with respect to the affine parameter
$s$
, and
$$\begin{array}{c}\mathit{R}=\left(\begin{array}{cc}{\Phi}_{00}& 0\\ 0& {\Phi}_{00}\end{array}\right)+\left(\begin{array}{cc}Re\left({\psi}_{0}\right)& Im\left({\psi}_{0}\right)\\ Im\left({\psi}_{0}\right)& Re\left({\psi}_{0}\right)\end{array}\right)\end{array}$$ 
(17)

is the optical tidal matrix, with
$$\begin{array}{c}{\Phi}_{00}=\frac{1}{2}Ric(K,K),{\psi}_{0}=\frac{1}{2}C\left({E}_{1}i{E}_{2},K,{E}_{1}i{E}_{2},K\right).\end{array}$$ 
(18)

Here
$Ric$
denotes the Ricci tensor, defined by
$Ric(X,Y)=tr\left(R(\cdot ,X,Y)\right)$
, and
$C$
denotes the conformal curvature tensor (= Weyl tensor). The notation in Equation ( 18 ) is chosen in agreement with the Newman–Penrose formalism (cf., e.g., [
54]
). As
${Y}_{1}$
,
${Y}_{2}$
, and
$K$
are not everywhere linearly dependent,
$det\left(\mathit{D}\right)$
does not vanish identically. Linearity of the matrix Jacobi equation implies that
$det\left(\mathit{D}\right)$
has only isolated zeros. These are the “caustic points” of the bundle (see below).
Shape parameters.
The Jacobi matrix
$\mathit{D}$
can be parametrized according to
$$\begin{array}{c}\mathit{D}=\left(\begin{array}{cc}cos\psi & sin\psi \\ sin\psi & cos\psi \end{array}\right)\left(\begin{array}{cc}{D}_{+}& 0\\ 0& {D}_{}\end{array}\right)\left(\begin{array}{cc}cos\chi & sin\chi \\ sin\chi & cos\chi \end{array}\right).\end{array}$$ 
(19)

Here we made use of the fact that any matrix can be written as the product of an orthogonal and a symmetric matrix, and that any symmetric matrix can be diagonalized. Note that, by our definition of infinitesimally thin bundles,
${D}_{+}$
and
${D}_{}$
are nonzero almost everywhere. Equation ( 19 ) determines
${D}_{+}$
and
${D}_{}$
up to sign. The most interesting case for us is that of an infinitesimally thin bundle that issues from a vertex at an observation event
${p}_{O}$
into the past. For such bundles we require
${D}_{+}$
and
${D}_{}$
to be positive near the vertex and differentiable everywhere; this uniquely determines
${D}_{+}$
and
${D}_{}$
everywhere. With
${D}_{+}$
and
${D}_{}$
fixed, the angles
$\chi $
and
$\psi $
are unique at all points where the bundle is noncircular; in other words, requiring them to be continuous determines these angles uniquely along every infinitesimally thin bundle that is noncircular almost everywhere.
In the representation of Equation (
19 ), the extremal points of the bundle's elliptical crosssection are given by the position vectors
$$\begin{array}{ccc}{Y}_{+}& =& cos\psi {Y}_{1}+sin\psi {Y}_{2}\simeq {D}_{+}\left(cos\chi {E}_{1}+sin\chi {E}_{2}\right),\end{array}$$ 
(20)

$$\begin{array}{ccc}{Y}_{}& =& sin\psi {Y}_{1}+cos\psi {Y}_{2}\simeq {D}_{}\left(sin\chi {E}_{1}+cos\chi {E}_{2}\right),\end{array}$$ 
(21)

where
$\simeq $
means equality up to multiples of
$K$
. Hence,
$\left{D}_{+}\right$
and
$\left{D}_{}\right$
give the semiaxes of the elliptical crosssection and
$\chi $
gives the angle by which the ellipse is rotated with respect to the Sachs basis (see Figure 3 ). We call
${D}_{+}$
,
${D}_{}$
, and
$\chi $
the shape parameters of the bundle, following Frittelli, Kling, and Newman [
120,
119]
. Instead of
${D}_{+}$
and
${D}_{}$
one may also use
${D}_{+}{D}_{}$
and
${D}_{+}/{D}_{}$
. For the case that the infinitesimally thin bundle can be embedded in a wave front, the shape parameters
${D}_{+}$
and
${D}_{}$
have the following interesting property (see Kantowski et al. [
172,
84]
).
${\dot{D}}_{+}/{D}_{+}$
and
${\dot{D}}_{}/{D}_{}$
give the principal curvatures of the wave front in the rest system of the observer field
$U$
which is perpendicular to the Sachs basis. The notation
${D}_{+}$
and
${D}_{}$
, which is taken from [
84]
, is convenient because it often allows to write two equations in the form of one equation with a
$\pm $
sign (see, e.g., Equation ( 27 ) or Equation ( 93 ) below). The angle
$\chi $
can be directly linked with observations if a light source emits linearly polarized light (see Section 2.5 ). If the Sachs basis is transformed according to Equations ( 13 , 14 ) and
${Y}_{1}$
and
${Y}_{2}$
are kept fixed, the Jacobi matrix changes according to
${\stackrel{~}{D}}_{\pm}={D}_{\pm}$
,
$\stackrel{~}{\chi}=\chi +\alpha $
,
$\stackrel{~}{\psi}=\psi $
. This demonstrates the important fact that the shape and the size of the crosssection of an infinitesimally thin bundle has an invariant meaning [
290]
.
Figure 3
: Crosssection of an infinitesimally thin bundle. The Jacobi matrix ( 19 ) relates the Jacobi fields
${Y}_{1}$
and
${Y}_{2}$
that span the bundle to the Sachs basis vectors
${E}_{1}$
and
${E}_{2}$
. The shape parameters
${D}_{+}$
,
${D}_{}$
, and
$\chi $
determine the outline of the crosssection; the angle
$\psi $
that appears in Equation ( 19 ) does not show in the outline. The picture shows the projection into the 2space (“screen”) spanned by
${E}_{1}$
and
${E}_{2}$
; note that, in general,
${Y}_{1}$
and
${Y}_{2}$
have components perpendicular to the screen.
Optical scalars.
Along each infinitesimally thin bundle one defines the deformation matrix
$\mathit{S}$
by
$$\begin{array}{c}\dot{\mathit{D}}=\mathit{D}\mathit{S}.\end{array}$$ 
(22)

This reduces the secondorder linear differential equation ( 16 ) for
$\mathit{D}$
to a firstorder nonlinear differential equation for
$\mathit{S}$
,
$$\begin{array}{c}\dot{\mathit{S}}+\mathit{S}\mathit{S}=\mathit{R}.\end{array}$$ 
(23)

It is usual to decompose
$\mathit{S}$
into antisymmetric, symmetrictracefree, and trace parts,
$$\begin{array}{c}\mathit{S}=\left(\begin{array}{cc}0& \omega \\ \omega & 0\end{array}\right)+\left(\begin{array}{cc}{\sigma}_{1}& {\sigma}_{2}\\ {\sigma}_{2}& {\sigma}_{1}\end{array}\right)+\left(\begin{array}{cc}\theta & 0\\ 0& \theta \end{array}\right).\end{array}$$ 
(24)

This defines the optical scalars
$\omega $
(twist ),
$\theta $
(expansion ), and
$({\sigma}_{1},{\sigma}_{2})$
(shear ). One usually combines them into two complex scalars
$\varrho =\theta +i\omega $
and
$\sigma ={\sigma}_{1}+i{\sigma}_{2}$
. A change ( 13 , 14 ) of the Sachs basis affects the optical scalars according to
$\stackrel{~}{\varrho}=\varrho $
and
$\stackrel{~}{\sigma}={e}^{2i\alpha}\sigma $
. Thus,
$\varrho $
and
$\left\sigma \right$
are invariant. If rewritten in terms of the optical scalars, Equation ( 23 ) gives the Sachs equations
$$\begin{array}{ccc}\dot{\varrho}& =& {\varrho}^{2}\sigma {}^{2}+{\Phi}_{00},\end{array}$$ 
(25)

$$\begin{array}{ccc}\dot{\sigma}& =& \sigma \left(\varrho +\overline{\varrho}\right)+{\psi}_{0}.\end{array}$$ 
(26)

One sees that the Ricci curvature term
${\Phi}_{00}$
directly produces expansion (focusing) and that the conformal curvature term
${\psi}_{0}$
directly produces shear. However, as the shear appears in Equation ( 25 ), conformal curvature indirectly influences focusing (cf. Penrose [
259]
). With
$\mathit{D}$
written in terms of the shape parameters and
$\mathit{S}$
written in terms of the optical scalars, Equation ( 22 ) results in
$$\begin{array}{c}{\dot{D}}_{\pm}i\dot{\chi}{D}_{\pm}+i\dot{\psi}{D}_{\mp}=\left(\rho \pm {e}^{2i\chi}\sigma \right){D}_{\pm}.\end{array}$$ 
(27)

Along
$\lambda $
, Equations ( 25 , 26 ) give a system of 4 real firstorder differential equations for the 4 real variables
$\varrho $
and
$\sigma $
; if
$\varrho $
and
$\sigma $
are known, Equation ( 27 ) gives a system of 4 real firstorder differential equations for the 4 real variables
${D}_{\pm}$
,
$\chi $
, and
$\psi $
. The twistfree solutions (
$\varrho $
real) to Equations ( 25 , 26 ) constitute a 3dimensional linear subspace of the 4dimensional space of all solutions. This subspace carries a natural metric of Lorentzian signature, unique up to a conformal factor, and was nicknamed Minikowski space in [
20]
.
Conservation law.
As the optical tidal matrix
$\mathit{R}$
is symmetric, for any two solutions
${\mathit{D}}_{1}$
and
${\mathit{D}}_{2}$
of the matrix Jacobi equation ( 16 ) we have
$$\begin{array}{c}{\dot{\mathit{D}}}_{1}{\mathit{D}}_{2}^{T}{\mathit{D}}_{1}{\dot{\mathit{D}}}_{2}^{T}=\text{constant},\end{array}$$ 
(28)

where
$({)}^{T}$
means transposition. Evaluating the case
${\mathit{D}}_{1}={\mathit{D}}_{2}$
shows that for every infinitesimally thin bundle
$$\begin{array}{c}\omega {D}_{+}{D}_{}=\text{constant}.\end{array}$$ 
(29)

Thus, there are two types of infinitesimally thin bundles: those for which this constant is nonzero and those for which it is zero. In the first case the bundle is twisting (
$\omega \ne 0$
everywhere) and its crosssection nowhere collapses to a line or to a point (
${D}_{+}\ne 0$
and
${D}_{}\ne 0$
everywhere).
In the second case the bundle must be nontwisting (
$\omega =0$
everywhere), because our definition of infinitesimally thin bundles implies that
${D}_{+}\ne 0$
and
${D}_{}\ne 0$
almost everywhere. A quick calculation shows that
$\omega =0$
is exactly the integrability condition that makes sure that the infinitesimally thin bundle can be embedded in a wave front. (For the definition of wave fronts see Section 2.2 .) In other words, for an infinitesimally thin bundle we can find a wave front such that
$\lambda $
is one of the generators, and
${Y}_{1}$
and
${Y}_{2}$
connect
$\lambda $
with infinitesimally neighboring generators if and only if the bundle is twistfree. For a (necessarily twistfree) infinitesimally thin bundle, points where one of the two shape parameters
${D}_{+}$
and
${D}_{}$
vanishes are called caustic points of multiplicity one, and points where both shape parameters
${D}_{+}$
and
${D}_{}$
vanish are called caustic points of multiplicity two. This notion coincides exactly with the notion of caustic points, or conjugate points, of wave fronts as introduced in Section 2.2 . The behavior of the optical scalars near caustic points can be deduced from Equation ( 27 ) with Equations ( 25 , 26 ). For a caustic point of multiplicty one at
$s={s}_{0}$
one finds
$$\begin{array}{ccc}\theta \left(s\right)& =& \frac{1}{2(s{s}_{0})}\left(1+\mathcal{O}(s{s}_{0})\right),\end{array}$$ 
(30)

$$\begin{array}{ccc}\left\sigma \left(s\right)\right& =& \frac{1}{2(s{s}_{0})}\left(1+\mathcal{O}(s{s}_{0})\right).\end{array}$$ 
(31)

By contrast, for a caustic point of multiplicity two at
$s={s}_{0}$
the equations read (cf. [
301]
)
$$\begin{array}{ccc}\theta \left(s\right)& =& \frac{1}{s{s}_{0}}+\mathcal{O}(s{s}_{0}),\end{array}$$ 
(32)

$$\begin{array}{ccc}\sigma \left(s\right)& =& \frac{1}{3}{\psi}_{0}\left({s}_{0}\right)(s{s}_{0})+\mathcal{O}\left((s{s}_{0}{)}^{2}\right).\end{array}$$ 
(33)

Infinitesimally thin bundles with vertex.
We say that an infinitesimally thin bundle has a vertex at
$s={s}_{0}$
if the Jacobi matrix satisfies
$$\begin{array}{c}\mathit{D}\left({s}_{0}\right)=0,\dot{\mathit{D}}\left({s}_{0}\right)=1.\end{array}$$ 
(34)

A vertex is, in particular, a caustic point of multiplicity two. An infinitesimally thin bundle with a vertex must be nontwisting. While any nontwisting infinitesimally thin bundle can be embedded in a wave front, an infinitesimally thin bundle with a vertex can be embedded in a light cone. Near the vertex, it has a circular crosssection. If
${\mathit{D}}_{1}$
has a vertex at
${s}_{1}$
and
${\mathit{D}}_{2}$
has a vertex at
${s}_{2}$
, the conservation law ( 28 ) implies
$$\begin{array}{c}{\mathit{D}}_{2}^{T}\left({s}_{1}\right)={\mathit{D}}_{1}\left({s}_{2}\right).\end{array}$$ 
(35)

This is Etherington's [
103]
reciprocity law. The method by which this law was proven here follows Ellis [
97]
(cf. Schneider, Ehlers, and Falco [
297]
). Etherington's reciprocity law is of relevance, in particular in view of cosmology, because it relates the luminosity distance to the area distance (see Equation ( 47 )). It was independently rediscovered in the 1960s by Sachs and Penrose (see [
259,
189]
).
The results of this section are the basis for Sections
2.4 , 2.5 , and 2.6 .
2.4 Distance measures
In this section we summarize various distance measures that are defined in an arbitrary spacetime.
Some of them are directly related to observable quantities with relevance for lensing. The material of this section makes use of the results on infinitesimally thin bundles which are summarized in Section
2.3 . All of the distance measures to be discussed refer to a pastoriented lightlike geodesic
$\lambda $
from an observation event
${p}_{O}$
to an emission event
${p}_{S}$
(see Figure 4 ). Some of them depend on the 4velocity
${U}_{O}$
of the observer at
${p}_{O}$
and/or on the 4velocity
${U}_{S}$
of the light source at
${p}_{S}$
. If a vector field
$U$
with
$g(U,U)=1$
is distinguished on
$\mathcal{\mathcal{M}}$
, we can choose for the observer an integral curve of
$U$
and for the light sources all other integral curves of
$U$
. Then each of the distance measures becomes a function of the observational coordinates
$(s,\Psi ,\Theta ,\tau )$
(recall Section 2.1 ).
Figure 4
: Pastoriented lightlike geodesic
$\lambda $
from an observation event
${p}_{O}$
to an emission event
${p}_{S}$
.
${\gamma}_{O}$
is the worldline of the observer,
${\gamma}_{S}$
is the worldline of the light source.
${U}_{O}$
is the 4velocity of the observer at
${p}_{O}$
and
${U}_{S}$
is the 4velocity of the light source at
${p}_{S}$
.
Affine distance.
There is a unique affine parametrization
$s\mapsto \lambda \left(s\right)$
for each lightlike geodesic through the observation event
${p}_{O}$
such that
$\lambda \left(0\right)={p}_{O}$
and
$g\left(\dot{\lambda}\left(0\right),{U}_{O}\right)=1$
. Then the affine parameter
$s$
itself can be viewed as a distance measure. This affine distance has the desirable features that it increases monotonously along each ray and that it coincides in an infinitesimal neighborhood of
${p}_{O}$
with Euclidean distance in the rest system of
${U}_{O}$
. The affine distance depends on the 4velocity
${U}_{O}$
of the observer but not on the 4velocity
${U}_{S}$
of the light source. It is a mathematically very convenient notion, but it is not an observable. (It can be operationally realized in terms of an observer field whose 4velocities are parallel along the ray. Then the affine distance results by integration if each observer measures the length of an infinitesimally short part of the ray in his rest system. However, in view of astronomical situations this is a purely theoretical construction.) The notion of affine distance was introduced by Kermack, McCrea, and Whittaker [
179]
.
Travel time.
As an alternative distance measure one can use the travel time. This requires the choice of a time function, i.e., of a function
$t$
that slices the spacetime into spacelike hypersurfaces
$t=\text{constant}$
.
(Such a time function globally exists if and only if the spacetime is stably causal; see, e.g., [
153]
, p. 198.) The travel time is equal to
$t\left({p}_{O}\right)t\left({p}_{S}\right)$
, for each
${p}_{S}$
on the past light cone of
${p}_{O}$
. In other words, the intersection of the light cone with a hypersurface
$t=\text{constant}$
determines events of equal travel time; we call these intersections “instantaneous wave fronts” (recall Section 2.2 ).
Examples of instantaneous wave fronts are shown in Figures
13 , 18 , 19 , 27 , and 28 . The travel time increases monotonously along each ray. Clearly, it depends neither on the 4velocity
${U}_{O}$
of the observer nor on the 4velocity
${U}_{S}$
of the light source. Note that the travel time has a unique value at each point of
${p}_{O}$
's past light cone, even at events that can be reached by two different rays from
${p}_{O}$
. Near
${p}_{O}$
the travel time coincides with Euclidean distance in the observer's rest system only if
${U}_{O}$
is perpendicular to the hypersurface
$t=\text{constant}$
with
$dt\left({U}_{O}\right)=1$
. (The latter equation is true if along the observer's world line the time function
$t$
coincides with proper time.) The travel time is not directly observable. However, travel time differences are observable in multipleimaging situations if the intrinsic luminosity of the light source is timedependent. To illustrate this, think of a light source that flashes at a particular instant. If the flash reaches the observer's wordline along two different rays, the proper time difference
$\Delta {\tau}_{O}$
of the two arrival events is directly measurable. For a time function
$t$
that along the observer's worldline coincides with proper time, this observed time delay
$\Delta {\tau}_{O}$
gives the difference in travel time for the two rays. In view of applications, the measurement of time delays is of great relevance for quasar lensing.
For the double quasar 0957+561 the observed time delay
$\Delta {\tau}_{O}$
is about 417 days (see, e.g., [
274]
, p. 149).
Redshift. In cosmology it is common to use the redshift as a distance measure. For assigning a redshift to a lightlike geodesic
$\lambda $
that connects the observation event
${p}_{O}$
on the worldline
${\gamma}_{O}$
of the observer with the emission event
${p}_{S}$
on the worldline
${\gamma}_{S}$
of the light source, one considers a neighboring lightlike geodesic that meets
${\gamma}_{O}$
at a proper time interval
$\Delta {\tau}_{O}$
from
${p}_{O}$
and
${\gamma}_{S}$
at a proper time interval
$\Delta {\tau}_{S}$
from
${p}_{S}$
. The redshift
$z$
is defined as
$$\begin{array}{c}z=lim\Delta {\tau}_{S}\to 0\frac{\Delta {\tau}_{O}\Delta {\tau}_{S}}{\Delta {\tau}_{S}}.\end{array}$$ 
(36)

If
$\lambda $
is affinely parametrized with
$\lambda \left(0\right)={p}_{O}$
and
$\lambda \left(s\right)={p}_{S}$
, one finds that
$z$
is given by
$$\begin{array}{c}1+z=\frac{g\left(\dot{\lambda}\left(0\right),{U}_{O}\right)}{g\left(\dot{\lambda}\left(s\right),{U}_{S}\right)}.\end{array}$$ 
(37)

This general redshift formula is due to Kermack, McCrea, and Whittaker [
179]
. Their proof is based on the fact that
$g(\dot{\lambda},Y)$
is a constant for all Jacobi fields
$Y$
that connect
$\lambda $
with an infinitesimally neighboring lightlike geodesic. The same proof can be found, in a more elegant form, in [
41]
and in [
310]
, p. 109. An alternative proof, based on variational methods, was given by Schrödinger [
298]
.
Equation (
37 ) is in agreement with the Hamilton formalism for photons. Clearly, the redshift depends on the 4velocity
${U}_{O}$
of the observer and on the 4velocity
${U}_{S}$
of the light source. If a vector field
$U$
with
$g(U,U)=1$
has been distinguished on
$\mathcal{\mathcal{M}}$
, we may choose one integral curve of
$U$
as the observer and all other integral curves of
$U$
as the light sources. Then the redshift becomes a function of the observational coordinates
$(s,\Psi ,\Theta ,\tau )$
. For
$s\to 0$
, the redshift goes to 0,
$$\begin{array}{c}z(s,\Psi ,\Theta ,\tau )=h(\Psi ,\Theta ,\tau )s+\mathcal{O}\left({s}^{2}\right),\end{array}$$ 
(38)

with a (generalized) Hubble parameter
$h(\Psi ,\Theta ,\tau )$
that depends on spatial direction and on time.
For criteria that
$h$
and the higherorder coefficients are independent of
$\Psi $
and
$\Theta $
(see [
151]
). If the redshift is known for one observer field
$U$
, it can be calculated for any other
$U$
, according to Equation ( 37 ), just by adding the usual specialrelativistic Doppler factors. Note that if
${U}_{O}$
is given, the redshift can be made to zero along any one ray
$\lambda $
from
${p}_{O}$
by choosing the 4velocities
${U}_{\lambda \left(s\right)}$
appropriately. This shows that
$z$
is a reasonable distance measure only for special situations, e.g., in cosmological models with
$U$
denoting the mean flow of luminous matter (“Hubble flow”).
In any case, the redshift is directly observable if the light source emits identifiable spectral lines.
For the calculation of Sagnaclike effects, the redshift formula (
37 ) can be evaluated piecewise along broken lightlike geodesics [
23]
.
Angular diameter distances.
The notion of angular diameter distance is based on the intuitive idea that the farther an object is away the smaller it looks, according to the rule
$$\begin{array}{c}\text{object diameter}=\text{angle}\times \text{distance}.\end{array}$$ 
(39)

The formal definition needs the results of Section 2.3 on infinitesimally thin bundles. One considers a pastoriented lightlike geodesic
$s\u27f6\lambda \left(s\right)$
parametrized by affine distance, i.e.,
$\lambda \left(0\right)={p}_{O}$
and
$g\left(\dot{\lambda}\left(0\right),{U}_{O}\right)=1$
, and along
$\lambda $
an infinitesimally thin bundle with vertex at the observer, i.e., at
$s=0$
. Then the shape parameters
${D}_{+}\left(s\right)$
and
${D}_{}\left(s\right)$
(recall Figure 3 ) satisfy the initial conditions
${D}_{\pm}\left(0\right)=0$
and
${\dot{D}}_{\pm}\left(0\right)=1$
. They have the following physical meaning. If the observer sees a circular image of (small) angular diameter
$\alpha $
on his or her sky, the (small but extended) light source at affine distance
$s$
actually has an elliptical crosssection with extremal diameters
$\alpha \left{D}_{\pm}\right(s\left)\right$
.
It is therefore reasonable to call
${D}_{+}$
and
${D}_{}$
the extremal angular diameter distances. Near the vertex,
${D}_{+}$
and
${D}_{}$
are monotonously increasing functions of the affine distance,
${D}_{\pm}\left(s\right)=s+\mathcal{O}\left({s}^{2}\right)$
.
Farther away from the vertex, however, they may become decreasing, so the functions
$s\mapsto {D}_{+}\left(s\right)$
and
$s\mapsto {D}_{}\left(s\right)$
need not be invertible. At a caustic point of multiplicity one, one of the two functions
${D}_{+}$
and
${D}_{}$
changes sign; at a caustic point of multiplicity two, both change sign (recall Section 2.3 ).
The image of a light source at affine distance
$s$
is said to have even parity if
${D}_{+}\left(s\right){D}_{}\left(s\right)>0$
and odd parity if
${D}_{+}\left(s\right){D}_{}\left(s\right)<0$
. Images with odd parity show the neighborhood of the light source sideinverted in comparison to images with even parity. Clearly,
${D}_{+}$
and
${D}_{}$
are reasonable distance measures only in a neighborhood of the vertex where they are monotonously increasing.
However, the physical relevance of
${D}_{+}$
and
${D}_{}$
lies in the fact that they relate crosssectional diameters at the source to angular diameters at the observer, and this is always true, even beyond caustic points.
${D}_{+}$
and
${D}_{}$
depend on the 4velocity
${U}_{O}$
of the observer but not on the 4velocity
${U}_{S}$
of the source. This reflects the fact that the angular diameter of an image on the observer's sky is subject to aberration whereas the crosssectional diameter of an infinitesimally thin bundle has an invariant meaning (recall Section 2.3 ). Hence, if the observer's worldline
${\gamma}_{O}$
has been specified,
${D}_{+}$
and
${D}_{}$
are welldefined functions of the observational coordinates
$(s,\Psi ,\Theta ,\tau )$
.
Area distance.
The area distance
${D}_{\text{area}}$
is defined according to the idea
$$\begin{array}{c}\text{object area}=\text{solid angle}\times {\text{distance}}^{2}.\end{array}$$ 
(40)

As a formal definition for
${D}_{\text{area}}$
, in terms of the extremal angular diameter distances
${D}_{+}$
and
${D}_{}$
as functions of affine distance
$s$
, we use the equation
$$\begin{array}{c}{D}_{\text{area}}\left(s\right)=\sqrt{\left{D}_{+}\left(s\right){D}_{}\left(s\right)\right}.\end{array}$$ 
(41)

${D}_{\text{area}}(s{)}^{2}$
indeed relates, for a bundle with vertex at the observer, the crosssectional area at the source to the opening solid angle at the observer. Such a bundle has a caustic point exactly at those points where
${D}_{\text{area}}\left(s\right)=0$
. The area distance is often called “angular diameter distance” although, as indicated by Equation ( 41 ), the name “averaged angular diameter distance” would be more appropriate. Just as
${D}_{+}$
and
${D}_{}$
, the area distance depends on the 4velocity
${U}_{O}$
of the observer but not on the 4velocity
${U}_{S}$
of the light source. The area distance is observable for a light source whose true size is known (or can be reasonably estimated). It is sometimes convenient to introduce the magnification or amplification factor
$$\begin{array}{c}\mu \left(s\right)=\frac{{s}^{2}}{{D}_{+}\left(s\right){D}_{}\left(s\right)}.\end{array}$$ 
(42)

The absolute value of
$\mu $
determines the area distance, and the sign of
$\mu $
determines the parity.
In Minkowski spacetime,
${D}_{\pm}\left(s\right)=s$
and, thus,
$\mu \left(s\right)=1$
. Hence,
$\left\mu \right(s\left)\right>1$
means that a (small but extended) light source at affine distance
$s$
subtends a larger solid angle on the observer's sky than a light source of the same size at the same affine distance in Minkowski spacetime. Note that in a multipleimaging situation the individual images may have different affine distances. Thus, the relative magnification factor of two images is not directly observable. This is an important difference to the magnification factor that is used in the quasiNewtonian approximation formalism of lensing. The latter is defined by comparison with an “unlensed image” (see, e.g., [
297]
), a notion that makes sense only if the metric is viewed as a perturbation of some “background” metric.
One can derive a differential equation for the area distance (or, equivalently, for the magnification
factor) as a function of affine distance in the following way. On every parameter interval where
${D}_{+}{D}_{}$
has no zeros, the real part of Equation ( 27 ) shows that the area distance is related to the expansion by
$$\begin{array}{c}{\dot{D}}_{\text{area}}=\theta {D}_{\text{area}}.\end{array}$$ 
(43)

Insertion into the Sachs equation ( 25 ) for
$\theta =\varrho $
gives the focusing equation
$$\begin{array}{c}{\ddot{D}}_{\text{area}}=\left(\sigma {}^{2}+\frac{1}{2}Ric(\dot{\lambda},\dot{\lambda})\right){D}_{\text{area}}.\end{array}$$ 
(44)

Between the vertex at
$s=0$
and the first conjugate point (caustic point),
${D}_{\text{area}}$
is determined by Equation ( 44 ) and the initial conditions
$$\begin{array}{c}{D}_{\text{area}}\left(0\right)=0,{\dot{D}}_{\text{area}}\left(0\right)=1.\end{array}$$ 
(45)

The Ricci term in Equation ( 44 ) is nonnegative if Einstein's field equation holds and if the energy density is nonnegative for all observers (“weak energy condition”). Then Equations ( 44 , 45 ) imply that
$$\begin{array}{c}{D}_{\text{area}}\left(s\right)\le s,\end{array}$$ 
(46)

i.e.,
$1\le \mu \left(s\right)$
, for all
$s$
between the vertex at
$s=0$
and the first conjugate point. In Minkowski spacetime, Equation ( 46 ) holds with equality. Hence, Equation ( 46 ) says that the gravitational field has a focusing, as opposed to a defocusing, effect. This is sometimes called the focusing theorem.
Corrected luminosity distance.
The idea of defining distance measures in terms of bundle crosssections dates back to Tolman [
321]
and Whittaker [
351]
. Originally, this idea was applied not to bundles with vertex at the observer but rather to bundles with vertex at the light source. The resulting analogue of the area distance is the socalled corrected luminosity distance
${D}_{\text{lum}}^{\prime}$
. It relates, for a bundle with vertex at the light source, the crosssectional area at the observer to the opening solid angle at the light source.
Owing to Etherington's reciprocity law (
35 ), area distance and corrected luminosity distance are related by
$$\begin{array}{c}{D}_{lum}^{\prime}=(1+z){D}_{\text{area}}.\end{array}$$ 
(47)

The redshift factor has its origin in the fact that the definition of
${D}_{lum}^{\prime}$
refers to an affine parametrization adapted to
${U}_{S}$
, and the definition of
${D}_{\text{area}}$
refers to an affine parametrization adapted to
${U}_{O}$
. While
${D}_{\text{area}}$
depends on
${U}_{O}$
but not on
${U}_{S}$
,
${D}_{lum}^{\prime}$
depends on
${U}_{S}$
but not on
${U}_{O}$
.
Luminosity distance.
The physical meaning of the corrected luminosity distance is most easily understood in the photon picture. For photons isotropically emitted from a light source, the percentage that hit a prescribed area at the observer is proportional to
$1/({D}_{lum}^{\prime}{)}^{2}$
. As the energy of each photon undergoes a redshift, the energy flux at the observer is proportional to
$1/({D}_{lum}{)}^{2}$
, where
$$\begin{array}{c}{D}_{lum}=(1+z){D}_{lum}^{\prime}=(1+z{)}^{2}{D}_{\text{area}}.\end{array}$$ 
(48)

Thus,
${D}_{lum}$
is the relevant quantity for calculating the luminosity (apparent brightness) of pointlike light sources (see Equation ( 52 )). For this reason
${D}_{lum}$
is called the (uncorrected) luminosity distance. The observation that the purely geometric quantity
${D}_{lum}^{\prime}$
must be modified by an additional redshift factor to give the energy flux is due to Walker [
342]
.
${D}_{lum}$
depends on the 4velocity
${U}_{O}$
of the observer and of the 4velocity
${U}_{S}$
of the light source.
${D}_{lum}$
and
${D}_{lum}^{\prime}$
can be viewed as functions of the observational coordinates
$(s,\Psi ,\Theta ,\tau )$
if a vector field
$U$
with
$g(U,U)=1$
has been distinguished, one integral curve of
$U$
is chosen as the observer, and the other integral curves of
$U$
are chosen as the light sources. In that case Equation ( 38 ) implies that not only
${D}_{\text{area}}\left(s\right)$
but also
${D}_{lum}\left(s\right)$
and
${D}_{lum}^{\prime}\left(s\right)$
are of the form
$s+\mathcal{O}\left({s}^{2}\right)$
. Thus, near the observer all three distance measures coincide with Euclidean distance in the observer's rest space.
Parallax distance.
In an arbitrary spacetime, we fix an observation event
${p}_{O}$
and the observer's 4velocity
${U}_{O}$
. We consider a pastoriented lightlike geodesic
$\lambda $
parametrized by affine distance,
$\lambda \left(0\right)={p}_{O}$
and
$g\left(\dot{\lambda}\left(0\right),{U}_{O}\right)=1$
. To a light source passing through the event
$\lambda \left(s\right)$
we assign the (averaged) parallax distance
${D}_{par}\left(s\right)=\theta (0{)}^{1}$
, where
$\theta $
is the expansion of an infinitesimally thin bundle with vertex at
$\lambda \left(s\right)$
. This definition follows [
171]
. Its relevance in view of cosmology was discussed in detail by Rosquist [
287]
.
${D}_{par}$
can be measured by performing the standard trigonometric parallax method of elementary Euclidean geometry, with the observer at
${p}_{O}$
and an assistant observer at the perimeter of the bundle, and then averaging over all possible positions of the assistant. Note that the method refers to a bundle with vertex at the light source, i.e., to light rays that leave the light source simultaneously. (Averaging is not necessary if this bundle is circular.)
${D}_{par}$
depends on the 4velocity of the observer but not on the 4velocity of the light source. To within firstorder approximation near the observer it coincides with affine distance (recall Equation ( 32 )). For the potential obervational relevance of
${D}_{par}$
see [
287]
, and [
297]
, p. 509.
In view of lensing,
${D}_{+}$
,
${D}_{}$
, and
${D}_{lum}$
are the most important distance measures because they are related to image distortion (see Section 2.5 ) and to the brightness of images (see Section 2.6 ).
In spacetimes with many symmetries, these quantities can be explicitly calculated (see Section
4.1 for conformally flat spactimes, and Section 4.3 for spherically symmetric static spacetimes). This is impossible in a spacetime without symmetries, in particular in a realistic cosmological model with inhomogeneities (“clumpy universe”). Following Kristian and Sachs [
189]
, one often uses series expansions with respect to
$s$
. For statistical considerations one may work with the focusing equation in a Friedmann–Robertson–Walker spacetime with average density (see Section 4.1 ), or with a heuristically modified focusing equation taking clumps into account. The latter leads to the socalled Dyer–Roeder distance [
86,
87]
which is discussed in several textbooks (see, e.g., [
297]
). (For preDyer–Roeder papers on optics in cosmological models with inhomogeneities, see the historical notes in [
173]
.) As overdensities have a focusing and underdensities have a defocusing effect, it is widely believed (following [
344]
) that after averaging over sufficiently large angular scales the Friedmann–Robertson–Walker calculation gives the correct distanceredshift relation.
However, it was argued by Ellis, Bassett, and Dunsby [
99]
that caustics produced by the lensing effect of overdensities lead to a systematic bias towards smaller angular sizes (“shrinking”). For a spherically symmetric inhomogeneity, the effect on the distanceredshift relation can be calculated analytically [
230]
. For thorough discussions of light propagation in a clumpy universe also see Pyne and Birkinshaw [
283]
, and Holz and Wald [
160]
.
2.5 Image distortion
In special relativity, a spherical object always shows a circular outline on the observer's sky, independent of its state of motion [
256,
319]
. In general relativity, this is no longer true; a small sphere usually shows an elliptic outline on the observer's sky. This distortion is caused by the shearing effect of the spacetime geometry on light bundles. For the calculation of image distortion we need the material of Sections 2.3 and 2.4 . For an observer with 4velocity
${U}_{O}$
at an event
${p}_{O}$
, there is a unique affine parametrization
$s\mapsto \lambda \left(s\right)$
for each lightlike geodesic through
${p}_{O}$
such that
$\lambda \left(0\right)={p}_{O}$
and
$g\left(\dot{\lambda}\left(0\right),{U}_{O}\right)=1$
. Around each of these
$\lambda $
we can consider an infinitesimally thin bundle with vertex at
$s=0$
. The elliptical crosssection of this bundle can be characterized by the shape parameters
${D}_{+}\left(s\right)$
,
${D}_{}\left(s\right)$
and
$\chi \left(s\right)$
(recall Figure 3 ). In the terminology of Section 2.4 ,
$s$
is the affine distance, and
${D}_{+}\left(s\right)$
and
${D}_{}\left(s\right)$
are the extremal angular diameter distances. The complex quantity
$$\begin{array}{c}\epsilon \left(s\right)=\left(\frac{{D}_{+}\left(s\right)}{{D}_{}\left(s\right)}\frac{{D}_{}\left(s\right)}{{D}_{+}\left(s\right)}\right){e}^{2i\chi \left(s\right)}\end{array}$$ 
(49)

is called the ellipticity of the bundle. The phase of
$\epsilon $
determines the position angle of the elliptical crosssection of the bundle with respect to the Sachs basis. The absolute value of
$\epsilon \left(s\right)$
determines the eccentricity of this crosssection;
$\epsilon \left(s\right)=0$
indicates a circular crosssection and
$\left\epsilon \right(s\left)\right=\infty $
indicates a caustic point of multiplicity one. (It is also common to use other measures for the eccentricity, e.g.,
${D}_{+}{D}_{}/{D}_{+}+{D}_{}$
.) From Equation ( 27 ) with
$\varrho =\theta $
we get the derivative of
$\epsilon $
with respect to the affine distance
$s$
,
$$\begin{array}{c}\dot{\epsilon}=2\sigma \sqrt{\epsilon {}^{2}+4}.\end{array}$$ 
(50)

The initial conditions
${D}_{\pm}\left(0\right)=0$
,
${\dot{D}}_{\pm}\left(0\right)=1$
imply
$$\begin{array}{c}\epsilon \left(0\right)=0.\end{array}$$ 
(51)

Equation ( 50 ) and Equation ( 51 ) determine
$\epsilon $
if the shear
$\sigma $
is known. The shear, in turn, is determined by the Sachs equations ( 25 , 26 ) and the initial conditions ( 32 , 33 ) with
${s}_{0}=0$
for
$\theta (=\varrho )$
and
$\sigma $
.
It is recommendable to change from the
$\epsilon $
determined this way to
$\varepsilon =\overline{\epsilon}$
. This transformation corresponds to replacing the Jacobi matrix
$\mathit{D}$
by its inverse. The original quantity
$\epsilon \left(s\right)$
gives the true shape of objects at affine distance
$s$
that show a circular image on the observer's sky. The new quantity
$\varepsilon \left(s\right)$
gives the observed shape for objects at affine distance
$s$
that actually have a circular crosssection. In other words, if a (small) spherical body at affine distance
$s$
is observed, the ellipticity of its image on the observer's sky is given by
$\varepsilon \left(s\right)$
.
By Equations (
50 , 51 ),
$\epsilon $
vanishes along the entire ray if and only if the shear
$\sigma $
vanishes along the entire ray. By Equations ( 26 , 33 ), the shear vanishes along the entire ray if and only if the conformal curvature term
${\psi}_{0}$
vanishes along the entire ray. The latter condition means that
$K=\dot{\lambda}$
is tangent to a principal null direction of the conformal curvature tensor (see, e.g., Chandrasekhar [
54]
). At a point where the conformal curvature tensor is not zero, there are at most four different principal null directions. Hence, the distortion effect vanishes along all light rays if and only if the conformal curvature vanishes everywhere, i.e., if and only if the spacetime is conformally flat. This result is due to Sachs [
290]
. An alternative proof, based on expressions for image distortions in terms of the exponential map, was given by Hasse [
148]
.
For any observer, the distortion measure
$\varepsilon =\overline{\epsilon}$
is defined along every light ray from every point of the observer's worldline. This gives
$\varepsilon $
as a function of the observational coordinates
$(s,\Psi ,\Theta ,\tau )$
(recall Section 2.1 , in particular Equation ( 4 )). If we fix
$\tau $
and
$s$
,
$\varepsilon $
is a function on the observer's sky. (Instead of
$s$
, one may choose any of the distance measures discussed in Section 2.4 , provided it is a unique function of
$s$
.) In spacetimes with sufficiently many symmetries, this function can be explicitly determined in terms of integrals over the metric function. This will be worked out for spherically symmetric static spacetimes in Section 4.3 . A general consideration of image distortion and example calculations can also be found in papers by Frittelli, Kling and Newman [
120,
119]
.
Frittelli and Oberst [
126]
calculate image distortion by a “thick gravitational lens” model within a spacetime setting.
In cases where it is not possible to determine
$\varepsilon $
by explicitly integrating the relevant differential equations, one may consider series expansions with respect to the affine parameter
$s$
. This technique, which is of particular relevance in view of cosmology, dates back to Kristian and Sachs [
189]
who introduced image distortion as an observable in cosmology. In lowest nonvanishing order,
$\varepsilon (s,\Psi ,\Theta ,{\tau}_{O})$
is quadratic with respect to
$s$
and completely determined by the conformal curvature tensor at the observation event
${p}_{O}=\gamma \left({\tau}_{O}\right)$
, as can be read from Equations ( 50 , 51 , 33 ).
One can classify all possible distortion patterns on the observer's sky in terms of the Petrov type of the Weyl tensor [
56]
. As outlined in [
56]
, these patterns are closely related to what Penrose and Rindler [
261]
call the fingerprint of the Weyl tensor. At all observation events where the Weyl tensor is nonzero, the following is true. There are at most four points on the observer's sky where the distortion vanishes, corresponding to the four (not necessarily distinct) principal null directions of the Weyl tensor. For type
$N$
, where all four principal null directions coincide, the distortion pattern is shown in Figure 5 .
Figure 5
: Distortion pattern. The picture shows, in a Mercator projection with
$\Phi $
as the horizontal and
$\Theta $
as the vertical coordinate, the celestial sphere of an observer at a spacetime point where the Weyl tensor is of Petrov type
$N$
. The pattern indicates the elliptical images of spherical objects to within lowest nontrivial order with respect to distance. The length of each line segment is a measure for the eccentricity of the elliptical image, the direction of the line segment indicates its major axis. The distortion effect vanishes at the north pole
$\Theta =0$
which corresponds to the fourfold principal null direction. Contrary to the other Petrov types, for type N the pattern is universal up to an overall scaling factor. The picture is taken from [
56]
where the distortion patterns for the other Petrov types are given as well.
The distortion effect is routinely observed since the mid1980s in the form of arcs and (radio) rings (see [
297,
274,
343]
for an overview). In these cases a distant galaxy appears strongly elongated in one direction. Such strong elongations occur near a caustic point of multiplicity one where
$\left\varepsilon \right\to \infty $
. In the case of rings and (long) arcs, the entire bundle cannot be treated as infinitesimally thin, i.e., a theoretical description of the effect requires an integration. For the idealized case of a point source, images in the form of (1dimensional) rings on the observer's sky occur in cases of rotational symmetry and are usually called “Einstein rings” (see Section 4.3 ). The rings that are actually observed show extended sources in situations close to rotational symmetry.
For the majority of galaxies that are not distorted into arcs or rings, there is a “weak lensing” effect on the apparent shape that can be investigated statistically. The method is based on the assumption that there is no prefered direction in the universe, i.e., that the axes of (approximately spheroidal) galaxies are randomly distributed. So, without a distortion effect, the axes of galaxy images should make a randomly distributed angle with the
$(\Psi ,\Theta )$
grid on the observer's sky. Any deviation from a random distribution is to be attributed to a distortion effect, produced by the gravitational field of intervening masses. With the help of the quasiNewtonian approximation, this method has been elaborated into a sophisticated formalism for determining mass distributions, projected onto the plane perpendicular to the line of sight, from observed image distortions. This is one of the most important astrophysical tools for detecting (dark) matter. It has been used to determine the mass distribution in galaxies and galaxy clusters, and more recently observations of image distortions produced by largescale structure have begun (see [
22]
for a detailed review).
From a methodological point of view, it would be desirable to analyse this important line of
astronomical research within a spacetime setting. This should give prominence to the role of the conformal curvature tensor.
Another interesting way of observing weak image distortions is possible for sources that emit linearly polarized radiation. (This is true for many radio galaxies. Polarization measurements are also relevant for stronglensing situations; see Schneider, Ehlers, and Falco [
297]
, p. 82 for an example.) The method is based on the geometric optics approximation of Maxwell's theory. In this approximation, the polarization vector is parallel along each ray between source and observer [
88]
(cf., e.g., [
225]
, p. 577). We may, thus, use the polarization vector as a realization of the Sachs basis vector
${E}_{1}$
. If the light source is a spheroidal celestial body (e.g., an elliptic galaxy), it is reasonable to assume that at the light source the polarization direction is aligned with one of the axes, i.e.,
$2\chi \left(s\right)/\pi \in \mathbb{Z}$
. A distortion effect is verified if the observed polarization direction is not aligned with an axis of the image,
$2\chi \left(0\right)/\pi /\in \mathbb{Z}$
. It is to be emphasized that the deviation of the polarization direction from the elongation axis is not the result of a rotation (the bundles under consideration have a vertex and are, thus, twistfree) but rather of successive shearing processes along the ray. Also, the effect has nothing to do with the rotation of an observer field. It is a pure conformal curvature effect. Related misunderstandings have been clarified by Panov and Sbytov [
253,
254]
. The distortion effect on the polarization plane has, so far, not been observed.
(Panov and Sbytov [
253]
have clearly shown that an effect observed by Birch [
31]
, even if real, cannot be attributed to distortion.) Its future detectability is estimated, for distant radio sources, in [
316]
.
2.6 Brightness of images
For calculating the brightness of images we need the definitions and results of Section 2.4 .
In particular we need the luminosity distance
${D}_{lum}$
and its relation to other distance measures.
We begin by considering a point source (worldline) that emits isotropically with (bolometric, i.e., integrated over all frequencies) luminosity
$L$
. By definition of
${D}_{lum}$
, in this case the energy flux at the observer is
$$\begin{array}{c}F=\frac{L}{4\pi {{D}_{lum}}^{2}}.\end{array}$$ 
(52)

$F$
is a measure for the brightness of the image on the observer's sky. The magnitude
$m$
used by astronomers is essentially the negative logarithm of
$F$
,
$$\begin{array}{c}m=2.5{log}_{10}\left({{D}_{lum}}^{2}\right)2.5{log}_{10}\left(L\right)+{m}_{0},\end{array}$$ 
(53)

with
${m}_{0}$
being a universal constant. In Equation ( 52 ),
${D}_{lum}$
can be expresed in terms of the area distance
${D}_{\text{area}}$
and the redshift
$z$
with the help of the general relation ( 48 ). This demonstrates that the magnification factor
$\mu $
, which is defined by Equation ( 42 ), admits the following reinterpretation.
$\left\mu \right(s\left)\right$
relates the flux from a point source at affine distance
$s$
to the flux from a point source with the same luminosity at the same affine distance and at the same redshift in Minkowski spacetime.
${D}_{lum}$
can be explicitly calculated in spacetimes where the Jacobi fields along lightlike geodesics can be explicitly determined. This is true, e.g., in spherically symmetric and static spacetimes where the extremal angular diameter distances
${D}_{+}$
and
${D}_{}$
can be calculated in terms of integrals over the metric coefficients. The resulting formulas are given in Section 4.3 below. Knowledge of
${D}_{+}$
and
${D}_{}$
immediately gives the area distance
${D}_{\text{area}}$
via Equation ( 41 ).
${D}_{\text{area}}$
together with the redshift determines
${D}_{lum}$
via Equation ( 48 ). Such an explicit calculation is, of course, possible only for spacetimes with many symmetries.
By Equation (
48 ), the zeros of
${D}_{lum}$
coincide with the zeros of
${D}_{\text{area}}$
, i.e., with the caustic points.
Hence, in the rayoptical treatment a point source is infinitely bright (magnitude
$m=\infty $
) if it passes through the caustic of the observer's past light cone. A waveoptical treatment shows that the energy flux at the observer is actually bounded by diffraction. In the quasiNewtonian approximation formalism, this was demonstrated by an explicit calculation for light rays deflected by a spheroidal mass by Ohanian [
244]
(cf. [
297]
, p. 220). Quite generally, the rayoptical calculation of the energy flux gives incorrect results if, for two different light paths from the source worldline to the observation event, the time delay is smaller than or approximately equal to the coherence time. Then interference effects give rise to frequencydependent corrections to the energy flux that have to be calculated with the help of wave optics. In multipleimaging situations, the time delay decreases with decreasing mass of the deflector. If the deflector is a cluster of galaxies, a galaxy, or a star, interference effects can be ignored. Gould [
144]
suggested that they could be observable if a deflector of about
${10}^{15}$
Solar masses happens to be close to the line of sight to a gammaray burster. In this case, the angleseparation between the (unresolvable) images would be of the order
${10}^{15}$
arcseconds (“femtolensing”). Interference effects could make a frequencydependent imprint on the total intensity. Ulmer and Goodman [
326]
discussed related effects for deflectors of up to
${10}^{11}$
Solar masses. Femtolensing has not been observed so far. However, it is an interesting future perspective for lensing effects where wave optics has to be taken into account.
This would give practical relevance to the theoretical work of Herlt and Stephani [
155,
156]
who calculated gravitational lensing on the basis of wave optics in the Schwarzschild spacetime.
We now turn to the case of an extended source, whose surface makes up a 3dimensional timelike submanifold