<ph f="cmex"> </ph><ph f="cmbx">Global Regularity for the Yang–Mills Equations on High Dimensional Minkowski Space</ph>

Joachim Krieger

Jacob Sterbenz

1 Introduction

In this work we investigate the global in time regularity properties of the Yang-Mills equations on high dimensional Minkowski space with compact semi-simple gauge group G   . Specifically, we show that if a certain gauge covariant Sobolev norm is small, the so called critical regularity H ˙ A n 4 2   , and the dimension satisfies 6 n   , then a global solution exists and remains regular for all times given that the initial data is regular. This is in the same spirit as the recent result [8for the Maxwell-Klein-Gordon system, as well as earlier results for high dimensional wave-maps (see [11, [6, [9, and [7). Our approach shares many similarities with those works, whose underlying philosophy in basically the same. That is, to introduce Coulomb type gauges in order to treat a specific potential term as a quadratic error. In our setup, we use a non-abelian variant of the remarkable parametrix construction contained in [8, in conjunction with a version of the Uhlenbeck lemma [13on the existence of global Coulomb gauges. This latter result has been used for high dimensional wave-maps to globally “renormalize” the equation so that the existence theory can be treated directly through Strichartz estimates applied to multi-linear expressions.
In the present situation, as was the case with the Maxwell-Klein-Gordon system, the corresponding renormalization procedure is necessarily more involved because it needs to be done separately for each distinct direction in phase space. That is, we provide a renormalization of the Yang-Mills equations through the construction of a Fourier integral operator with G   -valued phase. The construction and estimation of such an object relies heavily on elliptic-Coulomb theory, primary due to the difficulty one faces in that the G   -valued phase function cannot be localized within a neighborhood of any given point on the group due to the critical nature of the problem (if you like, there is a logarithmic “twisting” of the group element as one moves around in physical space; fortunately the group is compact so this doesn't ruin things).
To get things started, we now give a simple gauge covariant description of the equations we are considering. The (hyperbolic) Yang-Mills equations arise as the evolution equations for a connection on the bundle V = n × g   , where n   is some n   (spatial) dimensional Minkowski space, with metric g : = ( 1 , 1 , , 1 )   in inertial coordinates ( x 0 , x i )   , and g   is the Lie algebra of some compact semi-simple Lie group G   . Here we are considering V   with the A d ( G )   gauge structure. If φ   is any section to V   over   , then a connection assigns to every vector-field X   on the base n   , a derivative which we denote as D X   , such that the following Leibniz rule is satisfied for every scalar field f   : D X ( f φ ) = X ( f ) φ + f D X φ .   In this setup, we assume that V   is equipped with an A d ( G )   invariant metric ,   which respects the action of D   . That is, one has the formula:
d φ , ψ = D φ , ψ + φ , D ψ . (1)
In the present situation we will take ,   to be the Killing form on g   . The curvature associated to D   is the g   valued two-form F   which arises from the commutation of covariant derivatives and is defined via the formula: D X D Y φ D Y D X φ D [ X , Y ] φ = [ F ( X , Y ) , φ ] .   We say that the connection D   satisfies the Yang-Mills equations if its curvature is a (formal) local minima of the following Maxwell type functional:
[ F ] = 1 4 n F α β , F α β D V n . (2)
The Euler-Lagrange equations of  2 read:
D β F α β = 0 . (3)
Also, from the fact that F   arises as the curvature of some connection, we have that the following identity known as “Bianchi” is satisfied:
D [ α F β γ ] = 0 . (4)
From now on we will refer to the system  3  4 as the first order Yang–Mills equations (FYM).
As we have already mentioned, our aim is to study the regularity properties of the Cauchy problem for the (FYM) system. To describe this in a geometrically invariant way, we make use of the following splitting of the connection-curvature pair ( F , D )   : Foliating   into the standard Cauchy hypersurfaces t = c o n s t .   , we decompose: ( F , D ) = ( F ̲ , D ̲ ) ( E , D 0 ) ,   where ( F ̲ , D ̲ )   denotes the portion of ( F , D )   which is tangent to the surfaces t = c o n s t .   (i.e. the induced connection), and ( E , D 0 )   denotes respectively the interior product of F   with the foliation generator T = t   , and the normal portion of D   . In inertial coordinates we have: E i = F 0 i .   On the initial Cauchy hypersurface t = 0   we call a set ( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) )   admissible Cauchy data 1 if it satisfies the following compatibility condition:
D ̲ i E i ( 0 ) = 0 . (5)
We define the Cauchy problem for the Yang-Mills equation to be the task of construction a connection ( F , D )   which solves  3 , and has Cauchy data equal to ( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) )   .
In order to understand what the appropriate condition on the initial data should be (and what we would like it to be!), it is necessary to consider the following two basic mathematical features of the system  3  4 . The first is conservation. From the Lagrangian nature of the field equations  3  4 , we have the tensorial conservation law:
Q α β [ F ] = F α γ , F β γ 1 4 g α β F γ δ , F γ δ ,
α Q α β [ F ] = 0 ,
where   is the covariant derivative on n   . In particular, contracting Q   with the vector-field T = t   , we arrive at the following constant of motion for the system  3  4 :
R n Q 00 d x = 1 2 R n ( | E | 2 + | F ̲ | 2 ) d x . (6)
The second main aspect of the system  3  4 is that of scaling. If we perform the transformation:
( x 0 , x i ) ( λ x 0 , λ x i ) , (7)
on n   , then an easy calculation shows that:
D λ D , F λ 2 F . (8)
If we now define the gauge covariant (integer) Sobolev spaces:
F 2 H ˙ A s : = | I | = s D ̲ I F 2 L 2 ( R n ) , (9)
where for each multiindex I = ( i 1 , , i s )   we have that D I = D i 1 D i s   is the repeated covariant differentiation with respect to the translation invariant spatial vector-fields { 1 , , n }   , then for even2 spatial dimensions, the norm H ˙ A n 4 2   is invariant with respect to the scaling transformation  8 . In particular, the conserved quantity  6 is invariant when n = 4   and this is called the critical dimension.
Now, based on numerical evidence as well as analytical arguments, it is suspected that in general the Cauchy problem for  3  4 with smooth initial data will not be well behaved without size control of the critical regularities s c = n 4 2   in high dimensions. What we will take this statement to mean here is simply that if 4 n   and the H ˙ A s c   norm of the initial data is not sufficiently small, then one can expect the existence of regular (i.e. C A   ) sets ( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) )   such that the corresponding solution to  3  4 will develop a singularity in finite time. By singularity development, we mean that some higher norm of the type  9 will fail to be bounded at a later time, given that it was initially; or even more specifically, that the L   norm of the curvature F   will blow up in finite time for some regular initial data sets. Since these norms are gauge covariant, this type of singularity development would correspond to an intrinsic geometric breakdown of the equations, and could not be an artifact of poorly chosen local coordinates (gauge) on V   . This has been rigorously demonstrated in the equivariant category for the supercritical dimensions 5 n   (see [3). In the critical dimension things are much less clear, although there is numerical evidence that on still has blowup for large initial data (see [2). This is thought to be connected with the existence of large static solutions (instantons).
One possible conjecture is that there is global regularity when the norm  6 is below the ground state energy.
Going in the other direction, it is expected that if the critical norm H ˙ A n 4 2   is sufficiently small, then regular initial data will remain regular for all times. This can be seen as an easier preliminary step toward understanding in detail the issue of large data for dimension n = 4   , and is furthermore an interesting problem in its own right. A central difficulty in the demonstration of this conjecture is to construct a stable set coordinates on the bundle V   such that the Christoffel symbols of D   are well behaved in the sense that they obey the natural range of estimates one expects for this type of problem. This is precisely what we shall do in dimensions 6 n   through the well known procedure of using (spatial) Coulomb gauges. Unfortunately, this preliminary gauge construction is far from sufficient to close the regularity argument, and it will in fact be necessary for us to go much further and control infinitely many Coulomb gauges, each of which correspond to a distinct polarized plane wave solution to the usual (flat) wave equation = α α   .
However, this does not effect the statement of our main result which is in fact quite simple:
Theorem 1.1 (Critical regularity for high dimensional Yang-Mills). Let the number of spatial dimensions be even and such that 6 n   . Then there exists fixed constants 0 < ɛ 0 , C   such that if ( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) )   is an admissible data set which satisfies the smallness condition:
( F ̲ ( 0 ) , E ( 0 ) ) H ˙ A n 4 2 ɛ 0 , (10)
and there exists constants M k <   , n 4 2 < k N   such that:
( F ̲ ( 0 ) , E ( 0 ) ) H ˙ A k = M k , (11)
then there exists a unique global solution to the field equations  3  4 with this initial data, and furthermore one has that the following inductive norm bounds hold:
F H ˙ A n 4 2 C ɛ 0 ,
F H ˙ A k C ( M n 4 2 , , M k 1 ) M k .
In particular, in this case F   remains smooth (in the gauge covariant sense) and bounded for all times.
Remark 1.2. As alluded to above, we will more specifically prove the existence of a global (in space and time) spatial Coulomb gauge such that the coefficient functions of the curvature F   , as well as the Christoffel symbols (gauge potentials) of the connection D   are in the classical Sobolev spaces H ˙ s   , and such that they satisfy appropriate angularly and spatially microlocalized Strichartz estimates. We have elected to eliminate a discussion of this from the statement from the main theorem in favor of the simpler geometric language so that the reader can at a first glance gain an idea of the content of our result without being confronted with too many technical details.

1 Of course, this set is overdetermined as the curvature F ̲   depends completely on the connection D ̲   . Also, it is perhaps not completely obvious at first that the set ( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) )   determined uniquely a solution ( F , D )   to  3  4 . For example, the initial normal derivative D 0 ( 0 )   does not need to be specified. We will show this is the case in the sequel (in particular see Proposition  5.3 ).

2 For odd spatial dimensions, the above discussion needs to be modified somewhat because we will not make an attempt here to define fractional powers of the spaces H ˙ A s   . Instead, what one should do is to simply put things in a Coulomb gauge and then use the usual fractional Sobolev spaces. This later approach is what we will take in the sequel, although for sake of concreteness we will only discuss the case of even dimensions. We have opted for the covariant approach in the introduction because it makes stating our main result a bit easier, and has an appealing simplicity. Also, since we shall need many specifics on how Coulomb gauges are constructed in order to create and control our parametrix, we will explain how the Coulomb gauge relates to the Cauchy problem in detail in the following two sections.

Acknowledgements

First and foremost, we would like to thank our advisors Sergiu Klainerman and Matei Machedon for their continuing support and encouragement. This subject matter as well as our own point of view owes much to them. We would also like to thank Igor Rodnianski, Terry Tao, and Daniel Tataru for many interesting and helpful conversations. This work began at the Institute for Advanced Study during the Fall 2003 semester when both authors were in attendance. The second author would like to thank Harvard University for for its hospitality during the Spring of 2004 and Winter 2005. The first author was partially supported under NSF grant DMS-0401177. The second author was supported in part by an NSF postdoctoral fellowship.

2 Some Basic Notation

We list here some of the basic conventions used in this work, as well as some constants which will be needed in the sequel. We use the usual notation a b   , to denote that a C b   for some (possibly large) constant C   which may change from line to line. Likewise we write a b   to mean a C 1 b   for some large constant C   . In general, C   will denote a large constant, but at times we will also call C   a connection. The difference should be clear from context. Overall, we will have use for a family of small constants, which satisfy the hierarchy:
0 < ɛ 0 ε 0 ε 0 ~ γ δ 1 . (12)

3 Some gauge-theoretic preliminaries

In this paper, we are working with a compact semi-simple group Lie G   . However, all of our calculations will be carried out in a somewhat larger context. Firstly, we will assume that G   is embedded as a subgroup of matrices of some (possibly) larger orthogonal group O ( m )   . In particular, we can identify the Lie algebra g   with an appropriate sub-algebra of o ( m )   . This allows us to perform all of our calculations on a specific collection of matrices. Since our main computation involves complex valued integral operators, we will further need to work in the complexified algebra C o ( m )   . The Killing form ,   on g   extends easily to this context to yield the bilinear form:
A , B = t r a c e ( A B * ) . (13)
Notice that this is a positive definite form when restricted to the real vector space o ( m )   , and is a sesquilinear form on the corresponding complexified algebra C o ( m )   . More importantly, ,   is A d ( O ( m ) )   invariant, and in fact the more general identity holds:
g 1 1 A h 1 , g 2 1 B h 2 = g 2 g 1 1 A h 1 h 2 1 , B , (14)
for A , B C o ( m )   and g i , h i O ( m )   . In fact, it is not difficult to see that the form  13 extends to a sesquilinear form on all complex matrices in M ( m × m )   , and that it can be identified with the usual matrix inner product:
A , B = i , j a i j b ¯ i j , (15)
which come from considering these matrices as vectors in C m 2   . Furthermore, it is easy to see that the general adjoint formula  14 continues to hold in this context.
This will be of fundamental importance in the sequel. In general, we will use the notation A 2 = A , A   to denote the action of this norm on any matrix. Also, notice that directly from  14 one has the isometric identity:
g A = A , g O ( m ) . (16)
These are all very simple algebraic identities, but our method is incredibly sensitive to them and would collapse entirely if they did not hold.
In the context of matrices, we may compute the action of the connection D   on sections F   to V   as follows: D X F = X α ( α ( F ) + [ A α , F ] ) ,   Here, the gauge potentials { A α }   are g   -valued, and are defined via the equation:
D α 1 V = [ A α , 1 V ] ,   where 1 V   denotes some chosen orthonormal frame in V   , and we are abusively writing F = F 1 V   . In shorthand notation, we write: D = d + A ,   where d   is the usual exterior derivative on matrix valued functions. Likewise, in this notation we have the well known identity for the curvature of D   : F A = d A + [ A , A ] .   In this last formula, we use the superscript notation to emphasize the fact that the curvature is not gauge invariant, but transforms according to the A d ( G )   action:
F g F g 1 ,   whenever one performs the change of frame 1 V g 1 V g 1   . As is well known, the potentials { A α }   themselves do not transform according to A d ( G )   , but instead take on an affine group of transformations:
B = g A g 1 + g d g 1 , (17)
where { B α }   represents the connection D   in the frame g 1 V g 1   . In particular, the difference of two connections obeys the A d ( G )   structure, a fact we will have use for in a moment. For instance, any connection { C α }   with F C = 0   obeys A d ( G )   .
Furthermore, as is the basic fact of gauge theory, such connections always lead to a globally3 integrable ODE: d g = g C ,   where the solution g   belongs to G   . Thus, we may identify flat connections C   with infinitesimal gauge transformations, and it is easy to see that every gauge transformation  17 leads to a flat connection which we may define as C = g 1 d g   .
This completes our discussion of elementary gauge theory.
It will also be necessary for us to make use of the basic facts from (non-gauge-covariant) Hodge theory. Even though the connections we work with in this paper are on the full space-time n   , our use of Hodge theory will always be restricted to time slices { t } × R n   . In particular we use the general notation d , d *   for the exterior derivative and its adjoint acting on g   (and more generally M ( m × m )   ) valued differential forms on R n   . To emphasize this restriction, we will use Latin indices when computing these operators. For example:
( d A ) i j = { i A j } , ( d F ) i j k = [ i F j k ] ,
where { }   and [ ]   denote anti-symmetric and symmetric cyclic summing respectively. Also, the adjoint here is taken with respect to the Killing form  13 . In particular, we have the Hodge Laplacean:
Δ = ( d d * + d * d ) , (18)
which in our context is simply the usual scalar Laplacean acting component-wise on matrices. Finally, we have the Hodge decomposition which we write as A = A d f + A c f   where:
A d f = d * d Δ 1 A ,
A c f = d d * Δ 1 A .
This decomposition is bounded on L p   spaces for 1 < p <   as the operators involved are SIO's. Also, since these operators are all real, this decomposition respects the Lie algebra structure of g   inside of C o ( m )   .
The last topic we cover here is the basic underpinning of much of analysis in the context of compact gauge groups. This is the remarkable Uhlenbeck lemma, which allows one to “straighten out” a connection as long as its curvature satisfies appropriate bounds. The important thing for us is that these bounds are precisely at the level of the critical regularity H ˙ A n 4 2   . This result is:
Lemma 3.1 (Classical Uhlenbeck lemma). Let D A = d + A   be a connection with compact (matrix) group on R n   . Then there is a pair of constants ε 0 , C   which only depend on the dimension n   such that if the curvature F A   of D A   satisfies the bound: F A L n 2 ε 0 ,   then D A   is gauge equivalent to a connection D B = d + B   where the potentials { B i }   satisfy the condition: d * B = 0 ,   and such that the following estimate holds:
B L n C ε 0 . (19)
In the sequel, it will be useful for us to have a somewhat more refined version of Lemma  3.1 which does not make reference to the size of the curvature, but rather to the size of the connection { A α }   itself in a critical norm which does not involve derivatives. This will allow us to prove certain connections exist more directly. Furthermore, since the basic formulas used in the proof of this result will be important in constructing our parametrix, it will set the pace for much of what follows. Finally, we mention here that our proof is a bit different from that of [13in that it does not rely on any implicit function theorem type arguments, and is instead completely explicit being based on a simple Picard iteration.
Lemma 3.2 (Uhlenbeck lemma for small L n   perturbations of Coulomb potentials with small L n 2   curvature.). Let D A = d + A   be a connection on R n × V   with compact (matrix) gauge group G   . Then there exists constants ε 0 , C   such that if:
F A L n 2 ε 0 , (20)
and such that d + A   is gauge equivalent to d + B   with d * B = 0   , where one has the bounds:
A L n C ε 0 , (21)
then for every connection d + A ~   such that:
A ~ A L n C ε 0 , (22)
there exists a gauge equivalent connection d + B ~   such that d * B ~ = 0   , and one has the same size control:
B ~ L n C ε 0 . (23)
Remark 3.3. Before continuing with proof, let us remark here that Lemma  3.2 is in fact more general that the classical Uhlenbeck Lemma. Specifically,  3.2 easily implies  3.1 with smallness condition ε 0 / 2   (where ε 0   is determined by Lemma  3.2 ) through a straightforward induction procedure which we outline now.
First of all, from Lemma  3.2 we see that the set of all connections d + A   with curvature such that:
F A L n 2 ε 0 2 , (24)
and such that d + A   is equivalent to d + B   with d * B = 0   , and such that one has the bounds  19 , is an open set in the intersection of L n   with the set determined by  24 (in the sense of distributions). Therefore, if the conclusion of Lemma  3.1 were to be violated, it must then be the case that there is a smallest number r *   such that the sphere of radius r *   contains a connection d + A   with the property that it cannot be put in the Coulomb gauge (with L n   bounds), even though the bound  24 is valid for this connection. Now, consider the set of connections d + λ A   where 0 < ( 1 λ ) 1   .
A quick calculation shows that these have curvature: F λ A = λ F A + λ ( λ 1 ) [ A , A ] .   Choose λ   such that: ( 1 λ ) ( 1 + r * ) 2 ε 0 2 .   By the triangle and Hölders inequality, and the definition of r *   , we have that: F λ A L n 2 ε 0 .   Therefore, by the minimality of r *   we have that d + λ A   can be Coulomb gauged.
Again, by the definition of λ   , we have that: d + A = d + λ A + A ~ ,   where we easily have the bound (we may assume 1 C   ): A ~ L n C ε 0 .   Therefore, by an application of Lemma  3.2 we have that d + A   can be put in the Coulomb gauge with the  19 holds. This contradicts the minimality of r *   as was to be shown.
  • Proof of Lemma  3.2 . It suffices to show that d + A ~   is gauge equivalent to d + B ~   , where d * B ~ = 0   and with the bound  23 , provided that:
    A ~ L n 2 M ε 0 , (25)
    when ε 0   chosen sufficiently small, and where M   is some sufficiently large fixed constant which will be determined in a moment, and which will be chosen to be our C   in the estimates  21 and  23 (the reason for the notation switch will become clear in a moment). To see this, notice that the smallness condition  22 is gauge invariant because the difference of two connections transforms according to the A d ( G )   action which fixes the Killing form used to compute L n   . Therefore, we may assume from the start that the original connection A   is in the Coulomb gauge with size control  21 . In particular, the connection d + A   satisfies the div-curl system:
    d A = F A [ A , A ] , d * A = 0 ,
    which we can integrate to form the equation:
    A = d * Δ ( F A [ A , A ] ) . (26)
    Everything we do now will be based on the Riesz operator bounds:
    2 Δ 1 : L n L n , (27)
    Δ 1 : L n 2 L n . (28)
    We choose our constant C   such that M 8   is the constant in (the various vector analogs) of the embeddings  27  28 . Using these bounds and the integral equation  26 in conjunction with the assumed smallness conditions  20 and  21 , and a round of Hölder's inequality, we have the following improved bounds for d + A   :
    A L n M ε 0 , (29)
    as long as ε 0   is chosen sufficiently small. In particular, using  22 and some addition and subtraction we have the bound  25 .
    We now construct by hand the gauge transformation:
    d g = g A ~ B ~ g , (30)
    with d * B ~ = 0   . This will be done by constructing the infinitesimal gauge transformation C = g 1 d g   . A quick calculation shows that this must satisfy the following div-curl system:
    d C = [ C , C ] , (31a)
    d * C = d * A ~ + [ A ~ , C ] . (31b)
    Unfortunately, the system  31 cannot be solved constructively, say through an iteration scheme. This is because implicit in its structure is the compatibility condition d [ C , C ] = 0   , which gets destroyed through (at least the usual) Picard iteration. This could be side-stepped by using an implicit function theorem type argument, but since we prefer to do things explicitly we proceed as follows: We first write the system  31 in terms of integral equations:
    C d f = d * Δ [ C , C ] , (32a)
    C c f = d Δ ( d * A ~ [ A ~ , C ] ) . (32b)
    Here C = C d f + C c f   denotes the Hodge decomposition of the matrix valued one-form C   . A solution to system  32 can now be constructed from scratch via Picard iteration starting with C ( 0 ) = 0   . The condition  25 and the embeddings  27  28 guarantee convergence to a solution. Furthermore, because it is true for each iterate, one has the following bounds on the solution:
    C L n 2 M 8 A ~ L n M 2 ε 0 . (33)
    Also, since each iterate belongs pointwise to g   , the solution does also due to the fact that g   is a linear (and hence closed) subspace of the matrices M ( m × m )   . We now need to show that this C   is indeed a solution to the original system  31 . That is, we need to establish the identity:
    d d * Δ 1 [ C , C ] = [ C , C ] . (34)
    Notice that this does not follow algebraically from the form of the integral system  32 , because it is not a-priori clear that in fact d [ C , C ] = 0   . However, this is the case, which is a consequence of the following a-priori estimate for solutions to  32 :
    d d * Δ 1 [ C , C ] + [ C , C ] L n 2 C L n d d * Δ 1 [ C , C ] + [ C , C ] L n 2 . (35)
    Notice that  33 and  35 taken together immediately imply the identity  34 .
    In order to show  35 , we first use the Hodge Laplacean  18 to write: d d * Δ 1 [ C , C ] + [ C , C ] = d * Δ 1 ( d [ C , C ] ) .   Next, we compute that:
    ( d [ C , C ] ) i j k = [ i [ C j , C k ] ] ,
    = [ [ i C j , C k ] ] [ [ i C k , C j ] ] ,
    = [ C [ i , ( d C ) j k ] ] .
    Therefore, using this last identity in conjunction with fractional integration, and using the identity from line  32a above, we have that:
    d d * Δ 1 [ C , C ] + [ C , C ] L n 2 = d * Δ 1 [ C , d C ] L n 2 ,
    [ C , d C ] L n 3 ,
    [ C , ( d d * Δ 1 [ C , C ] + [ C , C ] ) ] L n 3 + [ C , [ C , C ] ] L n 3 ,
    2 C L n d d * Δ 1 [ C , C ] + [ C , C ] L n 2 .
    Notice that the last inequality here follows simply from the Jacobi identity [ C , [ C , C ] ] = 0   .
    To wrap things up, we only need to establish the existence of g   on line  30 above with d * B ~ = 0   , and such that we have the size control  23 (with constant M   ). Now, by design we have that F C = 0   , so we may integrate the equation: d g = g C ,   with initial conditions g ( 0 ) = I   on all of R n   . Defining now: B ~ = g A ~ g 1 + g d g 1 ,   we have that:
    d * B ~ = D i B ~ B ~ i ,
    = g D i A ~ ( g 1 B ~ i g ) g 1 ,
    = g D i A ~ ( A ~ i C i ) g 1 ,
    = g ( d * A ~ + d * C [ A ~ , C ] ) g 1 ,
    = 0 ,
    as was to be shown. Finally, by using the bounds  25 and  33 and the definition of the potentials { B ~ }   and { C }   we have the bound: B ~ L n A ~ L n + C L n M ε 0 .   This completes the proof of Lemma  3.2 .

3 Of course this ODE is non-linear, but in the present context it also satisfies the conservation law g g = I   .

4 Some analytic preliminaries

We record here some useful formulas, mostly from elementary harmonic analysis, which will be used many times in the sequel. Firstly, we define the Fourier transform on C o ( m )   , which is merely the usual scalar Fourier transform acting component-wise on matrices:
A ^ ( ξ ) = R n e 2 π i x ξ A ( x ) d x . (36)
The Plancherel theorem with respect to the Killing form  13 reads: R x n A , B d x = R ξ n A ^ , B ^ d ξ .   This follows simply from definition of the inner product  15 . While the constructions we make in the sequel are almost explicitly based on the spatial transform  36 , it will in certain places be convenient for us to work with the space-time Fourier transform: A ^ ( τ , ξ ) = R n + 1 e 2 π i ( t τ + x ξ ) A ( t , x ) d t d x .   In the sequel, we will have much use for dyadic frequency decompositions with respect to the spatial variable. For the most part, we will use a fairly loose and heuristic notation for this operation. This will help us to avoid having to come up with different symbols for multipliers which are basically the same. First of all, we let χ ( ξ )   denote some smooth bump function adapted to the unit frequency annulus { 2 a | ξ | 2 a }   , where 1 a   is some constant used to define χ   which may change from line to line. For a dyadic number μ { 2 i | i Z }   , we define the rescaled cutoffs: χ μ ( ξ ) = χ ( μ 1 ξ ) ,   and the associated Fourier multipliers P μ A ^ = χ μ A ^   . The two main facts we will need about these multipliers is the Bernstein inequality:
P μ A L p μ n ( 1 q 1 p ) A L q , (37)
which holds for all 1 q p   , and the Littlewood-Paley equivalence:
( μ | P μ A | 2 ) 1 2 L p A L p , (38)
which holds under the restriction 1 < p <   . All of the norms above can be taken with respect to  13 .
There are two simple analysis lemmas involving derivatives and multipliers which will come in useful in the sequel. The first of these is the low frequency (operator) commutator estimate:
[ A , P 1 ] F L p x A L q F L r , (39)
where 1 p = 1 q + 1 r   (see [8). The second is the homogeneous paraproduct estimate:
x k ( A F ) L p x k A L q 1 F L r 1 + A L q 2 x k F L r 2 , (40)
for 1 < p , q i , r i <   , 1 p = 1 q 1 + 1 r 1   , and 1 p = 1 q 2 + 1 r 2   whenever 0 < k   . This estimate is true even for non-integer 0 k   by a simple Littlewood-Paley argument. We note here that we only use it the integer case, and there it is only employed as a convenience. For a proof of this, see e.g. Chapter 2 of [12.
We would now like to set up a system to formalize many of the dyadic estimates which will appear in this paper. This is most easily done using the language of Besov spaces. Since we have a specific purpose for these in mind, we introduce the following notation:
A 2 B ˙ 2 p , ( q , s ) = μ μ 2 s 2 n ( 1 q 1 p ) P μ A 2 L p , (41)
This notation may seem a bit mysterious at first, but the thing to keep in mind here is that the first index p   in some sense controls the decay, while the second double index ( q , s )   controls the scaling, which is the same as W ˙ s , q   (homogeneous L q   Sobolev space). In general, the second index will be fixed, so we will strive to have p   as low as possible (see Remark  4.2 below). This notation has the following simple significance: B ˙ 2 p , ( q , s )   is the 2   Besov space of Lebesgue index p   which contains the standard Besov space B ˙ 2 q , s   defined by: A 2 B ˙ 2 q , s = μ μ 2 s P μ A 2 L q .   This identification is a direct consequence of the Bernstein embedding  37 . In general, one has the inclusions:
B ˙ 2 p 1 , ( q , s ) B ˙ 2 p 2 , ( q , s ) , q p 1 p 2 . (42)
Furthermore, a quick application of the Littlewood-Paley identity  38 gives the Lebesgue space inclusion:
B ˙ 2 p , ( q , n ( 1 q 1 p ) ) L p , 2 p < . (43)
The reason we prefer to use this more involved notation, instead of the usual Besov norm convention is that ours allows one to tell at first glance which norms are critical, which is particularly useful in a scale invariant problem like the one of this paper. Specifically, the norms B ˙ 2 p , ( 2 , n 2 2 )   will play a prominent role in what follows.
It will also be necessary for us to employ the 1   summing version of the norm  41 , which we label by B ˙ 1 p , ( q , s )   . This will essentially be used for one purpose only, and that is that the L   end-point of  43 is true for this space:
B ˙ 1 , ( q , n q ) L , 1 q . (44)
Besov spaces are particularly well behaved with respect to the action of Riesz operators, which is exactly why we use them. In general, we define the operator | D x | σ   to be the Fourier multiplier with symbol | ξ | σ   . The basic embedding we will use in the sequel is the following:
Lemma 4.1. One has the following bilinear estimate for Besov spaces for 0 σ   :
| D x | σ : B ˙ 2 p , ( 2 , s 1 ) B ˙ 2 q , ( 2 , s 2 ) B ˙ 1 r , ( 2 , s 3 ) , (45)
where the indices 1 p , q , r   and σ , s i   satisfy the following conditions:
s 3 = s 1 + s 2 + σ n 2 , ( s c a l i n g ) , (46)
σ + n 2 s 3 < n ( 1 p + 1 q ) , ( H i g h × H i g h ) , (47)
s 1 < n 2 + min { n ( 1 q 1 r ) , 0 } , ( L o w × H i g h ) , (48)
s 2 < n 2 + min { n ( 1 p 1 r ) , 0 } , ( H i g h × L o w ) , (49)
1 r 1 p + 1 q , ( L e b e s g u e ) . (50)
Remark 4.2. As will become apparent in the proof, it is possible to show frequency localized versions of the embedding  45 such that not all of the conditions  47  49 need to be satisfied. Indeed, we will show the following two frequency localized “improvements” are possible:
| D x | σ : P λ ( B ˙ 2 p , ( 2 , s 1 ) ) P λ ( B ˙ 2 q , ( 2 , s 2 ) ) P λ ( B ˙ 1 r , ( 2 , s 3 ) ) , (51)
| D x | σ : P λ ( B ˙ 2 p , ( 2 , s 1 ) ) P λ ( B ˙ 2 q , ( 2 , s 2 ) ) ( μ λ ) δ P μ ( B ˙ 1 r , ( 2 , s 3 ) ) , (52)
where δ = n ( 1 p + 1 q ) + s 3 σ n 2   in estimate  52 . Estimate  51 holds whenever  46 ,  48 , and  50 are satisfied. The second estimate  52 is valid whenever we have  46 ,  47 , and  50 . In particular, notice that for larger σ   this estimate requires lower values of p , q   . This fact will have an immense bearing on the estimates we prove in the sequel, and seems to be one of the most difficult factors in lowering the dimension of the overall argument from n = 6   (apart from even more difficult things such as null-form estimates).
  • Proof of estimate  45 . The proof is a simple matter of the standard technique of trichotomy. That is, we start with two test matrices A   and C   , and we run a frequency decomposition on the product: A C = λ , μ i P λ ( P μ 1 A P μ 2 C ) .   Setting now: γ = min { n 2 s 1 , n 2 s 2 , n ( 1 p + 1 q ) + s 3 σ n 2 , n 2 + n ( 1 q 1 r ) s 1 , n 2 + n ( 1 p 1 r ) s 2 } ,   we have from the conditions  47  49 that 0 < γ   . To prove  45 it suffices to show that:
    μ 1 : μ 1 μ 2 λ μ 2 λ s 3 n ( 1 2 1 r ) σ P λ ( P μ 1 A P μ 2 C ) L r
    μ 1 : μ 1 μ 2 λ μ 2 ( μ 1 μ 2 ) γ P μ 1 A B ˙ p , ( 2 , s 1 ) P μ 2 C B ˙ q , ( 2 , s 2 ) ,
    μ 2 : μ 2 μ 1 λ μ 1 λ s 3 n ( 1 2 1 r ) σ P λ ( P μ 1 A P μ 2 C ) L r
    μ 2 : μ 2 μ 1 λ μ 1 ( μ 2 μ 1 ) γ P μ 1 A B ˙ p , ( 2 , s 1 ) P μ 2 C C ˙ q , ( 2 , s 2 ) ,
    λ : μ 2 μ 1 λ μ i λ s 3 n ( 1 2 1 r ) σ P λ ( P μ 1 A P μ 2 C ) L r P μ 1 A B ˙ p , ( 2 , s 1 ) P μ 2 C B ˙ q , ( 2 , s 2 ) ,
    That  45 follows from these three estimates is a simple consequence of Young's inequality and Cauchy-Schwartz respectively. These estimates, in turn, are all a consequence of the single fixed frequency bound:
    (53) λ s 3 n ( 1 2 1 r ) σ P λ ( P μ 1 A P μ 2 C ) L r ( λ max { μ i } ) γ min { ( μ 1 μ 2 ) γ , ( μ 2 μ 1 ) γ } P μ 1 A B ˙ p , ( 2 , s 1 ) P μ 2 C B ˙ q , ( 2 , s 2 ) .
    The proof of  53 is a simple matter of Hölders and Bernstein's inequalities, and counting weights. There are three cases corresponding to the three summing estimates above. In the first case, we assume that λ μ 1 μ 2   . Since  53 is scale invariant, we may assume in this case that both μ i 1   . Using now Hölders inequality which is permissible by  50 , followed by the Bernstein inequality, we have that: P λ ( P μ 1 A P μ 2 C ) L r λ n ( 1 p + 1 q 1 r ) P μ 1 A L p P μ 2 C L q .   Multiplying this last estimate by the weight λ s 3 n ( 1 2 1 r ) σ   we arrive at the bound:
    (L.H.S.)gen_besov_fixed_freq53 λ n ( 1 p + 1 q ) + s 3 σ n 2 P μ 1 A L p P μ 2 C L q .   Then  53 follows in this case from the definition of γ   and the fact that μ i 1   .
    The other two cases, which correspond to μ 1 μ 2   or vice versa are similar, so it suffices to consider the first. In this case we rescale to μ 2 λ 1   . In the case where r < q   we set 1 p ~ = 1 r 1 q   , and we again use Hölder and Bernstein to estimate:
    P λ ( P μ 1 A P μ 2 C ) L r μ 1 n ( 1 p 1 p ~ ) P μ 1 A L p P μ 2 C L q .   If it is the case that q r   , then we simply estimate: P λ ( P μ 1 A P μ 2 C ) L r μ 1 n p P μ 1 A L p P μ 2 C L q .   In either case, the claim  53 follows from the definition of γ   . This completes the proof of  45 .
Before continuing on, let us note here a slight refinement of the Besov norms  41 and the embedding  45 . This involves taking into account functions which live at frequency 1   . If we let D x   denote the multiplier with symbol ( 1 + | ξ | 2 ) 1 2   , then we form the low frequency spaces:
A B ˙ 2 , 10 n p , ( q , s ) = D x 10 n A B ˙ 2 p , ( q , s ) , (54)
with a similar definition for the 1   version B ˙ 1 , 10 n p , ( q , s )   . By a straightforward adaptation of the previous argument, it is easy to see that the embedding  45 is equally valid for these low frequency spaces. We leave the details to the reader.
It will also be necessary for us to perform various dyadic decompositions with respect to the angular frequency variable. For each fixed direction ω   in the frequency plane R ξ n   , we decompose the unit sphere S ξ n 1   into dyadic conical regions:
( ω , θ ) = { η S ξ n 1 | ( ω , η ) θ } , (55)
where θ { π 2 2 i | i Z , i 0 }   . Here we will not bother to fix the constant in the   notation used to define the regions  55 , but we will let it change from line to line as we have done for the spatial multipliers above. We also define a smooth partition of unity adapted to these regions, which we label by b θ ω   . These can always be chosen (e.g. by defining them on a larger sphere and then rescaling) so that they satisfy the differential bounds:
| ( ω ξ ) ω k p 1 b θ ω | 1 , | ( ω ξ ) k p 1 b θ ω | θ k ,
where the implicit constants depend on k   but are uniform in θ   . In particular, if we define the multipliers ω Π θ A ^ = b θ ω A ^   , then the operators ω Π θ P μ   are bounded on all L p   spaces uniformly in μ   and θ   . In fact, the following refinement of the inequality  37 holds, which we also call Bernstein:
ω Π θ P μ A L p μ n ( 1 q 1 p ) θ ( n 1 ) ( 1 q 1 p ) A L q . (56)
In all of the above inequalities, we have kept ω   as a fixed directional value.
However, it will also be necessary for us to have an account of how our multipliers depend on this parameter. In particular, we will need to have bounds for the operators ω ω Π θ   . This is easily achieved by differentiating the associated multiplier.
In fact, one has the bounds for fixed ξ   :
| ω k b θ ω | θ k . (57)
The way we shall express this bound in calculations is through the following heuristic operator identity:
ω k ω Π θ θ k ω Π θ , (58)
which we shall take to mean that the left hand side satisfies all L p   space bounds as the right hand side. Notice that this relation has a preferred direction (left   right).
In practice, this means that we have the bound  56 for the operator on the left hand side of  58 with the added factor of θ k   .
Finally, let us end this section by making the following conventions. Firstly, it will be convenient for us at times to write P μ A = A μ   for a localized object.
This should not be confused with the μ t h   component of A   in the case that it is a one-form. This should usually be clear from context. Secondly, it will be necessary for us to ensure that certain of our multipliers have real symbol so that they respect the subalgebra g ( m ) M ( m × m )   . This will be done by taking their real part which simply symmetrizes their (real) symbols. In particular, we will denote this by: ( ω Π θ ) = ω Π ¯ θ .   Secondly, we use the following bulleted notation for the sum of various cutoffs over a given range:
P < c = μ < c P μ , ω Π < c = θ < c ω Π θ ,
etc. We will also use the notation A < c   etc. for these operators applied to tensors.
Finally, we will set aside a special notation here for cutting off on angles sectors whose width depends on the frequency:
ω Π ¯ ( σ ) = μ ω Π ¯ μ σ < P μ . (59)
Notice that this multiplier does not satisfy good bounds of the form  57 . However, it can be dealt with using the Littlewood-Paley equivalence  38 if there is a little extra room left to sum over fixed angular dyadics. This ends our description of the basic analysis we will use in this paper.

5 Gauge construction for the initial data; Reduction to a second order system and the main a-priori estimate

We now begin our proof of the main theorem  1.1 . As we have already mentioned, one of the central components of the proof is to construct a stable set of “elliptic coordinates” on the bundle V   . The way we will do this is to construct the desired frame on the t = 0   slice R n × g   . We will then show that this frame propagates as the system evolves by solving an auxiliary set of equations for the gauge potentials which respects the chosen frame automatically. The regularity of this system of equations will be provided in the usual translation invariant Sobolev spaces. We then show that our auxiliary solution is in fact a true solution to the system of equations  3  4 by employing a bootstrapping procedure which is similar to that used in the proof of Lemma  3.2 . The desired gauge covariant regularity, which is contained in the statement of Theorem  1.1 , will be provided by a comparison principle. These constructions are all local in time and are more or less standard. We have included them here for the convenience of the reader, the sake of completeness, and the fact that some of the formulas we develop along the way will be central to what we do in later sections.
With the local theory established, the global conclusion of Theorem  1.1 will then be a consequence of a certain a-priori estimate on the (usual Sobolev) energy of solutions to  3  4 in the gauge we construct. Our task will then be to show that this a-priori estimate is true for all solutions to yet another system of auxiliary equations, this time for the curvature. This can be considered to be the main estimate of the paper. The proof turns out to be quite involved, and will occupy the rest of the paper. In the next section, we will prove the main a-priori estimate itself with the help of a certain family of microlocalized space-time (Strichartz) estimates for solutions to second order covariant wave equations on bundles with connections satisfying estimates consistent with our bootstrapping assumptions.
The breakdown here is based on the Smith-Tataru (see [10)   -parametrix idea, which allows one to reduce the needed Strichartz estimates to proving them for a suitable family of approximate frequency localized fundamental solutions. Our rendition of this is essentially equivalent to that contained in the paper [8.
Finally, in the remaining sections of the paper we develop the linear theory. This is by far the most involved portion of the present work, and requires the construction of some fairly sophisticated oscillatory integrals and microlocal function spaces.
This material can be read without reference to the non-linear problem, as long as one is familiar with the algebraic and analytic assumptions we make on the geometry (frequency localized connection). While these come from the non-linear problem, they are of course a bit more general.

5.1 Construction of the initial frame, and the comparison principle

The first thing we do here is to put the initial connection D ̲   into the Coulomb gauge. Via the Uhlenbeck lemma  3.1 , we simply need to show that: F ̲ L n 2 ɛ 0 ,   for ɛ 0   the sufficiently small parameter from line  10 (which should not be confused with the small constant from Lemma  3.1 above). This L p   bound follows immediately from the gauge covariant Sobolev embedding (for n   even): H ˙ A n 4 2 L n 2 ,   which in turn follows from repeated application of the usual single derivative Sobolev embeddings and the Kato estimate (which follows immediately from  1 and Cauchy-Schwatrz):
| d | F | | | D ̲ F | , (60)
where F   is any section to × g   and the absolute norm | |   is taken with respect to the Killing inner product  13 .
We may now assume that we are dealing with an initial data set:
( F ̲ ( 0 ) , D ̲ ( 0 ) , E ( 0 ) ) , (61)
for the system which is such that connection D ̲ ( 0 ) = d + A ̲ ( 0 )   satisfied the elliptic div-curl system:
d A ̲ ( 0 ) + [ A ̲ ( 0 ) , A ̲ ( 0 ) ] = F ̲ ( 0 ) , d * A ̲ ( 0 ) = 0 , (62)
and such that the compatibility condition  5 is satisfied. Furthermore, from  19 we have the bounds: A ̲ ( 0 ) L n ɛ 0 .   We will now use this last bound to show that the initial data set  61 is in fact in the classical Sobolev spaces H ˙ k   . This is a consequence of the following:
Lemma 5.1 (Comparison principle for Sobolev norms on R n   ). Let D ̲ = d + A ̲   be a connection on R n   , with n   even, such that one has the potential and curvature bounds:
A ̲ L n , F ̲ H ˙ A n 4 2 ε 0 , (63)
F ̲ H ˙ A k M k , (64)
for n 4 2 < k   . Suppose also that D ̲   is in the gauge d * A ̲ = 0   . Then we have the critical classical Sobolev bounds:
F ̲ H ˙ n 4 2 C ε 0 , (65)
A ̲ H ˙ n 2 2 C ε 0 . (66)
Furthermore, if G   is any g   valued function, then we have the following inductive comparison of norms:
C 1 ( M n 4 2 , , M k 1 ) G H [ k * , k ] G H A [ k * , k ] , (67)
C ( M n 4 2 , , M k 1 ) G H [ k * , k ] , (68)
where the index k *   is such that n 4 2 k * < n   , and where we have set:
G 2 H A [ k * , k ] = k * m k D ̲ m G 2 L 2 ,   to be the interval gauge-covariant Sobolev space. We use an analogous definition for the space H [ k * , k ]   . We also have the non-inductive equivalence between x A ̲   and F ̲   :
N k 1 A ̲ H ˙ k F ̲ H ˙ k 1 N k A ̲ H ˙ k , (69)
where N k   , n 2 2 k   , is a set of constants which depends only on the dimension and not on the constant ε 0   once it is sufficiently small. In particular, combining all of this, we have the following classical Sobolev bounds on the pair ( A ̲ , F ̲ )   :
F ̲ H ˙ k C ( M n 4 2 , , M k 1 ) M k , (70)
A ̲ H ˙ k + 1 C ( M n 4 2 , , M k 1 ) M k . (71)
for n 4 2 < k   .
  • Proof of Lemma  5.1 . The proof will be accomplished via a series of inductions.
    In what follows, we will assume the estimate  69 , whose proof follows from simple analysis of the elliptic system  62 in Besov spaces of the kind B ˙ 2 p , ( 2 , s )   . We will perform many reductions like this in the sequel so we leave this one to the reader.
    The first step is to prove the critical classical Sobolev  65 . Note that the potential bounds  66 follow from this and  69 . The inductive hypothesis that we make here is that:
    x l D ̲ m F ̲ L n k ε 0 , (72)
    for k = l + m + 2 n 2   whenever 0 l l 0   . Notice that this hypothesis is verified for l 0 = 0   on account of the assumption  63 and by applying the Kato estimate  60 in conjunction with integer Sobolev embeddings. Notice also that by applying Riesz operator estimates to the elliptic system  62 , and using the product estimate  40 along with Sobolev embeddings we have the bounds:
    x l + 1 A ̲ L n k x l F ̲ L n k + x l ( [ A ̲ , A ̲ ] ) L n k ,
    x l F ̲ L n k + x l A ̲ L n k 1 A ̲ L n ,
    x l F ̲ L n k + ε 0 x l + 1 A ̲ L n k .
    Therefore, the inductive hypothesis  72 may be assumed to also contain the estimate:
    x l + 1 A ̲ L n k ε 0 , (73)
    for k = l + 2 n 2   and l l 0   . To show that  72 holds for all l l 0 + 1   , we start with l l 0   and we compute using  40 and Sobolev embeddings that:
    x l + 1 D ̲ m 1 F ̲ L n k ,
    x l D ̲ m F ̲ L n k + x l ( [ A ̲ , D ̲ m 1 F ̲ ] ) L n k ,
    ε 0 + x l A ̲ L n l + 1 D ̲ m 1 F ̲ L n k l 1 + A ̲ L n x l D ̲ m 1 F ̲ L n k 1 ,
    ε 0 + ε 0 x l + 1 D ̲ m 1 F ̲ L n k .
    This inductively establishes  72 and hence proves  65 .
    We now show  68 . We first deal with the leftmost inequality. Our inductive hypothesis here is that:
    x l D ̲ m G L 2 C ( M n 4 2 , , M k 1 ) G H A [ k * , k ] , (74)
    where l + m = k 0   for k 0 = k   or k 0 = k *   , and for all l l 0   . To compute x l + 1 D ̲ m 1 G   in terms of this, we need to split into cases depending on whether or not l + 1 < n 2   .
    In the former case we compute that:
    x l + 1 D ̲ m 1 G L 2 ,
    x l D ̲ m G L 2 + x l ( [ A ̲ , D ̲ m 1 G ] ) L 2 ,
    C ( M n 4 2 , , M k 1 ) G H A [ k * , k ] + x l A ̲ L n l + 1 D ̲ m 1 G L 2 n n 2 l 2 (75)
    + A ̲ L n x l D ̲ m 1 G L 2 n n 2 ,
    C ( M n 4 2 , , M k 1 ) G H A [ k * , k ] + ε 0 x l + 1 D ̲ m 1 G L 2 .
    In the case where n 2 1 l   we have the inequality:
    x l + 1 D ̲ m 1 G L 2 ,
    C ( M n 4 2 , , M k 1 ) G H A [ k * , k ] + x l A ̲ L 2 n n 2 D ̲ m 1 G L n (76)
    + A ̲ L n x l D ̲ m 1 G L 2 n n 2 ,
    C ( M n 4 2 , , M k 1 ) G H A [ k * , k ] + x l + 1 A ̲ L 2 D ̲ n 2 2 + m 1 G L 2 (77)
    + ε 0 x l + 1 D ̲ m 1 G L 2 .
    Notice that this last line above used the L 2 L n   gauge covariant Sobolev embedding.
    To bound the second term on this line, notice that since n 2 1 l   and we must assume that 1 m   for the induction to make sense, we have the bound k * n 2 2 + m 1 k   . This allows us to bound: D ̲ n 2 2 + m 1 G L 2 G H A [ k * , k ] .   Furthermore, by placing all of these calculations within an induction on the value of k   itself, and using the bound  69 while noting that l k 1   we may assume the bound: x l + 1 A ̲ L 2 x l F ̲ L 2 C ( M n 4 2 , , M k 1 ) .   This completes our inductive proof of  74 above.
    The proof of the second inequality on line  68 follows from reasoning similar as that used to prove  74 inductively. We leave it to the reader to set up the inductive hypothesis for this case and work out the details. This completes our proof of Lemma  5.1 .
Using Lemma  5.1 and the assumed bounds  10  11 , we may assume that our initial data  61 is such that:
( F ̲ ( 0 ) , E ( 0 ) ) H ˙ n 4 2 ε ~ 0 , (78)
A ̲ ( 0 ) H ˙ n 2 2 ε ~ 0 , (79)
( F ̲ ( 0 ) , E ( 0 ) ) H ˙ k M ~ k , (80)
A ̲ ( 0 ) H ˙ k + 1 M ~ k , (81)
where n 4 2 < k   , and the M ~ k   depend on the M k   in some inductive way, and we also have that ε 0 ~ C ɛ 0   for some constant C   which depends only on the dimension.
Here M k   and ɛ 0   refer to the constants introduced in the statement of Theorem  1.1 .
We now decompose the initial field strength { E i ( 0 ) }   in a way that will be consistent with the evolution of the system  3  4 . This will be convenient for discussing the Cauchy problem. Our first step is to define the following elliptic quantity:
Δ a 0 = [ a i , i a 0 ] + [ a i , E i ] . (82)
where for convenience we have labeled { a i } = { A ̲ i ( 0 ) }   . We then define the auxiliary set of quantities:
a ˙ i = E i + i a 0 [ a 0 , a i ] . (83)
Notice that as an immediate consequence of the constraint equation  5 , the form of  82 , and the Coulomb condition d * a = 0   , we have the secondary Coulomb condition:
i a ˙ i = 0 .   This will turn out to be important in a moment. Now, from the definition of the quantities  82 and  83 , the already established bounds  78  81 , and several rounds of Sobolev embeddings, we have the following differential bounds on the quantities { a ˙ i }   :
a ˙ H ˙ n 4 2 ε ~ 0 , (84)
a ˙ H ˙ k M ~ k , (85)
for n 4 2 < k   (after a possible slight redefinition of the constants ε ~ 0 , M ~ k   via multiplication by some fixed dimensional constant). We now define a Coulomb admissible initial data set to be a collection ( F ̲ , { a i } , { a ˙ i } )   such that:
d a + [ a , a ] = F ̲ , d * a = 0 , d * a ˙ = 0 . (86)
Notice that F ̲   is uniquely determined by the { a i }   , therefore we do not need to include it in the definition of initial data. We define the Coulomb-Cauchy problem to be the task of finding a space-time connection D = d + A   such that it satisfies the set of equations:
D β F α β = 0 , (87a)
d A + [ A , A ] = F , (87b)
d * A ̲ = 0 , (87c)
and such that at time t = 0   we have that:
A ̲ ( 0 ) = a , t A ̲ ( 0 ) = a ˙ . (88)
We remark briefly here that solving the problem  86  88 provides a solution to the original Yang Mills system  3  4 with Cauchy data  61 as long as we define the collection { a ˙ }   according to the equations  82  83 . All we need to do to prove this assertion is to show that: F 0 i ( 0 ) = E i .   Our proof of this follows the same bootstrapping philosophy used to show the equivalence  34 in the proof of Lemma  3.2 . The claim will follow at once from equation  83 if we can first establish that: A 0 ( 0 ) = a 0 ,   where a 0   is defined by  82 . Now, from the system of equations  87 we have that the quantity A 0   is elliptically determined by the equation:
Δ A ̲ A 0 = [ A i , t A i ] , (89)
where Δ A ̲ = D ̲ i D ̲ i   is the gauge covariant Laplacean. Furthermore, by using equation  83 as the definition of E i   , and substituting this into equation  82 , we have that the quantity a 0   is elliptically determined by the equation:
Δ a a 0 = [ a i , a ˙ i ] . (90)
By subtracting  90 from  89 at time t = 0   we have that: Δ a ( A 0 ( 0 ) a 0 ) = 0 .   Uniqueness now comes from the Sobolev type estimate: B L n Δ a B L n 3 ,   which follows from the smallness condition  79 and the usual Sobolev estimates.
The details of the proof are left to the reader.
Keeping the equivalence we have just established in mind, and the first inequality contained in the comparison estimates  68 and  69 , we have reduced the demonstration of Theorem  1.1 to showing the following non-gauge covariant global regularity theorem:
Theorem 5.2 (Global regularity in the Coulomb gauge). Let the number of spatial dimensions be 6 n   . Then there exists a set of constants ε ~ 0   and C , C k   , n 2 2 k   such that if ( F ̲ , { a i } , { a ˙ i } )   is a Coulomb admissible initial data set such that is satisfies the bounds:
F ̲ H ˙ n 4 2 ε ~ 0 , F ̲ H ˙ k M ~ k , (91a)
a H ˙ n 2 2 ε ~ 0 , a ˙ H ˙ n 4 2 ε ~ 0 , (91b)
a H ˙ k M ~ k 1 , a ˙ H ˙ k 1 M ~ k 1 , (91c)
then if ε ~ 0   is sufficiently small there exists a unique global solution { A α }   to the system  87 with this initial data. Furthermore, this solution obeys the following differential estimates:
A H ˙ n 2 2 C ε ~ 0 , t A H ˙ n 4 2 C ε ~ 0 , (92a)
A H ˙ k C k 1 M ~ k 1 , t A H ˙ k 1 C k 1 M ~ k 1 , (92b)

5.2 Local existence in the Coulomb gauge

Our goal here is to reduce the proof of Theorem  5.2 to a certain a-priori estimate involving the energies of the field strength F   . This amounts to proving a local existence theorem for the system  86  88 . The proof of this will allow us to set up a system of equations for the coulomb potentials { A α }   which will be of central importance in the sequel. We will show that:
Proposition 5.3 (Local existence in the Coulomb gauge). Let the number of spatial dimensions be 6 n   . Then for every set of constants C , C k   , n 2 2 k   , there exists an ε ~ 0   which only depends on C   with the following property: If ( { a i } , { a ˙ i } )   is any set of Coulomb admissible initial data such that:
a H ˙ n 2 2 C ε ~ 0 , a ˙ H ˙ n 4 2 C ε ~ 0 , (93)
a H ˙ k C k 1 M ~ k 1 , a ˙ H ˙ k 1 C k 1 M ~ k 1 , (94)
then for ε ~ 0   sufficiently small there exists a time 0 < T *   , which only depends on the quantities C ε ~ 0 , C n 2 M ~ n 2 , C n + 2 2 M ~ n + 2 2   such that there exists a unique local solution { A α }   to the system  86  88 with this set of initial data. Furthermore, on the time interval [ 0 , T * ]   one has the following norm bounds on the collection { A α }   :
sup 0 t T * A ( t ) H ˙ n 2 2 2 C ε ~ 0 , (95)
sup 0 t T * t A ( t ) H ˙ n 4 2 2 C ε ~ 0 , (96)
sup 0 t T * A ( t ) H ˙ k 2 C k 1 M ~ k 1 , (97)
sup 0 t T * t A ( t ) H ˙ k 1 2 C k 1 M ~ k 1 . (98)
  • Proof of Proposition  5.3 . The proof will be reduced to the standard procedure of energy estimates and Sobolev embeddings. Since we are assuming that the initial data has enough smoothness to cover L   , this is more or less trivial. We start by plugging  87b directly into  87a . After an application of the gauge condition d * A ̲ = 0   this yields a general second order system of equations which we write as:
    A β = β t A 0 + [ t A 0 , A β ] [ A α , α A β ] [ A α , F α β ] . (99)
    To split this into a hyperbolic-elliptic system, we decompose the set of equations  99 into its spatial and temporal parts, and apply the Leray projection: P = d * d Δ = ( I x ( d i v ) Δ ) ,   to the resulting spatial equation. After some rearrangement of the elliptic equation this yields the coupled system:
    A i = P ( [ t A 0 , A i ] [ A α , α A i ] [ A α , F α i ] ) , (100a)
    Δ A 0 = [ A i , i A 0 ] + [ A i , F 0 i ] . (100b)
    The above system of equations can be solved locally in time with the bounds  95  95 through a Picard iteration scheme. We leave this as an exercise for the reader. Notice that the projection P   can be removed in energy estimates because it is an order zero operator. Notice also that even though the smallness of the time interval [ 0 , T * ]   will not make up for estimates involving the elliptic equation  100b , the critical smallness assumption  93 allows one to obtain the bootstrapping estimates  95  95 if one uses Littlewood-Paley decompositions and paraproducts to make sure at least one factor in the non-linearity on the right hand side of  100b goes in a critical space. This same comment goes for bounding terms on the right hand side of  100a in energy estimates when one is bootstrapping the higher norm constants C k M ~ k   for n + 2 2 < k   . Again, the smallness in time makes up for the size of the first few constants C ε ~ 0 , C n 2 M ~ n 2 , C n + 2 2 M ~ n + 2 2   .
    Having now produced a local solution to the system  100 with the desired properties, we have shown the conclusion of Proposition  5.3 once we show that the spatial potentials which solve  100a are in fact solutions to the spatial portion of the original second order equation  99 . This will be shown through our general strategy of coming up with a quantity which yields a critical elliptic bootstrapping estimate which will force it to be zero. This time, the desired quantity turns out to be related to the conservation of electric charge for the Yang-Mills equations. We first write the spatial portion of the non-linearity on the right hand side of  99 as a vector:
    N i = i t A 0 + [ t A 0 , A i ] [ A α , α A i ] [ A α , F α i ] . (101)
    We would like to show that the equations  100 force ( I P ) N = 0   . We compute that: ( I P ) N = x Δ 1 ( t Δ A 0 i α [ A α , A i ] i [ A α , F α i ] ) .   Now, using the equation  100 to compute t Δ A 0   , this last line becomes:
    ( I P ) N = x Δ 1 ( β α [ A α , A β ] β [ A α , F α β ] ) ,
    = x Δ 1 β [ A α , F α β ] .
    where the equality of the second line follows on account of skew symmetry. We now isolate the interesting portion of the term on the right hand side of the last line above and use the Jacobi identity to compute that:
    β [ A α , F α β ] = 1 2 [ ( d A ) α β , F α β ] + [ A α , β F α β ] ,
    = 1 2 [ [ A α , A β ] , F α β ] [ A α , [ A β , F α β ] ] + [ A α , D β F α β ] ,
    = [ A α , D β F α β ] .
    Now, again using equation  100b we have that D β F 0 β = 0   . Furthermore, from equation  100a we also have the identity: D β F i β = ( I P ) i N .   Combining all of this, we have the following equality:
    ( I P ) N = x Δ 1 [ A i , ( I P ) i N ] . (102)
    Finally, from the form of  101 and the already established estimates  95  98 as well as the boundedness properties of the operator ( 1 P )   we have that: ( I P ) N ( t ) L n 3 < ,   for all times t [ 0 , T * ]   . However, from the smallness bound  95 , the identity  102 , and a Sobolev embedding we also have the fixed time bound:
    ( I P ) N L n 3 [ A i , ( 1 P ) i N ] L n 4 ,
    A L n ( I P ) N L n 3 ,
    ε ~ 0 ( I P ) N L n 3 .
    Therefore, for ε ~ 0   sufficiently small we see that we must have ( I P ) N = 0   as was to be shown. This completes the proof that the solution to  100 is a solution to the general system  99 , and therefore ends our proof of Proposition  5.3 .

5.3 The second order curvature equation and the main a-priori estimate

Through a repeated application of the local existence theorem  5.3 , we may reduce the proof of the global existence theorem  5.2 to showing a-priori that any solution to the Coulomb system  86  88 which exists on a time interval [ 0 , T * ]   (possibly large!), and such that it obeys the both the initial data bounds  91a  91c , as well as the evolution bounds  95  98 , in fact obeys the improved evolution bounds  92a  92b .
Now, it turns out that the system of equations  100 is by itself not so well adapted4 to the proof of such an a-priori estimate. This stems from the fact that these equations are not covariant. This manifests itself in the projection operator P   . If one were to try to write the hyperbolic system of equations  100a in terms of covariant wave operator A   and a source term, the projection operator which is non-local would end up causing problems in various commutator terms. The way around this is to not only consider the system  100 , but to also work directly with the curvature in the equations  87a  87b . This is possible because we are not attempting to set up an iteration scheme, but are instead merely trying to prove an a-priori estimate, so we may safely assume that the quantities we work with satisfy any equation which results from the system  87 . We will in fact use several such elliptic and hyperbolic equations. As a very rough description of this kind of philosophy, the reader may find it useful to keep in mind the following schematic:
W e a k c o n t r o l o f t h e c o n n e c t i o n I m p r o v e d c o n t r o l o f t h e c u r v a t u r e ,
I m p r o v e d c o n t r o l o f t h e c o n n e c t i o n ,
W e a k c o n t r o l o f t h e c o n n e c t i o n f o r l o n g e r t i m e s .
To provide the improved control on the curvature, we will employ a second order equation for it. To derive this, we write the Bianchi identities  87b in the form  4 and then contract this expression with the covariant derivative D   . This yields the equations:
0 = D γ ( D α F β γ + D γ F α β + D β F γ α ) ,
= A F α β + [ F α γ , F β γ ] + [ F β γ , F γ α ] ,
= A F α β 2 [ F α γ , F β γ ] . (103)
In addition to  103 and the system  100 , it will also be useful for us to employ a secondary elliptic equation. This will be for the quantity t A 0   :
t A 0 = Δ 1 i ( [ A i , t A 0 ] + [ A 0 , t A i ] + [ A α , F i α ] ) . (104)
This equation follows immediately from differentiating the equation  100b with respect to time, and then applying the conservation law α [ A β , F α β ] = 0   to the resulting expression. We are now ready to state our main a-priori estimate:
Theorem 5.4 (Main a-priori estimate for the curvature of the Coulomb system  86  88 ). Let the space-time connection D = d + A   on R ( n + 1 )   , where 6 n   , be given such that it satisfies the following system of equations on some finite time interval [ 0 , T * ]   :
A F α β = 2 [ F α γ , F β γ ] , (105a)
d A + [ A , A ] = F , (105b)
d * A ̲ = 0 , (105c)
A i = P ( [ t A 0 , A i ] [ A α , α A i ] [ A α , F α i ] ) , (105d)
Δ A 0 = i [ A 0 , A i ] + [ A i , F 0 i ] , (105e)
Δ ( t A 0 ) = i ( [ A i , ( t A 0 ) ] + [ A 0 , t A i ] + [ A α , F i α ] ) . (105f )
Here we have split { A α } = ( A 0 , { A ̲ i } )   . Let there also be given a set of fixed constants L , N , L k , N k   for the indices n 2 2 k   , such that at time t = 0   we have the initial bounds:
F ( 0 ) H ˙ n 4 2 ε ~ 0 , t F ( 0 ) H ˙ n 6 2 L ε ~ 0 , (106)
F ( 0 ) H ˙ k M ~ k , t F ( 0 ) H ˙ k 1 L k M ~ k . (107)
Then if ε 0 ~   is chosen as to be sufficiently small on line  106 above, there exists a collection constants C , C k   , which only depend on the dimension and the collection L , N , L k , N k   but not on ε ~ 0   (once it is small enough) or the collection M ~ k   , such that if at later times we have the bounds:
sup 0 t T * A ̲ ( t ) H ˙ n 2 2 2 N C ε ~ 0 , sup 0 t T * t A ̲ ( t ) H ˙ n 4 2 2 N C ε ~ 0 , (108)
sup 0 t T * F ( t ) H ˙ n 4 2 2 N C ε ~ 0 , sup 0 t T * t F ( t ) H ˙ n 6 2 2 N C ε ~ 0 , (109)
sup 0 t T * A ̲ ( t ) H ˙ k < , sup 0 t T * t A ̲ ( t ) H ˙ k 1 < , (110)
sup 0 t T * F ( t ) H ˙ k < , sup 0 t T * t F ( t ) H ˙ k 1 < , (111)
the following set of stronger bounds holds:
sup 0 t T * F ( t ) H ˙ n 4 2 N 1 C ε ~ 0 , sup 0 t T * t F ( t ) H ˙ n 6 2 N 1 C ε ~ 0 , (112)
sup 0 t T * F ( t ) H ˙ k N k 1 C k M ~ k , sup 0 t T * t F ( t ) H ˙ k 1 N k 1 C k M ~ k . (113)
Remark 5.5. The bounds involving  111 and  113 express the fact that the control we provide here is at the critical level. That is, bounds on the higher norms are completely irrelevant in the bootstrapping procedure, except for the fact that they are finite. The only place where we need higher norms to accomplish anything here is in the local existence theorem  5.3 . The way we will prove Theorem  5.4 is by first establishing control at the critical level through a bootstrapping argument. The control of the higher norms will then be provided through an a-priori estimate who's proof is essentially identical to that of the critical bootstrapping bound, and will therefore be left to the reader.
Remark 5.6. The reader my find it useful to have a brief description of the various constants appearing in Proposition  5.3 and Theorem  5.4 . The constants L , L k , N , N k   are input into the a-priori machine, and these are meant to cover the transition to and from estimates involving the connection and curvature. The set L , L k   is only needed to deal with the initial data. This is necessary because we must have an account of bounds involving the quantities t F   . The other constants N , N k   govern comparison type estimates similar to  69 . The constants C , C k   are byproducts of the proof of the a-priori estimate itself. These will very much depend on the L , L k , N , N k   , but are independent of ε ~ 0   when it is small enough. Finally, the main adjusting parameter ε ~ 0   has two important roles. First and foremost, it is needed to prove the a-priori estimate itself. However, it has a second purpose which is also crucial, and that is to keep the dependence of C , C k   on L , L k , N , N k   from creating a feedback loop. Specifically, we need our various comparison estimates to have constants which do not depend on the large constants C , C k   . Since the critical energy of the curvature can grow by a factor of C   , we will need the extra influence of ε ~ 0   to make sure this does not cycle back to L , L k , N , N k   .
  • Proof that Theorem  5.4 and Proposition  5.3 together imply Theorem  5.2 . The proof here is more or less straightforward and will be largely left to the reader. Everything relies on two sets of estimates. The first has to do with showing that the initial data bounds  91a  91c imply the initial control assumed in  106  107 . This is just a matter of bounding the time derivatives t F   , and is why we have included the set of auxiliary constants L , L k   . Using now the field equations  3  4 (we have not included them in the system  105 , but we may assume they hold), we have the general schematic identity at time t = 0   :
    t F ( 0 ) = x F ( 0 ) + [ a , F ( 0 ) ] , (114)
    where we have generically set a = ( a 0 , { a i } )   . Therefore, to establish the control  106  107 , we only need to prove the estimates:
    [ a , F ( 0 ) ] H ˙ n 6 2 ε ~ 0 , [ a , F ( 0 ) ] H ˙ k 1 M ~ k , (115)
    assuming that the bounds  91a  91c hold. Notice that while these initial norms do not contain estimates on the quantities E i = F 0 i ( 0 )   , we originally had bounds on this from lines  78  81 above. Also, any estimates on a 0   which are needed in this process can be provided, for instance, through the equation  82 . Since the proof of estimate  115 is a straightforward paraproduct type bound, similar to what was done in the proof of Lemma  5.1 above, we leave it to the interested reader (see below for more details).
    The second set of estimates we need to prove here has to do with the relationship between the later time norms  108  113 and the ones  95  98 contained in the proof of the local existence proposition. Since our global regularity proof is by iteration of this latter result, we need to first show that the weak control  95  98 implies the bootstrapping assumption  108  111 . This assertion is trivial for norms involving the potentials  108 and  110 , as well as the larger norms  111 just by applying the definition of curvature. Therefore, we only need to see that  95  96 implies the bounds  109 . We first establish the desired bounds for the undifferentiated term F   .
    For the spatial curvature and potentials ( F ̲ , A ̲ )   , this is just the comparison principle form line  69 , and we can assume that the constants N , N k   are large enough to cover that case. To deal with potentials involving time derivatives of A ̲   or the temporal potential A 0   we have the following general calculation:
    F H ˙ n 4 2 d A H ˙ n 4 2 + [ A , A ] H ˙ n 4 2 ,
    A H ˙ n 2 2 + A 2 H ˙ n 2 2 ,
    where the quadratic term follows from paraproduct decompositions, Hölders inequality, and Sobolev embeddings as in the proof of  69 . The desired result now follows from the smallness of ε ~ 0   and the fact that we may assume the constant C   in line  95 does not depend on it. To establish the estimate for the quantity t F   , we use the later time version of the identity  114 , as well as the estimate which is responsible for the first estimate on line  115 above, which is: [ A , F ] H ˙ n 6 2 A H ˙ n 2 2 F H ˙ n 4 2 .   By again assume that the constant ε ~ 0   is sufficiently small with respect to C   we have the desired bound.
    The final thing we need to do here is to show that the improved bounds  112  113 imply the assumed estimates of the local existence theorem  93  94 . This is again a comparison estimate either identical or similar to  69 . Note that we only need to bound the spatial portion of the potentials { A α }   and their time derivatives. The undifferentiated terms can be bounded directly by  69 because we may assume that the constant ε ~ 0   on line  112 is small enough that the critical estimate  63 holds.
    To deal with the time differentiated potentials t A ̲   , one can simply differentiate the Hodge system  105b  105c with respect to time and then apply essentially the same proof as was used to produce  69 . The details of this are left to the ambitious reader.

4 Strictly speaking, this is not entirely true. This can be seen from the fact that if one looks at the localized commutator [ A , P ] P λ   , where the connection { A α }   is assumed to be of much lower frequency than λ   , then this is essentially a “derivative falls on low” interaction which can be handled with the available Strichartz estimates in 5 n   dimensions. We have elected instead to follow a formulation of the YM system which is based on the curvature because of its conceptual appeal. However, in lower dimensions, it may be best to work directly with the connection { A α }   , in part to help mitigate bad H i g h × H i g h L o w   frequency interactions which come from the quadratic term on the right hand side of  103 .

6 Proof of the Main Bootstrapping Estimate

We are now ready to begin our proof of the (improved) main critical a-priori estimate  112 . In order to do this, we will need to bootstrap in a function space which is much stronger than the energy type spaces of Theorem  5.4 . This will cost us another bootstrapping procedure, but this will be easy to set up because it will be clear the extra norms we create have good bounds on some very small initial time interval due to the fact that we are assuming the higher energy boundedness  111 and that these norms involve integration in time. All of the norms we construct here will be of Strichartz type, with an 2   Besov structure in the spatial variable.
It will also be necessary for us to include an angular square sum structure in many of the estimates we prove. This may seem a bit odd at first because we will not need such bounds directly in our proof of Theorem  5.4 . These extra bounds will instead be used to give the fine control which is needed to handle the linear part of the problem. At each fixed frequency, we form the square-sum norms:
P λ A S L P = sup θ 1 ( φ : ω 0 Γ φ ω 0 Π θ P λ A 2 L p ) 1 2 , (116)
where Γ φ   is taken to be a (uniformly) finitely overlapping set of spherical caps such that S n 1 = φ Γ φ   , each of which has size θ   and constructed such a way that one has the bounds: ( φ : ω 0 Γ φ ω 0 Π θ P λ A L 2 ) 1 2 P λ A L 2 ,   independent of the size of θ   . Here we take the condition ω 0 Γ φ   to mean that the variable ω 0   is essentially in the center of that spherical cap Γ φ   . The exact placement is not essential. Notice that by construction, these norms are contained in the usual L p   spaces because we can assume that one set of angular sectors we are summing over contains the whole sphere.
Next, using the same prescription that defined the Besov spaces  41 , we define the angular square sum Besov spaces to be:
A S B ˙ 2 p , ( q , s ) = ( λ λ 2 s 2 n ( 1 q 1 p ) P λ A 2 S L p ) 1 2 . (117)
We now define the main dispersive component of the function spaces we will be working with. These are L t 2   based Strichartz spaces, built on the norms  117 and  41 . These are all defined on a finite time interval [ 0 , T * ]   , which will for the most part be left implicit:
A Z ˙ s = A L t 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , s + 1 2 ) ) [ 0 , T * ] , (118)
A S Z ˙ s = A L t 2 ( S B ˙ 2 2 ( n 1 ) n 3 , ( 2 , s + 1 2 ) ) [ 0 , T * ] . (119)
To gain some intuition about these spaces, notice that they all scale like L ( H ˙ s )   under the change of variables  7 . Therefore, they all scale like solutions to the wave equations with H ˙ s   initial data. Indeed, these spaces are consistent with the available range of Strichartz estimates for the usual scalar wave equation, and it will be our goal to show that one has bounds on the norm  119 for solutions of the covariant wave operator on the left hand side of  105 .
To form the overall spaces we will bootstrap in, we add the above space-time norms to the energy type norms used in the statement of the main a-priori estimate  5.4 :
X ˙ s = L [ 0 , T * ] ( H ˙ s ) S Z ˙ s , (120)
Y ˙ s = L [ 0 , T * ] ( H ˙ s ) Z ˙ s . (121)
It will also be necessary for us to estimate time derivatives in the above spaces.
Since differentiation will decrease the scaling by one unit, we use the norms: A X ˙ s × t 1 ( X ˙ s 1 ) = A X ˙ s + t A X ˙ s 1 ,   with an analogous definition for Y ˙ s × t 1 ( Y ˙ s 1 )   .

6.1 Proof of the Critical Bootstrapping Estimate

We are now ready to prove the critical component of Theorem  5.4 (we will now change notation from ε ~ 0   back to ε 0   ):
Proposition 6.1 (Critical bootstrapping estimate in the X ˙ s   spaces). Let the dimension be 6 n   . Let the collection ( F , A )   be a space-time connection curvature pair which obeys the general smoothness conditions  110  111 , and which satisfies the system of equations  105 . Let L , N   be given constants such that one has the initial bounds:
F ( 0 ) H ˙ n 4 2 + t F ( 0 ) H ˙ n 6 2 L ε 0 . (122)
Then there exists a constant C   which depends only on L , N   and the dimension such that if one has the bootstrapping bounds on a time interval [ 0 , T * ]   :
sup 0 t T * ( A ̲ , t A ̲ ) ( t ) H ˙ n 2 2 × H ˙ n 4 2 2 N C ε 0 , (123)
F X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) 2 N C ε 0 , (124)
then for ε 0   sufficiently small, we have that the following improved bounds on the same time interval [ 0 , T * ]   :
F X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) N 1 C ε 0 . (125)
The proof of Proposition  6.1 will be accomplished through the standard use of Littlewood-Paley paraproduct decompositions, and the application of space-time estimates. All of the linear bounds we will need are provided by the following, which is the main technical result of this work:
Theorem 6.2 (Gauge covariant angular square-sum Strichartz estimates for Yang-Mills connections). Let the number of dimensions be such that 6 n   , and let d + A ̲ ~   be a space-time connection defined defined on all of Minkowski space n + 1   such that it satisfies the conditions:
A ̲ ~ 0 = 0 ( T e m p o r a l G a u g e ) , (126a)
d * A ̲ ~ = 0 ( C o u l o m b G a u g e ) , (126b)
P | ξ | | τ | ( A ̲ ~ ) = 0 ( S p a c e - t i m e f r e q u e n c y l o c a l i z a t i o n ) , (126c)
A ̲ ~ X ˙ n 2 2 ( S p a c e - t i m e e s t i m a t e ) , (126d)
A ̲ ~ = P ~ ( [ B , H ] ) ( S t r u c t u r e e q u a t i o n ) , (126e)
( B , H ) Y ˙ n 2 2 × Y ˙ n 4 2 ( S t r u c t u r e e s t i m a t e s ) , (126f )
where ( B , H )   is an auxiliary set of g   valued functions defined on all of n + 1   . The symbol P ~   denotes a composition of the Leray projection P   with some frequency cutoff function which is bounded on all mixed Lebesgue-Besov spaces of the type L p ( B ˙ 2 p , ( 2 , s ) )   . We assume also that the connection d + A ̲ ~   satisfies the general smoothness bounds:
sup T * t T * A ̲ ~ ( t ) H ˙ k < , n 2 2 < k , (127)
for each fixed time T *   . Let now F   be any other g   valued function which satisfies the inhomogeneous equation:
A ̲ ~ F = G , (128)
with Cauchy data:
F ( 0 ) = f , t F ( 0 ) = f ˙ . (129)
Then if the constant   in lines  126d and  126f above is sufficiently small, one has the following family of space-time estimates:
F X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) ( f , f ˙ ) H ˙ n 4 2 × H ˙ n 6 2 + G L 1 ( H ˙ n 6 2 ) . (130)
Remark 6.3. In the above Theorem, the Strichartz estimates have a preferred scaling. This is consistent with the application we have in mind. In general, it is not possible to prove estimates of the type  130 for higher Sobolev indices without assuming that the connection A ̲ ~   itself has more regularity. In the case where A ̲ ~   does have better regularity, a proof similar to that given after Proposition  7.1 below can be used to show estimates for those higher norms.
  • Proof of Proposition  6.1 . The proof requires another bootstrapping argument. This will be done on subintervals [ 0 , T * * ] [ 0 , T * ]   . Using the initial bounds  122 and the general smoothness assumption  111 we may assume that for T * * 1   we have the estimate  124 . Therefore, it suffices to prove that  124 implies  125 on all subintervals [ 0 , T * * ]   . But this is just the same as proving Proposition  6.1 itself since T *   is arbitrary.
    The proof will be accomplished in a series of steps. Our first goal will be to derive X ˙ s   and Z ˙ s   type bounds for the connection d + A   . We will then split this connection into a sum of two pieces d + A ~ + A ~ ~   , where the potentials A ~   satisfy the criteria of Theorem  6.2 and the remainder term A ~ ~   obeys the better L 1 ( L )   space-time estimate. This is enough to be able to write the equation  105a schematically as:
    A ~ F = [ A ~ ~ , F ] + [ A ~ ~ , F ] + [ A ~ , [ A ~ ~ , F ] ] + [ A ~ ~ , [ A ~ ~ , F ] ] + [ F , F ] . (131)
    One is then in a position where Theorem  6.2 can be applied directly, and we only need to choose our constant C   depending on L , N   and the constant which appears on line  130 . The key thing is that the dangerous term [ A ~ ~ , F ]   can safely be put in L 1 ( H ˙ n 6 2 )   using the improved space-time estimate for A ~ ~   and the energy estimate for F   . Throughout the proof we will use the usual splitting { A α } = ( A 0 , A ̲ )   of d + A   into its temporal and spatial components.
  •   X ˙ n 2 2   estimates for { A ̲ i }   Here we write F ̲   for the spatial components of the field strength and use the Hodge system  105b  105c to write schematically:
    A ̲ = x Δ 1 ( F ̲ + [ A ̲ , A ̲ ] ) . (132)
    As a preliminary first step, we will show that the potentials { A ̲ i }   can be estimated in Y ˙ n 2 2   with bounds comparable to N C ε 0   . Now, it is not too difficult to see directly from the definition that: x Δ 1 : Y ˙ n 4 2 Y ˙ n 2 2 .   Next, notice that we have the bilinear estimate:
    x Δ 1 : L ( H ˙ n 2 2 ) Y ˙ n 2 2 Y ˙ n 2 2 , (133)
    which follows integrating the bound  45 . Note that in this case, the range restrictions  46  50 are easily satisfied. Therefore, using the critical bounds  123 as well as the general smoothness criteria  110 (so that in particular we may assume the Y ˙ n 2 2   norm of { A ̲ i }   is finite) we see we may absorb the quadratic term on the right hand side of  132 onto the left in the desired estimates.
    Our task is now to show the more restrictive X ˙ n 2 2   estimates for the potentials { A ̲ i }   . Again from the definition, it is not hard to see that we have the embedding:
    x Δ 1 : X ˙ n 4 2 X ˙ n 2 2 .   Therefore, keeping in mind the Y ˙ n 2 2   bounds just proved, we see that is suffices to be able to show the bilinear estimate:
    x Δ 1 : Y ˙ n 2 2 Y ˙ n 2 2 X ˙ n 2 2 . (134)
    The main issue here is, of course, to be able to include the angular square sum structure. This turns out to be very simple. Notice first that by orthogonality and the general nesting  42 we have the inclusion (on any finite time interval [ 0 , T * ]   ): L ( H ˙ n 2 2 ) L 2 ( H ˙ n 1 2 ) X ˙ n 2 2 .   Therefore, to conclude  134 we see that it suffices to be able to show the set of bilinear estimates:
    x Δ 1 : Y ˙ n 2 2 Y ˙ n 2 2 L ( H ˙ n 2 2 ) , (135)
    x Δ 1 : Y ˙ n 2 2 Y ˙ n 2 2 L 2 ( H ˙ n 1 2 ) . (136)
    The first of these embedding follows easily from: x Δ 1 : L ( H ˙ n 2 2 ) L ( H ˙ n 2 2 ) L ( H ˙ n 2 2 ) ,   which in turn follows directly from  45 . The second estimate  136 above is more bilinear in nature. It follows from applying a trichotomy and then summing the following two fixed frequency bilinear inclusions:
    x Δ 1 : P λ ( L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) ) P λ ( L ( H ˙ n 2 2 ) ) P λ ( L 2 ( H ˙ n 1 2 ) ) . (137)
    x Δ 1 : P λ ( L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) ) P λ ( L ( H ˙ n 2 2 ) ) ( μ λ ) δ P μ ( L 2 ( H ˙ n 1 2 ) ) , (138)
    where we have set δ = n ( n 2 n 1 ) 3 2   to be the “gap” constant. The estimates  137  138 follow directly from the frequency localized bounds  51  52 . Note that in this case, the various positivity conditions are satisfied.
  •   Y ˙ n 2 2 × Y ˙ n 4 2   bounds for the pair ( A 0 , t A 0 )   Our first step here is to deal with the variable A 0   . We integrate equation  105e and write it schematically as:
    A 0 = Δ 1 ( x [ A 0 , A ̲ ] + [ A ̲ , F ] ) . (139)
    The desired estimate now follows by constructing A 0   from scratch by iteration, using the already established estimates and bilinear embedding  133 and the following:
    Δ 1 : Y ˙ n 2 2 Y ˙ n 4 2 Y ˙ n 2 2 . (140)
    This last embedding follows in turn from the pair of estimates:
    Δ 1 : L ( H ˙ n 2 2 ) L ( H ˙ n 4 2 ) L ( H ˙ n 2 2 ) ,
    Δ 1 : L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) L ( H ˙ n 4 2 ) L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) .
    Both of these are easy consequences of  45 and we leave the numerology to the reader.
    To establish the Y ˙ n 4 2   bound for t A 0   , we can use the equation  105f to treat it as a separate variable. In that equation we have quantities of the form t A ̲   . We can use the curvature equation  105b to swap this for spatial derivatives as follows:
    t A ̲ = x A 0 [ A 0 , A ̲ ] + F . (141)
    This allows us to write schematically:
    ( t A 0 ) = x Δ 1 ( [ A , ( t A 0 ) ] + [ A , x A ] + [ A , [ A , A ] ] + [ A , F ] ) , (142)
    where A   now denotes any of the full set of potentials { A α }   which we have estimated in the space Y ˙ n 2 2   . We may now iterate the equation  142 in the space Y ˙ n 4 2   to constructively obtain the desired bounds using the bilinear embedding: x Δ 1 : Y ˙ n 2 2 Y ˙ n 4 2 Y ˙ n 4 2 .   which follows from differentiating  140 above. Notice that the needed inclusion [ A , A ] Y ˙ n 4 2   follows, for instance, from differentiating the embedding  133 .
  •   Splitting the spatial potentials Our next goal is to split the spatial potentials { A ̲ i }   into a sum of two pieces which are each more easily managed. This will be done using the “structure” equation  105d . Using the formula  141 to get rid of terms of the form t A ̲   on the right hand side of this equation, and using the various Y ˙ s   space embeddings we have just shown (on the time interval [ 0 , T * ]   ), we may write this equation in the schematic form:
    A ̲ = P ( [ B , H ] ) , (143)
    where the quantities ( B , H )   obey the estimate: ( B , H ) Y ˙ n 2 2 × Y ˙ n 4 2 N C ε 0 ,   where the implicit constant in the above inequality comes from the estimates just shown. Using Duhamel's principle and (sharp) time cutoffs, we now extend  143 to all possible times. This is done simply by writing:
    A ̲ ( t ) = A ̲ ( 0 ) ( t ) + 0 t sin ( ( t s ) Δ ) Δ P ( [ B , H ] ) ( s ) χ [ 0 , T * ] ( s ) d s , (144)
    where A ̲ ( 0 )   denotes to propagation of ( A ̲ ( 0 ) , t A ̲ ( 0 ) )   as a solution to the free scalar wave equation. Also, here χ [ 0 , T * ]   denotes the indicator function of the time interval [ 0 , T * ]   . This implies that we have the condition:
    A ̲ ( t ) = 0 , t < 0 , T * < t .
    Now, from the bootstrapping assumption  123 we have the pair of bounds:
    ( A ̲ ( 0 ) , t A ̲ ( 0 ) ) H ˙ n 2 2 × H ˙ n 4 2 N C ε 0 ,
    ( A ̲ ( T * ) , t A ̲ ( T * ) ) H ˙ n 2 2 × H ˙ n 4 2 N C ε 0 .
    Therefore, using the bounds we have just shown in conjunction with the usual Strichartz estimates for the wave equation, we have that this extension of the potentials { A ̲ i }   satisfies the bounds: A ̲ X ˙ n 2 2 N C ε 0 .   Notice that the angular square function structure inherent in the X ˙ s   norms is provided automatically by the fact that the usual wave equation commutes with the angular cutoffs ω Π θ   .
    Our next step to introduce the space–time frequency cutoff S | τ | | ξ |   , which cuts off smoothly on the region | τ | | ξ |   . That is, the compound multipliers P λ S | τ | | ξ |   all have L 1   kernels with uniform bounds. We denote by S | ξ | | τ | = I S | τ | | ξ |   . Our decomposition of { A ̲ i }   is now given by the formula:
    A ̲ ~ = S | τ | | ξ | A ̲ , A ̲ ~ ~ = S | ξ | | τ | A ̲ .
    We now need to show that both the potential sets { A ̲ i ~ }   and { A ̲ i ~ ~ }   obey good X ˙ n 2 2   estimates. Since the original collection of extended potentials does, we only need to prove this assertion for one of these sets. This is most easily shown for the collection { A ̲ i ~ }   . As we have already mentioned, the cutoffs P λ S | τ | | ξ |   are bounded on all mixed Lebesgue spaces. Therefore, the entire multiplier S | τ | | ξ |   is bounded on any mixed Lebesgue-Besov space of the type L q ( B ˙ 2 p , ( 2 , s ) )   . This implies that this multiplier is in fact bounded on the X ˙ s   spaces, which is enough to support our claim.
    Finally, we would like to prove two fixed frequency multiplier estimates which will be useful in the sequel when dealing with the two sets of potentials { A ̲ i ~ }   and { A ̲ i ~ ~ }   . The first is:
    t P λ S | τ | | ξ | A L p λ A L p 1 p . (145)
    This is easily demonstrated by rescaling to frequency λ = 1   and using the L 1   bound on the convolution kernel of t P λ S | τ | | ξ |   . Combining this with the remarks made above, we see that we have the estimate: t A ̲ ~ X ˙ n 4 2 N C ε 0 .   In particular, from everything we have shown, the potential set { A ̲ i ~ }   satisfies all of the requirements  126 of Theorem  6.2 when ε 0   is sufficiently small.
    The second fixed frequency multiplier bound that will be of use shortly is the space–time estimate:
    Ξ 1 P λ S | ξ | | τ | A L q ( L p ) λ 2 A L q ( L p ) . (146)
    Here Ξ   is the multiplier with symbol Ξ ( τ , ξ ) = τ 2 | ξ | 2   . To prove this, we employ a family of Littlewood-Paley space-time cutoffs which we denote by S μ   . By this we mean that the space-time frequency support of these is supported where | τ | + | ξ | μ   . As usual, these are all chosen so as to have uniform L 1   bounds on their convolution kernels. Using the support restrictions of the S | ξ | | τ |   multiplier, we have the formula: P λ S | ξ | | τ | A = μ : λ μ P λ S μ S | ξ | | τ | A .   Therefore, by dyadic summing and the boundedness of the multiplier P λ   , to prove  146 it suffices to be able to show that: Ξ 1 S | ξ | | τ | S μ A L q ( L p ) μ 2 A L q ( L p ) .   This last bound follows easily from rescaling to frequency μ = 1   and the appropriate differential bounds on the symbol of Ξ 1 S | ξ | | τ |   which we leave to the reader.
  •   L 1 ( L )   bounds for the potentials { A ~ ~ α } = ( A 0 , { A ̲ ~ ~ } )   Our goal here is to show the 1   type Besov estimate:
    ( A 0 , { A ̲ ~ ~ } ) L 1 ( B ˙ 1 , ( 2 , n 2 ) ) N C ε 0 . (147)
    By repeatedly using the estimate  146 , we have that the multiplier Ξ 1 Δ S | ξ | | τ |   is bounded on the space L 1 ( B ˙ 1 , ( 2 , n 2 ) )   . Furthermore, from all of the estimates we have shown above, and by distributing the derivative in the first term on the right hand side of  139 , we see that the right hand side of the schematics  139 and  143 are equivalent. Therefore, we have the following heuristic schematic for the potentials { A ~ ~ α }   : A ~ ~ = Δ 1 ( [ B , H ] ) ,   where the pair ( B , H )   enjoys the bounds: ( B , H ) Y ˙ n 2 2 × Y ˙ n 4 2 N C ε 0 .   The bound  147 now follows from the bilinear estimate: Δ 1 : Y ˙ n 2 2 Y ˙ n 4 2 L 1 ( B ˙ 1 , ( 2 , n 2 ) ) .   This in turn follows from the product estimate: Δ 1 : L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ) L 1 ( B ˙ 1 , ( 2 , n 2 ) ) .   This last estimate follows at once from  45 . The check on the conditions  46  50 is left to the reader.
  •   Improving the curvature This is the final part of the proof of Proposition  6.1 .
    Recalling the schematic  131 and using the Strichartz estimates  130 , our goal here is to show the following four bounds:
    [ A ~ ~ , F ] L 1 ( H ˙ n 6 2 ) N 2 C 2 ε 0 2 , (148)
    [ A ~ ~ , F ] L 1 ( H ˙ n 6 2 ) N 2 C 2 ε 0 2 , (149)
    [ A ~ , [ A ~ ~ , F ] ] L 1 ( H ˙ n 6 2 ) N 2 C 2 ε 0 2 , (150)
    [ A ~ ~ , [ A ~ ~ , F ] ] L 1 ( H ˙ n 6 2 ) N 2 C 2 ε 0 2 , (151)
    [ F , F ] L 1 ( H ˙ n 6 2 ) N 2 C 2 ε 0 2 . (152)
    For ε 0   sufficiently small, this will be enough for us to conclude the improved bootstrapping estimates  125 by choosing C   to be such that 1 2 ( L N ) 1 C   is equal to the constant appearing on the right hand side of estimate  130 . This works because the implicit constants which appear in  148  152 above have only been manufactured in the estimates of this proof, and can all be chosen to be independent of N   and C   if ε 0   is chosen small enough.
    To prove these bounds, first notice that the estimates  148 and  150  152 are essentially identical. This follows from the equivalence (in terms of Y ˙ s   spaces) A ~ ~ F   . We also have the equivalences [ A ~ , A ~ ~ ] F   and [ A ~ ~ , A ~ ~ ] F   . These are given by the inclusion:
    Y ˙ n 2 2 Y ˙ n 2 2 Y ˙ n 4 2 . (153)
    This is easily demonstrated, as we have already mentioned, by differentiating the inclusion  133 and using the boundedness of x 2 Δ 1   on the various Y ˙ s   component spaces. Therefore, to prove  148 and  150  152 we only need to know that:
    L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ) L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ) L 1 ( H ˙ n 6 2 ) . (154)
    This is yet again a consequence of our general Besov calculus  45 , and we leave the various additions to the reader.
    Our final task here is to prove the estimate  149 . This needs to be frequency decomposed using a trichotomy. Specifically, we have the following set of fixed frequency estimates in the three cases (note that in the first two estimates below the square summing needs to be done inside the time integral):
    P λ ( L 1 ( B ˙ 1 , ( n , n 2 ) ) ) P λ ( L ( H ˙ n 6 2 ) ) P λ ( L 1 ( H ˙ n 6 2 ) ) , (155)
    P λ ( L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) P λ ( L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ) ) P λ ( L 1 ( H ˙ n 6 2 ) ) , (156)
    P λ ( L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) P λ ( L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ) ) ( μ λ ) δ P μ ( L 1 ( H ˙ n 6 2 ) ) , (157)
    where the quantity δ   in the last estimate  157 above can be computed to be δ = n ( n 3 n 1 ) 3   . The estimate  155 follows from inspection. The latter two estimates  156  157 follow from  51  51 of Remark  4.2 . This completes the proof of Proposition  6.1 .

7 Reduction to Approximate Half-Wave Operators

This is a preliminary technical section where we reduce the proof of the Strichartz estimates  130 to a more easily managed form. This material more or less standard, and we again follow closely what was done in [8. Our first step here is to reduce the proof of Theorem  6.2 to the following:
Proposition 7.1 (Existence of a fixed frequency parametrix). Let the number of dimensions be 6 n   , and let d + A ̲ λ   be a connection which satisfies the conditions  126 . In addition assume that we have the frequency localization condition:
P λ ( A ̲ λ ) = 0 , (158)
where P λ   is a frequency cutoff on the region where 2 10 a λ | ξ |   , where 1 a   is some fixed parameter. Then if the constant   on lines  126d and  126f is sufficiently small, there exists a family of approximate propagation operators W A ̲ λ λ ( s )   (or just W s λ   for short) such that if ( f λ , g λ )   is any set of λ   –frequency initial data with Fourier support in the region 2 a λ | ξ | 2 a λ   , the following estimates hold:
W s λ ( f λ , g λ ) X ˙ 0 × t 1 ( X ˙ 1 ) E 1 2 ( f λ , g λ ) , (159a)
W s λ ( f λ , g λ ) ( s ) f λ L 2 1 2 E 1 2 ( f λ , g λ ) , (159b)
t W s λ ( f λ , g λ ) ( s ) g λ L 2 λ 1 2 E 1 2 ( f λ , g λ ) , (159c)
A ̲ λ W s λ ( f λ , g λ ) L 1 ( L 2 ) λ E 1 2 ( f λ , g λ ) . (159d)
Here we have set E ( f λ , g λ )   to the L 2   normalized energy: E ( f λ , g λ ) = f λ 2 L 2 + λ 2 g λ 2 L 2 .   Finally, we have that the frequency support of the parametrix is contained in the set 2 2 a λ | ξ | 2 2 a λ   , where a   is as above.
  • Proof that Proposition  7.1 implies Theorem  6.2 . The first step here is to reduce the estimate  130 to the case where G 0   . This is done in the usual way via Duhamel's principle. We define the true propagation operator U s ( t )   via the formulas:
    U s ( s ) ( f , g ) = f , t U s ( s ) ( f , g ) = g ,
    and: A ̲ U s ( f , g ) = 0 ,   We then have that:
    F ( t ) = U 0 ( t ) ( f , f ˙ ) + 0 t U s ( t ) ( 0 , G ( s ) ) d s , (160)
    solves the problem  128  129 . In particular, by Minkowski's triangle inequality we easily have that: 0 t U s ( t ) ( 0 , G ( s ) ) d s X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) 0 U s ( 0 , G ( s ) ) X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) d s .   Therefore, we are trying to show:
    U s ( f , g ) X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) C ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 , (161)
    for any pair of functions ( f , g )   and any initial time s   . Since it is easy to see that the conditions  126 are translation invariant, it suffices to show this estimates for s = 0   .
    The estimate  161 will be shown using a bootstrapping procedure. This will be done inside of the compact intervals [ 0 , T * ]   . What we will do is to first assume that  161 is true for all 0 s T *   on all time intervals of the form [ 0 , s ]   and [ s , T * ]   , where the constant on the left hand side of  161 is replaced by 2 C   . Our goal is then to improve the constant by proving the desired bound  161 on the time subintervals of [ 0 , T * ]   . Once this is accomplished, we can easily extend the bound  161    to all subintervals of a slightly larger time interval [ 0 , T * + γ ]   , where the constant 0 < γ 1   is determined by the bound  127 . This is provided by the usual local existence theory based on energy and L   estimates. Once this is done, the bootstrapping closes. Notice again that, by using the local existence theory and the bound  127 , we may begin the argument for some very small time interval [ 0 , γ ]   .
    We are now assuming that  161 holds on our time interval [ 0 , T * ]   with constant 2 C   which we will decide on in a moment. We are working with a solution:
    A ̲ F = 0 , (162)
    where the connection d + A ̲   satisfies  126 , and where we have the initial data:
    F ( 0 ) = f , t F ( 0 ) = g . (163)
    We now split this initial data into a sum frequency localized pieces:
    f = λ P λ ( f ) = λ f λ ,
    g = λ P λ ( g ) = λ g λ ,
    and then repeatedly use Proposition  7.1 to construct an approximate solution to  162  163 as follows: F ~ = λ F ~ λ = λ W 0 λ ( f λ , g λ ) .   By summing over the parametrix estimate  159a we automatically have that: F ~ X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) 1 2 C ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 ,   where C   is some fixed constant. We choose this to be our definition of the constant on the right hand side of  161 . Thus, our goal is to conclude that:
    F F ~ X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) 1 2 C ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 . (164)
    To do this, we use the Duhamel formula  160 to express everything in terms of the operators U s ( t )   : F ( t ) F ~ ( t ) = U 0 ( t ) ( f F ~ ( 0 ) , g t F ~ ( 0 ) ) 0 t U s ( t ) ( 0 , A ̲ F ~ ( s ) ) d s .   By combining the assumed estimate  161 and the approximation bounds  159b  159c , we have that: U 0 ( f F ~ ( 0 ) , g t F ~ ( 0 ) ) X ˙ n 4 2 × t 1 ( X ˙ n 6 2 ) C 1 2 ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 .   Therefore, by using Minkowski's triangle inequality and again using the bootstrapping assumption  161 , we see that in order to conclude  164 we only need to show the following remainder estimate on the time interval [ 0 , T * ]   :
    A ̲ F ~ L 1 ( H ˙ n 6 2 ) C ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 . (165)
    To show the estimate  165 , we use a family of frequency cutoffs: I = P λ + P λ ,   for each scale λ   such that they all have L 1   kernels with uniform bounds, and such that the cutoff P λ   is consistent with the definition of d + A ̲ λ   in the statement of Proposition  7.1 . This allows us to schematically write:
    (166) A ̲ F ~ = λ ( A ̲ λ F ~ λ + [ x A ̲ λ , F ~ λ ] + [ A ̲ λ , x F ~ λ ] + [ [ A ̲ λ , A ̲ λ ] , F ~ λ ] + [ [ A ̲ λ , A ̲ λ ] , F ~ λ ] ) .
    The bound  165 for the term λ A ̲ λ F ~ λ   is a direct consequence of repeatedly applying the estimate  159d while using the fact that each term in this sum is supported in frequency where | ξ | λ   to gain the orthogonality needed to obtain bounds in terms of the pair ( f , g )   . Therefore, we are reduced to showing the following family of error estimates:
    λ [ x A ̲ λ , F ~ λ ] L 1 ( H ˙ n 6 2 ) ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 , (167)
    λ [ A ̲ λ , x F ~ λ ] L 1 ( H ˙ n 6 2 ) ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 , (168)
    λ [ [ A ̲ λ , A ̲ λ ] , F ~ λ ] L 1 ( H ˙ n 6 2 ) ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 , (169)
    λ [ [ A ̲ λ , A ̲ λ ] , F ~ λ ] L 1 ( H ˙ n 6 2 ) ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 . (170)
    These estimates are all very similar to each other, and to estimates we have already proved in the last section, in particular  148  152 . To prove the first estimate  167  above, we further decompose the left hand side into frequencies and use the triangle inequality to bound:  167  ( L . H . S . ) λ , μ : λ μ [ x P μ ( A ̲ ) , F ~ λ ] L 1 ( H ˙ n 6 2 ) .   Thus, by Young's inequality, it suffices to show the following family of fixed frequency estimates: [ x P μ ( A ̲ ) , F ~ λ ] L 1 ( H ˙ n 6 2 ) ( λ μ ) δ P μ ( A ̲ ) Z ˙ n 2 2 ( f λ , g λ ) H ˙ n 4 2 × H ˙ n 6 2 ,   where we have set δ = 3 2 n n 1   . Notice that we have used the Z ˙ n 2 2   norm for the { A ̲ i }   on the right hand side. This allows us to reconstruct norms through square-summing. For λ μ   this estimate is nothing but a fixed frequency version of the estimate  154 above, so it suffices to consider case λ μ   . Using the simple inclusion x X ˙ n 2 2 X ˙ n 4 2   , this is a consequence of the fixed frequency embedding:
    P μ ( L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ) ) P λ ( L 2 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ) ) ( λ μ ) δ L 1 ( H ˙ n 6 2 ) , (171)
    which follows at once from the fixed frequency estimate  53 which helps to generate the general estimate  45 . Notice that the proof of the second estimate  168 above is very similar to what we have just done. In fact, there is more room because the derivative is on the low frequency term. We leave the details to the reader.
    It remains to prove the two estimates  169  170 . Since these follow from essentially identical reasoning, we concentrate on proving the second of these estimates. This one in fact requires a bit more work than the fist because it has more frequency overlap. Applying a trichotomy to the product, we see that it suffices to be able to show the following three estimates:
    0 T * ( λ ( μ : μ λ [ P μ ( [ A ̲ λ , A ̲ λ ] ) , F ~ λ ] ( s ) H ˙ n 6 2 ) 2 ) 1 2 d s (172)
    A ̲ 2 X ˙ n 2 2 ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 , (173)
    0 T * ( μ ( λ : λ μ [ P μ ( [ A ̲ λ , A ̲ λ ] ) , F ~ λ ] ( s ) L 1 ( H ˙ n 6 2 ) ) 2 ) 1 2 d s (174)
    A ̲ 2 X ˙ n 2 2 ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 , (175)
    λ , μ : λ μ [ P μ ( [ A ̲ λ , A ̲ λ ] ) , F ~ λ ] L 1 ( H ˙ n 6 2 ) (176)
    A ̲ 2 X ˙ n 2 2 ( f , g ) H ˙ n 4 2 × H ˙ n 6 2 . (177)
    The first two estimates  173  175 follow from first fixing time and then proving the fixed frequency estimate:
    [ P μ ( [ A ̲ λ , A ̲ λ ] ) , F ~ λ ] ( s ) H ˙ n 6 2 min ± ( λ μ ) ± δ P μ ( [ A ̲ λ , A ̲ λ ] ) ( s ) B ˙ 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) F ~ λ ( s ) B ˙ 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ,  
    where δ   is the same constant from estimate  171 . Indeed, this last line follows from the non-time integrated version of that estimate. Applying Young's inequality to this, integrating in time and applying Cauchy-Schwartz, using the parametrix bound  159a , the product embedding  153 , and the fact that for each fixed value of λ   the multipliers P λ   and P λ   are bounded on the X ˙ s   spaces we arrive at the desired pair of estimates.
    It remains for us to prove the last estimate  177 above. After another application of the embedding  154 and a Cauchy-Schwartz, followed by the parametrix estimate  159a , we are left with showing the bound: ( λ , μ : λ μ [ P μ ( [ A ̲ λ , A ̲ λ ] ) 2 L 2 ( B ˙ 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ) ) 1 2 A ̲ 2 X ˙ n 2 2 .   This last estimate follows from applying a further trichotomy, and then using Young's inequality after reduction to the various fixed frequency versions of the product estimate  153 which are provided by the general fixed frequency estimates  51  51 . We leave the details to the diligent reader. This completes the proof of our reduction of Theorem  6.2 to Proposition  7.1 .
The final thing we will do in this section is to make one further reduction of the Strichartz estimates  130 . This involves the following proposition:
Proposition 7.2 (Existence of approximate half-wave parametrices). Let the number of dimensions be 6 n   , and let d + A ̲ 1   be a connection which satisfies the conditions  126 as well as the frequency localization condition  158 for λ = 1   . Then there exists pair of evolution operators Φ ± ( f ^ ) ( t )   from L 2 ( R ξ n )   to L 2 ( R x n )   such that the fixed time adjoints ( Φ ± ( t ) ) *   are always supported in the region 2 a | ξ | 2 a   for some fixed 1 a   , and such that they obey the following estimates:
( P 1 Φ ± ( f ^ ) , Φ ± ( f ^ ) ) X ˙ 0 × L x 2 f ^ L ξ 2 , (178a)
x Φ ± ( f ^ ) L t 2 ( L x 2 ( n 1 ) n 3 ) f ^ L ξ 2 , (178b)
t P 1 Φ ± ( f ^ ) P 1 Φ ± ( 2 π i | ξ | f ^ ) X ˙ 0 f ^ L ξ 2 , (178c)
Φ ± ( 0 ) ( ( 2 π | ξ | ) α ( Φ ± ( 0 ) ) * ) g ( Δ ) α 2 P 1 ( g ) L x 2 1 2 g L x 2 , (178d)
A ̲ 1 Φ ± ( f ^ ) L t 1 ( L x 2 ) f ^ L ξ 2 . (178e)
  • Proof that Proposition  7.2 implies Proposition  7.1 . This is a simple matter, and we explain it briefly. Notice first that it suffices to prove Proposition  7.1 on the scale λ = 1   because everything in sight is scale invariant. We now let ( f 1 , g 1 )   be any pair of unit frequency initial data, and we define the approximate unit frequency wave propagator:
    (179) W 0 1 ( f 1 , g 1 ) ( t ) = P 1 ( 1 2 Φ + ( t ) ( Φ + ( 0 ) ) * f 1 + 1 2 Φ ( t ) ( Φ ( 0 ) ) * f 1 + Φ + ( t ) ( 1 4 π i | ξ | ( Φ + ( 0 ) ) * ) g 1 Φ ( t ) ( 1 4 π i | ξ | ( Φ ( 0 ) ) * ) g 1 ) .
    Here P 1   is defined to be the cutoff on line  178d which is also chosen large enough such that P 1 ( f 1 , g 1 ) = ( f 1 , g 1 )   . From the boundedness of the P 1   multiplier, the estimates  178a and  178c , the frequency support of the adjoints, and the dualized L x 2 L ξ 2   estimate contained in  178a , we easily have that the operator  179 obeys the estimate  159a . Next, notice that by applying  178d with α = 0   and α = 1   , and using the unit frequency condition which implies the boundedness of ( Δ ) 1 2   , we have the estimate  159b . Furthermore, by using estimate  178c in conjunction with  178d , where this time we use the indices α = 0   and α = 1   , and using the boundedness of ( Δ ) 1 2   at unit frequency, we have the second accuracy estimate  159c . Therefore, it remains to show that we have the error estimate  159d . By the estimate  178e and by again making use of the dual L x 2 L ξ 2   adjoint bound, we are reduced to proving (operator) commutator bounds of the type: [ A ̲ 1 , P 1 ] Φ ± ( h ^ ) L t 1 ( L x 2 ) h ^ L ξ 2 .   Using the commutator estimate  39 in conjunction with the parametrix bounds  178a  178b (this is where the extra bound on the gradient comes in), this reduces to showing the two bounds:
    x A ̲ 1 L t 2 ( L x n 1 ) A ̲ 1 X ˙ n 2 2 , (180)
    x [ A ̲ 1 , A ̲ 1 ] L t 1 ( L x ) A ̲ 1 2 X ˙ n 2 2 . (181)
    The first estimate follows easily from integrating the following Besov and low frequency Besov nestings: P 1 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ) B ˙ 2 n 1 , ( 2 , n ( n 3 2 ( n 1 ) ) ) L n 1 .   The second estimate follows as easily from first distributing the derivative and then integrating the two low frequency nestings: P 1 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 3 2 ) ) , P 1 ( B ˙ 2 2 ( n 1 ) n 3 , ( 2 , n 1 2 ) ) B 1 , ( 2 , n 2 ) L .   This completes the proof that Proposition  7.2 implies Proposition  7.1 .

8 Construction of the half wave operators

We now begin construction of our approximate solutions Φ ±   to the reduced covariant wave equation A ̲ 1   . This will be accomplished by integrating over a collection of gauge transformations designed to eliminate the highest order effect of troublesome term A ̲ α 1 α   . In order to understand what such a gauge transformation should be, we begin with a simple calculation. We consider the covariant wave equation ω A   , where the connection ω D = d + ω A   will be determined in a moment, acting on a vector valued plane wave e 2 π i λ ω u ± f ^   . Here f ^   is a constant complex valued matrix in C o ( m )   , and the ω u ±   are the standard plane wave optical functions:
ω u + = t + ω x , ω u = t + ω x .
In particular, α ( ω u ± ) = ( ω L ) α   , where the ω L ±   are the associated null hyper-surface generators:
ω L + = t + ω x , ω L = t + ω x .
With these identifications, we easily have the calculation:
ω A ( e 2 π i λ ω u ± f ^ ) = e 2 π i λ ω u ± ( 4 π i λ [ ω A ( ω L ) , f ^ ] + D α ω A [ ω A α , f ^ ] ) . (182)
Using the heuristic5 that terms of the form ( ω A )   and [ ω A , ω A ]   are lower order, and splitting the potentials { ω A α }   into the sets { ω A α ± }   associated with the optical functions ω u ±   (resp.), we see that in order eliminate the highest order term on the right hand side of  182 would need to assume this connection is in the backward (resp. forward) ω   -null-gauge:
ω A + ( ω L ) = 0 , ω A ( ω L + ) = 0 . (183)
Of course, it is not possible to assume that a given fixed connection will simultaneously be in the null-gauge for every direction ω   . However, it is more or less clear that since these gauges are of Crönstrom type, it is always possible to transform a given connection so that it is in the null-gauge for a fixed direction. This motivates the following form of an approximate solution to A ̲ 1   :
Φ ± ( f ^ ) = R n e 2 π i λ ω u ± ω g ± 1 f ^ ( λ ω ) ω g ± χ ( 1 2 , 2 ) ( λ ) λ n 1 d λ d ω , (184)
where χ ( 1 2 , 2 )   is a smooth bump function such that χ ( 1 2 , 2 ) 1   on the interval [ 2 1 , 2 ]   and such that χ ( 1 2 , 2 ) 0   outside of [ 4 1 , 4 ]   (the variable width assumption of Proposition  7.2 can be achieved with similar bump functions). Here, the gauge transformation:
ω B ± = ω g ± A ̲ 1 ( ω g ± 1 ) + ω g ± d ( ω g ± 1 ) , (185)
will be chosen so that ω B ±   approximately satisfies  183 . It seems that there are in fact many choices of how to do this, although the naive choice of letting ω B ±   satisfy  183 directly by solving the appropriate transport equations6 leads to group elements with poor regularity properties. Therefore, the procedure for arriving at the correct choice deserves some motivation.
The heart of the matter is two-fold. First and foremost, we need to come up with a construction that gives us explicit formulas so that we may perform certain standard calculations on the integral  184 . In particular, we will need to perform integration by parts with respect to the variable ω   . Since G   is assumed to be non-abelian, and since we will not be able to localize things to a neighborhood of any fixed point on the group7 , this is actually a non-trivial matter. For example, it is not possible to do this directly through a use of the exponential map because we would run into trouble with conjugate points.
Secondly, we will need to replace the transport equation which defines the naive pure null-gauge transformation, with something that has more “elliptic” features.
That such a choice is possible is, strangely enough, determined by the fact that the connection { A ̲ 1 }   is not arbitrary, but instead evolves according to a hyperbolic equation. This is taken into account by condition  126e . This kind of structure seems to be ubiquitous in geometric wave equations, both semi and quasi-linear, and the observation that it makes the crucial difference goes back to work of Klainerman-Rodnianski on quasi-linear wave equations [5. The particular form we will use it in here is almost identical to that of [8, but since everything we do is non-abelian, the derivation will seem a bit different at first.
The first observation we use is that just like the Crönstrom gauge, the null-gauge allows one to recover the potentials directly from the curvature. However, since we aim to derive an (sub)-elliptic equation, we do not do this by simply integrating along null directions. Instead, we write:
ω L ω B α ± = F ω B ± ( ω L , α ) . (186)
Making now the approximate assumption that the { ω B ± }   are simply a solution to the scalar wave equation = α α   , which we write as:
= ω L ± ω L + Δ ω , (187)
the identity  186 can be written in the integral form:
ω B α ± = ω L ± Δ ω 1 F ω B ± ( ω L , α ) . (188)
Here Δ ω = Δ ω 2   is the Laplacean on the plane perpendicular to the ω   direction in R n   . We would now like to make  188 our “choice” for the gauge transformed connection on the right hand side of  185 . For example, even though it was based on the approximate assumption the { ω B ± }   satisfy the scalar wave equation, it still respects the null-gauge  183 simply by the skew-symmetry property of the curvature. Unfortunately,  188 has several undesirable features. Firstly, we would like an expression which involves the curvature of { A ̲ 1 }   , not the curvature F ω B ±   .
Secondly, the sub-Laplacean on the right hand side of this expression needs to be smoothed out in some way so that its dependence on the angular variable ω   is not so rough.
To get around the first of these problems, we simply pretend that the various differential operators on the right hand side of  188 are gauge covariant. Assuming this and then conjugating both sides of that expression by ω g ±   , moving these group elements past the differential operators on the right, and throwing away quadratic terms from the curvature while assuming that the reduced connection A ̲ 1   satisfies the usual homogeneous wave equation, we are left with the approximate identities:
ω g ± 1 ω B α ± ω g ± ω L ± Δ ω 1 F A ̲ 1 ( ω L , α ) ,
( A ̲ 1 ) α + α ω L ± Δ ω 1 A ̲ 1 ( ω L ) .
To get around the second problem, we mollify the angular variable of the second term on the right hand side of this last expression. Doing this and looking back on the definition  185 , we see that we would like our group elements to be such that:
ω g ± 1 d ( ω g ± ) ω Π ¯ ( 1 2 δ ) x ω L ± Δ ω 1 A ̲ 1 ( ω ) . (189)
Here we have set:
0 < γ δ 1 , (190)
where γ   is our small all purpose constant from line  12 above. Now the problem is, of course, that right hand side of the above formula does not in general represent a flat connection. However, as one can see immediately, its curvature is small in some sense because it is a quadratic expression. At this point, the problem now looks essentially like what happens for wave-maps8 (see e.g. [11and [9). In particular, it is clear that the right way to define the group elements ω g ±   so that the approximate formula  189 holds is to flatten out the right hand side of that expression as much as possible by using the potential version  3.2 of the Uhlenbeck lemma. Therefore, what we need to do is to show the fixed time estimate:
ω Π ¯ ( 1 2 δ ) x ω L ± Δ ω 1 A ̲ 1 ( ω ) L n , (191)
and then assume that   is chosen small enough to that we may use it as the constant in  22 . Because of its utility in the sequel, we will in fact prove the more general estimate:
ω Π ¯ ( 1 2 δ ) x ω L ± Δ ω 1 A ̲ 1 ( ω ) B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) , (192)
where p γ   is a dimension dependent Lebesgue index which we set to:
p γ = 2 ( n 1 ) n 3 2 γ . (193)
Here 0 < γ 1   is again the all-purpose constant which we have fixed in section  2 to be small enough so that it is compatible with its use here. Notice that  192 implies the estimate  191 thanks to the embedding  43 and the fact that for γ   sufficiently small there is plenty of room in the inequality p γ < n   .
Now, because the norm B ˙ 2 , 10 n p γ , ( 2 , n 2 2 )   is 2   based, by orthogonality and the L ( H ˙ n 2 2 )   estimate contained in the bootstrapping assumption  126d , we see that in order to conclude  192 it is enough to show the fixed frequency estimate (note that there are no high frequencies here): x ω L ± Δ ω 1 ( A ̲ 1 ) μ ( ω ) L p γ μ n ( 1 2 1 p γ ) ( A ̲ 1 ) μ L 2 .   Decomposing the spatial frequency variable into fixed dyadic angular sectors spread from the direction ω   : P μ = θ ω Π θ P μ   , this estimate further reduces (after dyadic summing) to being able being able to prove that:
ω Π θ x ω L ± Δ ω 1 ( A ̲ 1 ) μ ( ω ) L p γ θ γ μ n ( 1 2 1 p γ ) ( A ̲ 1 ) μ L 2 . (194)
We are now almost at the point where we can apply the angular Bernstein inequality  56 directly, because in the current localized setting we have the symbol bounds:
ω Π θ x ω L ± Δ ω 1 S | τ | | ξ | P μ θ 2 P μ , (195)
where we are enforcing the heuristic notation introduced on line  58 . However, since Bernstein only nets us a savings of: θ ( n 1 ) ( 1 2 1 p γ ) = θ 1 + γ ,   in this context, we need to be a bit more careful in order to gain an extra power of θ   . This is provided by the fact that the potentials { A ̲ 1 }   are in the Coulomb gauge. Notice that if say, 1 10 < θ   there is nothing to worry about and we have estimate  194 without any problem. On the other-hand, if it is the case that θ < 1 10   , then we can use the fact that ω Π θ ω 1   is elliptic (in terms of symbol bounds) in conjunction with the gauge condition d * A ̲ 1 = 0   to write:
ω Π θ A ̲ 1 ( ω ) = ω 1 ω Π θ / d * / A ̲ 1 θ ω Π θ / A ̲ 1 . (196)
Here { / A ̲ 1 }   the induced connection (angular portion) on the hyperplane ω   perpendicular to ω   , and / d *   is the associated divergence. We note here that this identity will turn out to be very useful and will be used many times throughout the sequel. With these extra savings in mind, an application of Bernstein now directly yields the desired estimate  194 .
We have now constructed the infinitesimal group elements ω g ±   in equations  185 , which is explicitly defined by the formulas  31 in Lemma  3.2 applied to the connection:
ω A ̲ ± = ω Π ¯ ( 1 2 δ ) x ω L ± Δ ω 1 A ̲ 1 ( ω ) . (197)
This has the pleasant effect that we will never need to explicitly refer to the connection { ω B ± }   in line  185 . We can calculate the conjugated right hand side of that expression to be:
ω g ± 1 ω B ± ω g ± = A ̲ 1 ω C ± , (198)
where we have set:
ω g ± 1 d ( ω g ± ) = ω C ± . (199)
Using the formulas  31 , we have the following expressions for the spatial components { ω C ̲ ± }   :
( ω C ̲ ± ) d f = d * Δ 1 [ ω C ̲ ± , ω C ̲ ± ] , (200a)
( ω C ̲ ± ) c f = ω A ̲ ± x Δ 1 [ ω A ̲ ± , ω C ̲ ± ] . (200b)
In order to compute a formula for the temporal potential ω C 0 ±   , we simply use the fact that F ω C ± = 0   and the formula  200b which together imply (by computing d * E ω C ±   ):
ω C 0 ± = ω A 0 ± t Δ 1 [ ω A ̲ ± , ω C ̲ ± ] d * Δ 1 [ ω C 0 ± , ω C ̲ ± ] , (201)
where we have: ω A 0 ± = ω Π ¯ ( 1 2 δ ) t ω L ± Δ ω 1 A ̲ 1 ( ω ) .   We remark here that the importance of the system of equations  200a  201 is that they give the following decomposition of the infinitesimal gauge transformation { ω C ± }   :
ω C ± = t , x ω Π ¯ ( 1 2 δ ) ω L ± Δ ω 1 A ̲ 1 ( ω ) + { Q u a d r a t i c E r r o r } . (202)
The linear term in the above expression is enough to kill off the worst error term when differentiating the parametrix  184 . It should be noted that this linear term is precisely what one gets more directly in the abelian case studied in [8. We should also point out here that the quadratic error on the right hand side of  202 above is much more delicate than the quadratic error resulting form the cancellation involving the linear term in this expression. In order to control this, we will need the full force of the orthogonality properties of our parametrix, which are contained in the bootstrapping assumption  126d , as well as some rather technical function spaces and multilinear estimates which we will develop in Section  11 .
To close out this section, we apply the truncated covariant wave operator A ̲ 1   to the parametrix  184 and record the various error terms which result. We gather this together in the following proposition:
Proposition 8.1 (Error terms for the differentiated parametrix). Consider the parametrix Φ ± ( f ^ )   defined by the formula  184 , with infinitesimal gauge transformations given by equations  200a  201 . Then one has the identity:
A ̲ 1 Φ ± ( f ^ ) (203)
= 4 π i R n e 2 π i λ ω u ± [ A ̲ 1 ( ω L ) ω C ± ( ω L ) , ω g ± 1 f ^ ( λ ω ) ω g ± ] χ ( 2 1 , 2 ) ( λ ) λ n d λ d ω
R n e 2 π i λ ω u ± [ D α A ̲ 1 ( ω C ± ) α , ω g ± 1 f ^ ( λ ω ) ω g ± ] χ ( 2 1 , 2 ) ( λ ) λ n 1 d λ d ω
+ R n e 2 π i λ ω u ± [ A ̲ α 1 ( ω C ± ) α , [ ( A ̲ 1 ) α ω C α ± , ω g ± 1 f ^ ( λ ω ) ω g ± ] ] χ ( 2 1 , 2 ) ( λ ) λ n 1 d λ d ω .
Remark 8.2. The worst error term in the expression  203 is of course the “derivative fall on high” term which is the first on the right hand side. However, using the structure equation  126e , this takes the form:
A ̲ 1 ( ω L ) ω C ± ( ω L ) , (204)
= A ̲ 1 ( ω ) + ω Π ( 1 2 δ ) ω L ω L ± Δ ω 1 A ̲ 1 ( ω ) + { Q u a d r a t i c E r r o r } ,
= ( I ω Π ( 1 2 δ ) ) A ̲ 1 ( ω ) + { Q u a d r a t i c E r r o r } .
The key observation now is that since the operator ( I ω Π ( 1 2 δ ) )   cuts off on such a small angular sector with respect to the spatial frequency, an application of Bernstein's inequality gains enough extra spatial derivatives to put this term in the mixed Lebesgue space L 2 ( L n 1 )   . Furthermore, the quadratic error term which is left over involves enough bilinear interactions to go in L 1 ( L )   . So in this sense, as we have mentions before, the problem reduces to something which is reminiscent of wave-maps. Of course, there is a somewhat heavy price to pay for this “renormalization”, which is that it must take place under an integral sign. Finally, it is worth pointing out that this top order cancellation is completely analogous to what happens in the abelian case [8.
  • Proof of the error identity  203 . The proof is a simple consequence of using gauge transformations in conjunction with the identity  182 . Applying the truncated covariant wave operator, and differentiating under the integral sign, we see that:
     203 
    A ̲ 1 Φ f ± ,
    = R n A ̲ 1 ( e 2 π i λ ω u ± ω g ± 1 f ^ ( λ ω ) ω g ± ) χ ( 2 1 , 2 ) ( λ ) λ n 1 d λ d ω ,
    = R n ω g ± 1 ω B ± ( e 2 π i λ ω u ± f ^ ( λ ω ) ) ω g ± χ ( 2 1 , 2 ) ( λ ) λ n 1 d λ d ω ,
    = R n e 2 π i λ ω u ± ω g ± 1 ( 4 π i λ [ ω B ± ( ω L ) , f ^ ] + D α ω B ± [ ω B ± α , f ^ ] ) ω g ± χ ( 2 1 , 2 ) ( λ ) λ n 1 d λ d ω ,
    = 4 π i R n e 2 π i λ ω u ± [ ω g ± 1 ω B ± ( ω L ) ω g ± , ω g ± 1 f ^ ω g ± ] λ n 1 χ ( 2 1 , 2 ) ( λ ) d λ d ω (205)
    + R n e 2 π i λ ω u ± D α A ̲ 1 [ ω g ± 1 ω B ± α ω g ± , ω g ± 1 f ^ ω g ± ] λ n 1 χ ( 2 1 , 2 ) ( λ ) d λ d ω ,
    = 4 π i R n e 2 π i λ ω u ± [ A ̲ 1 ( ω L ) ω C ± ( ω L ) , ω g ± 1 f ^ ω g ± ] λ n 1 χ ( 2 1 , 2 ) ( λ ) d λ d ω (206)
    R n e 2 π i λ ω u ± [ α ( ω C ± ) α , ω g ± 1 f ^ ω g ± ] λ n 1 χ ( 2 1 , 2 ) ( λ ) d λ d ω (207)
    + R n e 2 π i λ ω u ± [ ( A ̲ 1 ) α ω C α ± , α ( ω g ± 1 f ^ ω g ± ) ] λ n 1 χ ( 2 1 , 2 ) ( λ ) d λ d ω (208)
    + R n e 2 π i λ ω u ± [ ( A ̲ 1 ) α , [ ( A ̲ 1 ) α ω C α ± , ω g ± 1 f ^ ω g ± ] ] λ n 1 χ ( 2 1 , 2 ) ( λ ) d λ d ω
    = ( L . H . S . ) .
    Notice that the equality on the last line follows from: α ( ω g ± 1 f ^ ω g ± ) = [ ω g ± 1 f ^ ω g ± , ω C α ± ] ,   which is a consequence of line  199 above, followed by the Jacobi identity:
    [ ( A ̲ 1 ) α ( ω C ± ) α , [ ω g ± 1 f ^ ω g ± , ω C α ± ] ] ,
    = [ ω C α ± , [ ( A ̲ 1 ) α ( ω C ± ) α , ω g ± 1 f ^ ω g ± ] ]
    [ ω g ± 1 f ^ ω g ± , [ ω C α ± , ( A ̲ 1 ) α ( ω C ± ) α ] ] ,
    = [ ω C α ± , [ ( A ̲ 1 ) α ( ω C ± ) α , ω g ± 1 f ^ ω g ± ] ]
    [ [ ( A ̲ 1 ) α , ( ω C ± ) α ] , ω g ± 1 f ^ ω g ± ] ] .
    This completes the proof of  203 .

5 For those who are familiar with this kind of problem, this is precisely a reduction to the famous L o w × H i g h   frequency interaction A ̲ α 1 α Φ 1   .

6 This would end up being the usual a frequency based Hadamard parametrix for the operator A ̲ 1   .

7 This is an artifact of the critical nature of the problem.

Specifically, the group elements have the heuristic form ω g = e x p ( 1 ω A )   . Since we do not have L   control on 1 ω A   we cannot localize its image.

8 It is very much our philosophy here that this problem is essentially equivalent to wave-maps after a microlocalization.

Of course, as the reader will see, this microlocalization is quite costly and introduces many objects that are not present in the original wave-maps problem.

9 Fixed Time L 2   Estimates for the Parametrix

We now begin our proof of the estimates  178 for the integral operator  184 introduced in the last section. Here we cover bounds which are of non-differentiated energy type. Specifically, we will show the undifferentiated L ( L 2 )   estimate contained in  178a , as well as the multiplier-approximation bound  178d . Both of these will follow from the same set of estimates. At a heuristic level, they are not much more involved that a standard T T *   argument followed by some integration by parts, although the details turn out to be a bit involved. Things will be computed more or less directly by an appeal to the explicit equations  200a  201 , taking a little bit of care to use them properly. This will be done by considering them as “path lifting” formulas from Minkowski space n   to the compact group G   .
This allows us to employ an integral form of the intermediate value theorem from elementary calculus which is valid in the context of Lie groups. It turns out that this identity can be differentiated as many times as necessary with respect to the angular frequency variable, although this fact is provided through a surprisingly delicate bootstrapping argument. Here the unitarity of the group is needed in a crucial way to keep everything from collapsing. Once the bootstrapping is complete, the estimates themselves will be proved using a “trace-Bernstein” type inequality that we construct by hand using various multipliers. Once the integration by parts portion of things is taken care of, we will close the L 2   estimate by showing that a “non-smooth” remainder kernel has small amplitudes after integration in the angular frequency variable. This involves some fairly technical bilinear estimates because the necessary othogonality arguments are difficult to pass through Hodge systems. The details of these procedures are as follows.
Throughout this section we will replace the specific cutoff function χ ( 1 2 , 2 )   appearing in the definition of parametrix  184 with an arbitrary smooth scalar bump function χ ( ξ )   that we may assume to be supported in the frequency annulus { 4 1 < | ξ | < 4 }   .
At fixed time t 0   , we define the operator T ( f ^ ) = Φ ( f ^ ) ( t 0 )   , where we have suppressed the ±   notation because it will be irrelevant for what we do here. Our first goal is the prove the bound:
T ( f ^ ) L 2 f ^ L 2 . (209)
Squaring this, it suffices to show that (here f   has no relation to f ^   and simply represents a function of the physical-space variables):
T T * ( f ) L 2 f L 2 , (210)
where the adjoint T *   is taken with respect to the Killing form  13 . A quick calculation of the kernel of this operator shows that:
K T T * ( x , y ) = R n e 2 π i ( x y ) ξ ω g 1 ( x ) ω g ( y ) [ ] ω g 1 ( y ) ω g ( x ) χ ( ξ ) d ξ , (211)
where we use the [ ]   notation to emphasize the fact that this operator acts via conjugation. Our task is now to show the estimates:
K T T * L y ( L x 1 ) , K T T * L x ( L y 1 ) 1 . (212)
Since K T T *   is essentially symmetric in ( x , y )   , we may concentrate on the first such estimate.
To proceed, we first decompose the product physical space R n × R n   into the dyadic regions:
D σ = { | x y | σ | σ = 2 i , i N } . (213)
We then decompose the kernel T T *   kernel into the dyadic sum: K T T * = σ χ D σ K T T * = σ K σ T T * .   By dyadic summing, to show  212 it suffices to be able to show the single estimate:
K σ T T * L y ( L x 1 ) σ γ , (214)
where 0 < γ 1   now represents a small savings in physical space decay. Now  214 would be easy to show if we had the absolute decay estimate: | K σ T T * ( x , y ) | | x y | ( n + γ ) ,   and this is almost true. Unfortunately, there is a regularity problem due to the degeneracy of the sub-Laplacean Δ ω   used in the connection  200 which provides the group elements ω g   . This forces us to write the kernel K σ T T *   as a sum of two terms:
K σ T T * = K ~ σ T T * + σ T T * . (215)
We will then prove that both:
| K ~ σ T T * ( x , y ) | | x y | ( n + γ ) , (216)
σ T T * L y ( L x 1 ) σ γ . (217)
To define the splitting  215 , we factor the group elements ω g   into a product of smooth and small parts. This is completely analogous to the procedure used in [8, but since things are non-abelian (and hence non-linear) here, the estimates required are quite a bit more involved. What we will do is construct another gauge transformation ω g ~   , which is based on a further smoothing of the connection  197 .
This will produce a group element which can be treated as a standard symbol. To this end, we define the scale mollified connection:
ω A ̲ ( σ ) ~ = ω Π ¯ σ 1 + γ < ω Π ¯ ( 1 2 δ ) x ω L Δ ω 1 A ̲ 1 ( ω ) , (218)
where γ   is, again, the small dimensional constant from line  190 . Again, we have dropped the ±   notation because it is irrelevant. Following the proof of  191 , and using the fact that the multipliers ω Π ¯ σ 1 + γ <   are bounded on frequency localized Lebesgue spaces, we may apply Lemma  3.2 to the connection { ω A ̲ ( σ ) ~ }   .
This produces a group element ω g ~   , which is defined by the infinitesimal generator:
ω g ~ 1 d ( ω g ~ ) = ω C ̲ ~ . (219)
Furthermore, this generator is itself defined via the Hodge system:
( ω C ̲ ~ ) d f = d * Δ 1 [ ω C ̲ ~ , ω C ̲ ~ ] , (220a)
( ω C ̲ ~ ) c f = ω A ̲ ( σ ) ~ x Δ 1 [ ω A ̲ ( σ ) ~ , ω C ̲ ~ ] . (220b)
Using this new group element ω g ~   , we define the remainder group element ω h   via the product:
ω g = ω h ω g ~ . (221)
To compute the infinitesimal generator of ω h   , we first use the identity:
d ( ω h ) = d ( ω g ) ω g ~ 1 + g d ( ω g ~ 1 ) ,
= ω h ω g ~ ( ω C ̲ ω C ̲ ~ ) ω g ~ 1 . (222)
This leads us to define the difference connection:
ω C ̲ ~ ~ = ω C ̲ ω C ̲ ~ . (223)
A quick calculation using the systems  200 and  220 shows that this new connection can be pinned down via the Hodge system:
( ω C ̲ ~ ~ ) d f = d * Δ 1 ( [ ω C ̲ ~ , ω C ̲ ~ ~ ] + [ ω C ̲ ~ ~ , ω C ̲ ~ ] ) , (224a)
( ω C ̲ ~ ~ ) c f = ω A ̲ ~ ω A ̲ ( σ ) ~ x Δ 1 ( [ ω A ̲ ~ ω A ̲ ( σ ) ~ , ω C ̲ ~ ] + [ ω A ̲ ( σ ) ~ , ω C ̲ ~ ~ ] ) , (224b)
where a simple computation shows that:
ω A ̲ ω A ̲ ( σ ) ~ = ω Π ¯ σ 1 + γ ω Π ¯ ( 1 2 δ ) x ω L Δ ω 1 A ̲ 1 ( ω ) , (225)
We now define the decomposition  215 along the following decompositions of the group element products in the kernel  211 :
ω g 1 ( x ) ω g ( y ) = ω g ~ 1 ( x ) ω g ~ ( y ) + ω g ~ 1 ( x ) ( ω h 1 ( x ) ω h ( y ) I ) ω g ~ ( y ) , (226)
ω g 1 ( y ) ω g ( x ) = ω g ~ 1 ( y ) ω g ~ ( x ) + ω g ~ 1 ( y ) ( ω h 1 ( y ) ω h ( x ) I ) ω g ~ ( x ) . (227)
Accordingly, we define:
K ~ T T * ( x , y ) = R n e 2 π i ( x y ) ξ ω g ~ 1 ( x ) ω g ~ ( y ) [ ] ω g ~ 1 ( y ) ω g ~ ( x ) χ ( ξ ) d ξ , (228)
and then define σ T T *   according to the formula  215 . The idea now is that while one can only perform integration by parts in the kernel  228 above, the group element ω h 1 ( x ) ω h ( y )   and its inverse, which must be contained as at least one factor in the remainder, are so close to the identity matrix that the resulting difference expression can be estimated without use of the oscillations which take place under the integral sign.
We now begin our proof of the estimate  216 . To do this, we simply integrate by parts as may times as necessary with respect to the variable ξ   in order to pick up the needed point-wise decay. Doing this, we see that in order to draw our conclusion, it suffices to show the following symbol bounds for 1 k   :
χ D σ ξ k ( ω g ~ 1 ( x ) ω g ~ ( y ) ) σ k ( 1 γ ) , (229)
χ D σ ξ k ( ω g ~ 1 ( y ) ω g ~ ( x ) ) σ k ( 1 γ ) . (230)
In fact, we shall prove the following more general bounds, which contain  229  230 as a special case, and which will be useful in the sequel:
Proposition 9.1 (Symbol bounds for the smoothed amplitudes ω g ~ 1 ( t , x ) ω g ~ ( s , y )   and ω g ~ 1 ( s , y ) ω g ~ ( t , x )   ). Let the group elements ω g ~   be defined infinitesimally by the Hodge system  220 , where the parameter σ 1 + γ   is replaced by M 1   , where M   lies in the range:
( | t s | + | x y | ) 1 2 M | t s | + | x y | . (231)
Then for any integer 1 k   , one has the following symbol bounds assuming that the bootstrapping constant   from line  126d is chosen sufficiently small (with respect to each fixed k   ):
ξ k ( ω g ~ 1 ( t , x ) ω g ~ ( s , y ) ) M k , (232)
ξ k ( ω g ~ 1 ( s , y ) ω g ~ ( t , x ) ) M k . (233)
Here the ξ k   notation is shorthand for all k t h   order partial derivatives involving the variable ξ   , and   is the standard matrix vector-norm from line  15 . The implicit constants on the right hand side depend on k   , but are uniform in the parameter M   for each fixed k   .
  • Proof of the estimates  232  233 . It suffices for us to prove the first bound  232 , as the second follows from virtually identical reasoning. The goal is to reduce this via an ODE bootstrapping type argument to an associated estimate involving the connection { ω C ~ }   . This associated estimate will then be proved by another bootstrapping argument in certain mixed Lebesgue-Besov spaces naturally associated with the ODE problem from the first step. The goal of the second bootstrapping will be to reduce things to proving the Besov estimates for the connection { A ̲ 1 }   which appears as the linear term on the right hand side of the Hodge system  220a .
    Before proceeding, we first make a preliminary reduction on the product ω g ~ 1 ( t , x ) ω g ~ ( s , y )   .
    We would like be set up as to only have to handle products which involve the same space or same time variables. This is easily accomplished via the product decomposition:
    ω g ~ 1 ( t , x ) ω g ~ ( s , y ) = ω g ~ 1 ( t , x ) ω g ~ ( t , y ) ω g ~ 1 ( t , y ) ω g ~ ( s , y ) . (234)
    It is clear that if we can produce the bounds  232 for each of the terms on the right hand side of  234 separately, then by the product rule for derivatives we have the estimate  232 for the full term. Since they require slightly different arguments, we will proceed separately for each of these two factors.
    Our first task is to prove the bound  232 for the spatial product ω g ~ 1 ( t , x ) ω g ~ ( t , y )   .
    This will be done inductively with respect to the value of k   . Since we will proceed via a bootstrapping type procedure, we first assume that we can prove the desired bounds over small intervals and then try to use this knowledge to extend things to longer intervals. To do this, we differentiate the product ω g ~ 1 ( t , ) ω g ~ ( t , y )   , where [ y , ]   is some shorter line segment inside of [ y , x ]   , with respect to the operators ( M 1 ξ ) k   . This yields the equation:
    (235) ( M 1 ξ ) k ( ω g ~ 1 ( ) ω g ~ ( y ) ) = i = 0 k 1 ( M 1 ξ ) k i ( ω g ~ 1 ( ) ω g ~ ( x 1 ) ) ( M 1 ξ ) i ( ω g ~ 1 ( x 1 ) ω g ~ ( y ) ) + ( ω g ~ 1 ( ) ω g ~ ( x 1 ) ) ( M 1 ξ ) k ( ω g ~ 1 ( x 1 ) ω g ~ ( y ) ) .
    In the above identity, we have dropped the dependence on time as it no longer has any bearing on how we proceed. Also [ x 1 , ]   denotes an even smaller interval embedded in the overall bootstrapping line segment [ y , ]   . We will let this smaller segment go to zero. Before doing this, we collect the last term on the right hand side of  235 onto the left, apply the matrix norm  15 and the reverse triangle inequality, and use the isometric identity  16 to arrive at the bound:
    (236) | ( M 1 ξ ) k ( ω g ~ 1 ( ) ω g ~ ( y ) ) ( M 1 ξ ) k ( ω g ~ 1 ( x 1 ) ω g ~ ( y ) ) | i = 0 k 1 ( M 1 ξ ) k i ( ω g ~ 1 ( ) ω g ~ ( x 1 ) ) ( M 1 ξ ) i ( ω g ~ 1 ( x 1 ) ω g ~ ( y ) ) .
    We now divide both sides of this last expression by the small interval length | x 1 |   and let the resulting expression go the limit x 1   . To compute this, we only need to handle the expressions:
    lim x 1 | x 1 | 1 ( M 1 ξ ) k i ( ω g ~ 1 ( ) ω g ~ ( x 1 ) ) , (237)
    where we have the important restriction 1 k i   . We do this by using the fact that the gauge equation  219 gives us an explicit realization of the product ω g ~ 1 ( ) ω g ~ ( x 1 )   as an integral over the interval [ x 1 , ]   :
    ω g ~ 1 ( ) ω g ~ ( x 1 ) = x 1 ω g ~ 1 ( x 1 ) ω g ~ ( s ) ω C ̲ ~ α ( ) ( s ) d s + I . (238)
    Here the α ( )   index denotes the component of the connection { ω C ̲ }   in the direction of the line segment [ y , x ]   . Plugging this last expression into the limit  237 and using the fundamental theorem of calculus on the resulting identity we arrive at the simple equation:  237 
    = ( M 1 ξ ) k i ( ω C ̲ ~ α ( ) ( ) ) . (239)
    Notice that the identity matrix on line  238 drops out because of the condition 1 k i   , and that all terms where the derivatives fall on the group elements are zero because when x 1 =   these are again just derivatives of the identity matrix I   . Now, substituting  239 into the limiting version of  236 we have the differential inequality:
    (240) | ( M 1 ξ ) k ( ω g ~ 1 ( ) ω g ~ ( y ) ) | i = 0 k 1 ( M 1 ξ ) k i ( ω C ̲ ~ α ( ) ( ) ) ( M 1 ξ ) i ( ω g ~ 1 ( ) ω g ~ ( y ) ) .
    Assuming now that we have proved the inductive bound: sup 0 i k 1 ( M 1 ξ ) i ( ω g ~ 1 ( ) ω g ~ ( y ) ) 1 ,   which is easy when k 1 = 0   on account of the compactness of the group O ( m )   , we see that by integrating the expression ( M 1 ξ ) k ( ω g ~ 1 ( ) ω g ~ ( y ) )   the proof of  229 at the k t h   step boils down to being able to establish the line integral estimate:
    i = 0 k 1 y x ( M 1 ξ ) k i ( ω C ̲ ~ α ( ) ( ) ) d . (241)
    The reason this bound will be possible is that we have taken care to make sure that there is always at least one copy of the operator ( M 1 ξ )   in each of the above integrals, and it is the presence of the extra factor M 1   in conjunction with the range restriction  231 that will be enough to provide the needed integrability. In fact, using the condition that M 1 | x y | 1 2   and the Cauchy-Schwartz inequality, we see that it suffices to be able to prove the bound:
    i = 0 k 1 y x ( M 1 ξ ) i ξ ( ω C ̲ ~ α ( ) ( ) ) 2 d 2 . (242)
    This last integral can now be bounded in terms of energy type estimates once one applies the L L 2   trace theorem to it. However, because of the various angular degeneracies involved in the potentials { ξ ω C ̲ ~ }   , it will be necessary for us to use a more refined “trace-Bernstein” type inequality. Furthermore, since the connection { ω C ̲ ~ }   is only defined implicitly via the Hodge system  220 , it will be necessary for us to prove estimate  242 via a bootstrapping argument in mixed Lebesgue spaces.
    What we will do is to show the following somewhat more restrictive estimate which yields  242 as a consequence:
    Lemma 9.2. Let the connection { ω C ̲ ~ }   be defined via the Hodge system  220 :
    ( ω C ̲ ~ ) d f = d * Δ 1 [ ω C ̲ ~ , ω C ̲ ~ ] , (243a)
    ( ω C ̲ ~ ) c f = ω A ̲ ( M ) ~ x Δ 1 [ ω A ̲ ( M ) ~ , ω C ̲ ~ ] . (243b)
    where we have set:
    ω A ̲ ( M ) ~ = x ω Π ¯ M 1 < ω Π ¯ ( 1 2 δ ) ω L Δ ω 1 A ̲ 1 ( ω ) . (244)
    Furthermore, the parameter M 1   which lies in the range  231 (although this is not essential). Then the following mixed Lebesgue space estimates of Besov type hold:
    i = 0 k 1 μ ( M 1 ξ ) i ξ P μ ( ω C ̲ ~ ) L ( L 2 ) . (245)
    • Proof of estimate  245 . Things will be a bit easier if we prove the following more restrictive estimate:
      i = 0 k 1 μ μ γ ( 1 + μ ) n ( M 1 ξ ) i ξ P μ ( ω C ̲ ~ ) L 2 ( L ) . (246)
      That  245 is a consequence of  246 is a simple matter applying the Minkowski inequality for mixed Lebesgue spaces and the fact that the weights in  246 are clearly more restrictive. Now, the proof of this second estimate is essentially no more complicated than using the Bernstein inequality in the hyperplane plane R n 1   to turn things into the energy estimate contained in the bootstrapping norm  126d .
      To see this, we begin our proof of  246 by first establishing this bound for the reduced Coulomb potentials { ω A ̲ ( M ) ~ }   .
      We are now trying to prove that:
      j = 0 , 1 i = 0 k 1 μ μ γ ( 1 + μ ) n ( M 1 ξ ) i ξ j P μ ( ω A ̲ ( M ) ~ ) L 2 ( L ) . (247)
      For each fixed frequency in the above sum, we decompose things into all frequencies corresponding to the R n 1   plane, as well as all possible dyadic angular sectors spread from the ω   (fixed) direction: P μ = θ , λ : λ μ ω Π θ Q λ P μ ,   where Q λ   is an ( n 1 )   dimensional fixed frequency multiplier which is defined in analogy with P λ   . Freezing all frequencies, our goal will be to show the following estimate:
      ( M 1 ξ ) i ξ j ω Π θ Q λ P μ ( ω A ̲ ( M ) ~ ) L 2 ( L ) θ γ ( λ μ ) γ μ 2 γ . (248)
      By adding in the weights μ γ ( 1 + μ ) n   , using the fact that the potentials { ω A ̲ ( M ) ~ }   are truncated to frequencies μ 1   , and dyadic summing, the fixed frequency estimate  248 implies  247 with room to spare. To deal with all of the ξ   derivatives, notice that we have the following heuristic multipliers bounds:
      ( M 1 ξ ) i ξ j ω Π θ Q λ P μ ( ω A ̲ ( M ) ~ ) θ 2 ω Π θ ω Π ¯ ( 1 2 δ ) Q λ P μ ( A ̲ 1 ) , (249)
      where we are enforcing the notation introduced on line  58 . That is, the left hand side of the above identity satisfies all mixed Lebesgue space bounds as the right hand side with the same constants. Notice that this bound uses the extra Coulomb savings introduced on line  196 above to kill off one power of θ 1   from the degenerate Laplacean Δ ω   . The other power of θ 1   on the right hand side of  249 comes from the operator ξ   which has no smoothing factor of M 1   . This is precisely what one pays for passing from the L 1   integral  241 to the more manageable L 2   integral  242 . Finally, it is important to point out that although we have not emphasized it, the multipliers Q λ   depend on ω   , but the fact that λ θ   implies that the multiplier product on the left hand side of  249 is zero prevents the derivatives of Q λ   with respect to ξ   from costing more than derivatives of ω Π θ   (alternatively, we could have applied the Q λ   multipliers on the outside of the ξ k   operators, because differentiation will not change the support of the various multipliers).
      Now, to use the Bernstein inequality on the R n 1   plane, we simply note that one has the multiplier identity:
      ω Π θ Q λ P μ = ω | | B ( μ θ ) ω Π θ Q λ P μ , (250)
      where ω | | B ( μ θ )   is a (smooth symbol) block type cutoff in the R n 1   frequency plane of dimensions 1 × ( μ θ ) × × ( μ θ )   which has its long side centered along the projection9 of the unit vector ω   onto the R n 1   (frequency) plane. The crucial fact about the geometry of the multiplier  250 is that is has support contained in a box of size λ × ( μ θ ) × × ( μ θ )   in the R ξ n 1   (frequency) plane. Using now the identities  249 and  250 , as well as the n 1   dimensional Bernstein inequality, we see that we may estimate:
      (251) ( M 1 ξ ) i ξ j ω Π θ Q λ P μ ( ω A ̲ ( M ) ~ ) L 2 ( L ) θ 2 λ 1 2 ( μ θ ) n 2 2 P μ ( A ̲ 1 ) L 2 .
      To deal with the weights on the right hand side, we use the truncation condition that μ 1 2 δ θ   , as well as the fact that λ μ   to conclude the bound: θ 2 λ 1 2 ( μ θ ) n 2 2 μ n 2 2 θ γ ( λ μ ) γ μ 2 γ .   Substituting this into the right hand side of estimate  251 and using the L ( H ˙ n 2 2 )   bound contained in the bootstrapping estimate  126d , we have achieved the desired result  248 .
      It is now our task to use  247 and the Hodge system  243 to pass to the more general estimate  245 . In order to do this, it will be necessary for us to first prove some critical estimates for the potentials { ω C ̲ ~ }   . These will then be used as a reference point in certain bilinear estimates involving the space used to define estimate  245 .
      While we're at it, this will also give us a chance to prove some estimates which will be used many times in the sequel. What we will show is that:
      ( M 1 ξ ) k ω C ̲ B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) , (252)
      ( M 1 ξ ) k t ω C ̲ B ˙ 2 , 10 n p γ , ( 2 , n 4 2 ) , (253)
      where p γ   is exponent defined on line  193 above. Both of the bounds  252  253 will easily follow via our general Besov embedding  45 once we have established them for the linear term on the right hand side of the Hodge system  243 . That is, we fist establish that:
      ( M 1 ξ ) k ω A ̲ ( M ) ~ B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) , (254)
      ( M 1 ξ ) k t ω A ̲ ( M ) ~ B ˙ 2 , 10 n p γ , ( 2 , n 4 2 ) , (255)
      These follow from immediately from the steps used to prove  192 above, and the following heuristic identity which follows our convention established on line  58 :
      ξ k ( ω Π θ P μ ω A ̲ ( M ) ~ ) θ k ω Π θ ω Π ¯ M 1 < P μ x ω L Δ ω 1 A ̲ 1 ( ω ) , (256)
      ξ k ( ω Π θ P μ t ω A ̲ ( M ) ~ ) μ θ k ω Π θ ω Π ¯ M 1 < P μ x ω L Δ ω 1 A ̲ 1 ( ω ) , (257)
      Notice that the space-time frequency localization  126c allows us to trade the t   with the factor of μ   on the second line above.
      We now prove the estimates  252  253 by proceeding inductively on the value of k   . If k = 0   the first estimate  252 holds because one can solve the system  243 via Picard iteration in the space B ˙ 2 , 10 n p γ , ( 2 , n 2 2 )   thanks to the bilinear embedding  45 which furnishes the embedding:
      x Δ 1 : B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) . (258)
      The key thing to point out here is that for γ   sufficiently small, and in dimensions 6 n   we have the bound p γ < n   , which is all that is needed to satisfy the gap condition  47 in this case. The other conditions of Lemma  4.1 are also easily seen to be satisfied for this set of indices.
      To establish  252 for 0 < k   , we simply differentiate the system  243  k   times with respect to the operator ( M 1 ξ )   . Doing this yields the linearized set of equations:
      ( M 1 ξ ) k ( ω C ̲ ~ ) d f = j = 0 k d * Δ 1 [ ( M 1 ξ ) k j ω C ̲ ~ , ( M 1 ξ ) j ω C ̲ ~ ] , (259)
      ( M 1 ξ ) k ( ω C ̲ ~ ) c f = ( M 1 ξ ) k ω A ̲ ( M ) ~ (260)
      j = 0 k x Δ 1 [ ( M 1 ξ ) k j ω A ̲ ( M ) ~ , ( M 1 ξ ) j ω C ̲ ~ ] , (261)
      which can again be solved in the Besov space B ˙ 2 , 10 n p γ , ( 2 , n 2 2 )   by using the already established estimate  254 for the linear term, in conjunction with the (inductive) hypothesis that estimate  252 holds for k 1   , and absorbing the highest derivative (involving ( M 1 ξ )   falling on ω C ̲ ~   ) term to the left hand side. All of this is permissible by referring to the embedding  258 .
      To prove the second estimate  253 above, we first apply the time derivative t   to both sides of the system  259  261 above. The resulting system of equations, which we will not write down, can easily be solved in the derivative critical Besov space B ˙ 2 , 10 n p γ , ( 2 , n 4 2 )   by again using an induction on k   , the already established estimate  255 for the linear term, and the following bilinear Besov estimate which is again a special case of  45 :
      x Δ 1 : B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) B ˙ 2 , 10 n p γ , ( 2 , n 4 2 ) B ˙ 2 , 10 n p γ , ( 2 , n 4 2 ) . (262)
      Notice that  262 is permissible because for γ   sufficiently small, we have the condition p γ < 2 n 3   in dimensions 6 n   which is necessary to get around the gap condition  47 . The other conditions of  45 are easily satisfied for this choice of indices.
      Armed with estimates  247 and  252 , we now move back to the proof of estimate  245 . We set the norm in that latter bound equal to: A N 1 γ , 2 , = μ μ γ ( 1 + μ ) n P μ ( A ) L 2 ( L ) .   By differentiating the system  243 with respect to the operators ( M 1 ξ ) k ξ   , we see that the claim will now follow once we can prove the bilinear Riesz operator bound:
      x Δ 1 : B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) N 1 γ , 2 , N 1 γ , 2 , . (263)
      We now let A   and C   be any two elements of the two spaces on the left hand side of  263 . By applying the trichotomy, we see that it suffices to be able to prove the three estimates:
      λ , μ i : μ 1 μ 2 λ μ 2 λ γ ( 1 + λ ) n x Δ 1 P λ ( P μ 1 A P μ 2 C ) L 2 ( L ) (264)
      A B ˙ 2 , 10 n p γ , ( 2 , n 2 2 ) C N 1 γ , 2 , , (265)
      λ , μ i : μ 2 μ 1 λ μ 1 λ γ