A Commoner’s Approach to 1dvar

 

Larry McMillin     March 21, 2005

 

The intent of this presentation is to make the concept of optimal sounding retrievals as intuitive as possible.  To do so, I am starting from simplest model I can think of and building toward the retrieval equation.  As considerations are added to make it close to the retrieval equation, it gets increasingly complicated.  It does not follow all the way the through and at the one point, I have to start with conventional approach to get to the exact equation.  Strictly following this approach only gets you close.  Never the less, it does give useful insight that complements the more conventional approach.

 

Most people understand the concept of averaging to reduce errors.  If the errors are equal, it is easy.  If one is more accurate, then obviously you want to weight the one that is more accurate more.  This is all that both 1dvar retrievals and the data assimilation used in numerical models are.  The question is how much more?  We start with the Equation for weighted average

 

                                   .

 

But we also have to constrain the weights so that

 

 

to keep the units.  Otherwise we would start with inches and end Kellicams or some equally obscure unit.  Combining equations gives

 

 

We want to find the weight that gives the minimum variance.  To do so, we need to calculate the variance by squaring the equation

 

 

Since we are interested in the variance over a sample, we note that since the errors are random, cross products between  and have an expected value of zero and can be dropped out.  A full expansion would still include terms with  but we know they will drop later so we will skip them to keep things simple.  Then taking the derivative of  with respect to  gives

 

 

Solving for and gives

 

                               

 

and manipulation gives an alternate form

 

          

 

It is interesting to note that what the weights really do is scale the quantities so that the scaled variances are equal so that equal variances are added just like the simple case we started with.  This is demonstrated below

 

 

To make things consistent with retrieval literature, we will rename the errors by calling

 

     and       to get

 

 

Up till now, we have been considering like things, as in temperatures with the same units.  But if the units were different, one would have to have multiplied by a scaling factor to convert the units.  The scaling factor also affects the values of the weights.  For soundings, the scaling factor is often a function.  But the common methods linearize the function about the point defined by the current estimate of the truth.  This is usually the background for the first iteration and the current retrieval for later iterations.  In any case, for a given iteration, the function is evaluated and the value is a constant just like any other scaling factor.  We will denote the scaling factor by the letter .  Including the scaling factor gives

 

 

Note that the errors are also scaled so the scaling factor is included in the weight as well.  We will talk more about that later, but first we will subtract a value from both sides.  It could be any value, but we will choose the value given by the background.  Doing so gives

 

 

Since the weights sum to 1.0, the guess can be moved inside the brackets to give

 

 

Converting the guess to the same units as  gives and remembering that we set the guess equal to the value of  gives

 

 

but note that zero still has an error distribution given byand the second term has one given by

 

This equation can also be written as

 

 

Remember that for our scalar case without any units conversion, can be written as

 

 

But since we are now using  this becomes

 

 

When the units are included it becomes

 

 

Which results in

 

 

which can be simplified to

 

 

 

There is yet another consideration.  That is the definition of .  Up till now, it has been the error in the measurements.  When we subtracted a guess, remained unchanged.  However, when  becomes  the process involves the calculation of radiances from the guess state (i.e. temperature profile).  Even though we know guess perfectly (it is just a set of numbers) the process of calculating radiances from a known state is subject to errors due the calculation process.  The definition of is now changed to include errors in the both the measurement and the forward calculation.  One of the ramifications of this is that if one of these is known much more accurately than the other, then the largest error becomes the limiting factor.  In other words, the resources spent on the instrument and the forward calculation have to be kept in balance.  Getting one very accurate while ignoring the other provides no benefit. 

 

So far we have been considering a simple scalar problem and we could have used brightness temperatures.  But we want to assimilate radiances.  This presents some additional considerations.

 

One is that radiances are averages of the values we want.  Suppose I tell you I have a surface and its area is x and ask you to draw it.  You know something because you know the area, but you need some information about the shape.  If I slice the object into slabs and give the area of the slabs, you can do better.  If I give you information about the smoothness or some other feature of the shape you can do even better.  The closest example is the problem of having a profile of layer temperatures.  You can never get a unique solution for the level temperatures.

 

For soundings this means that the satellite measurements are never provide the complete information for a profile.  If a solution is attempted without a constraint, a ridiculous solution is obtained.  This comes about for 3 reasons

 

1.      As in the case of the layer temperatures for every profile, there a number of profiles with fine vertical structure that give the same answer.  One needs to know something to select the right one.

 

2.      The measurements are not independent.  This means the temperature for a given level contributes to the radiances for several channels. 

 

3.      The measurements contain errors.  Suppose one channel has an error and the others are all perfect.  If the measurements are assumed to error free, a profile needs to be found that keeps the radiances for all the channels except the noisy one constant and changes the one for the noisy channel to satisfy the erroneous value.  Without the other channels, the noise would correspond to a small temperature change over levels contributing to that channel.  However, since the values for the other channels must be kept constant, the only way to do this is to make a large change for some of the levels.  Then since the other channels also see this level, other levels have to be changed to cancel the effect on the other channels.  The net effect of all this is to produce one of the noisy solutions that are possible because of condition 1.

 

This means that some information in addition to the radiances is needed to produce a solution.  Examples are:

 

1.      a guess profile

2.      a constraint such as smoothness of the profile.

 

The smoothness constraint will be left as an exercise.  But general approach is derived as follows.  Suppose we are retrieving the temperatures for a profile.  One of the values that might be used as an estimate for one layer is the temperature at a nearby layer.  When this is done, the value for a layer enters on one side of the equation for itself and on the other for other layers.  When the algebra is done, the equations for layers become connected.

 

In any case, once a constraint is selected, it has to be applied in the right weights to produce a good solution.  Using it either too strongly or too weakly is not optimal.  This is what the retrieval process does.

 

The second consideration is that the measurements are not temperatures or temperatures scaled to another unit, but rather the product of a sum of weighted temperatures with the weights being determined by the atmospheric transmittances.  Note that these weights are not the ones that we have been discussing earlier, but rather part of the units conversion factor.  The problem is that we have radiance and we want to get temperatures.  There are several ways to do this:

1.   convert radiances to brightness temperatures.

a.                  but the transmittances become functions of temperature

2.   use the radiances in the retrieval and put the conversion in the retrieval system

a.                  The conversion factors are then the elements of a Jacobian matrix

 

 

This model has some implications.  Note that the weights are determined by the error.

 

These equations have some implications.  Suppose we want to let the satellite measurements have  the maximum impact on a forecast model.   Below are some alternatives and their effects.

 

1.      Have a good guess.  This gives a good retrieval.  But if the guess is good, then why not just use it.  In fact a retrieval produced with a good guess will be pretty much independent of the measurements.

2.      Produce a retrieval based on a separate guess from the one used in the model.  Since the weight is determined by the combination of the errors in both measurements and the guess, the error used to determine the weights is increased and the measurements have less influence on the model than it would if only one guess were used.

3.      Produce a retrieval based on the same guess as the one used in the model.  This can be done if the error is counted correctly.  The trap is that exactly the same  guess with exactly the same error enters twice with weights based on the assumption that there are two guess values with different errors that are independent.  Since the errors are absolutely identical, the errors are not weighted correctly if the weights are based on the assumption that they differ.  The errors have to be counted correctly.

4.      Directly assimilate the radiances.  This allows the maximum influence of the radiances on the forecast because only one guess is involved.  Step 3 can produce the same result with more work.  Remember the 3 laws of thermodynamics.  You can’t get something for nothing, you can’t break even, and you can’t even come close.  This is similar.  Doing better than assimilating the radiance based on a single background is like inventing a perpetual motion machine.

 

If the solution is iterated it must de done right.  Some errors that have been made are:

1.      iterate the solution and use the result of the previous solution as the guess for the next iteration.  This has the effect increasing the weight given to the satellite measurements at each step.  When too much weight is given to the measurements, noise in the measurements and the forward calculation are amplified.

2.      Do a retrieval based on one guess and supply the retrieval to a model using a separate guess.  This has the effect of down weighting the effect of the measurements because weight is given to two guesses and part of the additional guess weight is taken from the measurement contribution.

Solutions are iterated because the accuracy of the linearization step depends on the accuracy of the retrieval.  The way to do the iteration is to

1.   use the same guess for the background each time.

2.   Iterate the state (profile) used to calculate the Jacobian      

 

 

The Penalty Function Approach

 

Up till now, we have been working with a scalar model.  Although one can get close to the final solution by analogy, there are a couple of little details that, at least so far, don’t make it to the exact solution.  So for matrices using the full solution, the usual penalty function approach is now given following Eyre 1991 as given in his 2002, ECMWF Meteorological Training Course Lecture Series.  The penalty function can be written as

differentiating with respect to x and denoting it by gives

 

 

where  is the Jacobian matrix containing the derivatives .

 

Integrating gives

 

 

and noting that for the linear case and substituting gives

 

 

An equivalent form which can be more stable and more efficient can be obtained using the matrix identity   

 

 

To get

 

 

 

The difference is that in one case the matrix product that occurs in the inverse is singular by itself and the inverse is stable only because of the matrix that is added to it.  This is because one form has the number of channels as the inner dimension and the number of temperatures, water vapor amounts, etc. being retrieved as the outer dimension.  The second form has the dimensions reversed.  

 

Finally note the similarity between this retrieval and the older versions using brightness temperatures to convert from radiances to temperature

 

 

 

Note that  takes the place of  since the conversion from radiance to temperature is done in the Jacobian.  The newer Jacobian is simply a more accurate way to do the conversion.  In the form using weighting functions as , the weighting functions ignore the dependence of the weighting functions on the state vectors on the grounds that it is small compared to the change in brightness temperature.  While this assumption is true, it slows the convergence and can lead to a slightly different answer.  It was a good approximation in its day, but it is not a good approach to use now.