Monday 10 June 2013

Blog Reboot

Because of a glitch, I will have to reboot this blog in July. You will need to re-follow it then. Sorry! Will keep you posted. Chris

Monday 13 May 2013

Datasets created!

The SST CCI team have completed generation of new data records for sea surface temperature (SST). Together with development of a prototype for climate data record production, these datasets are a major output from the project. This post summarises what we have generated, and what happens next.

Three categories of data have been produced. "Level 2" data are SSTs with the data organised as captured by the satellite. This is the sort of data we have generated mostly from AVHRR sensors as "L2P", a GHRSST level-2 format. "Level 3" data are SSTs that have been spatially averaged, in our case to the SST-CCI base resolution of 0.05 deg lat-lon resolution. This format has been used for most of our ATSR-series data, and is roughly equivalent in spatial resolution to the AVHRR L2P. "Level 4" data have been interpolated/extrapolated in time and space to give complete fields (even where cloud prevented any observations). In our case, the level-4 dataset is at 0.05 deg resolution and is daily. The interpolation method is the Met Office OSTIA system, improved within the project in resolution and interpolation assumptions. All of the above datasets span late 1991 to the end of 2010.

There are also demo products consistent with the approach outlined in a previous blog.

The team is currently completing a system/data verification activity and soon (assuming no major problems) the SST CCI data will be made publicly available, as part of a Climate Research Data Package. This CRDP will facilitate various climate analyses and comparisons by bringing together not only our new datasets, but a range of other data. The CRDP access will be supported by a Products User Guide (PUG), which will describe the new data on various levels, from a "quick start" guide to detailed contents. Availability will be announced in various ways, including this blog.

Friday 29 March 2013

Initial Evaluations by the Climate Modelling Users Group

ESA rightly consider Climate Modellers as key users of the sort of satellite-based climate datasets that the SST and other CCI projects will produce. A Climate Modelling Users Group (CMUG) is therefore part of the set of projects that makes up CCI, which gives us "observationalists" an excellent change to interact with that community of scientists.

Once CMUG responsibility is to make assessments of ECV quality. This will be done once our new SST CCI datasets are generated and verified, which is about a month away now (computers churning away in the background!). A trial of this was done on precursor data -- in our case the ARC data set blogged about previously.

CMUG asked for comments on their report. My comments are the substance of this post.

Section 2 -- Terminology

"Uncertainty refers to a combination of random and systematic (bias) errors for each variable." It could be better to stick with well established definitions for words like "uncertainty" and "error".

Uncertainty is the doubt that exists about the result of a measurement.

Error is the difference between the result of a measurement and the true value.

These are the official scientific definitions and match the plain non-scientific meaning of the words.

Quantitatively, we don't know the error -- if we did, we would correct for it! But usually we can (and should) estimate the statistical distribution of error by various means. Doing so allows us to quantify the uncertainty we have (usually expressed as the standard deviation of what we estimate the distribution of errors to be).

Unfortunately, scientists have for some reason slipped into using the single word "error" to mean: 'error', 'uncertainty', and 'difference from validation data' ... sometimes using all three meanings in a single sentence without even being aware of what they are doing. This greatly inhibits public communication, which is very important in all areas of science, and perhaps especially in climate change science.

Section 4 -- Specifically the ARC assessments

Much of what CMUG has done comprises comparisons with global drifting buoys, stratified in various ways. These analyses are essential to do, and appear to be well done.

They are, however, very similar to the validation activities the EO team within the SST CCI project undertake. On the one hand, it will be presentationally useful to have such results done independently on SST CCI datasets, to confirm that we are being "honest" in our validation. But on the other hand, since I know my team are honest, I don't feel we learn very much from these additional assessments by CMUG about the faults in our data.

From my point of view, it would be much more interesting to have assessments made that we wouldn't do internally. For example, an interesting question to us is: how can a climate modeller looking at model biases via SST comparisons usefully use our uncertainty information?

Giving proper uncertainty information (measurement by measurement, not just generic values) is relatively rare in satellite climate datasets, but is done in ARC and SST CCI. It would be interesting to see how a climate modeller can get value out of the information. Does it give better confidence in their model testing because of its presence -- i.e., where the SST in observations and model disagree but the observational uncertainty is low, does that make it more convincing that the problem lies in the model?

The CMUG have done an analysis of SST biases in the ARC data as a function of stated uncertainty. This shows that the uncertainty is useful in describing the quality of the data with respect to bias, at least in relative terms. That is good to see.

In the Feb 2012 GEWEX newsletter there is this diagram about dataset assessments



in a useful article about their experiences of assessment activities. I think over the full range of CCI variables, the CMUG has taken a broader view of assessment activities, similar to this GEWEX model, that goes beyond comparison to in situ data. One thing missing from the above diagram is demonstration applications as a means of assessment, which would address the sort of question asked above with regards to the usefulness to climate modellers of the uncertainty information in our datasets.

Thursday 18 October 2012

SST climate dataset paper accepted

Merchant C. J., O. Embury, N. A. Rayner, D. I. Berry, G. K. Corlett, K. Lean, K. L. Veal, E. C. Kent, D. T. Llewellyn-Jones, J. J. Remedios and R. Saunders, A twenty-year independent record of sea surface temperature for climate from Along Track Scanning Radiometers, accepted for J Geophys Res in July 2012.

Analysis is based on the ARC SST dataset which acts as the reference point for forward modelling and SST retrieval within SST CCI.

Thursday 9 August 2012

Regridding tool progress

Discussion with Ralf Quast, Stuart MacCallum and me on status of regridding tool, since Stu has applications requiring its use as soon as possible.

Ralf explain the use case currently implemented, which doesn't quite match. It seems worthwhile for Stuart to shuffle priorities and use the tool in intermediate stages of development that will be still fit for his immediate purposes and available in the next few weeks.

The conclusions were:

1. Ralf will advise Stu tomorrow (Fri 10 Aug) on what route for advancing the tool he plans to follow (whether further develop current code or switch to extension of new Beam rebinning module).
2. Assuming choice to base it on the beam module, we agreed a useful order for development would be:
(i) get it working for ARC input data using currently available aggregation -- and let Stu know so he can play with it
(ii) implement CCI method for averaging (using OSTIA climatology as reference) -- and send to Stu so he can use this seriously
(iii) implement the CCI method for uncertainty propagation -- this requires us to finalise the method and LUTs etc, and we need to have a telecom including Nick Rayner soon to do that
(iv) (if possible) allow an option for an alternative to the OSTIA climatology -- again, let Stu know, and he can then use the ARC-based climatology when required
3. Stu will send Ralf some ARC data files as examples
4. Stu will review the tool specs and may also comment on the approaches when the further discussions come
5. Output of regridded time series as CF-compliant NetCDF is fine, it doesn't have to be in full SST CCI L3 format as outputs from this tool are not CCI products
6. Confirm it does need to cope with input L2P for use on AVHRR GAC

Wednesday 11 July 2012

Update on "demo product" strategy

We revised the demo product strategy from that originally proposed because of instrument failures that prevented pursuing the original plans (see here). Getting closer to this work, Alison McLaren pointed out that the notes don't discuss the depths intended for the analysis products for the demo periods.

So, this note simply clarifies this for the record. It is straightforward, since the choice arises from the purpose for each of the revised demo periods.

The "4i" option adds passive microwave sensors to the long-term ECV product, in effect, and will of interest to see their impact. Clearly, this has to include an analysis product of the same type as the long-term record -- i.e., sst-depth.

The "4ii" option covers Metop AVHRR 0.05deg, AATSR L2P and SEVIRI, all of which deliver true skin SSTs. The main purpose of this demo is simply to have experience of handling these datasets (which are larger volume than the long-term AVHRR GAC and ATSR 0.05deg data). The outputs will be compared with operational OSTIA, and therefore the analysis should be a foundation analysis, obtained by applying the same methods to skin SSTs as currently used in OSTIA operationally.

Tuesday 3 July 2012

How to estimate SST in SST CCI?

The "Algorithm Selection" process has been discussed many times on this blog (herehere and here, for example). Now I can present the conclusions.

First, some general comments. It was great to get some external participation, and to be able to make very clean comparisons between different methods of estimating SST from space on common data, where we had controlled the procedure for comparison against validation data that had not been used in algorithm development by any party.

It turned out that there was tough competition between the algorithms in terms of the quantitative metrics. Just to recap: we generated statistics and maps of "bias" (taken as the mean SST-depth minus drifting buoy difference), "precision" (standard deviations of the same), and "SST sensitivity", and also various measures of stability (with respect to trend, seasons and day-night).

Generally, all the considered algorithms were good. Sometimes one would perform better on a particular metric for a given sensor, but on a different sensor, the ranking would be reversed.

For the ATSRs, the choice came down to using either the ARC-project coefficients, or a version of optimal estimation tuned to the ARC-project coefficients. Both were independent of in situ, therefore. The optimal estimation had a slight edge on precision, but otherwise there was no clear advantage that one consistently had over another across the full range of application.

For the AVHRRs, the choice came down to the same optimal estimation or Boris Petrenko's incremental regression approach. Incremental regression gave better precision, but poorer sensitivity, so there was an aspect of trade off there. For night-time cases, overall both were very similar on quantitative metrics. For day time cases, optimal estimation was a little better on bias and sensitivity, not as good, as mentioned, on precision.

So, in the end, the only very clear deciding factor was that incremental regression is an empirical approach, tuned to in situ measurements. Optimal estimation, being tuned to independent ARC SSTs, retains independence from in situ measurements -- this being an advantage for a significant minority of climate users of SST (according to our earlier survey).

Having opted for optimal estimation for the AVHRRs, it then seemed preferable to make the same choice for the ATSRs, to maximise the consistency of approach across the sensors. (The only exception will be ATSR-1, for which an version of optimal estimation is yet to be developed.)