Friday 29 March 2013

Initial Evaluations by the Climate Modelling Users Group

ESA rightly consider Climate Modellers as key users of the sort of satellite-based climate datasets that the SST and other CCI projects will produce. A Climate Modelling Users Group (CMUG) is therefore part of the set of projects that makes up CCI, which gives us "observationalists" an excellent change to interact with that community of scientists.

Once CMUG responsibility is to make assessments of ECV quality. This will be done once our new SST CCI datasets are generated and verified, which is about a month away now (computers churning away in the background!). A trial of this was done on precursor data -- in our case the ARC data set blogged about previously.

CMUG asked for comments on their report. My comments are the substance of this post.

Section 2 -- Terminology

"Uncertainty refers to a combination of random and systematic (bias) errors for each variable." It could be better to stick with well established definitions for words like "uncertainty" and "error".

Uncertainty is the doubt that exists about the result of a measurement.

Error is the difference between the result of a measurement and the true value.

These are the official scientific definitions and match the plain non-scientific meaning of the words.

Quantitatively, we don't know the error -- if we did, we would correct for it! But usually we can (and should) estimate the statistical distribution of error by various means. Doing so allows us to quantify the uncertainty we have (usually expressed as the standard deviation of what we estimate the distribution of errors to be).

Unfortunately, scientists have for some reason slipped into using the single word "error" to mean: 'error', 'uncertainty', and 'difference from validation data' ... sometimes using all three meanings in a single sentence without even being aware of what they are doing. This greatly inhibits public communication, which is very important in all areas of science, and perhaps especially in climate change science.

Section 4 -- Specifically the ARC assessments

Much of what CMUG has done comprises comparisons with global drifting buoys, stratified in various ways. These analyses are essential to do, and appear to be well done.

They are, however, very similar to the validation activities the EO team within the SST CCI project undertake. On the one hand, it will be presentationally useful to have such results done independently on SST CCI datasets, to confirm that we are being "honest" in our validation. But on the other hand, since I know my team are honest, I don't feel we learn very much from these additional assessments by CMUG about the faults in our data.

From my point of view, it would be much more interesting to have assessments made that we wouldn't do internally. For example, an interesting question to us is: how can a climate modeller looking at model biases via SST comparisons usefully use our uncertainty information?

Giving proper uncertainty information (measurement by measurement, not just generic values) is relatively rare in satellite climate datasets, but is done in ARC and SST CCI. It would be interesting to see how a climate modeller can get value out of the information. Does it give better confidence in their model testing because of its presence -- i.e., where the SST in observations and model disagree but the observational uncertainty is low, does that make it more convincing that the problem lies in the model?

The CMUG have done an analysis of SST biases in the ARC data as a function of stated uncertainty. This shows that the uncertainty is useful in describing the quality of the data with respect to bias, at least in relative terms. That is good to see.

In the Feb 2012 GEWEX newsletter there is this diagram about dataset assessments



in a useful article about their experiences of assessment activities. I think over the full range of CCI variables, the CMUG has taken a broader view of assessment activities, similar to this GEWEX model, that goes beyond comparison to in situ data. One thing missing from the above diagram is demonstration applications as a means of assessment, which would address the sort of question asked above with regards to the usefulness to climate modellers of the uncertainty information in our datasets.