Online climate change projections report 3.2.8 Structural model errors
What is discrepancy, and why is it important?
The discrepancy term, introduced in Section 3.2.7, is a measure of how informative the climate model is about the real world. Formally, it represents the mismatch we would find between the model and the real world if we could locate precisely the combination of model parameter settings giving the best overall simulation of climate that the model is capable of providing. Discrepancy applies to simulations of both historical and future climate. It is also a prior input to the Bayesian framework, and should therefore be specified using a method as independent as possible from the specific observations used to weight the (emulated) climate model projections, in order to avoid double counting the observations. Discrepancy is itself uncertain, and is therefore specified as a distribution (in common with other uncertain inputs to the Bayesian calculation). Values must be specified for all historical and future variables involved in the calculation, including covariances between the variables. Discrepancy in historical variables focuses the weight on the well modelled variables and prevents small variations in the poorly modelled variables from having an unduly large impact on the weighting. Discrepancy in future variables increases the uncertainty associated with the projections, and mitigates the risk of making overconfident projections. Specifying the discrepancy is an extremely demanding task in principle, given the inherent difficulty of anticipating the effects on particular climate variables of missing or inadequately understood processes, and their complex interactions.
Estimation of discrepancy in UKCP09
In practice we estimate discrepancy by using results from our large ensemble of HadSM3 simulations of present day and doubled CO2 climates (see Section 3.2.3) to predict the results of an ensemble of different climate models, whose members consist of coupled atmosphere-mixed layer ocean (slab) models of similar complexity and credibility as HadSM3, but employing different basic assumptions in some of their parameterisations of physical processes. Note that this exercise must be carried out using ensembles of slab model simulations, rather than ensembles of coupled models containing a full dynamical ocean (e.g. Figure 3.2), because our perturbed physics ensembles using HadCM3 are too small to support a direct application of the Bayesian framework to their results. Nevertheless, our approach confers the benefit of allowing us to provide projections which combine results from perturbed physics and multi-model ensembles, hence adjusting the projections to account for likely biases arising from structural errors in HadCM3. It is based on the judgement that the effects of structural differences between models can be assumed to provide reasonable a priori estimates of possible structural differences between HadSM3 and the real world. We take a given multi-model ensemble member as a proxy for the true climate, and use our emulator of HadSM3 to locate a point in the HadSM3 parameter space which achieves the best multivariate fit between HadSM3 and the multi-model member, based on a set of climate variables described in Section 3.2.9. The fit is determined using an optimisation procedure starting from a randomly-selected initial point in parameter space. The difference represents one estimate of discrepancy, under the above judgement. This process is repeated four times for each multi-model member, in order to sample the sensitivity of the optimisation process to the initial point. These difference estimates are then pooled across the multimodel ensemble, giving a sample of four times the number of ensemble members. The mean of these is taken as our estimate of the mean value of discrepancy, and the covariances of the differences about the ensemble mean serve as our estimate of the discrepancy covariance matrix, after allowing for a component due to internal climate variability.
This approach allows us to provide projections combining results from perturbed physics and multi-model ensembles, thus avoiding exclusive reliance on results from the Hadley Centre model. The slab models used in the discrepancy calculation were selected from those contributed to the IPCC AR4 (Randall et al., 2007), and the Cloud Feedback Model Intercomparison Project (CFMIP) (e.g. Webb et al. 2006), using data interpolated to the HadSM3 model grid. Some models could not be used as insufficient data was available, and one model was excluded because the design of its simulation of the response to doubled CO2 excluded the contribution of surface albedo changes from melting sea-ice, this being a process of known importance included in the other models. In the remaining 14 models, data was available for nearly all of the required variables, but with isolated exceptions (mainly daily data required to calculate the required indicators of temperature and precipitation extremes, which was missing from five of the models). Here, values of the missing variables were estimated from inter-variable correlations derived from the multi-model ensemble. In two cases where more than one model was potentially available from a given institute, statistical tests showed that these models could not reasonably be assumed to give quasi-independent estimates of model error, so the model variant thought to be less credible (based on criteria of lower resolution in one case, and published assessments by the relevant modelling centre in the other) was excluded. This left 12 models to be used in the discrepancy calculation (Table 3.1).
Assumptions and limitations
Whilst this method of calculating discrepancy provides an appropriate means of quantifying uncertainties in projected future changes consistent with current climate modelling technology, it is important to recognise caveats associated with the approach. Firstly, it assumes that the structural errors in different models can be taken to be independent. Whilst there is evidence for a degree of independence (for example, model errors in multiyear climate averages reduce significantly when ensembles of different models are averaged together (e.g. Lambert and Boer, 2001; Reichler and Kim, 2008)), there is also evidence that some errors are common to all models (see Annex 3), due to shared limitations such as insufficient resolution or the widespread adoption of an imperfect parameterisation scheme. From this perspective, our estimates of discrepancy can be viewed as a likely lower bound to the true level of uncertainty associated with structural model errors. However, another caveat is that we do not take into account variations in the credibility of different multi-model ensemble members when calculating discrepancy, partly because there is no widely recognised means of quantifying such variations (Randall et al. 2007), and partly because such an exercise would introduce an element of double counting in the use of observations in our Bayesian framework. Nevertheless, the assumption of equal credibility carries the risk that models which simulate climate relatively poorly could yield excessively large estimates of discrepancy, thus overestimating the impact of structural errors.
It is clear, therefore, that the sensitivity of our projections to plausible variations in discrepancy is an important test of their robustness (See Annex 2, and further discussion in Section 3.3). In the case of the historical component of discrepancy, such tests can be augmented by diagnostic checks, since the magnitude of biases in our model simulations can be calculated a posteriori. We used our emulator to estimate the location in the model parameter space which gives the best simulation of historical climate, and then calculated the squared error between emulated and observed values found in practice, for each of the variables used in our weighting of different model variants (see Section 3.2.9). For each variable, the squared error was then divided by our a priori estimate of its expected value, this consisting of the sum of the variances arising from our prior estimate of discrepancy, observational errors, and emulation errors. The average value of these normalised squared errors was found to be ~0.3, indicating that the structural component of model error may be rather smaller than our a priori estimates derived from other climate models without reference to the observations. This suggests that the potential risk that the presence of common systematic errors in models might lead us to underestimate historical discrepancy is not realised in practice, at least for the set of historical observables considered. Obviously we cannot perform corresponding diagnostic checks on the discrepancy attached to future variables, and there is no guarantee that an overestimate in historical discrepancy would necessarily imply a corresponding overestimate of future values.