|
A challenge in the gridding process is to remove the effects of the constantly varying pool of stations. This could be overcome by only using stations with a complete record but the sparseness of the network that this would lead to would introduce much greater uncertainty due to the spatial interpolation required. Instead, all stations believed to have a good record in any month are used, and every effort made to compensate for missing stations during the gridding process.
The gridding process is accomplished in several stages. Firstly, for most parameters, the monthly average or total values are turned into differences from or percentages of the 1961–1990 long period average (termed anomalies). This generally produces a field that is smoother than the raw observations (termed actuals) and is therefore easier to interpolate. This assumes that a grid of the 1961–1990 average has already been generated. For most parameters, this has been done on a 1 km x 1 km grid. To do this, gaps in the monthly or annual station data are first filled in with estimates generated using relationships with well-correlated neighbour stations. The resulting station averages are then gridded using a combination of multiple regression and spatial interpolation of the regression residuals.
The regression equation is fitted to the station averages using a range of different factors. These include latitude, longitude, altitude, terrain shape, coastal and urban effects. Different combinations of factors are used for different parameters. It is not appropriate to use all geographic factors for all parameters, as there may not be a plausible reason for such a relationship, leading to the possibility of generating spurious correlations that only add noise to the regression surface. The fit of the regression surface to station values will not be perfect, the differences being known as regression residuals. At stations where the residuals are large they tend to be indicative of spurious values and so the residuals are used to help with quality control checks.
The same process is used to generate the monthly and annual gridded datasets. The same range of factors is available for the regression fitting, but since the data being analysed are usually anomalies most of the factors are already accounted for. Often, only a cross-polynomial of latitude and longitude is required to account for broad spatial patterns in the anomalies.
The regression residuals are then interpolated on to a 5 km x 5 km grid using inverse-distance weighting (IDW). This ensures that local variations in the climate are incorporated into the final grid, which is produced when the regression and interpolated residual surfaces are added together. If anomalies were analysed the long term average field is added back on to produce a field of the original parameter.
Testing of different regression models and interpolation methods and settings is carried out by leaving out a set of 10% of the station data. Error statistics of the actual values at these stations compared with values estimated by the grid are calculated and compared. Different settings of the IDW interpolation have been tested, for example varying the power and radius parameters. Spline surfaces have also been tested but were not found to give as good a result. The full method is described in Perry and Hollis (2005a).
Next
|