WESTERN REGION TECHNICAL ATTACHMENT
NO. 02-01
JANUARY 8, 2002

A PROPOSED GRIDDED QPF VERIFICATION SCHEME
FOR THE GFE IN THE WESTERN REGION

Linda Cheng, Weather Forecast Office, Salt Lake City, UT

Introduction

The proposed implementation of the Interactive Forecast Preparation System (IFPS) at National Weather Service forecast offices (NWSFOs) nationwide will allow forecasters to issue gridded forecast products at relatively high resolutions. The greater spatial detail provided by these gridded forecasts requires that verification be performed over the gridded data set. Current verification procedures typically focus on selected surface stations, but this type of verification is incapable of providing a detailed account of the spatial accuracy of a forecast. A method to verify quantitative precipitation forecasts (QPFs) derived from the Graphical Forecast Editor (GFE) component of the IFPS is described in this Technical Attachment.


Methodology


One of the purposes of verification is to provide forecasters with feedback regarding the skill and accuracy of their forecasts so that improvements can be made in the future. A challenge of forecasting precipitation, especially in Western Region, is that the amount and distribution of precipitation depends heavily on orographic influences. Even within a forecast office's county warning area (CWA), significant local variations in precipitation totals are often observed. Because of this variability, verification statistics derived over a large area are not very meaningful to forecasters. In order to provide detailed and meaningful verification for regions of complex terrain, statistics must be calculated for different regions with similar precipitation climatologies.


The proposed verification scheme allows each office to divide its county warning area (CWA) into a maximum of six climatologically-distinct regions. Since precipitation is related to elevation, each office can define three elevation ranges which cover, for example, the areas containing the valleys, the mountain slopes or plateaus, and the mountain crests. Furthermore, each CWA can be divided into two planar regions, e.g., north and south, east and west, etc.


Verification statistics generated for these regions must be easily interpreted to be useful in the operational sense. It is important for forecasters to know whether their QPFs were over- or underestimates and whether the precipitation was forecast over the correct location. Consideration must also be taken for forecasts that have the correct intensity but incorrect location and vice versa. This must be done using a suite of skill scores, since no one score can give an accurate assessment of the entire situation. The statistics generated in the proposed verification scheme are described below.

Mean and Max


Mean and maximum values determined from both the forecast and observed precipitation are useful for determining errors in intensity. These values are independent of the locations of the forecast and observed precipitation relative to each other, so forecasts are not be punished by slight errors in location within each verification zone. Larger displacements, however, would be punished by this measurement, since verification is performed by region. For example, if the heaviest precipitation was forecast for the mountain crest but observed on the slopes, then the forecast would be punished more severely.


One important point to note is the way the mean is determined in this scheme. A mean is typically calculated using all of the available points. However, because precipitation is not spatially continuous, many grid points may have values of zero. Therefore, the size of the precipitation area influences the mean. For example, a small area of heavy precipitation might have the same mean as a large area of light precipitation. Thus, a mean averaged over an entire region is not useful. However, a mean calculated using only those grid points with nonzero values of precipitation gives a more useful measure of the average precipitation that was forecast or observed.

Mean Error, Mean Absolute Error, and Root-Mean-Squared Error


The mean error (ME), mean absolute error (MAE), and root-mean-squared error (RMSE) are standard statistical measurements widely used for verification (Wilks 1995). These scores are defined as:

where fi and oi are the forecast and observed values at each grid point, respectively, and N is the total number of grid points.


All three of these scores are different ways of measuring the error between the forecast and observed total of precipitation at each grid point. Because these scores are averaged over each grid point, they are location-dependent, meaning they punish severely for precipitation forecast over the incorrect location.

Bias and Threat Score


Bias and threat scores are important for determining the spatial accuracy of a forecast (Wilks 1995). These statistics are generated for different thresholds of precipitation. For example, statistics calculated for a threshold of 0.01 in. show the spatial accuracy of the entire area of measurable precipitation, and the higher thresholds show how well the locations of precipitation maxima were forecast.


The bias measures how well the size of the area of forecast precipitation matches that of the observed. Two types of bias scores are calculated by the verification scheme. One takes the difference between the number of grid points forecast to have precipitation and the number of grid points having observed precipitation, while the other takes the ratio of forecast grid points to observed.


The threat score measures the overlap of forecast and observed areas of precipitation and is given by:


where H is the number of "hits," or the number of grid points where both observed and forecast precipitation met or exceeded the threshold amount, F is the number of grid points that were forecast to meet or exceed the threshold amount, and O is the number of grid points that had observed totals meeting or exceeding the threshold amount. Threat scores range between 0 and 1, with 0 meaning there were no "hits," and 1 meaning that all of the forecast and observed points were "hits."


GFE Procedures

Both the forecast and observed precipitation can be generated and output to netCDF files using the GFE. Gridded precipitation analyses from the National Centers for Environmental Prediction (NCEP) are currently available for viewing on the GFE from the D2D directories. The variable, tp (total precipitation), has units of kg m^-2. The data can easily be converted to inches and saved into an IFPS database through Smart Initialization (Forecast Systems Laboratory 2001). The forecast and/or observed precipitation for each GFE grid can be summed into the correct time length for verification. For example, if QPF is issued as 6-h grids and the observed precipitation is only available as 24-h totals, the four 6-h QPF grids can be summed up into a 24-h grid using a simple GFE Smart Tool.


Each office can define their own planar verification areas by using GFE "edit areas" (Forecast Systems Laboratory 2001). For example, an office might choose to aggregate all of the northern forecast zones for one verification region and all of the southern zones for another. These regions are saved as named edit areas which can be called up each time the verification is performed.


Forecast and observed precipitation is then be output to separate netCDF files using the ifpnetCDF program (Forecast Systems Laboratory 2001) with the -m switch, or mask, on the command line set to one of the verification areas. The -g switch must also be set so that topographical information is also included in the file. A PERL script is then run which reads the files and calculates the statistics for each elevation zone in each planar area.


Example Output

An example of the statistics generated by the verification code is provided below. As a test case, the verification was performed over the NWSFO Salt Lake City CWA for the unmodified MesoEta forecast imported into the GFE at 5-km resolution. The forecast was valid for the 24-h period ending at 1200 UTC 23 November 2001. The observed precipitation data were obtained from the 24-h gauge-only Stage IV precipitation analysis produced by the River Forecast Centers (National Centers for Environmental Predicton 2001). Figures 1 and 2 show the forecast and observed precipitation for this period over the western United States. The Salt Lake CWA was divided into the northern and southern areas shown in Fig. 3. Elevation ranges used in the verification are 0-5000 ft, 5000-7000 ft, and 7000-15000 ft.


The statistics for the northern part of the Salt Lake CWA are shown in Tables 1, 2, and 3, and those for the southern part of the CWA are shown in Tables 4, 5, and 6.

In addition to the scores mentioned previously, the number of hits, false alarms, misses, and the total number of observed and forecast gridpoints exceeding each threshold is also included in the tables. These additional numbers are helpful in showing the size of the area of forecast or observed precipitation.


From the statistics generated for the test case, one can deduce from the mean and max values that the intensity of precipitation was underforecast at all six verification regions. However, the extent of the underforecasting appears to be greater in the north than the south. The values for the ME show that in the north, precipitation was underforecast at each grid point on average, and the values of MAE and RMSE show that the extent of the errors increase with elevation. In the south, however, ME values show that each grid point was, on average, overforecast, even though the mean and max values show that the intensity of the total area of precipitation was underforecast. This apparent contradiction could be due to large errors in the placement or size of the precipitation areas. As in the north, the extent of errors at each grid point also increases with elevation.


The areal bias statistics show a general decrease with increasing threshold values at all elevation ranges in the north. This indicates that the total size of the precipitation areas were overforecast, but the areal coverage of the higher values of precipitation was underforecast. In the south, however, the largest bias ratio was at the 0.10-in. threshold.


Threat scores were generally high in the north for all thresholds except values above 1.00 in. Threat scores were lower in the south. In the north, the best forecasts were for the mid-elevation range, whereas in the south, the forecast was better at the highest-elevation zones.


Discussion

A feasible way of verifying precipitation for gridded GFE forecasts was presented above. However, the success of the verification scheme requires a better gridded precipitation analysis than those currently available. Furthermore, resolution is important in any gridded verification scheme. Western Region forecast offices will likely begin running the GFE operationally at approximately 5-km resolution. Obviously, higher horizontal resolution would allow for better representation of smaller-scale precipitation features such as convection in the verification scheme. Future increases in computing power at the local offices should allow for the needed improvements to grid resolution. The GFE is still undergoing development, so slight changes in the verification process may be necessary in the future. In any case, gridded verification by climatological regimes should prove useful for forecasters when the GFE becomes operational.

Acknowledgments

Thanks to Andy Edman, Steve Vasiloff, and Jason Burks for their comments, and Kirby Cook for his review of this TA and help with the PERL code.


References

Forecast Systems Laboratory, Enhanced Forecaster Tools Branch, cited 2001: GFESuite Information. [Available on-line from
http://www-md.fsl.noaa.gov/eft/rpp/doc/onlinehelp_RPP14/GFESuite.html].

National Centers for Environmental Prediction, Environmental Modeling Center, Mesoscale Modeling Branch, cited 2001: National Stage II Analyses ("Stage IV"). [Available on-line from http://www.emc.ncep.noaa.gov/mmb/stage2/].

Wilks, D. S., 1995: Statistical Methods in the Atmospheric Sciences. Academic Press, 467 pp.