Wednesday, August 12, 2020

A Tropical Cyclone Forecast Metric for Operations and Model Development

prospectus for a WAF paper 

  • why metrics matter: "you're only as good as what you measure"
    • .gov & .mil set standards for operational forecast quality or goodness
      • GPRA for NWS - mean 48-h PE/IE
      • PACOM for .mil - mean 24,48,72,120 PE
    • model 'goodness' based on mean PE/IE statistics
  • TC forecast is...
    • surface (10 m) wind field
    • 2-D functional representation using Position (lat/lon), Intensity (Vmax) and Radii (R34/50/64) POCI/ROCI (pressure and radius of outermost closed isobar) and other parameters...
    • NWS (noaa.gov) & PACOM (.mil) warnings/advisories based on onset of 34 kt winds
  • Standard TC metrics:
    • PE - position error falsely called 'track' error
    • IE - intensity error; intensity defined by Vmax not Pmin
    • forecast taus 0,12,24,36,48,60,72,96,120
    • primary statistic is the mean
    • best track uncertainty:
      • P ~ 5-20 nmi depending on I (big I small P uncertainty)
      • I ~ 10-20 kts largest for small I and (ironically) very large I 
  • Properties of PE (and IE)
    • first and foremost NOT equivalent to NWP metrics like 5-day 500 mb NHEM AnomCorr (NAC)
      • time series of  5DNAC --> mean
        • continuous from a continuous process (the model)
        • # of cases the same for day 5, day 10, day...
      • time series of 24/72/120 h PE -- 2019 LANT for hwrf,avno,tecm5
        • discontinuous 
        • 2 or more PE at a given time (2 or more storms)
      • # of cases at each tau is different showed by histogram of R34 at tau 24/72/120
        • varies with basins
        • in the LANT only 1 of every 2 forecasts has a 72-h verifying position
      • show how 24-h PE using only 120 h storms != as 24-h PE using all possible
    • display means of both NAC and PE as die off curves
      • NAC dieoff can be differentiated PE dieoff cannot
      •  PE should be displayed for each tau separately!
      • # of storms / mean PE
    • apples v orange problem!!
    • the 'population' is season/basin dependent
    • year-to-year variability in season/basin mean implies the population cannot be well defined
    • serial correlation between forecasts reduces number of cases
      • e-folding time ~ 12-18 h or for forecasts every 6-h Nind~Nall/3
    • for every forecast...
      • number of verifying cases at:
        • tau 24 ~ 80% (short range)
        • tau 72 ~ 50% (medium range)
        • tau 120 ~ 30% (long range)
      • mean PE/IE represent a subset of storms
      • contribution by storm highly variable
  • the most important part of the forecast is track
    • 80% v 20%?
  • How to improve mean PE?
    • separate from IE
    • a model must make a 'good' track forecast 
      • to use the intensity? maybe...but physically intensity does depend on track
        • can be seen in ensembles -- need to make this plot again...

    • improve the process that generates the forecast -- the model -- why ECMWF is the best TC forecast model
    • reduce 'big' errors
      • do big errors happen within a storm or by storm?
  • Forecast Error (FE)
    • is not the same as PE or IE
    • must be related to the wind field (the forecast) and particularly the extent of 34 kt winds represented by the Radius of 34 kt winds (R34)
    • conceptually FE=f(PE,IE,R34)
      • in the early years (60-80s) Charlie Neumann defined FE=PE
      • or FE=a*PE + b*IE ; a=1.0 and b=0.0
  • New FE=f(PE,IE)
    • only require the forecaster (human or model) to predict position and intensity use best track (BT) R34
    • define error IKE (integrated kinetic energy, Powell...) as symmetric difference of two (intersecting) circles of R34
      • symmetric R34
      • statistical relationship between Vmax and Rmax for forecast & BT
    • use of this simplified IKE represents a lower limit in FE 
  • How
  • demonstrate how 'large' errors contribute to the season/basin mean
    • for every one 'bad' forecast it takes 5-10 'good' forecasts to compensate
    • one storm can dominate the mean at the medium/long range
  • demonstrate how storms contribute to the mean PE/IE and why a model failure for a single storm can 'ruin' the seasonal mean
  • analyze 'large' model errors for both PE and FE
    • are large PE errors always large FE?
    • how do large FE compare to PE

 

 

Thursday, January 30, 2020

CMC > NCEP 20200130 Update

CMC pulls ahead of the GFS? We're #4?

Update on 20200130: CMC better in Summer Hemisphere?

Mike Fiorino
30 January 2020

Recap of  the November Blog


In my first post:  https://wxmapstertc.blogspot.com/2019/11/cmc-ncep.html I discussed how the CMC GDPS (Global Deterministic Prediction System -- the Canadian model) was outperforming the NCEP GFS (the American model) in a very basic measure of global model forecast quality -- the 5-day northern Hemisphere (NHEM) height Anomaly Correlation (5DNACC). 

I acknowledged that a few points do not make a trend, but at the time hypothesized that the improvements in Canadian model came from new/better physics ...

I asked two questions:
  1. when/what were the apparent changes in the CMC GDPS (Global Deterministic Prediction System)?
  2. was the gap between the NCEP.GFS and the CMC.GDPS a blip or a trend?
Ron McTaggart-Cowen (ron.mctaggart-cowan@ec.gc.ca) answered #1: 3 July 2019.

And he provided these links describing the changes:
The most significant physics update to me involved the convective parameterization (adding momentum transport).  As a long-time tropical cyclone (TC) NWP modeler, the impact of better convection was not surprising and consistent with related changes at ECMWF in 2008 that lead to a dramatic improvement in their TC track forecasts (see:https://www.ecmwf.int/sites/default/files/elibrary/2009/17493-record-setting-performance-ecmwf-ifs-medium-range-tropical-cyclone-track-prediction.pdf)

 

Recent Trends...


Pete Kaplan's EMC Stat Page (the long-running web page used at all global model meetings at EMC during my time there in the 1990s) has been down for 'technical reasons'(?), but here are the latest stats from the EMC VSDB (https://www.emc.ncep.noaa.gov/gmb/STATS_vsdb/):



CMC is ahead of the GFS by 0.4 points (a point in NWP is a percentage point) in the NHEM but 1.2 in the SHEM.  A 1.0 point change is a pretty big deal, but again is not necessarily a trend.  What's more impressive  is how the delta is consistent and how the CMC.GDPS does not 'dropout' as badly as the GFS, e.g., the SHEM 11 JAN and NHEM 4 JAN dropouts.

The new CMC GDPS model has been running since July 2019 so we have 7 months to compare in the text table below:

               NHEM                   SHEM
         CMC    GFS  CMC-GFS    CMC    GFS  CMC-GFS
201906: .866   .876  -0.010     N/A  
201907: .863   .871  -0.008    .883   .889  -0.006
201908: .897   .890  +0.007    .882   .891  -0.009
201909: .876   .874  +0.002    .885   .891  -0.006 
201910: .898   .894  +0.004    .908   .899  +0.009
201911: .907   .911  -0.003    .914   .911  +0.003    
201912: .904   .911  -0.007    .903   .903  +0.000
202001: .913   .907  +0.004    .887   .875  +0.012 
 
What's most interesting is how the Canadian model does better in the summer hemisphere

We're not quite at the peak of austral TC activity (around February), but it looks like the convection changes are really improving the CMC height scores and confirms (IMHO) how the tropics impacts the quality of the midlatitude forecasts.

 

Some final thoughts...


Cliff Mass' recent blog post on US NWP (https://cliffmass.blogspot.com/2020/01/the-future-of-us-weather-prediction.html) makes a compelling argument that American operational NWP is fundamently broken with perhaps the greatest dysfunction (unstated there) in the US Navy 😟.  In the NHEM plot above, the Navy global model is barely competitive with the CFSR!  This is very distressing to me as a retired Naval Oceanography officer that implemented the first operational two-way interactive, moving nested-grid TC model in 1981...

I'm seeing a real trend here and at the rate we're going, American NWP will be 4th rate for the foreseeable...

While Cliff correctly identifies our problem mostly as one of leadership (I agree), the (much?) bigger in my mind is the data handling systems at NCEP NCO.  Using the unix filesystem to 'manage' gridded fields is appalling primitive and at least 30 years behind the rest of the world!  Although the Navy global model is clearly in last place, the data systems at FNMOC have always been top notch. 

Data handling is certainly not a sufficient condition, as demonstrated by the lack of performance by the Navy global model, but it is a necessary condition.  I cannot see American NWP at NCEP progressing without a fundamental change in the nitty gritty of data, a fully-funded and outside-DC EPIC notwithstanding.

The Usual Caveats


These comments/opinions are wholly mine and do not refect those of my current employer The University of Colorado Boulder or my previous employer NOAA.