Computing resource allocation for models

Dan Robinson · Nov 14, 2016

Does anyone know why the GFS continues to be run out to 384 hours, while the NAM stays at 84? We would undoubtedly get more useful results from moving computing resources from the useless 10- 16 day GFS into extending the NAM a few more days.

ScottCurry · Nov 14, 2016

You question was interesting, so I spent some time researching it. From what I can tell, NOAA is very interested in long range forecasts - even if they are not accurate. So they would rather run the GFS at 1/2 resolution every 4 hours and get a longer 16 day forecast, than run both the NAM & GFS at full resolution for a longer duration (say 7 - 10 days each).

In fact, it looks like they are trying to produce forecasts as much as 1 month out.

From http://www.noaa.gov/media-release/noaa-to-develop-new-global-weather-model
Goals for the new model are:

a unified system to improve forecast accuracy beyond 8 to 10 days
better model forecasts of hurricane track and intensity, and
the extension of weather forecasting through 14 days and for extreme events, 3 to 4 weeks in advance.

Looking at the run times of the NAM & GFS, it appears they run them back-to-back. As one is nearing completion, they start the other. And they don't just need their servers for these two models. They use their computing power for other models as well.

http://www.nco.ncep.noaa.gov/pmb/nwprod/prodstat/index.html

My overall thought is that based upon their available computing power, this setup allows them get long range (16 day) forecasts with the GFS as well as run all of their other models and still get results out to the public on time. Eliminating the 1/2 resolution GFS runs would go against their goals of producing long-range forecasts. And they don't have enough computing power to extend the NAM.

rdale · Nov 14, 2016

The 10-16 GFS has some value in an ensemble mode. A NAM at 120 hours has zero value in any mode

Brian G · Nov 15, 2016

Ditto rdale. While a single run of the GFS does not exhibit consistent useful skill beyond 8 days when aggregating multiple runs together in a time-lagged ensemble it does demonstrate some, albeit limited, skill. This is similar to the way the CFS is used. Have you seen those heat-map charts for the CFS where the y-axis is the CFS cycle and the x-axis is the forecast time and the coloring of a cell is a model parameter like SCP or something? You can do the same thing with the GFS.

For what it's worth the DGEX is a variation of the NAM that is ran out to 192 hours. It picks up where the NAM left off and uses the previous cycle of the GFS to feed it the lateral boundary conditions. It is only available on the 6Z and 18Z cycles and so its boundaries are fed by the 0Z and 12Z cycles of the GFS. It's name is derived from Downscaled GFS by Eta Extension. ETA was the numerical core to the NAM back in the day. The intent, from what I can gather from internet searches, is that the DGEX was used to provide more realistic point forecasts because it was ran at a higher resolution than the GFS. The GFS forecast grids were too coarse especially in areas with significant elevation gradients. But, because the boundaries were being fed by the GFS the inner domain was dominated by the GFS synoptic forecast. The DGEX is still run today though it's not clear if that's for historical reasons or if it continues to serve the original purpose. It's skill is often mocked. It can be found here.

Jeff Duda · Nov 15, 2016

TL;DR: We used to have the GFS, NAM, and RUC, which had their own niches to fulfill, and did just that. But over time, as technology has improved, the configurations of these models have become much more similar to the point where they fulfill almost the same roles, resulting in ambiguous or confusing differences between them and opening up possibilities for consolidation of resources.

There are only certain roles to be played in the operational forecasting business. Most weather forecasting centers around the world participate in at least some of the following:

- A global model: Some agencies, like ECMWF, do only this. Global models were the first type of operational model to be run once things really got going back in the 1950s-1960s. Sure, the first experimental NWP models were not global, but the first that served a practial purpose were. Global models are only meant to forecast synoptic scale processes, and operational ensembles are also generally limited to global types. The big advantage of a global model is that it can be run as far out in time as you wish since it uses periodic lateral boundary conditions. Some agencies attempt to run GCMs or climate models even further out than 16 days, knowing that such models are not meant to forecast day-to-day sensible weather, but rather larger scale climatic events like ENSO or just seasonal forecasts. However, most agencies that run global models out that far operate under the assumption that there is no predictability of sensible weather past 16 days. Practically speaking, predictability in a deterministic sense is lost anywhere between 7 and 12 days. One area where ensembles are highly useful is the extension of predictability to longer ranges.

- A mesoscale model: In the US, we have the NAM (you could argue the RAP fits this role to some degree), while MeteoFrance has ARPEGE, for example. The mesoscale model is designed to cover a limited area, but still large enough to encompass a sizeable region, like a continent. Because these models have limited area domains, they must be forced using lateral boundary conditions taken from a global model. The controlling influence of lateral boundary conditions on model forecast evolution places an effective limit on the length of a mesoscale model forecast. After a sufficiently long enough period of integration, the information coming in from the lateral boundaries will completely occupy the model domain, after which point the mesoscale model is essentially a downscaled version of the driving model. That's why the NAM only runs out to about 84 hours. Imagine a situation in which there is a strong polar jet stream zonally oriented across the CONUS with wind speeds exceeding 100 kts. It would only take slightly more than 24 hours for an air parcel crossing over the Pacific coast to cross the CONUS and exit over the Atlantic coast. This means that information passed into such a domain from lateral boundary conditions would completely overtake the influence from the dynamical core of the model in that time. In other words, the NAM could basically just be a downscaled (i.e., finer resolution) version of the GFS in barely 24 hours in that situation. Sure, you get weaker wind speeds closer to the surface, and in many cases even modest low level flow would not cross a CONUS sized domain within 84 hours, thus allowing NAM forecasts to keep their identity separate from the GFS through that time. However, larger scale conditions like Rossby waves exert a strong degree of control over smaller scale processes like fronts, so you can't really use the latter argument to justify running a mesoscale model that much farther out in time. It would simply be a waste of resources.

- A rapidly-updating or small-scale model: This is where the US weather enterprise does pretty well. We had the RUC for years, and now the RAP. These models are designed to be re-initialized and forecast hours updated rather frequently compared to the time scale used in mesoscale and global models. In many cases, they're also run at a finer resolution. These models are useful for very short term or nowcast type forecasting of less predictable events like thunderstorms or extreme winter storms. Many other modeling centers have small scale models run over small domains. The UK Met Office has a model that runs as fine as 1.5 km. The French have a model called AROME that runs over a similarly small domain with a horizontal grid spacing of around 2.5 km. And so on. However, where the US exceeds these other centers is in the scale of the models we use. For one, we don't just run the RAP here. We also have the HRRR, and convection-allowing ensembles are not far away. The US may be the first nation to be running an operational convection-allowing ensemble within a few years. Also we forecast over a larger domain. The CONUS is much bigger than the British Isles or central Europe.

========== sidebar on ensembles ======================================
I haven't really covered ensembles even though they are more useful, more computationally expensive, and likely the future of operational forecasting. Pretty much no deterministic model can outperform an ensemble that runs at a similar grid spacing. Anyway, you can run global, regional/mesoscale, and convection-allowing/small scale ensembles as well. In the US, we have the GEFS for our global ensemble, the SREF for our mesoscale ensemble, and there are a few experimental convection-allowing ensembles that have been going for years, including the SSEF (storm-scale ensemble forecast system run by CAPS at OU), the UCAR ensemble, and recently NSSL has begun running a 4 km ensemble, and the SPC uses the SSEO (not truly an ensemble, though, but uses US resources). Starting this year some of these have been combined into what's being called CLUE (the Community Leveraged Unified Ensemble) that will likely serve as the testbed ensemble for future spring experiments until the HRRRE (HRRR ensemble) becomes operational, likely by 2020. I will leave further discussion of ensembles for another thread. Back to the topic at hand...
===================================================================

Probably due to a combination of unsteady technological advances and user demand, some of the long-running US models have begun to overlap in their configuration and purpose. NOAA/NCEP is aware of this and is making some strides towards reducing the redundancy, but due to the bureaucracy, this is going to take some time. For example, the GFS now runs at ~13 km, which is well within the range of a mesoscale model. Hell, ECMWF runs their HRES model at ~9 km! I'm guessing this is the result of having increased technology and nowhere else to apply it. I don't know if ECMWF plans on taking their model down to convection-allowing grid spacings at some point, but the computational requirements for that are ridiculous currently. It will certainly become plausible in the not-too-distant future, but will still require incredible resources. In the US we decided to skip over that burden by shifting to variable-resolution grids, of which MPAS and FV3 (the new GFS) are examples. These types of models allow for selective regions of the globe to have convection-allowing grid spacings while keeping a global domain (so that lateral boundary forcing is not needed) and without excessive computational cost. Anyway, I'm rambling, but my point is the GFS and 12-km NAM now run at the same grid spacing. They have different physics, and one is nested within the other, so that leaves one to wonder why they bother keeping going with this. Furthermore, the NAM used to have grid spacings in the 20-60 km range, whereas the old RUC and now RAP have been run at 13 km for some time. The RAP's domain is also being significantly expanded in the newer versions (a lot of interesting information can be found here: http://ruc.noaa.gov/pdf/RAPX_HRRRX_NWS-13sep2016-pub.pdf). This basically means the RAP and the 12-km NAM have the same model domain and grid spacing. They're still different model cores using different physics and different initialization frequencies and forecast lengths, but why bother? There are some changes possibly in the works, including a combined RAP/NAM called the NAMRR (NAM Rapid Refresh) but it currently appears the NAMRR may never become operational for various reasons.

Another strange thing I've noticed is that the GFS is the only global model run every 6 hours. All other modeling centers around the world run their global model every 12 hours. You'd think a lot of resources could be saved by eliminating the 06Z and 18Z GFS runs, but I have been informed that it has been determined that the 6/18Z runs still provide some forecast skill in the short term, which some users still want to have. Therefore, I don't see the 06Z/18Z runs going away anytime soon.

Dan Robinson · Nov 15, 2016

Very helpful information, thank you! The NAM does seem to have some skill even at 84 hours, seemingly more so than the GFS does at 180 - that being the case, wouldn't it be reasonable to expect another 24-48 hours of the NAM would be of *some* value, even considering the degradation from the factors Jeff pointed out?

Jeff Duda · Nov 15, 2016

Dan Robinson said:
wouldn't it be reasonable to expect another 24-48 hours of the NAM would be of *some* value, even considering the degradation from the factors Jeff pointed out?

Not in a systematic fashion, which is what is important. I'm sure you can find an example of a 108-hour NAM forecast that has skill compared to a GFS forecast of the same length or compared to a 60-hour NAM forecast valid at the same time as the 108-hr forecast (from a later initialization), but that will only happen a small portion of the time, and thus is not a good enough reason to devote that much more in resources. It would be like paying the extra $1 for the power-play option in PowerBall for an extra 1% chance of winning money - the math just doesn't add up.

Computing resource allocation for models

Dan Robinson

EF5

ScottCurry

EF3

rdale

EF5

Brian G

EF2

Jeff Duda

site owner, PhD

Dan Robinson

EF5

Jeff Duda

site owner, PhD

Similar threads