• A friendly and periodic reminder of the rules we use for fostering high SNR and quality conversation and interaction at Stormtrack: Forum rules

    P.S. - Nothing specific happened to prompt this message! No one is in trouble, there are no flame wars in effect, nor any inappropriate conversation ongoing. This is being posted sitewide as a casual refresher.

Discussing CFS dashboard verification/usefulness

This topic came up in another thread and rather than go off on too much of a tangent there, I felt it would be a good idea for us to collectively share thoughts on the CFS dashboard here.

*This is a subjective analysis that is open for debate*


Since the CFS severe weather guidance dashboard is designed to show large areas of SCP/supercell composite parameter (theoretical supercell potential), I decided that I would look at days that featured substantial large hail/tornado reports in the U.S. I used a threshold of days with at least 10 tornado reports and/or at least 50 reports of hail. While not perfect, this should help weed out most of the mesoscale accidents or otherwise localized events. Afterall, the dashboard is used to show broad potential events. When I talk about active periods, I will assume that most of the days within that period met the 10 tor or 50 hail criteria. Since this criteria is subjective, I understand that others may have better ways of assessing the CFS dashboard. If so, please share them here! I also used "eyeballing" for verification, assuming that plots with solid blue color refer to quiet periods, while warmer colors and/or any colors with Xs refer to active periods. Days that met the threshold of either 10+ tornado reports or 50+ hail reports are highlighted in red in the graphics below.

It should also be noted that some false flags with the guidance become more likely from mid-May through June, as stronger boundary layer heating only necessitates modest shear in order to throw SCP values over 1.

The first thing I noticed is that the dashboard seems to have better skill with identifying higher-end early season events (April). This is probably due to the fact that there are less false flag signals, as a system that is going to produce an outbreak early in the season is going to need instability well above seasonal averages to perform well above climatology for a large-scale event.

I have three saved runs of the dashboard (5/13/2016, 4/29/2017, 5/12/2017). The latest 4/18/2018 run gives some backward visibility to see how the suite is performing this year and even those runs mentioned do give some hints as to how prior runs leading up to those days performed.

Before I get into specific events and periods of interest, right off the bat, I noticed that overall, there is virtually no skill beyond day 21 and only occasional skill beyond day 14, as a forecast with climatology will probably verify better than the CFS dashboard in the long-range. For this reason, I will not focus much on events that were 2 to 3 or more weeks out.



The period from 5/21 to 5/29 met the 10 tor/50 hail criteria in 7/9 days and chasers know that every day in that period featured chaseworthy setups. Looking at the CFS, it began to show a signal for the period (using 5/25 as the mean point) by May 1st, which was 24 days in advance. There weren't really any "bad" runs that lost the idea of a busy stretch. The caveat here is that late May is very close to the seasonal peak for severe weather, so while 24 days of lead time appears to be impressive, take that with a grain of salt.

There were mixed results for a quiet period preceding this stretch, from 5/18 to 5/20. The CFS waffled back and forth from 1-2 weeks out, which leads me to go back to the climatology argument and argue that the dashboard may not be as effective during peak season, as it is earlier in the year.

For the 5/1 to 5/9 dry spell, the dashboard locked into this by 4/29, which was roughly a weeks worth of lead time. As early as 4/25, the dashboard showed that the early part of that period would perform below climatology, so again, around or slightly more than a week of lead time.

There was an active stretch in late April 2016 that didn't consistently meet my criteria, but just eyeballing the dashboard, it had about a weeks worth of lead time, but not much beyond that.

Finally, there was a seasonally noteworthy downturn in severe activity from 5/30 to 6/12, but the d-17 run showed no skill whatsoever in highlighting this period. Since this is more than two weeks out and near the peak of the severe season, its not really a surprise that the CFS performed poorly here.




The period from 4/28 to 4/30 met the 10 tor/50 hail criteria and had about 14 days of lead time with the dashboard, as the signal was fairly clear from 4/14 on.

On the other hand, 5/10 featured a severe weather event over a broad portion of the Plains (not high-end though) and the CFS showed no skill pinpointing this event from 12 days out.

The active period from 5/16 to 5/20 saw about 19 days of lead time, considering than two runs of the dashboard had "bad runs". 17 out of 19 runs nailing down the period still seems impressive, although mid-May is moving close to the seasonal peak, so one could argue that the 19 day lead time is not as significant as it may seem.

Another active period from 5/23 to 5/27 also saw 19 days of lead time, with just one "bad run" from the CFS.

The quiet period from 5/12 to 5/15 is interesting, since the dashboard first started showing this with the 4/23 run and only had one bad run that lost the dry spell. That shows approximately 20 days of lead time and seems significant since mid-May climatology would argue against an extended stretch of limited severe activity.



For the 4/13 event, the CFS was locked in 12 days in advance, showing a strong signal starting on 4/1, while the suite started hinting at it as early as 3/27 (17 days out).

Starting on 4/8, the dashboard showed that the current stretch we are in would be relatively quiet, so there's about 7-10 days of lead time there.

Key take-aways from this subjective analysis of the CFS dashboard:

  • The CFS has virtually no skill beyond week 3 or 21+ days out
  • The CFS skill from weeks 2-3 (14-21 days out) is variable, noting that climatology skews toward increased severe activity in mid to late spring, which questions the validity or importance of CFS "verification" this far out.
  • While the CFS may occasionally zero in on events that are as far as ~3 weeks out, it is not uncommon for the model to flip flop back a few times before getting fully locked in.
  • The CFS is generally very good a week (7 days out) and has shown some skill 1-2 weeks (7-14 days) out, especially early in the season.
I did not give much analysis to winter events, since the thresholds for the dashboard are also subjective. At some point in the late winter/early spring (it does not seem consistent over the past few years) the threshold for CAPE in the algorithm increases from 500 J/kg to 1000 J/kg. This causes a large shift in the suite's output, which further complicates assessing verification. I also have a lack of saved dashboard runs from the winter, so I felt it would be best to not focus on them. (Plus I think most chasing prospects are focused on April/May/June than January/February/March.)

Live link for the latest run of the CFS dashboard:
Did a research project using the CFS dashboard back in 2013 and arrived at similar conclusions. There's really no point in using the CFS more than about 10-15 days out. Model climatology overwhelms signal beyond that point.
I thought about doing some objective verification on this a couple weeks ago. Unfortunately, I don't think you can reconstruct the chiclets based on what's stored in the CFS archive. You'd have to have saved them in real-time. So that idea was scuttled. Unless someone knows of an archive of CFS runs that contains SCP or the necessary variables to compute SCP.

@Jeff Duda, what were the methods for your project?
Tim, it was part of Harold's FEDA (verification) course. We used the day 1 forecast as the verifying analysis and looked at things like the tendencies for hot/cold runs, consistency in forecasting, noise in the signal etc. I think we also looked at the sensitivity to the SCP threshold. It was certainly not very rigorous. As far as I know no publications came out of it.
Was recently asked about the utility the CFS dashboard offers, so glad to see more detailed analysis. Overall, my opinions from it seem to match the general take-home you guys are coming to... it has occasional usefulness in the long run, but varies a fair bit. Long term patterns of results often seem a key to focus on.

And likewise to SPC forecasts, for chasers the results may not always line up well with your interests. Large squall lines may well show up on CFS brightly... whereas an isolated great chase target day may not so much. Size isn't the ultimate indicator a great chase day lies ahead (though it can well help). When planning chase vacation options, it's probably helpful input, though you often need to inspect the days more carefully. And seems the X can well be as important as the color at all time periods in terms of chasing I'd well suggest.

Overall glad we have it, it's a really neat visualization tool.
I believe the CFS is most useful if one looks at all the products. I start with the weekly charts for the upper level pattern and temperature departures. Dashboard is nice for a rough take at surface features. When they diverge I favor the weekly charts over the Dashboard. Why? Weekly charts have a better chance at the broad pattern than the Dashboard does at surface details.