Forecasting Accuracy 2024 - Good or Bad?

For sure I would still chase if the batting average went up to 1.000. I would chase more. Busts are a serious drain on resources, which I did not appreciate until the economy began to go south in 2021-2022. In 2021 the idea of a two-day panhandle chase with an overnight stay was, while non-trivial, definitely doable. Not so much in 2022...and afterwards. Lots more same-day "there and back again" excursions. That's why I spend so much time trying to figure out on which days to chase, and when things go wrong, why they went wrong. To optimize the consumption of resources.

Interesting. We probably have very different views on that. For sure I sometimes find myself frustrated and wondering if chasing is worth the time and money. But ultimately I have come to appreciate that the failures and “.300 average” are what make the successes so very satisfying. And I just enjoy the process. I suspect you do, also, more than you are letting on. To go as deep as you go in trying to learn, you’d have to find intrinsic satisfaction in it. Bottom line is you are still finding obvious enjoyment in the challenge of improving, and if it were made easier (guaranteeing the 1.000 average) all that would go away. Interesting topic in its own right, probably better for a DM conversation as I didn’t mean to take us OT from the original 2024 forecast accuracy thread!
 
I'm not sure I am qualified to comment on forecast accuracy so far this year, but it seems to me that the forecasts have not been all that bad.

However, I do temper any perception of accuracy with my understanding that the SPC forecasts "severe weather events" and I am more interested in the subset of "severe weather events" that are "chasable storms".

For instance, here is how the 1630Z Day One outlook "verified":

1713374586463.png
I mean--if you grade based on "point-in-polygon" and weight by the categorical outlook, this forecast probably gets a very high score. Hard for me to argue this was bad. Chasable storms, though? I am waiting to see the reports come in.
 
Last edited:
I might respectfully disagree as to the accuracy of the outlooks for the 15th.

The hatched hail forecast (right) was awful. There was only 1 report of ≥ 2" hail in spite of thousands of square miles of hatched areas. Few large hail events occurred in the red (30%) area.

As to tornadoes, there was one EF-2 "strong" tornado but that was at the edge of the 5% area (Greenwood County, KS). No strong tornadoes in the hatched area.

I'm not saying I did any better w/r/t tornadoes (except that I had fewer square miles of false alarm, see: [4:30pm Update] Central U.S. Tornado and Severe Thunderstorm Forecast .

Given the number and intensity of tornadoes that occurred yesterday morning (most after 7am, so they were after the period of the forecast shown), I think we dodged a bullet with the upper low being 6-8 hours later than forecast. Had that 125 kt jet steam been able to work on the 3000+ j of CAPE, things might have been really dangerous.
 

Attachments

  • Screenshot 2024-04-17 at 1.32.49 PM.png
    Screenshot 2024-04-17 at 1.32.49 PM.png
    406.5 KB · Views: 5
  • Screenshot 2024-04-17 at 1.33.01 PM.png
    Screenshot 2024-04-17 at 1.33.01 PM.png
    620.5 KB · Views: 5
Specifically with regard to chasing and only chasing, I actually hope forecasts, the related tools, and the observation network don’t get much better! The uncertainty is part of the appeal. I’m not a poker player, but I imagine it’s a similar inclination and personality type that enjoys weighing the blend of knowns, unknowns, probabilities, etc. and decision making in an environment of uncertainties. I don’t think chasing would be as satisfying if it was 100% successful all the time, instead of a more baseball-like .300 average. It’s the challenge that makes chasing fulfilling. Thinking of the recent eclipse, suppose the exact day, minute and path of an EF5 tornado were known in advance, and you could just show up to watch it (along with hundreds of others who had to do little more than look up the path and schedule online) - would you even want to chase anymore???
Definitely agree with this. While it is certainly frustrating driving for literally days on end spending precious time and money only to come up empty handed, it makes the successes that much sweeter.

To me, the very act/process of chasing is fun and enough to get me out there. There is just something inherently exciting about forecasting and planning with so many uncertainties. I love roadtrips, Ghost Towns, camping, photography, history, exploration/spontaneity, and severe weather. All of which are typically available in great abundance on any given chase. Not knowing where you will be or what you will see is intoxicating to me, because its one of the few things you can actively pursue with true adventure everytime. I love that cowboy feeling every chase day of making decisions on the fly which can make or break your whole day. Constantly learning from each chase to get better every time. To me, all of those factors are enough that I still have fun each time I go out, and enjoy the overall experience. Yes, catching "the big one" for that day is the ultimate goal, but it doesn't mean the trip sucked if you didn't capture it. At that point it was ultimately a roadtrip, and those are always a great time in my book.

My brother/chase partner takes bust days alot harder than I do. He comes out so desperately wanting and hoping to catch a photogenic tornado, and gets really bummed when it doesn't happen. I like to remind him that it's called stormchasing and not tornado catching, but he still gets pretty bummed out.

We can all sit here and bash the SPC (and probably all have at some point), but ultimately they are working with the same data we are, and can't really blame them when it doesn't exactly verify. They are an easy scapegoat to us, just like the local weather girl is to the general public, but they are just that, scapegoats. Without more detailed, frequent, and granular observation data for the models and for us in the field, we all can only work within these broad uncertainties and have to accept the realities of that.

Gotta do it for the love of the game, not the love of victory.
 
I contemplated this post as an Event or Pseudo-Event for 4/23/2024. As much as we have recently seen events with huge categorical risk polygons thinly populated with storm reports, I'd like to see some analysis of an event where it seems like the SPC did a surprisingly good job: what did they see in the conditions and model forecasts that prompted the tiny SLGT risk polygon in NW TX on 4/23/2024? Here is the graphical display of performance; no hail or tornado events--just wind:


1713982321154.png


Sometimes I wonder to what extent categorical risk polygons are set by the convex hulls* (or concave hulls) of the CAM model outputs, as in the following image:

Model Convex Hull.jpg
Composite image of all Reflectivity Ensemble Paintball plots for 4/23/2024 19Z to 4/24/2024 06Z from the SPC HREF Ensemble Viewer. The model runs reflected in the ensemble paintballs were initialized on 4/23/2024 00Z.

I'm not suggesting that such a simplistic approach (convex hull of CAMs) was taken here, or anywhere for that matter. Just wanted to point out the "convex hull" observation on the way into the discussion.

What I'm really interested in is: what made such a focused forecast successful in this case?


* The convex hull is the smallest set of points enclosing data of interest, in this case the smallest polygon that encloses the paintball ensembles.
 
Last edited:
There are so many ways I think you could attack this topic of accuracy/verification topic and what I write here is not meant to solve anything. it is definitely an interesting topic always, and I have definitely been part of these discussions before in my own right.
1714138182575.png

- spatial coverage given the size of the polygon? i.e. if a marginal area covers 25,000 sq miles and you get 1 report, is that a bust? or a hit. Does it depend on who you ask?
- does defining the risk trump any accuracy? (how much accuracy should matter in terms of defining risk to the public)
- how do they define ISOLD / SCT/ Numerous / Widespread (coverage area on Sq miles? or resultant number of storm reports inside the polygon)
- are storm reports part of the SPC verification process of the previous models forecast?
1. say you cover the entire plains in Enhanced and you get 6 tornado reports 2 EFO, 3 EF2, 1 EF4 .. anyone located inside the EF4 might be thankful for the Enhanced area. take a step back and people might say, that was a bust for such a large area of coverage that only received 6 confirmed reports.
- how much does the public really rely on SPC outlooks vs. local NWS announcements watches or warnings (I really don't know). I assume that maybe some study was done to ask this kind of question? I would assume services/business/EOC/Aviation/Chasers may rely on SPC more than generic public? total guess but it feels accurate/logical in my head anyway, lol.

- How many severe storm reports on lower end forecasts does it take to bust a lower end risk?

I could honestly keep going and ask more questions, but at the end of the day, I know the SPC is the best at characterizing the risk for the day. Do busts happen? sure, are there reasons meteorologically that delay or cap convection that the models missed, absolutely. As seasoned chasers, especially ones with meteorological backgrounds, it's our job to see through the baseline and investigate the micro to place ourselves in the location with the best potential.

and we ALL bust at one time or another.
 
Last edited:
Hopefully there is help on the way...TorNet:

Open source
 
As I've already seen people calling it a bust and questioning the SPC outlook (no doubt assuming the risk means sig tors and not spacial coverage), so I thought I'd do a crude overlay to see how the current reports compare to the final 1943Z outlook (and yeah, I know it's not exactly lined up but I'm at work and can't mess about too much!)

I don't think it's too far off to be honest.

SPC 6 May.jpg
 
I guess some of that will come down to how many Sig Tors there were yesterday, I would say 1 for sure, perhaps 2? but I stopped watching after the Barnell storm, so I don't know if anything formed after that. if it ends up being just one report of a sig tor, is that considered enough to call it a hit? .. I mean, their reasoning for upgrading to, valid. But I wonder when they updated it last evening to trim the area, should they have just gotten rid of the High and left MDT. but here I am, hind sighting lol.

One thing they don't do is discriminate with SLIGHT, MDT, or HIGH on whether its tornadic or not, their categories are irrespective of tornadoes, just severe, and longevity. So based on those conditions, they probably hit what they forecast.
 
I guess some of that will come down to how many Sig Tors there were yesterday, I would say 1 for sure, perhaps 2? but I stopped watching after the Barnell storm, so I don't know if anything formed after that. if it ends up being just one report of a sig tor, is that considered enough to call it a hit? .. I mean, their reasoning for upgrading to, valid. But I wonder when they updated it last evening to trim the area, should they have just gotten rid of the High and left MDT. but here I am, hind sighting lol.

One thing they don't do is discriminate with SLIGHT, MDT, or HIGH on whether its tornadic or not, their categories are irrespective of tornadoes, just severe, and longevity. So based on those conditions, they probably hit what they forecast.

I haven't seen a good, purely objective categorization of what consists of a "bust" per se. For the high risk, it may come down to how the wind damage reports are quantified (see below). If we don't see "numerous intense and long-tracked tornadoes" once all of the cards are on the table, then the specific, tornado-driven high risk upgrade wouldn't verify. I have no idea how widespread and intense the wind damage is in eastern OK though. I wouldn't personally call the event a "bust," since there were a plethora of severe weather reports, although it's possible that the final report verifications will merit that the event only met a lower categorical threshold for verification, e.g., an enhanced or moderate, for example. I think it's important to message that this doesn't mean a complete bust even though a specific severe weather outlook categorization threshold may not have been met, although I understand the social science optics with the general public are far more complex.

  • 5-HIGH (magenta) - High risk - An area where a severe weather outbreak is expected from either numerous intense and long-tracked tornadoes or a long-lived derecho-producing thunderstorm complex that produces hurricane-force wind gusts and widespread damage. This risk is reserved for when high confidence exists in widespread coverage of severe weather with embedded instances of extreme severe (i.e., violent tornadoes or very damaging convective wind events).
 
I think it's important to message that this doesn't mean a complete bust even though a specific severe weather outlook categorization threshold may not have been met, although I understand the social science optics with the general public are far more complex.
I dont think it was a complete bust at all. but when it comes to whether someone has to calculate a HIT/MISS/FAR, it would be interesting to see how they determine it, separate from the social aspect ofcourse, as you said thats too complex and I would agree with you, better there to just take a pole before and after storm outbreaks maybe on , did you feel you were adequately warned? type questions. lol
 
Back
Top