Accuracy of watch probabilities

I'm all for looking at the chance (ha) of using probabilities - but are these verified in any way that can be publicly seen?
 
Mike,

I assume you realize the watch probabilities are simply experimental at this time? Or, are we to expect someone to make decisions based on a product that is unfamiliar?

You don't like probabilities (they're "superfluous"), and that's your apparent preference. However, your preference does not change the fact that uncertainty is inherent in weather forecasts, and probabilities are the direct language of uncertainty. Why don't you pose your same question w.r.t. to PoPs?

On a related note, how would you propose that forecasters decide on categorical thresholds? Probabilities provide a direct and verifiable means of assessing uncertainty. As you approach perfection, probability values will trend toward zero and 100%. Are you against the expression of uncertainty?

Rich,

Greg contends people want more probabilities for more types of products and says
As well as this study that shows that some people want probabilistic information:

Communicating Uncertainty in Weather Forecasts: A Survey of the U.S. Public.

Greg has made several other contentions along these lines in favor of probabilistic tornado warnings in previous threads. So, if the demand truly exists for more probabilities, StormTrack readers (meteorologists, EMs, spotters, chasers, etc.) would be the ones that would be aware of how they are used, if indeed they are. Since I just posted my "challenge" yesterday evening, we may indeed learn the watch probabilities are being used. I suspect they are not. The fact they are "experimental" doesn't matter.

I don't think, when it comes to tornado watches, we need "categorical thresholds." Stay with the regular and PDS watches.

Especially in these economic times, resources are not infinite. I believe the NWS should put its resources where they will do the most good for the most number of people. Improving accuracy is important. I suspect my challenge will indicate people are not using probabilities which would indicate they are of either secondary or little importance.

The current watch/warning system works remarkably well. We have cut the tornado death rate by 97% from its peak to the rate averaged over the last three years (2006-09). Yes, it can be improved with better accuracy. But, remaking the system runs the risk of screwing up something that works very, very well.
 
I'm not sure probability information would be useful for public forecasts. Probability is one of those concepts, at first blush, seems fairly easy to understand. I think most middle-schoolers understand that 60% means a 6 out of 10 chance. But, to really understand probability, I think you have to have a decent grounding in concepts like standard deviation, normal distributions, confidence factors, etc. Even many college graduates are never exposed to these concepts, let alone the general public.

Sure, we have POPs, and the public has been used to using those forecasts for decades. The reality is, every person generalizes what these POPs mean to them, based on their own experience. For example, my wife comments that "it never rains here when they say its a 40% chance or less." That's her working definition, and the demarcation line of whether she takes action to prepare for precipitation or not. When it comes to severe storms and tornadoes, though, the potential consequences of applying such internalized rules of thumb could be personally catastrophic.

I think using probablities internally at the SPC, etc. is just fine, as it is an objective means of expressing the forecast uncertainty. But, I would shudder if tornado watches began being communicated to the public in terms of probabilities.

I agree with Mike Smith. If there is some demonstrable, practical use for emergency managers, etc., then let's cull it out, study it, and institute best practices. Otherwise, I just hope the SPC is cautious and thoughtful before introducing probability forecasts in its public products.
 
Greg has made several other contentions along these lines in favor of probabilistic tornado warnings in previous threads. So, if the demand truly exists for more probabilities, StormTrack readers (meteorologists, EMs, spotters, chasers, etc.) would be the ones that would be aware of how they are used, if indeed they are. Since I just posted my "challenge" yesterday evening, we may indeed learn the watch probabilities are being used. I suspect they are not. The fact they are "experimental" doesn't matter.
Actually, we were originally discussing winter weather probabilities in that other thread.

We've talked with some emergency managers at various workshops who have wanted more information about the range of possibilities for weather hazards, including best-case and worst-case scenarios. Those are just different ways of framing uncertainty in forecasts. But you can't extract that information from a deterministic forecast.

I don't think, when it comes to tornado watches, we need "categorical thresholds." Stay with the regular and PDS watches.
Aren't categorical thresholds just a different expression of uncertainty? Then, the question begs, "what do all these *words* mean?" How much worse is a SLGT versus and MODT versus a HIGH risk, and how much worse is that compared to the normal conditions we expect around here on any given storm day? What should we expect people to do given those different thresholds?

Especially in these economic times, resources are not infinite. I believe the NWS should put its resources where they will do the most good for the most number of people. Improving accuracy is important. I suspect my challenge will indicate people are not using probabilities which would indicate they are of either secondary or little importance.
And there are others (not necessarily me) that argue that we could be spending less money on new technology and instead finding better ways to communicate hazard information.
 
People will always say they want more of whatever.

The fact a NRC or NAS report says "more" is a "dog bites man" story -- when do they ever issue a report that says more science/technology/research isn't needed?

So, I have challenged us to find some entity who is actually using the probabilistic product to make concrete decisions. So far, nothing (but it is early).
 
Mike,

The fact that the watch probabilities are experimental is relevant - they haven't been around that long and most users have to hunt around on a web page to find them! I challenge you to find an experimental product of any kind that was in widespread use and driving decisions before it became "official".

I don't understand the link between the hard economic times and the probabilities. There haven't been any new forecasters hired to produce these numbers - all we're doing is expressing the uncertainty that's inherently built into the categorical products. It doesn't cost us much of anything, other than some working hours to setup the system. The payoff for the forecasters is the ability to quantify our forecasts and their errors, and identify areas of strength and weakness. The payoff to the customers is steadily improving forecast accuracy as a result of more robust verification and forecaster feedback. I would think that a private company would jump at the opportunity to decode these numbers and provide their own value-added products to their customers, but maybe I'm wrong.

The watch probabilities won't ruin the watches any more than the outlook probabilities ruined the outlooks. We'll still provide categorical (yes/no) type products that are based on various probability thresholds.
 
So, I have challenged us to find some entity who is actually using the probabilistic product to make concrete decisions. So far, nothing (but it is early).
I'm fairly certain there are some specific protocols and decisions being made by the State of Alabama regarding school closings that are based on variable threat levels for severe weather (DY1 Outlooks and Tornado Watches), but I don't have the details. Rich T. might want to ask around SPC about this, because I think they are more familiar with what is going on there.

But as for other weather hazards like winter weather, you've already answered that challenge. I'm still waiting to read your answers that I posted on the other thread (here - second paragraph; here - my last comment) regarding this.

BTW - I admittedly misread your post...

Mike Smith said:
I don't think, when it comes to tornado watches, we need "categorical thresholds." Stay with the regular and PDS watches.

...and responded with info about SPC Convective Outlook categorical thresholds. Now that I re-read this, I should point out that "regular" and "PDS" are categories of watch products and are used to express two different levels of certainty. And (not to dig up a dead horse), if you favor enhanced wording in watch products, why not in warning products?

Rich Thompson said:
I would think that a private company would jump at the opportunity to decode these numbers and provide their own value-added products to their customers, but maybe I'm wrong.
I know of at least one representative of a well-known private weather company that has expressed great interest in NWS weather hazard uncertainty information in order to create new value-added products. Mike, care to take a guess?

Rich Thompson said:
The watch probabilities won't ruin the watches any more than the outlook probabilities ruined the outlooks. We'll still provide categorical (yes/no) type products that are based on various probability thresholds.
Exactly. In my probabilistic hazard information presentations, I've clearly stated that the end users do not necessarily have to see any of the probability numbers in their products. They can all be aggregated to simpler and simpler formats to address different levels of user sophistication and requirements. The aggregation can come in the form of color-coded threat levels, or categorical verbiage, or simple yes/no decisions made at thresholds specific to the situation (i.e., more tornado warnings at longer lead times for highly vulnerable populations like rural manufactured housing residents.)

I wonder if we should merge these two threads?
 
Mike,

The fact that the watch probabilities are experimental is relevant - they haven't been around that long and most users have to hunt around on a web page to find them! I challenge you to find an experimental product of any kind that was in widespread use and driving decisions before it became "official".

The relevance is if the demand exists that is being alleged, then people would have started using them.

I see people using the 24-hour observed precipitation (experimental, www.hpc.ncep.noaa.gov/qpf/obsmaps/obsprecip.php) and the combined radar and warning product (experimental, http://radar.srh.noaa.gov/ ). So, yes, products can go into wide use before they are "official." I don't see a similar rush to adopt the watch probabilities.

I don't understand the link between the hard economic times and the probabilities. There haven't been any new forecasters hired to produce these numbers - all we're doing is expressing the uncertainty that's inherently built into the categorical products. It doesn't cost us much of anything, other than some working hours to setup the system...

But, is that the optimum use of those "working hours"? That is the link.

I would think that a private company would jump at the opportunity to decode these numbers and provide their own value-added products to their customers, but maybe I'm wrong.
I know of at least one representative of a well-known private weather company that has expressed great interest in NWS weather hazard uncertainty information in order to create new value-added products. Mike, care to take a guess?

I'm sure you are alluding to AccuWeather. So? I work for a progressive company that wants to know what is going on the field. That doesn't mean that, if asked, it would be our number one priority for NWS R&D dollars. Note: I am expressing my opinion here, not AW's or WDSI's.

But as for other weather hazards like winter weather, you've already answered that challenge. I'm still waiting to read your answers that I posted on the other thread (here - second paragraph; here - my last comment) regarding this.

Greg, I posted information to several social science studies that indicate a lack of understanding of the PoP's. Did you miss it?

With regard to how clients are able to understand/use them, it is because we work intensely with them to understand them and adapt them into their business processes. They are motivated to understand, the public is not.

Mike
 
The SPC is not burning through R&D money and all waking hours to implement the watch probabilities - this stuff comes with pretty low overhead.

I realize that most people just want to know if they're going to be hit by a tornado or hailstorm, but we can't reliably tell them that in a deterministic (yes/no) sense right now. Hence, we use probabilities to express the uncertainty in our forecasts. The same thing can be accomplished with words, just not as cleanly.

Another misconception is that SPC products go straight to the public. Do your non-meteorologist family/friends frequent the SPC web site? The vast majority of SPC information reaches the public through various forms of the media. The TV/radio meteorologists translate products into terms they (the mets) believe will be more readily digestible for the public. I'm guessing that the direct "market" for SPC watch products is in the thousands or tens of thousands. However, those folks (meteorologists, EMs, etc.) are more weather savvy and occasionally want more than information than an unqualified "yes/no" forecast. Instead of simply knowing there's a tornado watch in effect, you can compare the probabilities to other watches you've seen and make a more informed judgement regarding particular courses of action.

As an SPC forecaster, I can't really make decisions for you because I don't know your needs/circumstances by the minute. Instead, my goal is to help you make decisions based on sound meteorological information. That includes the uncertainty in the forecast, which can be expressed in words or numbers. You choose the level of complexity you need in your forecast information, or you let someone else choose for you.
 
Instead of simply knowing there's a tornado watch in effect, you can compare the probabilities to other watches you've seen and make a more informed judgement regarding particular courses of action.

To get to that level - we need consistent and/or verifiable numbers though... A big issue popped up in the southeast this fall with a string of late morning / early afternoon TOR watches with VERY high F2+ probabilities and yet no tornadoes of any strength occurred. In the late afternoon a new forecaster was on shift, the odds decreased, but tornadoes actually occurred.

When I asked, it was explained that the first forecaster was really worried about a morning tornado event so the numbers were raised. Does that add value? How do you compare numbers when other forecasters (most?) looked at that event and gave it low odds for morning tornadoes?

That's the part that gets confusing. Since the odds seem pretty much "this is how I feel" where at least POP's have some form of science behind them, I'm not sure they are ready for prime-time.

Again maybe the outliers are the only ones that catch our attention, but unless the verification numbers start showing up it's hard to go on anything else.

- Rob
 
Instead of simply knowing there's a tornado watch in effect, you can compare the probabilities to other watches you've seen and make a more informed judgement regarding particular courses of action.

So, what do I do differently if there is a 50% chance of 1 or more F2 or greater tornadoes versus a 30% chance?

A hypothetical: Its a Friday evening. The Wichita TV guys have been saying all day that the chances of major severe weather that evening are pretty high. So, as a Kansas emergency manager, do I keep people on duty (on overtime, at the start of a weekend) when the watch comes in and says there is a 30% of 1 or more F2 or greater tornadoes (2004's WW227 [Greensburg] watch probabilities)? Had I been an EM that evening, I might have been inclined not to hold people on overtime on a Friday evening given what seems like a low number, especially when compared to what the TV guys were saying on the early newscasts (the watch was issued as the 6pm weathercasts were on the air).

I still believe that, 1) the demand for these products is not anywhere as high as you and Greg believe and 2), they have the potential to do more harm than good. It is the latter point that concerns me the most.
 
To get to that level - we need consistent and/or verifiable numbers though... A big issue popped up in the southeast this fall with a string of late morning / early afternoon TOR watches with VERY high F2+ probabilities and yet no tornadoes of any strength occurred. In the late afternoon a new forecaster was on shift, the odds decreased, but tornadoes actually occurred.

When I asked, it was explained that the first forecaster was really worried about a morning tornado event so the numbers were raised. Does that add value? How do you compare numbers when other forecasters (most?) looked at that event and gave it low odds for morning tornadoes?

That's the part that gets confusing. Since the odds seem pretty much "this is how I feel" where at least POP's have some form of science behind them, I'm not sure they are ready for prime-time.

Again maybe the outliers are the only ones that catch our attention, but unless the verification numbers start showing up it's hard to go on anything else.

- Rob

Rob,

I need to know the specific date you're writing about for me to comment on the trends in the watch numbers for that day.

I've dealt with the watch probabilities for a few years, and the tornado numbers are reliable. That means when we say 40%, it happens pretty close to 40% of the time. We also display skill - higher probabilities correspond to a greater rate of occurrence of tornadoes, and lower probabilities with a lesser rate of occurrence. The wind damage probabilities were less reliable 1-2 years ago (haven't seen anything from 2009 yet), with a tendency for the the values to be over-estimated by 10-20%.
 
Another misconception is that SPC products go straight to the public. Do your non-meteorologist family/friends frequent the SPC web site? The vast majority of SPC information reaches the public through various forms of the media. The TV/radio meteorologists translate products into terms they (the mets) believe will be more readily digestible for the public. I'm guessing that the direct "market" for SPC watch products is in the thousands or tens of thousands. However, those folks (meteorologists, EMs, etc.) are more weather savvy and occasionally want more than information than an unqualified "yes/no" forecast.

Like many other ST members, I've worked as the guy who is on the fringe between EMD, Skywarn, ARES/RACES, etc....I think Rich touches on multiple levels, all of which can be impacted by these probabilities in one way or another whether they know it or not.

Indeed, SPC outlooks are rarely seen by Joe Q. Public, are they not needed? I cannot tell you the number of times I've mentioned to JQP a PDS Watch is in effect..."What's the difference?"....why have two different designations? Tornado Emergency vs Tornado Warning...why?

BECAUSE there are those who are savvy, those who work to protect the public and/or disseminate information they can understand, those who NEED more than a "yes/no" forecast to do THEIR job. You can ask over and over for a magical threshold at which every EMD, Skywarn, FD/PD, or EMS organizations execute a different level of staging, response, or recovery to justify the product.....good luck. Ask them, however, if it influences their level of awareness or focus on a potential event.......I think you'll find a very favorable reason why the probabilities can be important.

I'm certainly not suggesting more hurdles, more of an ignorant question really....For these experimental products, are programs such as WAS*IS (or an EMD equivalent program) included in discussions regarding what products are chosen?
 
So, what do I do differently if there is a 50% chance of 1 or more F2 or greater tornadoes versus a 30% chance?

A hypothetical: Its a Friday evening. The Wichita TV guys have been saying all day that the chances of major severe weather that evening are pretty high. So, as a Kansas emergency manager, do I keep people on duty (on overtime, at the start of a weekend) when the watch comes in and says there is a 30% of 1 or more F2 or greater tornadoes (2004's WW227 [Greensburg] watch probabilities)? Had I been an EM that evening, I might have been inclined not to hold people on overtime on a Friday evening given what seems like a low number, especially when compared to what the TV guys were saying on the early newscasts (the watch was issued as the 6pm weathercasts were on the air).

I still believe that, 1) the demand for these products is not anywhere as high as you and Greg believe and 2), they have the potential to do more harm than good. It is the latter point that concerns me the most.

The most important line in your response was "had I been an EM that evening", which you weren't. What about WFO forecasters or private meteorologists? I respect that *you* don't see any practical difference between 30% and 50%, but what about 20% versus 60%, etc.? We'll eventually find a difference in values that matters to you, and that's the beauty of the probabilities. Otherwise, you're left with the only distinction being a few hundred "regular" tornado watches compared to a handful of PDS watches each year.

Your hypothetical example is also a bit unusual - the EM will somehow know about the tornado probs in the watch, yet he/she got all of their outlook information from another source? If this hypothetical EM also sees the tornado outlook probabilities, what would he/she have thought of the 15% and 10% SIG probabilities (boy, those seem awfully low ). The reality is that people adjust to the range of values they're used to seeing. 15% tornado probs are low in an absolute sense, but they seem to get plenty of attention on this discussion board because chasers quickly figure out those are not normal numbers. You can't turn around and say that they got excited solely because it was a MDT risk - you can get a MDT for hail/wind, and those outlooks don't generate quite the same buzz amongst the weather enthusiasts.

I may have no hope of convincing you that probabilities are useful, but that doesn't matter. You're free to put all of your emphasis on the public watch products as they have existed for 40+ years (and will continue into the foreseeable future). The probs are nothing more than additional information that you can choose to consider or ignore.
 
Like many other ST members, I've worked as the guy who is on the fringe between EMD, Skywarn, ARES/RACES, etc....I think Rich touches on multiple levels, all of which can be impacted by these probabilities in one way or another whether they know it or not.

Indeed, SPC outlooks are rarely seen by Joe Q. Public, are they not needed? I cannot tell you the number of times I've mentioned to JQP a PDS Watch is in effect..."What's the difference?"....why have two different designations? Tornado Emergency vs Tornado Warning...why?

BECAUSE there are those who are savvy, those who work to protect the public and/or disseminate information they can understand, those who NEED more than a "yes/no" forecast to do THEIR job. You can ask over and over for a magical threshold at which every EMD, Skywarn, FD/PD, or EMS organizations execute a different level of staging, response, or recovery to justify the product.....good luck. Ask them, however, if it influences their level of awareness or focus on a potential event.......I think you'll find a very favorable reason why the probabilities can be important.

I'm certainly not suggesting more hurdles, more of an ignorant question really....For these experimental products, are programs such as WAS*IS (or an EMD equivalent program) included in discussions regarding what products are chosen?

Tim,

Your response is exactly why we provide the probability info - it might help you make better decisions! Everyone faces different circumstances, thus it seems wise to have products that contain sufficient information to serve a variety of needs.

Sadly, we're just starting to get some interaction with some of the social scientists (WAS*IS, etc.). It's likely we can come up with better product designs and was of conveying information, but the watch probabilities have been driven largely by the meteorologists and the watch criteria.
 
Back
Top