Monitor everything.
Every storm in the country — tracked. Radar scans land every two minutes across the national network; each storm cell is matched, tagged, and followed from birth to dissipation.
Purpose-built from scratch — 83 engineered features, continental scale, 2-minute cadence. Trained and validated against 24 months of live data and 8.8 million tracked storms.
Every storm in the country is watched, scored live, and the moment a hail signature emerges — segmented to the pixel while monitoring continues uninterrupted.
Every storm in the country — tracked. Radar scans land every two minutes across the national network; each storm cell is matched, tagged, and followed from birth to dissipation.
Anvil assigns a calibrated hail probability to every tracked storm, refreshed every scan. Physics-trained and continuously validated against millions of verified ground-truth reports.
The moment a storm crosses threshold it forks into deep analysis while monitoring continues uninterrupted. The storm volume is sliced into a pixel grid and every cell scored independently — probability, size, severity, trajectory — resolved to the meter. 47 columns per pixel, refreshed every scan.
Per-address precision. Push, SMS, email, and webhook — dispatched the second a pixel lights up near you.
Every pixel saved forever. Queryable history for any location — events, damage paths, climatology.
Real-time feeds, historical exports, and webhook events delivered to the systems your business runs on.
Hail outlooks, property-specific threat scores, and long-range risk — informed by pixel-level history.
Continental-scale multi-sensor fusion refreshed every two minutes — the raw observational surface the model is trained and scored on.
Track-level verification on the same 24-month window. One row per metric. Trophy marks where Anvil wins outright.
| Metric | Anvil Hail Sentinel | ProbSevere v3 NOAA / CIMSS | MESH NOAA physics baseline | Verdict |
|---|---|---|---|---|
| Real hail caught | 15,104 of 28,070 | 9,262 33% | 12,195 43% | 1.6× more than ProbSevere |
| Storms the model even scores | 99% | 34% | 100% | ProbSevere refuses to score 66% |
| Giant hail caught (≥ 2″) | 82% | 49% | 67% | 82 of every 100 caught |
| Hail only Anvil saw | 6,082 storms | — | — | 47% of ProbSevere's blind spots |
Fire rate in three regimes where hail is physically impossible — no CAPE, no growth zone, no cold-cloud ice. A fire here is a false positive regardless of whether anyone filed a report.
Fire rate per million tracks, measured only in environments where hail is meteorologically impossible. No growth zone, no updraft, no instability. A model firing here is firing on nothing — no underreporting caveat applies.
Report databases miss rural, overnight, and mountain hail. Grading recall against the radar's own polarimetric signature instead of filed reports shows how wide the reporting gap really is.
The three buckets where legacy models fail — MESH’s winter graupel confusion, the marginal bucket where most claims originate, and giant hail where a miss is a loss event. Anvil leads in all three.
Wrong fires per real catch, Nov–Mar. Where MESH confuses winter graupel for hail.
Sub-severe hail that still damages vinyl siding, vehicle paint, and roofing granules.
Baseball+ hail. Total-loss roof events. Recall is everything — miss one and the claim is a surprise.
Each bucket shows two honest metrics side by side. POD (the bar) rewards catching real hail; CSI (the chip) penalises false alarms too — so a detector that fires on every pixel can’t game it. MESH edges Anvil on raw POD for marginal hail because it fires indiscriminately, but its CSI lags in every bucket. Anvil wins the composite metric at every size.
On tracks where both models fire in advance, who gets to the call sooner. Apples-to-apples — no coverage bias.
For every storm track that produced a verified hail report, we measured when each model first crossed the alerting threshold and compared it to the moment hail reached the ground. Comparing only on tracks where both models warned in advance removes the selection bias from per-model averages — PSv3's raw lead average is inflated by the fact that it mostly fires on long-lived, organised storms.
83 engineered features across six domains — more than three times the 25-feature surface of ProbSevere v3. Each domain captures a different physical signature of a hail-producing storm.