Skip to content
How It Works 3 STAGES · CONCURRENT · SUB-2 MIN

From continent to pixel, concurrently.

Every storm in the country is watched, scored live, and the moment a hail signature emerges — segmented to the pixel while monitoring continues uninterrupted.

STAGE 01

Monitor everything.

Every storm in the country — tracked. Radar scans land every two minutes across the national network; each storm cell is matched, tagged, and followed from birth to dissipation.

continent-wide 2-min cadence always-on
MONITOR continuous coverage
hail 12%
hail 41%
hail 86%
STAGE 02

Score every storm, live.

Anvil assigns a calibrated hail probability to every tracked storm, refreshed every scan. Physics-trained and continuously validated against millions of verified ground-truth reports.

physics-trained 83 features calibrated
STAGE 03

Segment. Analyze every pixel.

The moment a storm crosses threshold it forks into deep analysis while monitoring continues uninterrupted. The storm volume is sliced into a pixel grid and every cell scored independently — probability, size, severity, trajectory — resolved to the meter. 47 columns per pixel, refreshed every scan.

per-pixel probability per-pixel size per-pixel trajectory
storm object
48 pixels · 47 cols each
DELIVERED

Hyperlocal Alerts

Per-address precision. Push, SMS, email, and webhook — dispatched the second a pixel lights up near you.

Permanent Archive

Every pixel saved forever. Queryable history for any location — events, damage paths, climatology.

Structured Data

Real-time feeds, historical exports, and webhook events delivered to the systems your business runs on.

Forward-Looking Risk

Hail outlooks, property-specific threat scores, and long-range risk — informed by pixel-level history.

Observing Network 5 SOURCES · CONTINENTAL · 2-MIN CADENCE

What Anvil sees.

Continental-scale multi-sensor fusion refreshed every two minutes — the raw observational surface the model is trained and scored on.

Radar NEXRAD · MRMS national network · 1 km mosaic
Numerical models HRRR 80+ variables · 3 km grid
Satellite · lightning GOES · GLM storm-top · electrical
Surface obs ASOS · mesonet ground-level context
Ground truth LSR · SPC · CoCoRaHS continuous training loop
Benchmark QA VALIDATED · 28,070 TRACKS · 8.8M STORMS

Anvil leads on every test that matters.

Head-to-head against NOAA/CIMSS ProbSevere v3 and NOAA's MESH radar baseline. 24 months of live data, 8.8 million tracked storms, 28,070 verified hail tracks.

1.6 ×
More hail caught
VS PROBSEVERE V3
87 %
Verified when we fire
MULTI-EVIDENCE PRECISION
4.3 ×
Cleaner in winter
MESH FP-PER-CATCH VS ANVIL
82 %
Giant hail detected
REPORTS ≥ 50 MM
Head-to-head ANVIL · PROBSEVERE · MESH

The receipts.

Track-level verification on the same 24-month window. One row per metric. Trophy marks where Anvil wins outright.

Metric Anvil Hail Sentinel ProbSevere v3 NOAA / CIMSS MESH NOAA physics baseline Verdict
Real hail caught 15,104 of 28,070 9,262 33% 12,195 43% 1.6× more than ProbSevere
Storms the model even scores 99% 34% 100% ProbSevere refuses to score 66%
Giant hail caught (≥ 2″) 82% 49% 67% 82 of every 100 caught
Hail only Anvil saw 6,082 storms 47% of ProbSevere's blind spots
Physically impossible CLEAN FALSE-POSITIVE TEST · 3 REGIMES

The false-positive test that can’t be gamed.

Fire rate in three regimes where hail is physically impossible — no CAPE, no growth zone, no cold-cloud ice. A fire here is a false positive regardless of whether anyone filed a report.

Fire rate per million tracks, measured only in environments where hail is meteorologically impossible. No growth zone, no updraft, no instability. A model firing here is firing on nothing — no underreporting caveat applies.

No instability
MUCAPE < 100 J/kg · weak mid-level lapse
1,083,382tracks in regime
Anvil171 /M
ProbSevere v3193 /M
MESH3,686 /M
MESH fires 21.6× more often than Anvil here
Tropical warm rain
Freezing level > 4.5 km · rain-dominant
1,192,799tracks in regime
Anvil286 /M
ProbSevere v31,076 /M
MESH3,633 /M
MESH fires 12.7× more often than Anvil here
Winter graupel
Surface < 5°C · WBZ < 2 km · no CAPE
71,692tracks in regime
Anvil70 /M
ProbSevere v30 /M
MESH6,193 /M
MESH fires 88.8× more often than Anvil here
The underreporting gap RADAR SAW IT · NO ONE REPORTED IT

Hail nobody files.

Report databases miss rural, overnight, and mountain hail. Grading recall against the radar's own polarimetric signature instead of filed reports shows how wide the reporting gap really is.

96%
Of 135,826 storm tracks that showed the textbook polarimetric hail signature, no report was ever filed.
That is the gap every raw “false alarm” number ignores. Rural hail, overnight hail, hail that melted before anyone drove out to see it — none of it makes the verification record. When Anvil fires on a storm and no one filed a report, the odds it was real hail are still enormous.
28,070
Filed reports over 24 months
LSR + SPC + CoCoRaHS, deduped, severe only (≥ 19 mm). For every reported hail track, the radar saw 4.8× more tracks with a textbook hail signature.
Lower bound
20%
strict report match
treats every unreported hailstorm as a false fire
Upper bound
87%
any independent evidence
match, ensemble, sustained MESH, or radar signature
Strict
report within 15 km · 10 min
Expanded
report within 50 km · 60 min
Corroborated
any independent evidence
Anvil
Hail Sentinel
77,469 fires
20%
68%
87%
ProbSevere v3
NOAA / CIMSS
60,605 fires
15%
43%
78%
MESH
NOAA radar baseline
97,709 fires
12%
34%
76%
What backs an Anvil fire
Share of Anvil's 77,469 firing tracks supported by each evidence type. Categories overlap — one fire can have multiple sources of backing.
20%
Strict report match
15 km / ±10 min
68%
Wide neighborhood
50 km / ±60 min
50%
Ensemble agreement
≥ 2 of 3 models fired
13%
Sustained MESH
≥ 5 scans ≥ 25.4 mm
25%
Polarimetric hail sig
Z↑ ZDR↓ RhoHV↓
The scenarios that break legacy systems COLD · MARGINAL · GIANT

Winter, small hail, catastrophic hail.

The three buckets where legacy models fail — MESH’s winter graupel confusion, the marginal bucket where most claims originate, and giant hail where a miss is a loss event. Anvil leads in all three.

Winter false alarms
Cold-season cost

Wrong fires per real catch, Nov–Mar. Where MESH confuses winter graupel for hail.

Anvil3.8×Lowest
ProbSevere v35.8×
MESH16.3×
Anvil fires 4.3× cleaner than MESH in winter
Marginal hail
19–32 mm catch rate

Sub-severe hail that still damages vinyl siding, vehicle paint, and roofing granules.

Anvil30%Highest
ProbSevere v320%
MESH24%
Anvil catches 6 pts more marginal hail than next best
Giant hail
≥ 50 mm catch rate

Baseball+ hail. Total-loss roof events. Recall is everything — miss one and the claim is a surprise.

Anvil82%Highest
ProbSevere v349%
MESH67%
82 of every 100 giant events detected
Marginal reports 19–25 mm “marble to quarter” 2,421 truth events
Anvil
30.0%
CSI 0.300
ProbSevere
19.7%
CSI 0.197
MESH
23.8%
CSI 0.238
Anvil’s discrimination quality is +26% vs the next-best model on this bucket
Severe reports 25–50 mm “quarter to golf ball” 22,070 truth events
Anvil
51.8%
CSI 0.518
ProbSevere
31.8%
CSI 0.318
MESH
41.7%
CSI 0.417
Anvil’s discrimination quality is +24% vs the next-best model on this bucket
Giant reports ≥ 50 mm “baseball+” 3,579 truth events
Anvil
82.2%
CSI 0.822
ProbSevere
49.2%
CSI 0.492
MESH
67.3%
CSI 0.673
Anvil’s discrimination quality is +22% vs the next-best model on this bucket

Each bucket shows two honest metrics side by side. POD (the bar) rewards catching real hail; CSI (the chip) penalises false alarms too — so a detector that fires on every pixel can’t game it. MESH edges Anvil on raw POD for marginal hail because it fires indiscriminately, but its CSI lags in every bucket. Anvil wins the composite metric at every size.

Lead time ANVIL FIRES FIRST · SHARED INTERSECTION

Anvil fires first.

On tracks where both models fire in advance, who gets to the call sooner. Apples-to-apples — no coverage bias.

AnvilvsProbSevere v3
83%of tracks, Anvil fires first or at the same scan
Anvil fires first
31%
Tied on same scan
51%
PSv3 fires first
18%
Across 3,634 storm tracks where both warned in advance — Anvil is never slower on 83% of them, with a +0.8 min average time advantage.
AnvilvsMESH
83%of tracks, Anvil fires first or at the same scan
Anvil fires first
21%
Tied on same scan
62%
MESH fires first
17%
Across 4,700 storm tracks where both warned in advance — Anvil matches MESH's radar-speed detection on 83% of tracks, while firing ~50% fewer false alarms in the same window (see the false-alarm cost panel above).
Lead time on the shared set — 2,981 tracks where all three models warned in advance
Anvil15.7min avg
ProbSevere15.2min avg
MESH14.8min avg
7,247tracks Anvil warned in advance
4,818tracks PSv3 warned in advance
6,451tracks MESH warned in advance
28,070total tracks producing hail

For every storm track that produced a verified hail report, we measured when each model first crossed the alerting threshold and compared it to the moment hail reached the ground. Comparing only on tracks where both models warned in advance removes the selection bias from per-model averages — PSv3's raw lead average is inflated by the fact that it mostly fires on long-lived, organised storms.

Feature Architecture 83 FEATURES · 6 DOMAINS · ~3× PROBSEVERE V3

How Anvil sees.

83 engineered features across six domains — more than three times the 25-feature surface of ProbSevere v3. Each domain captures a different physical signature of a hail-producing storm.

Feature domains
I — VI · 83 TOTAL
I · Radar foundations raw observables polarimetric + reflectivity + motion
II · Microphysical discrimination hail vs rain vs graupel learned signal combinations
III · 3D convective architecture updraft structure vertical extent + organization
IV · Environmental context HRRR state instability · shear · moisture
V · Spatiotemporal tracking motion + lifecycle intensification derivatives
VI · Ground-truth learning closed feedback loop LSR · SPC · CoCoRaHS · damage
Research paper in preparation — 2026