The Poisson Model Explained | PropBetEdge Learn

Why We Built Our Own Model

Every other sports prop platform works the same way: they scrape DraftKings, store the odds, and show them to you with a prettier interface. You're not getting an edge — you're getting the book's number with a markup.

We built a different kind of tool. PropBetEdge runs a Poisson probability model on owned Statcast data and generates our own fair-value odds. The model doesn't know what DraftKings thinks. It only knows what the data says.

When our model says Aaron Judge has a 30% chance of hitting a HR today and DraftKings is offering +310 (implying 24.4%), that gap is the edge.

⚡

The core claim: When our Poisson model probability is higher than the odds-implied probability, you have a positive expected value bet. We surface that disagreement. You decide whether to act.

What Is a Poisson Distribution?

The Poisson distribution is a mathematical model for predicting the probability of a given number of discrete events occurring in a fixed interval, given a known average rate.

In simpler terms: if we know a player averages 0.35 home runs per game, the Poisson distribution tells us the probability of him hitting exactly 0, 1, 2, or more HRs in any single game.

// Poisson probability formula P(k events) = (λᵏ × e⁻λ) / k! // Where: λ = expected rate (e.g. 0.35 HRs per game) k = number of events we're testing (0, 1, 2...) e = Euler's number (2.71828...) // Example: P(Judge hits exactly 1 HR) λ = 0.35, k = 1 P(1) = (0.35¹ × e⁻⁰·³⁵) / 1! P(1) = (0.35 × 0.7047) / 1 P(1) = 0.2466 = 24.7% // P(Judge hits 1 or more HRs) = 1 - P(0) P(0) = e⁻⁰·³⁵ = 0.7047 P(≥1) = 1 - 0.7047 = 0.2953 = 29.5%

That 29.5% is our model's probability. The book offers +310 (24.4%). That's a +EV bet — our model sees more probability than the market is pricing.

The Data Pipeline

Poisson is only as good as the λ (lambda) input — the expected rate. This is where most models fail. They use simple counting stats (career HR rate, last 30 days) that don't capture the real underlying skill.

We use owned Statcast data — 287 active MLB batters and 200+ pitchers scraped nightly from Baseball Savant into our own Supabase database. No third-party data bill. We own the pipeline.

287

MLB Batters

Columns/Batter

200+

Pitchers

6×

Daily Model Runs

Batter Inputs (34 columns)

Metric	What It Measures	Weight in Model
xSLG	Expected slugging % based on contact quality	High — primary HR predictor
Barrel %	% of batted balls with max exit velo + launch angle	High — best contact quality metric
Exit Velocity	Average speed off the bat (mph)	High — raw power signal
Max EV	Peak exit velocity in sample	Medium — ceiling indicator
xBA	Expected batting average by contact quality	Medium — hits model
xwOBA	Expected weighted on-base average	Medium — total offensive value
Hard Hit %	% of balls hit 95+ mph	Medium — consistency metric
L7 Splits	Stats over last 7 days	High — recent form
vs RHP/LHP	Platoon splits by pitcher handedness	High — matchup adjustment
HR Streak	Recent consecutive games with HRs	Low — hot hand signal
EV Spike	L7 EV vs season avg — positive = trending hot	Medium — form signal

Pitcher Inputs

Metric	What It Measures	Used For
EV Allowed	Average exit velocity against the pitcher	HR model — how hard is he getting hit?
Barrel Rate Allowed	% of balls barreled against	HR/power model — giving up barrels?
SwStr%	Swinging strike rate (whiff rate)	K model — primary strikeout predictor
CSW%	Called strikes + whiffs / total pitches	K model — overall stuff quality
Avg Velocity	Average fastball velocity	Stuff quality signal
Velo Drop	Recent vs season avg velocity	Fatigue/form signal
HR Rate	HRs per 9 innings	HR model — park-adjusted rate
Pitch Mix	% fastball, breaking ball, offspeed	K model — repertoire complexity

Park Factor Adjustment

The same batter hitting at Coors Field (park factor 120) vs. Petco Park (park factor 82) has a very different expected HR rate. Every lambda calculation is adjusted for the day's park factor — including real-time weather (wind speed, wind direction, temperature) fetched at each model run.

// Lambda calculation (simplified) λ_hr = base_hr_rate × xSLG_factor × barrel_factor × pitcher_ev_allowed_factor × park_factor × platoon_factor × weather_factor // Then Poisson gives P(≥1 HR) P(HR) = 1 - e^(-λ_hr) // Convert to American odds // If P = 0.295 (29.5%): Fair odds = (1/P - 1) × 100 = +239 // If DK is offering +310, that's value

The K-Score

For strikeout props, we surface a simplified K-Score (0–100) that combines everything the model knows about a pitcher's K upside into one number.

K-Score inputs: · SwStr% (swinging strike rate) — 30% weight · CSW% (called + whiff rate) — 20% weight · Stuff+ or velocity tier — 15% weight · Opposing lineup K rate — 20% weight · Umpire K tendency (above/below avg)— 10% weight · Recent form (L5 starts K/9) — 5% weight K-Score 80+ → Elite K prop candidate K-Score 70–79 → Strong consideration K-Score below 60 → Lean UNDER

When the Model Runs

The Poisson model runs as a Cloudflare Worker cron job 6 times daily:

8AM

Morning Open

10AM

Pre-lineup

12PM

Midday

2PM

Post-lineup

5PM

Pre-game

8PM

Evening

Each run fetches updated lineups, weather, and any late pitcher changes — then reruns every lambda calculation and publishes fresh odds to the /mlb/odds/model API endpoint.

Model Odds vs. Book Odds

PropBetEdge shows both our model odds and DraftKings/FanDuel book odds side by side. When they diverge significantly, that's the signal.

🎯

Reading the edge:
Model odds: +280 (our fair value)
Book odds: +340 (DK is offering more than fair value)
→ Book is mispriced. This is a +EV bet.

Model odds: +280 (our fair value)
Book odds: +190 (DK is offering less than fair value)
→ Book has the edge. Skip this one.

What the Model Doesn't Know

Honest disclosure — every model has blind spots:

Injury information not yet public — if a player is dealing with a nagging injury that isn't in the lineup data, the model doesn't adjust for it
In-game decisions — pitcher pull decisions, bullpen usage, intentional walks
Small sample noise — early season Statcast data has high variance until 100+ PA
Extreme weather events — heavy rain delay impacts beyond our weather API data

The model is a probability engine, not a crystal ball. It surfaces edge over large samples. Individual game variance is real and expected.

Using the Model in Practice

1
Check /mlb/odds/model/top
This endpoint returns the day's top OVER value plays ranked by expected stat. Start here for the highest-confidence model picks.
2
Compare to Book Odds
Look for plays where our model odds are lower (better value) than the current DK/FD line. That gap is the stated edge.
3
Check the Inputs
For any pick, you can see the underlying Statcast data: barrel%, xSLG, EV. If the inputs make sense intuitively, the model call is stronger.
4
Size Flat
Don't bet more on a high-K-Score day because it "feels more certain." Flat betting (same amount per play) is the correct approach with a Poisson model. The edge compounds over volume, not individual bet size.

SEE IT IN ACTION

The model runs live at mlb.propbetedge.ai. $29/mo if you get access for $29/mo.

Open the Live Model →

Why We Built Our Own Model

What Is a Poisson Distribution?

The Data Pipeline

Batter Inputs (34 columns)

Pitcher Inputs

Park Factor Adjustment

The K-Score

When the Model Runs

Model Odds vs. Book Odds

What the Model Doesn't Know

Using the Model in Practice

Check /mlb/odds/model/top

Compare to Book Odds

Check the Inputs

Size Flat