Why We Built Our Own Model

Every other sports prop platform works the same way: they scrape DraftKings, store the odds, and show them to you with a prettier interface. You're not getting an edge — you're getting the book's number with a markup.

We built a different kind of tool. PropBetEdge runs a Poisson probability model on owned Statcast data and generates our own fair-value odds. The model doesn't know what DraftKings thinks. It only knows what the data says.

When our model says Aaron Judge has a 30% chance of hitting a HR today and DraftKings is offering +310 (implying 24.4%), that gap is the edge.

The core claim: When our Poisson model probability is higher than the odds-implied probability, you have a positive expected value bet. We surface that disagreement. You decide whether to act.

What Is a Poisson Distribution?

The Poisson distribution is a mathematical model for predicting the probability of a given number of discrete events occurring in a fixed interval, given a known average rate.

In simpler terms: if we know a player averages 0.35 home runs per game, the Poisson distribution tells us the probability of him hitting exactly 0, 1, 2, or more HRs in any single game.

// Poisson probability formula P(k events) = (λᵏ × e⁻λ) / k! // Where: λ = expected rate (e.g. 0.35 HRs per game) k = number of events we're testing (0, 1, 2...) e = Euler's number (2.71828...) // Example: P(Judge hits exactly 1 HR) λ = 0.35, k = 1 P(1) = (0.35¹ × e⁻⁰·³⁵) / 1! P(1) = (0.35 × 0.7047) / 1 P(1) = 0.2466 = 24.7% // P(Judge hits 1 or more HRs) = 1 - P(0) P(0) = e⁻⁰·³⁵ = 0.7047 P(≥1) = 1 - 0.7047 = 0.2953 = 29.5%

That 29.5% is our model's probability. The book offers +310 (24.4%). That's a +EV bet — our model sees more probability than the market is pricing.

The Data Pipeline

Poisson is only as good as the λ (lambda) input — the expected rate. This is where most models fail. They use simple counting stats (career HR rate, last 30 days) that don't capture the real underlying skill.

We use owned Statcast data — 287 active MLB batters and 200+ pitchers scraped nightly from Baseball Savant into our own Supabase database. No third-party data bill. We own the pipeline.

287
MLB Batters
34
Columns/Batter
200+
Pitchers
Daily Model Runs

Batter Inputs (34 columns)

MetricWhat It MeasuresWeight in Model
xSLGExpected slugging % based on contact qualityHigh — primary HR predictor
Barrel %% of batted balls with max exit velo + launch angleHigh — best contact quality metric
Exit VelocityAverage speed off the bat (mph)High — raw power signal
Max EVPeak exit velocity in sampleMedium — ceiling indicator
xBAExpected batting average by contact qualityMedium — hits model
xwOBAExpected weighted on-base averageMedium — total offensive value
Hard Hit %% of balls hit 95+ mphMedium — consistency metric
L7 SplitsStats over last 7 daysHigh — recent form
vs RHP/LHPPlatoon splits by pitcher handednessHigh — matchup adjustment
HR StreakRecent consecutive games with HRsLow — hot hand signal
EV SpikeL7 EV vs season avg — positive = trending hotMedium — form signal

Pitcher Inputs

MetricWhat It MeasuresUsed For
EV AllowedAverage exit velocity against the pitcherHR model — how hard is he getting hit?
Barrel Rate Allowed% of balls barreled againstHR/power model — giving up barrels?
SwStr%Swinging strike rate (whiff rate)K model — primary strikeout predictor
CSW%Called strikes + whiffs / total pitchesK model — overall stuff quality
Avg VelocityAverage fastball velocityStuff quality signal
Velo DropRecent vs season avg velocityFatigue/form signal
HR RateHRs per 9 inningsHR model — park-adjusted rate
Pitch Mix% fastball, breaking ball, offspeedK model — repertoire complexity

Park Factor Adjustment

The same batter hitting at Coors Field (park factor 120) vs. Petco Park (park factor 82) has a very different expected HR rate. Every lambda calculation is adjusted for the day's park factor — including real-time weather (wind speed, wind direction, temperature) fetched at each model run.

// Lambda calculation (simplified) λ_hr = base_hr_rate × xSLG_factor × barrel_factor × pitcher_ev_allowed_factor × park_factor × platoon_factor × weather_factor // Then Poisson gives P(≥1 HR) P(HR) = 1 - e^(-λ_hr) // Convert to American odds // If P = 0.295 (29.5%): Fair odds = (1/P - 1) × 100 = +239 // If DK is offering +310, that's value

The K-Score

For strikeout props, we surface a simplified K-Score (0–100) that combines everything the model knows about a pitcher's K upside into one number.

K-Score inputs: · SwStr% (swinging strike rate) — 30% weight · CSW% (called + whiff rate) — 20% weight · Stuff+ or velocity tier — 15% weight · Opposing lineup K rate — 20% weight · Umpire K tendency (above/below avg)— 10% weight · Recent form (L5 starts K/9) — 5% weight K-Score 80+ → Elite K prop candidate K-Score 70–79 → Strong consideration K-Score below 60 → Lean UNDER

When the Model Runs

The Poisson model runs as a Cloudflare Worker cron job 6 times daily:

8AM
Morning Open
10AM
Pre-lineup
12PM
Midday
2PM
Post-lineup
5PM
Pre-game
8PM
Evening

Each run fetches updated lineups, weather, and any late pitcher changes — then reruns every lambda calculation and publishes fresh odds to the /mlb/odds/model API endpoint.

Model Odds vs. Book Odds

PropBetEdge shows both our model odds and DraftKings/FanDuel book odds side by side. When they diverge significantly, that's the signal.

🎯
Reading the edge:
Model odds: +280 (our fair value)
Book odds: +340 (DK is offering more than fair value)
Book is mispriced. This is a +EV bet.

Model odds: +280 (our fair value)
Book odds: +190 (DK is offering less than fair value)
Book has the edge. Skip this one.

What the Model Doesn't Know

Honest disclosure — every model has blind spots:

The model is a probability engine, not a crystal ball. It surfaces edge over large samples. Individual game variance is real and expected.

Using the Model in Practice

  1. 1

    Check /mlb/odds/model/top

    This endpoint returns the day's top OVER value plays ranked by expected stat. Start here for the highest-confidence model picks.

  2. 2

    Compare to Book Odds

    Look for plays where our model odds are lower (better value) than the current DK/FD line. That gap is the stated edge.

  3. 3

    Check the Inputs

    For any pick, you can see the underlying Statcast data: barrel%, xSLG, EV. If the inputs make sense intuitively, the model call is stronger.

  4. 4

    Size Flat

    Don't bet more on a high-K-Score day because it "feels more certain." Flat betting (same amount per play) is the correct approach with a Poisson model. The edge compounds over volume, not individual bet size.

SEE IT IN ACTION

The model runs live at mlb.propbetedge.ai. $29/mo if you get access for $29/mo.

Open the Live Model →