How we build data you can trust—and why it matters for your models.
Most sports analytics platforms have a dirty secret: they don't tell you when their data was actually available. That metric showing team X had a 60% success rate in Week 5? It might include play-by-play that wasn't published until Tuesday. That "predictive" model? It might be using game outcomes to calculate the inputs.
This is called data leakage, and it's the silent killer of betting models. You build something that looks amazing in backtests, deploy it, and watch it fail in production. The problem wasn't your strategy—it was that your backtest was lying to you.
Tendency is built from the ground up to make data lineage explicit. Every metric, every feature, every data point carries metadata that tells you exactly where it came from and when it became available.
Every data point has an available_at timestamp that specifies exactly when this information became knowable. For play-by-play metrics, this is typically 3-4 hours after game completion. For odds data, this is the timestamp when the line was published.
Features are versioned and immutable. When we improve a metric's calculation, we create a new version—we don't silently update the old one. This means your backtests will always reproduce exactly, even years later.
Every metric includes a plain-English definition, source citation, and calculation methodology. No black boxes. If you want to know exactly how QB-VA is calculated, you can see the full specification.
When you run a backtest, we automatically verify that all features used have available_at timestamps before each simulated bet time. If there's any potential leakage, we flag it explicitly.
We believe in transparency about where our data comes from. Here are our primary sources:
| Source | Data Type | Timing |
|---|---|---|
| nflfastR | Play-by-play, EPA, success metrics | ~4 hours post-game |
| ESPN API | Schedules, scores, rosters | Real-time |
| The Odds API | Betting lines, market prices | Real-time (user key) |
| Pro Football Reference | Historical statistics | ~24 hours post-game |
Throughout the Tendency interface, you'll see "Trust Panels"—expandable sections that reveal the full lineage of any metric. Here's what each field means:
Definition
Plain-English explanation of what this metric measures and how to interpret it.
Source
The raw data source(s) used to calculate this metric.
Version
The specific version of the calculation methodology. Format: metric_vX.Y
available_at Rule
When this metric becomes available relative to game time. Critical for avoiding leakage.
No picks or predictions
We provide data and tools, not betting advice. Your edge comes from your analysis.
No guaranteed returns
Past performance doesn't guarantee future results. Even well-built models can lose.
No black-box models
Every metric is documented. If you can't understand it, you shouldn't bet on it.
Request access to start exploring research-grade football data.
Request access