Methodology

How we build data you can trust—and why it matters for your models.

The Problem with Sports Data

Most sports analytics platforms have a dirty secret: they don't tell you when their data was actually available. That metric showing team X had a 60% success rate in Week 5? It might include play-by-play that wasn't published until Tuesday. That "predictive" model? It might be using game outcomes to calculate the inputs.

This is called data leakage, and it's the silent killer of betting models. You build something that looks amazing in backtests, deploy it, and watch it fail in production. The problem wasn't your strategy—it was that your backtest was lying to you.

Our Approach: Trust by Design

Tendency is built from the ground up to make data lineage explicit. Every metric, every feature, every data point carries metadata that tells you exactly where it came from and when it became available.

available_at Timestamps

Every data point has an available_at timestamp that specifies exactly when this information became knowable. For play-by-play metrics, this is typically 3-4 hours after game completion. For odds data, this is the timestamp when the line was published.

Versioned Feature Sets

Features are versioned and immutable. When we improve a metric's calculation, we create a new version—we don't silently update the old one. This means your backtests will always reproduce exactly, even years later.

Documented Definitions

Every metric includes a plain-English definition, source citation, and calculation methodology. No black boxes. If you want to know exactly how QB-VA is calculated, you can see the full specification.

Leakage Checks

When you run a backtest, we automatically verify that all features used have available_at timestamps before each simulated bet time. If there's any potential leakage, we flag it explicitly.

Data Sources

We believe in transparency about where our data comes from. Here are our primary sources:

Source	Data Type	Timing
nflfastR	Play-by-play, EPA, success metrics	~4 hours post-game
ESPN API	Schedules, scores, rosters	Real-time
The Odds API	Betting lines, market prices	Real-time (user key)
Pro Football Reference	Historical statistics	~24 hours post-game

The Trust UI

Throughout the Tendency interface, you'll see "Trust Panels"—expandable sections that reveal the full lineage of any metric. Here's what each field means:

Definition

Plain-English explanation of what this metric measures and how to interpret it.

Source

The raw data source(s) used to calculate this metric.

Version

The specific version of the calculation methodology. Format: metric_vX.Y

available_at Rule

When this metric becomes available relative to game time. Critical for avoiding leakage.

What We Don't Do

No picks or predictions

We provide data and tools, not betting advice. Your edge comes from your analysis.

No guaranteed returns

Past performance doesn't guarantee future results. Even well-built models can lose.

No black-box models

Every metric is documented. If you can't understand it, you shouldn't bet on it.

Ready to build models you can trust?

Request access to start exploring research-grade football data.

Request access