NFL Big Data Bowl 2025 Prediction Competition
Project Overview
I competed in the NFL Big Data Bowl 2026 Prediction Competition, where the challenge was to forecast how key players (mainly the targeted receiver and nearby defenders) move after the quarterback releases the ball. Predictions were evaluated using root mean squared error (RMSE) in yards across short sequences of 5–30 frames at 10 Hz.
My solution was a complete end-to-end pipeline built in Python and PyTorch, from data cleaning to final visualizations.
What I Built
- Data Preprocessing
Cleaned and enriched the raw tracking data with football-specific features:- Direction normalization (flipping plays for consistent field view)
- Relative distances and angles to the ball landing spot
- Velocity and acceleration components (vx, vy, ax, ay)
- Time-to-throw countdown and frame progress
- Direction normalization (flipping plays for consistent field view)
- Model Architecture
A custom encoder-decoder transformer:- Encoder: Processes each player’s pre-throw history independently (with positional encoding and masking)
- Cross-player attention: Lets players “read” each other at the moment of release
- Decoder: Predicts future positions using smart conditioning on the player’s last state, ball landing spot, and small residual steps (cumulative for smooth, stable paths)
- Encoder: Processes each player’s pre-throw history independently (with positional encoding and masking)
- Training & Evaluation
Trained with a masked multi-component loss: position accuracy + velocity smoothness + extra emphasis on final position.
Evaluated using yard-based metrics: RMSE, ADE (average displacement error), FDE (final displacement error).
Added field visualizations (trajectory plots, error heatmaps) to show realistic movement and coverage insights.
Key Highlights & Impact
The model handles noisy, variable-length data reliably with careful masking, NaN fixes, and forced unmasking.
It captures real football behaviors—like receivers adjusting to the ball and defenders closing gaps—producing predictions that are not only accurate but also interpretable for scouting, film study, and analytics.
The result: Competitive performance on validation data and actionable insights into how players react in the critical moments after the throw.
Links:
Kaggle submission: Kaggle submission is no longer available
GitHub repository: GitHub Repo Link (full code, preprocessing pipeline, model architecture, training loop, evaluation, and visualizations)