Back to Projects
Project Overview
This project builds a fully automated data pipeline that connects to the Strava API using OAuth 2.0,
fetches all of my activity data (runs, rides, hikes, and more), processes it with Python, and
visualizes the results with interactive Chart.js charts. A GitHub Actions cron job runs daily to
keep the data fresh, committing updated JSON directly to the repository so GitHub Pages serves
the latest stats automatically — no backend server required.
Objectives
- Demonstrate API integration skills with OAuth 2.0 authentication and paginated data retrieval.
- Build a Python data pipeline that transforms raw API responses into analysis-ready JSON.
- Automate the entire workflow with GitHub Actions for hands-free daily updates.
- Create interactive browser-based visualizations revealing activity patterns and trends.
Key Results
- Automated daily sync via GitHub Actions fetches and processes all Strava activities with zero manual effort.
- Python script handles OAuth token refresh, pagination, unit conversion, and data enrichment using only the standard library.
- 8 interactive Chart.js visualizations reveal patterns across activity types, distances, elevation, and timing.
- Data updates automatically — every new Strava activity appears on the dashboard within 24 hours.
Lifetime Stats
Best Activity
Activity Type Breakdown
Activities by Type
Type Distribution
Distance Over Time
Monthly Distance (Miles)
Elevation & Activity Patterns
Monthly Elevation Gain (Feet)
Activity Type by Month
Weekly & Pace Patterns
Day of Week Distribution
Running Pace Trend
Methodology
- Data Source: Strava API v3 with OAuth 2.0 authentication. Activities are fetched via paginated GET requests (200 per page) using a refresh token flow.
- Automation: A GitHub Actions cron job triggers daily at 6 AM UTC. The workflow refreshes the OAuth token, fetches new data, and commits the updated JSON file.
- Data Processing: Python script (standard library only — no pip dependencies) handles token refresh, pagination, unit conversion (meters to miles/feet, m/s to pace), and JSON output.
- Token Management: Strava rotates refresh tokens on each use. The script automatically updates the GitHub Actions secret with the new token using the
gh CLI.
- Visualization: Chart.js renders 8 interactive charts directly from the pre-processed JSON in the browser with no build tools or frameworks.
Skills Demonstrated
- API Integration: Working with OAuth 2.0 flows, token refresh, and paginated REST endpoints.
- Data Pipeline: Building a Python ETL script that transforms raw API responses into analysis-ready datasets.
- CI/CD Automation: Designing a GitHub Actions workflow for scheduled, unattended data synchronization.
- Data Visualization: Creating interactive browser-based charts with Chart.js and vanilla JavaScript.
- Web Development: Building a responsive, data-driven project page following modern HTML/CSS patterns.
Technologies Used
Python
JavaScript
Chart.js
Strava API
GitHub Actions
Data Viz
How It Works
- A GitHub Actions cron job triggers daily at 6 AM UTC (or on manual dispatch).
- The Python script refreshes the Strava OAuth access token using the stored refresh token.
- The script paginates through all activities from the Strava API, processing each record.
- Raw data is enriched with unit conversions, derived fields (pace, display times), and aggregate statistics.
- The processed JSON is written to
data/strava_activities.json and committed to the repository.
- GitHub Pages automatically serves the updated file, and the dashboard reflects the latest data.
GitHub Repository
View on GitHub