Skip to content
View ivanrivasgr's full-sized avatar
😉
😉
  • Spiro.ai
  • Corpus Christi, Texas
  • LinkedIn in/ifrg

Block or report ivanrivasgr

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ivanrivasgr/README.md

👋 Hi, I'm Ivan Gruber

Data Engineer · Analytics Pipelines · Sports Data · BI & Visualization
Remote · Corpus Christi, TX

LinkedIn GitHub Email


🚀 About Me

Data Engineer with 5+ years designing and operating analytics pipelines on GCP and AWS, with a focus on sports data infrastructure, real-time event processing, and cloud-based ETL systems.

Built production pipelines handling live MLB game feeds at Sportradar/Synergy Sports. At Vikua, delivered GCP analytical models that cut time-to-insight by 45%, maintained 99.7% pipeline uptime, and reduced cloud compute costs by 18% across 6 client environments.

Currently completing an MIT MicroMaster in Statistical Modeling & Computation. Fluent in English and Spanish.

💡 My focus: turning technical execution into measurable business impact.


⚙️ Tech Stack

Languages: Python, SQL, Ruby
Cloud Platforms: GCP (BigQuery, Cloud Composer, Cloud Storage), AWS (S3, Redshift), Azure SQL
ETL & Orchestration: Airflow, Tray.io, Zapier, REST APIs, Pandas, Terraform
BI & Visualization: Power BI, Plotly, Streamlit, Zoho Analytics
Sports Data: Sportradar Platform, Statcast, pybaseball, Pitch-by-pitch Tracking
Other: DuckDB, Parquet, GitHub Actions, Pytest


📊 Featured Projects

⚾ MLB Baseball Data Warehouse

A multi-season MLB analytics platform built on a Bronze/Silver/Gold medallion architecture. Ingests real data from FanGraphs and Baseball Savant, transforms it through DuckDB, and serves it via an interactive Streamlit dashboard.

Live Demo →
🔗 github.com/ivanrivasgr/baseball-data-warehouse


⚽ Soccer Data Platform

Production-style sports data platform on a full Bronze/Silver/Gold architecture: raw GPS tracking ingestion, validation, Parquet transformation, and a player analytics layer. Includes CI/CD via GitHub Actions, Apache Airflow DAG, and Terraform provisioning a 3-layer AWS S3 data lake.

Live Demo →
🔗 github.com/ivanrivasgr/soccer-data-platform-demo


☁️ GCP Data Architecture with PII Anonymization

Fully automated PII-safe data pipeline on GCP integrating multiple heterogeneous sources into a unified Master User Model using Bronze/Silver/Gold layering in BigQuery. SHA256 hashing and boolean masking for sensitive fields, orchestrated with Cloud Composer/Airflow.

🔗 github.com/ivanrivasgr/gcp_data_architecture_project


🤖 Ruby Dropbox File Automation

Automated pipeline in Ruby that detects uploaded CSV files in Dropbox, cleans and transforms the data, and routes them to destination folders. Includes file mapping logic, CSV validation, and scheduled execution.

🔗 github.com/ivanrivasgr/ruby_dropbox_file_automation-


🧠 Interests

  • Sports data infrastructure & real-time event pipelines
  • Cloud data architecture & orchestration (Airflow, Terraform)
  • BI automation & dashboard design
  • Statistical modeling & predictive analytics

📬 Contact

📧 ivanfgruber@gmail.com
🌐 linkedin.com/in/ifrg


"Architecture is not about storing data — it's about how data flows to create value."

Pinned Loading

  1. ruby_dropbox_file_automation- ruby_dropbox_file_automation- Public

    Automated workflow that reads files from Dropbox, transforms CSVs (cleaning and formatting data), and sends them to a data pipeline — fully serverless and powered by Ruby + Cron scheduling.

    Ruby 1

  2. financial_analytics_construction_projects financial_analytics_construction_projects Public

    End-to-end financial analytics project for construction companies. Built SQL models and interactive Metabase dashboards to track income, expenses, and profitability across multiple projects.