Skip to content
View saitejasrivilli's full-sized avatar

Highlights

  • Pro

Block or report saitejasrivilli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
saitejasrivilli/README.md

πŸ‘‹ Hi, I'm Sai Teja Srivillibhutturu

ML & Deep Learning Engineer | LLM Specialist | Cloud Architect

LinkedIn GitHub Google Scholar Portfolio


🎯 Professional Summary

ML & Deep Learning Engineer with expertise in GPU optimization, LLM inference, and cloud-native AI solutions. Achieved 12.3Γ— throughput improvement and 4Γ— memory reduction in production ML systems. Passionate about building scalable AI infrastructure and deploying cutting-edge models to production.


πŸ“Š GitHub Activity Timeline

GitHub Contribution Timeline

Consistent contributions across ML/AI projects from 2024-2026


πŸ› οΈ Technical Skills

πŸ€– ML/DL & LLMs

  • PyTorch, TensorFlow, JAX
  • Transformers, LangChain
  • RAG, Vector Databases
  • LoRA, QLoRA Fine-tuning
  • Inference Optimization
  • vLLM, Speculative Decoding

☁️ Cloud & Infrastructure

  • AWS (Certified Data Engineer)
  • Oracle Cloud (GenAI Certified)
  • Microsoft Fabric
  • Docker, Kubernetes
  • MLOps, CI/CD Pipelines
  • Distributed Systems

πŸ’» Software Engineering

  • Python, Go, C++
  • FastAPI, Flask
  • PostgreSQL, Neo4j
  • Redis, Kafka
  • System Design
  • Data Structures & Algorithms

πŸš€ Featured Projects

⭐ NEW: Advanced AI Agent System

Live Demo GitHub Strategies

Multi-Strategy AI Reasoning System implementing cutting-edge techniques from recent AI research papers (Chain-of-Thought, Tree-of-Thoughts, ReAct, Multi-Agent). Built with Groq LLM, Tavily Search, and ChromaDB.

Feature Description
πŸ”— Chain-of-Thought Step-by-step reasoning with self-consistency voting
🌳 Tree-of-Thoughts Multi-path exploration with beam search
⚑ ReAct Agent Reasoning + Acting with real web search
πŸ‘₯ Multi-Agent Planner β†’ Worker β†’ Critic collaboration
🧠 LLM Auto-Classifier Intelligent strategy routing based on task type

Research Papers Implemented: Chain-of-Thought (Wei et al.) β€’ Self-Consistency (Wang et al.) β€’ Tree of Thoughts (Yao et al.) β€’ ReAct (Yao et al.)


πŸ”₯ LLM & GPU Optimization

Project Description Impact
vllm-throughput-benchmark Comprehensive benchmarking suite for vLLM inference optimization 12.3Γ— throughput, 4Γ— memory reduction
gpu-optimization-mistral GPU memory optimization for Mistral model deployment Production-ready optimization
quantization-speculative-decoding-benchmark Speculative decoding implementation for faster inference Significant latency reduction
attention-optimization Custom attention mechanisms for efficient transformers Memory-efficient attention
LORA-implementation Low-Rank Adaptation for efficient fine-tuning Parameter-efficient training

πŸ€– AI Agents & Multi-Agent Systems

Project Description Tech Stack
ai-agent-system Multi-strategy AI reasoning with CoT, ToT, ReAct, Multi-Agent Groq, Tavily, ChromaDB
AdvancedLLMAgent Sophisticated LLM agent with tool use capabilities LangChain, RAG
Multi_Agent_Workflow_Automator Multi-agent orchestration system Agent frameworks
offline-rag-assistant Privacy-focused RAG system for offline deployment Vector DB, Embeddings

πŸ”¬ ML Systems & Production

Project Description Tech Stack
ai-video-analysis-system End-to-end video analysis with CV models PyTorch, CV
ComputerVision Computer vision algorithms and implementations OpenCV, Deep Learning
TeluguGPT Language model for Telugu language Transformers, NLP
TelecomGPT Domain-specific LLM for telecom industry Fine-tuning, Domain Adaptation

πŸ“Š Data Engineering & ML Pipelines

Project Description Tech Stack
DistributedKVStore Distributed key-value store implementation Go, Distributed Systems
end-to-end-data-engineering-project Complete data pipeline from ingestion to analytics ETL, Cloud
Collaborative_filtering_recommender_system Scalable recommendation engine Spark, ML
TelecomChurnPredictor Customer churn prediction system PySpark, ML

πŸ›‘οΈ AI Safety & Evaluation

Project Description Focus
Red-Teaming-Failure-Analysis-Mitigation LLM red teaming and safety evaluation AI Safety
Generative-Model-Safety-Evaluation-LLMs-Diffusion-Models Safety benchmarks for generative models Evaluation
llm-long-context-stress-test Long-context capability testing Benchmarking
simulation-planning-evaluation Planning capabilities evaluation Agent Evaluation

πŸ† Certifications


Dec 2024 - Dec 2027

Aug 2025 - Aug 2026

Jun 2024 - Jun 2026

Feb 2025

πŸŽ“ Specialized Training

Certification Issuer Date Key Skills
Advanced Large Language Model Agents UC Berkeley EECS Jul 2025 Inference-Time Reasoning, DPO, RAG, Multi-agent Systems, Neural-Symbolic AI
AI Evals for Everyone Aishwarya Naresh Reganti & Kiriti Badam Dec 2025 LLM Evaluation, Benchmarking
Agentforce Specialist Salesforce Jun 2025 Prompt Engineering, AI Agent Development
CodePath Technical Interview Prep CodePath May 2025 DSA, Competitive Programming
Neo4j Certified Professional Neo4j Jul 2024 Graph Databases, Cypher
Certified Data Scientist 365 Data Science Nov 2024 SQL, Deep Learning
Machine Learning in Production EDX Jun 2024 MLOps, Production ML

πŸ“ˆ Key Achievements

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸš€ 12.3Γ— Throughput Improvement    β”‚  πŸ’Ύ 4Γ— Memory Reduction          β”‚
β”‚  in LLM inference optimization      β”‚  through GPU optimization        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  πŸ… 6+ Cloud Certifications         β”‚  πŸ“¦ 40+ Public Repositories      β”‚
β”‚  AWS, Oracle, Microsoft, Salesforce β”‚  ML, LLM, Systems, Data Eng      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  πŸŽ“ UC Berkeley LLM Agents Course   β”‚  πŸ”¬ Research in AI Safety        β”‚
β”‚  Completed in Mastery Tier          β”‚  Red-teaming & Evaluation        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🎯 What I'm Working On

  • πŸ”­ Currently: Optimizing LLM inference pipelines for production deployment
  • 🌱 Learning: Advanced techniques in speculative decoding and KV-cache optimization
  • πŸ‘― Collaborating: Open-source ML infrastructure and AI safety projects
  • πŸ’¬ Ask me about: GPU optimization, LLM deployment, RAG systems, or scaling ML pipelines

πŸ“« Let's Connect!

I'm actively seeking opportunities in ML Engineering, Deep Learning, LLM/GenAI, and Cloud Architecture roles. Let's discuss how I can contribute to your team!

Email LinkedIn Google Scholar Portfolio


⭐ If you find my projects useful, consider giving them a star!

Pinned Loading

  1. Collaborative_filtering_recommender_system Collaborative_filtering_recommender_system Public

    A hybrid product recommendation system leveraging user-based, item-based, and SVD filtering, deployed with Streamlit for interactive UI.

    Python