ML & Deep Learning Engineer with expertise in GPU optimization, LLM inference, and cloud-native AI solutions. Achieved 12.3Γ throughput improvement and 4Γ memory reduction in production ML systems. Passionate about building scalable AI infrastructure and deploying cutting-edge models to production.
Consistent contributions across ML/AI projects from 2024-2026
|
|
|
Multi-Strategy AI Reasoning System implementing cutting-edge techniques from recent AI research papers (Chain-of-Thought, Tree-of-Thoughts, ReAct, Multi-Agent). Built with Groq LLM, Tavily Search, and ChromaDB.
| Feature | Description |
|---|---|
| π Chain-of-Thought | Step-by-step reasoning with self-consistency voting |
| π³ Tree-of-Thoughts | Multi-path exploration with beam search |
| β‘ ReAct Agent | Reasoning + Acting with real web search |
| π₯ Multi-Agent | Planner β Worker β Critic collaboration |
| π§ LLM Auto-Classifier | Intelligent strategy routing based on task type |
Research Papers Implemented: Chain-of-Thought (Wei et al.) β’ Self-Consistency (Wang et al.) β’ Tree of Thoughts (Yao et al.) β’ ReAct (Yao et al.)
| Project | Description | Impact |
|---|---|---|
| vllm-throughput-benchmark | Comprehensive benchmarking suite for vLLM inference optimization | 12.3Γ throughput, 4Γ memory reduction |
| gpu-optimization-mistral | GPU memory optimization for Mistral model deployment | Production-ready optimization |
| quantization-speculative-decoding-benchmark | Speculative decoding implementation for faster inference | Significant latency reduction |
| attention-optimization | Custom attention mechanisms for efficient transformers | Memory-efficient attention |
| LORA-implementation | Low-Rank Adaptation for efficient fine-tuning | Parameter-efficient training |
| Project | Description | Tech Stack |
|---|---|---|
| ai-agent-system | Multi-strategy AI reasoning with CoT, ToT, ReAct, Multi-Agent | Groq, Tavily, ChromaDB |
| AdvancedLLMAgent | Sophisticated LLM agent with tool use capabilities | LangChain, RAG |
| Multi_Agent_Workflow_Automator | Multi-agent orchestration system | Agent frameworks |
| offline-rag-assistant | Privacy-focused RAG system for offline deployment | Vector DB, Embeddings |
| Project | Description | Tech Stack |
|---|---|---|
| ai-video-analysis-system | End-to-end video analysis with CV models | PyTorch, CV |
| ComputerVision | Computer vision algorithms and implementations | OpenCV, Deep Learning |
| TeluguGPT | Language model for Telugu language | Transformers, NLP |
| TelecomGPT | Domain-specific LLM for telecom industry | Fine-tuning, Domain Adaptation |
| Project | Description | Tech Stack |
|---|---|---|
| DistributedKVStore | Distributed key-value store implementation | Go, Distributed Systems |
| end-to-end-data-engineering-project | Complete data pipeline from ingestion to analytics | ETL, Cloud |
| Collaborative_filtering_recommender_system | Scalable recommendation engine | Spark, ML |
| TelecomChurnPredictor | Customer churn prediction system | PySpark, ML |
| Project | Description | Focus |
|---|---|---|
| Red-Teaming-Failure-Analysis-Mitigation | LLM red teaming and safety evaluation | AI Safety |
| Generative-Model-Safety-Evaluation-LLMs-Diffusion-Models | Safety benchmarks for generative models | Evaluation |
| llm-long-context-stress-test | Long-context capability testing | Benchmarking |
| simulation-planning-evaluation | Planning capabilities evaluation | Agent Evaluation |
|
Dec 2024 - Dec 2027 |
Aug 2025 - Aug 2026 |
Jun 2024 - Jun 2026 |
Feb 2025 |
| Certification | Issuer | Date | Key Skills |
|---|---|---|---|
| Advanced Large Language Model Agents | UC Berkeley EECS | Jul 2025 | Inference-Time Reasoning, DPO, RAG, Multi-agent Systems, Neural-Symbolic AI |
| AI Evals for Everyone | Aishwarya Naresh Reganti & Kiriti Badam | Dec 2025 | LLM Evaluation, Benchmarking |
| Agentforce Specialist | Salesforce | Jun 2025 | Prompt Engineering, AI Agent Development |
| CodePath Technical Interview Prep | CodePath | May 2025 | DSA, Competitive Programming |
| Neo4j Certified Professional | Neo4j | Jul 2024 | Graph Databases, Cypher |
| Certified Data Scientist | 365 Data Science | Nov 2024 | SQL, Deep Learning |
| Machine Learning in Production | EDX | Jun 2024 | MLOps, Production ML |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π 12.3Γ Throughput Improvement β πΎ 4Γ Memory Reduction β
β in LLM inference optimization β through GPU optimization β
βββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ€
β π
6+ Cloud Certifications β π¦ 40+ Public Repositories β
β AWS, Oracle, Microsoft, Salesforce β ML, LLM, Systems, Data Eng β
βββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ€
β π UC Berkeley LLM Agents Course β π¬ Research in AI Safety β
β Completed in Mastery Tier β Red-teaming & Evaluation β
βββββββββββββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββ
- π Currently: Optimizing LLM inference pipelines for production deployment
- π± Learning: Advanced techniques in speculative decoding and KV-cache optimization
- π― Collaborating: Open-source ML infrastructure and AI safety projects
- π¬ Ask me about: GPU optimization, LLM deployment, RAG systems, or scaling ML pipelines
I'm actively seeking opportunities in ML Engineering, Deep Learning, LLM/GenAI, and Cloud Architecture roles. Let's discuss how I can contribute to your team!
