I am an AI/ML Engineering Leader and Architect specializing in local-first AI systems, LLM inference, RAG pipelines, and agentic search frameworks. Over the past decade, I’ve built production-grade AI platforms, led global engineering teams, and delivered high-impact systems across finance, enterprise software, and AI product development.
As Lead Architect of AI4All, I designed a modular, privacy-first AI ecosystem enabling offline LLM inference, ASR/TTS, semantic search, and multi-agent orchestration. My background includes building real-time Streaming Analytics Platforms, architecting distributed systems, and leading engineering teams across New York, London, and India. I thrive at the intersection of AI architecture, systems engineering, and product design, with a strong focus on deterministic inference, developer experience, and scalable, maintainable AI pipelines. I’m passionate about advancing privacy-first AI, building modular agent frameworks, and creating tools that empower developers, researchers, and creators to work with AI more naturally and efficiently.
AI architecture, LLM optimization, RAG systems, agentic workflows, distributed systems, cross-platform engineering, technical leadership, and delivering production-ready AI products.
C++, Python, Java, TypeScript, Dart/Flutter, and modern inference stacks including llama.cpp, CTranslate2, DuckDB, sherpa-onnx, and SearXNG.



