An AI-powered FAQ assistant that answers questions from your own documents using Retrieval-Augmented Generation (RAG). Ask anything in plain English — Stella searches your knowledge base, retrieves the most relevant context, and generates a grounded answer with cited sources.
User asks a question
↓
Stella converts it to a semantic embedding
↓
ChromaDB searches for the most relevant document chunks
↓
Retrieved context is passed to the LLM (via Ollama)
↓
LLM generates a grounded answer with sources
↓
Answer displayed in Streamlit UI with typewriter effect
| Layer | Technology |
|---|---|
| Frontend | Streamlit |
| Backend | FastAPI + Python |
| LLM | Ollama (llama3.2:1b or any model) |
| Embeddings | LangChain FastEmbedEmbeddings |
| Vector DB | ChromaDB |
| Orchestration | Docker Compose |
CPU-only machines work but responses will be slow. Remove the
deploy.resourcesGPU block fromcompose.yamlif you don't have a GPU.
git clone https://github.com/yrarjun59/FAQ-Assistant.git
cd FAQ-Assistantcp .env.example .env
# edit .env to change models or API URLs if neededdocker compose up --buildThis will automatically:
- Build backend and streamlit images
- Start Ollama server
- Pull
llama3.2:1bmodels - Start the API and UI
http://localhost:8501
Copy .env.example to .env and edit as needed:
# Model selection
LLM_MODEL=llama3.2:1b
# API URLs (change only if running outside Docker)
OLLAMA_BASE_URL=http://ollama:11434
STELLA_API_URL=http://backend:8000Place your FAQ or documentation files inside backend/knowledge/:
backend/
└── knowledge/
├── doc1
├── doc2
delete the vector_db folder and only run app and ensure docs are in json format as in mine otherwise not ingesting.....
git pull
docker compose up --build