Skip to content

anmolg1997/NL2SQL-Engine

Repository files navigation

NL2SQL Engine logo

NL2SQL Engine

License: MIT Python 3.12+ React 18 SQLGlot FastAPI PostgreSQL Docker CI

Yes, you could hand-write SQL. But your business analysts can't, and they have questions.

NL2SQL Engine is an open-source pipeline that turns those questions into executable SQL against your databases. It pulls live schema metadata (not a stale export), retrieves similar past examples, asks an LLM for dialect-correct SQL with a short chain-of-thought, then validates with sqlglot, executes under timeouts and row caps, and retries when something breaks. A React 18 console ships beside the API so you can explore, run, explain, and manage connections without leaving the browser.

Repository: anmolg1997/NL2SQL-Engine · GitHub Wiki: GitHub Wiki · Extended docs: wiki/ in this repo

Tip

Start with Up and running in 5 minutes if you just want Chinook singing and rows on screen. Come back for architecture when you are wiring Snowflake or curating few-shots.


Why this exists

Most “text to SQL” demos stop at a green box in a slide deck. Production needs real catalogs, multiple dialects, guardrails, and a way to recover when the model invents a column named customer_id_maybe. This project optimizes for that boring middle: introspection + retrieval + validation + execution + correction, with Redis holding cache, examples, dynamic connections, and history.

Shipped connectors target PostgreSQL, Snowflake, and SQLite. The shape is deliberately plugin-friendly: implement DatabaseConnector, call register_connector_type, and your engine shows up in config.yaml and the Connections UI like everything else.


What you get (in one breath)

Multi-dialect SQL — First-class PostgreSQL, Snowflake, and SQLite; the dialect string flows into sqlglot parsing and LLM prompts. MySQL (and friends) follow the same DatabaseConnector protocol: add a plugin, register it, document params.

Self-correction — Failed validation or execution becomes an error string fed back into the LLM for another attempt, up to generation.max_retries.

Schema understanding — Auto-introspection (tables, columns, PK/FK, indexes, estimates, comments) plus optional column annotations (descriptions, tags) merged before the prompt is built.

Few-shot learning — Redis-stored NL/SQL pairs; embedding similarity (LiteLLM) with Jaccard fallback when embeddings are unavailable.

Query explanation — After a successful NL query, the UI calls /query/explain for markdown and step cards grounded in the same schema context as generation.

Safety — sqlglot AST checks, allowed/blocked operation lists, and text heuristics for risky patterns.

Ops-friendly — Docker Compose (API + web + Postgres + Redis), health endpoints, Redis-backed schema cache and dynamic connections.

Note

“Safety” here means syntax, statement class, and heuristics — not a substitute for database permissions, network policy, or your own review of generated SQL on sensitive data.


From question to grid

The happy path is a straight line; correction is the loop you hope you rarely see.

flowchart TD
  NL[Natural language question]:::primary
  CTX[Schema context builder]:::secondary
  FS[Few-shot retrieval]:::accent
  GEN[SQL generator LLM]:::primary
  VAL[SQL validator sqlglot + safety]:::secondary
  CORR[Correction loop]:::store
  EX[Sandboxed executor]:::primary
  RES[Rows + metadata]:::accent
  FMT[API / UI formatting]:::secondary

  NL --> CTX
  CTX --> FS
  FS --> GEN
  GEN --> VAL
  VAL -->|invalid| CORR
  CORR --> GEN
  VAL -->|valid| EX
  EX --> RES
  RES --> FMT

  classDef primary fill:#8b5cf6,stroke:#7c3aed,color:#fff
  classDef secondary fill:#64748b,stroke:#475569,color:#fff
  classDef accent fill:#f5f3ff,stroke:#8b5cf6,color:#4c1d95
  classDef store fill:#0f172a,stroke:#8b5cf6,color:#c4b5fd
Loading

At runtime, schema comes from SchemaDiscovery (connector introspection plus optional Redis cache), annotations from ColumnAnnotator, and examples from FewShotManager.


Up and running in 5 minutes

These steps assume you want the Chinook sample and a LiteLLM-compatible key (e.g. OpenAI).

1. Clone and copy environment

git clone https://github.com/anmolg1997/NL2SQL-Engine.git
cd NL2SQL-Engine
cp env.example .env

Edit .env: set LLM_API_KEY, DATABASE_URL, REDIS_URL (defaults in env.example point at local Postgres/Redis styled for Chinook).

2. Install everything the Makefile expects

make setup

You should see a Python venv under backend/.venv and npm install completing in frontend/ without errors.

3. Run the API

make dev-api

Terminal output should show Uvicorn listening on http://0.0.0.0:8000 (or your HOST/PORT). Open http://localhost:8000/docs for interactive OpenAPI.

4. Run the web console (second terminal)

make dev-web

Vite typically prints Local: http://localhost:5173/ — the dev server proxies API routes to port 8000.

Tip

One shell, two processes: make dev runs API and Vite together if you prefer not to juggle terminals.

5. Prove it with curl

curl -s -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{"connection":"chinook","question":"List all albums by the artist AC/DC"}' | jq .

Expect JSON with sql, validation (ok/errors/warnings), result (columns + rows when execution succeeds), and attempts (each try’s SQL, validation flags, and execution errors).

Docker shortcut

export LLM_API_KEY=sk-...
docker compose up --build

Default config.yaml defines a connection named chinook pointing at ${DATABASE_URL}. With Docker Postgres init scripts (docker/postgres/init/), the Chinook schema loads automatically when using Compose.

Important

At least one connector must connect at startup. If /health/ready returns degraded with no active connections, fix DSNs in config.yaml and watch logs for connector_startup_failed.


How the code is laid out

Python lives under backend/app/ (uvicorn uses PYTHONPATH=backend). The diagram is dense on purpose: it is the map when you are debugging “who called introspection?”

flowchart TB
  subgraph api [app/api]
    R1[routes/query]:::accent
    R2[routes/schema]:::accent
    R3[routes/connections]:::accent
    R4[routes/examples]:::accent
    R5[routes/history]:::accent
    R6[routes/health]:::accent
  end
  subgraph core [app/core]
    CFG[config.py Settings]:::secondary
    LLM[llm.py LiteLLMProvider]:::primary
    REG[registry.py]:::secondary
    EXC[exceptions.py]:::secondary
  end
  subgraph conn [app/connectors]
    BASE[base.py DatabaseConnector]:::primary
    PG[postgres.py]:::secondary
    SF[snowflake.py]:::secondary
    LQ[sqlite.py]:::secondary
    POOL[pool.py ConnectorPool]:::store
  end
  subgraph schema [app/schema]
    DISC[discovery.py]:::primary
    CACHE[cache.py SchemaCache]:::secondary
    MOD[models.py]:::secondary
    ANN[annotator.py ColumnAnnotator]:::accent
  end
  subgraph gen [app/generation]
    ENG[engine.py GenerationEngine]:::primary
    CTXB[context_builder.py]:::secondary
    SQLG[sql_generator.py]:::secondary
    EXP[explainer.py QueryExplainer]:::accent
  end
  subgraph val [app/validation]
    VSQL[validator.py SQLValidator]:::primary
    SAFE[safety.py SafetyChecker]:::secondary
  end
  subgraph exec [app/execution]
    SAN[executor.py SandboxedExecutor]:::primary
  end
  subgraph corr [app/correction]
    LOOP[CorrectionLoop]:::store
  end
  subgraph fs [app/fewshot]
    MGR[manager.py FewShotManager]:::primary
  end
  subgraph svc [app/services]
    CST[connection_store]:::secondary
    HST[history_store]:::secondary
  end
  api --> ENG
  ENG --> DISC
  ENG --> MGR
  ENG --> CTXB
  ENG --> SQLG
  ENG --> VSQL
  ENG --> SAN
  ENG --> LOOP
  SQLG --> LLM
  MGR --> LLM
  EXP --> LLM
  DISC --> POOL
  DISC --> CACHE
  SAN --> POOL
  R3 --> CST
  R5 --> HST
  PG & SF & LQ --> BASE

  classDef primary fill:#8b5cf6,stroke:#7c3aed,color:#fff
  classDef secondary fill:#64748b,stroke:#475569,color:#fff
  classDef accent fill:#f5f3ff,stroke:#8b5cf6,color:#4c1d95
  classDef store fill:#0f172a,stroke:#8b5cf6,color:#c4b5fd
Loading

Frontend (frontend/src/components/): QueryInput, SchemaExplorer, SQLPreview, ResultTable, QueryExplainer, ConnectionManager, QueryHistory — wired from App.tsx with Zustand state and Axios calls to the API.


Design choices (the short story)

Connectors as plugins. Databases disagree on protocols, catalogs, and error shapes. A single DatabaseConnector protocol (connect, introspect, execute, sample, explain, disconnect) keeps dialect code in postgres.py / snowflake.py / sqlite.py while ConnectorPool and SchemaDiscovery stay dumb and reusable. New engines register with register_connector_type and resolve through build_connector(type).

sqlglot before the database. Parsing with the target dialect catches syntax errors locally and yields a normalized SQL string from the AST before execution. That stabilizes what actually hits the wire.

Correction without a second model. When validation or execution fails, the same prompt template gets a Validation: … or Execution: … appendix. That is cheaper than bespoke repair parsers and works across dialects because database errors are often explicit.

Few-shots that sound like your users. Redis stores examples; at query time, cosine similarity on embeddings ranks them. If embeddings fail, Jaccard token overlap still produces an ordering — no hard dependency on the embedding API for the system to function.

Redis as the shared backstage. Schema snapshots (nl2sql:schema:{connection}), annotations, few-shot payloads and vectors, dynamic connections, and history all share one operational dependency. TTL on schema cache avoids hammering large warehouses; annotate and refresh paths invalidate or bypass when freshness matters.


Repository tree

NL2SQL-Engine/
├── backend/
│   ├── app/
│   │   ├── main.py              # FastAPI app, lifespan, routers
│   │   ├── state.py             # AppState container
│   │   ├── core/                # Settings, LLM, registry, exceptions
│   │   ├── connectors/          # postgres, sqlite, snowflake, pool
│   │   ├── schema/              # models, discovery, cache, annotator
│   │   ├── generation/          # engine, context, sql_generator, explainer
│   │   ├── validation/          # sqlglot validator + safety heuristics
│   │   ├── execution/           # SandboxedExecutor (executor.py)
│   │   ├── correction/        # CorrectionLoop
│   │   ├── fewshot/             # FewShotManager + Redis storage
│   │   ├── api/                 # routes + dependencies
│   │   └── services/            # Redis: connections, history
│   ├── tests/
│   ├── Dockerfile
│   └── requirements.txt
├── frontend/                    # Vite + React 18 + Tailwind
├── wiki/                        # Markdown wiki (mirror or supplement GitHub Wiki)
├── docs/assets/                 # Logo and future static assets
├── config.yaml                  # llm, connectors, generation, cache, execution
├── docker-compose.yml
├── Makefile
└── env.example

Configuration (where the knobs live)

Configuration merges config.yaml with environment variables (pydantic-settings). YAML strings may use ${VAR} or ${VAR:-default} interpolation after load.

Area Source Notes
LLM llm.* in YAML; LLM_* env Model, temperature, max tokens, embedding model, optional api_base
Connectors connectors: list Each entry: name, type (postgres / sqlite / snowflake), params
Generation generation.few_shot_k, max_retries, safety Safety lists drive SafetyChecker token scans
Schema cache schema_cache.ttl_seconds Redis TTL for serialized SchemaInfo
Execution execution.timeout_seconds, max_rows, allow_mutations Executor defaults; /query can override mutations per request

Full reference: wiki/Configuration.md.

Caution

execution.allow_mutations defaults to false. Even when you enable mutations, DDL stays blocked at the AST level; only DML passes when policy allows.


HTTP API (cheat sheet)

Method Path Description
GET / Service metadata + link to /docs
GET /health/live Liveness
GET /health/ready Redis ping; degraded if no connectors registered
POST /query NL → SQL → validate → execute; body: connection, question, optional allow_mutations
POST /query/explain Natural-language explanation + steps for given SQL
GET /schema/{connection} Full SchemaInfo; ?refresh=true bypasses cache
GET /schema/{connection}/tables Table list
POST /schema/{connection}/annotate Set column description/tags; invalidates schema cache
GET /connections List connections (secrets masked)
POST /connections Add connection (persisted in Redis)
PUT /connections/{name} Update
DELETE /connections/{name} Remove (blocked for names only in config.yaml)
POST /connections/test Test type + params without saving
GET /examples List few-shot examples
POST /examples Create
GET /examples/{id} Get one
PUT /examples/{id} Update
DELETE /examples/{id} Delete
GET /history Query history; ?q= search, ?limit=

Request/response examples: wiki/API-Reference.md. Interactive docs: http://localhost:8000/docs.


The React console

Component Role
QueryInput Natural language question and run action
SchemaExplorer Browse tables/columns from /schema/...
SQLPreview Generated SQL with editor styling
ResultTable Tabular results from /query
QueryExplainer Markdown + step cards from /query/explain
ConnectionManager CRUD + test for /connections
QueryHistory Local + server history patterns (store syncs with /history where used)

Set VITE_API_URL for production builds that call the API directly; leave empty in dev to use the Vite proxy.


Local development

make test      # pytest
make lint      # ruff check
make format    # ruff format

Documentation map


Contributing

Contributions are welcome: bug reports, connector plugins, docs, and tests. See wiki/Contributing.md for guidelines (Python style, PR checklist, where to add tests).


Part of the AI Engineering Portfolio

This repo is one piece of a larger production AI toolkit. Each project stands alone, but they're designed to work together.

Project What it does
Multi-Agent-AI-Framework Multi-agent orchestration with Google ADK — coordinator, planner, coder, reviewer
Enterprise-RAG-System Hybrid search RAG with guardrails, reranking, and Langfuse observability
SLM-From-Scratch Build language models from scratch — tokenizer, transformer, training, alignment
LLM-Finetuning-Toolkit LoRA/QLoRA fine-tuning with pluggable backends, YAML recipes, MLflow
Multi-LoRA-Serve Multi-adapter inference gateway — one base model, many LoRA adapters per request
LoRA-Factory Adapter lifecycle — train, evaluate, merge (TIES/DARE), version, and publish LoRAs
Domain-Adaptive-LLM Domain specialization for medical, legal, finance, code with safety guardrails

License

MIT License © 2025 Anmol Jaiswal


Author

Anmol JaiswalGitHub @anmolg1997

Focus: production AI systems — multi-agent orchestration, RAG, knowledge graphs, LLM observability, efficient inference, fine-tuning.


Redis keys (mental model)

Key / pattern Purpose
nl2sql:schema:{connection} Cached SchemaInfo JSON (TTL from schema_cache.ttl_seconds)
nl2sql:annotate:{connection}:{table}:{column} Column description/tags JSON
nl2sql:fewshot:data Redis hash: example id → FewShotExample JSON
nl2sql:fewshot:emb Redis hash: example id → embedding vector JSON
nl2sql:connections JSON array of dynamic ConnectorConfig (merged at startup with YAML)
nl2sql:history List (LPUSH) of query history entries, trimmed to 500

Example: add a few-shot pair

After the API is up and Redis is reachable:

curl -s -X POST http://localhost:8000/examples \
  -H "Content-Type: application/json" \
  -d '{
    "natural_language": "How many customers are in Brazil?",
    "sql": "SELECT COUNT(*) AS cnt FROM \"Customer\" WHERE \"Country\" = '\''Brazil'\'';",
    "dialect": "postgres",
    "tags": ["chinook", "count"]
  }' | jq .

Use the same dialect string your connection reports in SchemaInfo (e.g. postgres for PostgreSQL). The next /query will prefer examples whose dialect matches.


Example: dynamic SQLite connection

curl -s -X POST http://localhost:8000/connections \
  -H "Content-Type: application/json" \
  -d '{
    "name": "local_sqlite",
    "type": "sqlite",
    "params": { "path": "/absolute/path/to/app.db" }
  }' | jq .

Then: POST /query with "connection": "local_sqlite". Connections defined only in config.yaml cannot be deleted via API — edit the file instead.


Snowflake environment variables

When using the Snowflake connector, set (see env.example):

  • SNOWFLAKE_ACCOUNT, SNOWFLAKE_USER, SNOWFLAKE_PASSWORD
  • SNOWFLAKE_WAREHOUSE, SNOWFLAKE_DATABASE
  • SNOWFLAKE_SCHEMA (default PUBLIC)

Declare a connector in config.yaml with type: snowflake and params that match your deployment (see wiki/Connectors.md).


Reading attempts in /query responses

Each correction round appends an object shaped like:

{
  "index": 0,
  "sql": "SELECT ...",
  "validation_ok": false,
  "validation_errors": ["Parse error: ..."],
  "execution_error": null
}

On success, the last entry typically has validation_ok: true, execution_error: null, and the executor returns rows in result. If all retries fail, result may be null while sql holds the last generated statement — inspect validation_errors and execution_error for the failure chain.


Security and operations

  • Secrets: /connections masks password, private_key, and token in list responses; still protect Redis and .env like production credentials.
  • Mutations: Default execution.allow_mutations is false. Even when allow_mutations is true, validation must permit INSERT/UPDATE/DELETE; DDL remains blocked at the AST level.
  • Safety warnings: SafetyChecker emits warnings for suspicious text patterns; severe issues are enforced via AST classification and blocked-operation tokens.
  • Row limits: max_rows truncates result sets and sets truncated on QueryResult when applicable (connector-dependent).
  • CORS: Configure CORS_ORIGINS for your deployment; default includes local Vite and Docker web ports.

Troubleshooting

Symptom Check
ready = degraded, no connections At least one connector must connect at startup; verify config.yaml DSNs and logs for connector_startup_failed
Empty few-shot retrieval GET /examples — dialect filter must match schema dialect
Stale schema after DDL GET /schema/{connection}?refresh=true or restart after cache TTL
LLM returns prose, not SQL Prompt expects a ```sql fence or a line starting with SELECT/WITH; see LiteLLMProvider._extract_sql_block
Embedding errors Few-shot falls back to Jaccard; check LLM_API_KEY and embedding model name
Frontend cannot reach API VITE_API_URL for standalone builds; dev uses Vite proxy to port 8000

Roadmap ideas (not commitments)

Community PRs could add: MySQL connector, HTTP API for EXPLAIN using DatabaseConnector.explain, OAuth for Snowflake, streaming LLM responses, and RBAC per connection. The current codebase is intentionally small and composable to make those additions localized.


Where NL2SQL shines (use cases)

  • Business intelligence NL interface — Analysts ask “top 10 artists by track count last year” against a warehouse or reporting replica.
  • Database exploration for non-technical users — Schema explorer + explanations lower the barrier before exporting to CSV or a notebook.
  • Automated report generation — Fixed templates become parameterized NL prompts; validation and row caps reduce runaway queries.
  • Data warehouse querying — Snowflake connector + large-schema truncation in SchemaContextBuilder keep prompts within a practical character budget.

More scenarios: wiki/Use-Cases.md.

About

Natural language to SQL with live schema introspection, self-correction, few-shot learning, multi-dialect support (Postgres, Snowflake, SQLite) — FastAPI + React

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors