sqlglot, in C++. 45 dialects, 126× faster on benchmark SQL, 252× on the kind your ORM generates when nobody's looking. Performance gap scales with query complexity, see Benchmarks.
Supports stored procedures (PL/pgSQL, T-SQL, MySQL, PL/SQL): where sqlglot falls back to passthrough, libsqlglot parses them into the AST.
Anywhere SQL parsing meets the hot path: proxies, sidecars, migration, linters, Git pre-commit hooks, CI/CD pipelines. Also a replacement for every regex that's pretending to be a parser.
Because life is too short to parse SQL in Python.
Inspired by the original sqlglot, which did the decade-long work of mapping 31+ SQL dialects into an elegant, universal AST. libsqlglot does the comparatively trivial work of compiling it: the algorithm was already O(n), the runtime wasn't.
- Functionality
- Quickstart
- Differences from original sqlglot
- Building
- Architecture
- Benchmarks
- Examples
- Supported SQL dialects
- Contributing
- Licence
Transpiles SQL dialects via sqlglot AST. Full stored procedure support. Fail-fast errors with exact line and column. Python bindings available at 95-98% of C++ speed (import libsqlglot as sqlglot and go).
Handles the full SQL surface: SELECT, INSERT, UPDATE, DELETE, CREATE TABLE, ALTER TABLE, DROP TABLE, TRUNCATE, MERGE, plus stored procedures (CALL, RETURN, DECLARE, IF/ELSEIF/ELSE, WHILE, FOR loops). Also handles CTEs, window functions, subqueries, and various JOIN types.
Compatible with 45 dialects (see Supported SQL dialects for the full list).
#include <libsqlglot/transpiler.h>
using namespace libsqlglot;
// Transpile between dialects (MySQL → PostgreSQL)
std::string output = Transpiler::transpile(
"SELECT `id`, `name` FROM `users` LIMIT 10",
Dialect::MySQL,
Dialect::PostgreSQL
);
// Returns: SELECT "users"."id", "users"."name" FROM "users" LIMIT 10
// Parse SQL into AST
Arena arena;
auto stmt = Transpiler::parse(arena, "SELECT name FROM users WHERE age > 18");
// Returns: AST with SelectStmt node
// Optimise AST (column qualification, predicate pushdown, constant folding)
Transpiler::optimize(arena, stmt);
// Modifies AST: name → users.name, age → users.age
// Generate SQL from AST
std::string sql = Transpiler::generate(stmt, Dialect::PostgreSQL);
// Returns: SELECT users.name FROM users WHERE users.age > 18See Supported SQL dialects for all available Dialect:: values.
Available on PyPI: pip install libsqlglot
System requirements: Linux with glibc 2.35+ (Ubuntu 22.04+, Debian 12+, RHEL 9+)
import libsqlglot as sqlglot
# Transpile
sql = "SELECT `id`, `name` FROM `users` LIMIT 10"
result = sqlglot.transpile(sql, read="mysql", write="postgres")
result = sqlglot.transpile(sql, "mysql", "postgres") # Both styles work
# Parse
stmt = sqlglot.parse_one(sql) # single statement
stmts = sqlglot.parse("SELECT 1; SELECT 2") # multiple statements
# Generate
sql = stmt.sql() # default ANSI
sql = stmt.sql(dialect="postgres", pretty=True) # pretty-print
sql = stmt.sql(dialect="sqlserver") # TRUE → 1
# AST traverse
columns = stmt.find_all(sqlglot.ExprType.COLUMN)
tables = stmt.find_all(sqlglot.ExprType.TABLE_REF)
stmt.walk(lambda n: print(n.type))
# Optimise
optimized = sqlglot.optimize(stmt)
# Builder
stmt = (sqlglot.select(["id", "name"])
.from_("users")
.where("active = TRUE")
.order_by("name")
.limit(10))
# Diff
diff = sqlglot.diff("SELECT id FROM users", "SELECT user_id FROM users")libsqlglot can be used in Unix pipelines via the Python CLI:
# Basic transpilation
echo "SELECT \`id\` FROM \`users\`" | python3 -m libsqlglot -r mysql -w postgres
# Output: SELECT "id" FROM "users"
# Pipeline with grep
cat queries.sql | python3 -m libsqlglot -r mysql -w bigquery | grep "SELECT"
# Process multiple files
cat *.sql | python3 -m libsqlglot -r sqlserver -w postgres > output.sql
# Combine with other tools
find . -name "*.sql" -exec cat {} \; | python3 -m libsqlglot -r mysql -w postgres | wc -lThe CLI reads SQL from stdin and writes transpiled SQL to stdout, making it composable with standard Unix tools.
See Supported SQL dialects for all available dialect names.
Python API: parse(), parse_one(), generate(), transpile(), optimize(), diff(), .sql(), .find_all(), .walk(), select() builder.
Performance: 95-98% of C++ speed. Overhead is ~125ns per call (10ns function call + 100ns string marshal + 15ns misc). On typical 2.5μs parse, this is 5% overhead. On complex 25μs parse, 0.5% overhead.
Memory: Thread-local arenas, no runtime dependencies.
| libsqlglot | Python sqlglot | |
|---|---|---|
| Performance | 32-242× faster (126× avg) | Baseline |
| Stored procedures | Support for PL/pgSQL, T-SQL, MySQL, PL/SQL | Limited (exp.Command passthrough) |
| Error handling | Fail-fast with precise errors (line, column, context) | Error recovery (IDE-friendly, slower) |
| Memory | Arena allocation (O(1) cleanup) | Garbage collection |
| Optimiser | Column qualification, predicate pushdown, constant folding, subquery elimination | Same + additional passes + full execution engine |
| Codebase | Header-only C++26 library | 50,000+ lines Python |
| Keywords | C++26 reflection: auto-generated from enum (300+ keywords, zero maintenance) | Manually maintained dictionaries |
| Binary | C++ library 14KB, Python wheel 1.5MB | N/A |
| Dialects | 45 SQL dialects (14 unique to libsqlglot, including ANSI) | 32 dialects (including PRQL, which libsqlglot doesn't support) |
| SQL coverage | Same as Python sqlglot | Full SQL support |
| Dependencies | None (no runtime deps) | None |
Requires C++26 (GCC 14+ with -freflection) and CMake 3.21+.
C++26 features used:
- Keyword reflection (
std::meta): Auto-generates 300+ keyword mappings fromTokenTypeenum at compile time. Zero maintenance, impossible to desync. - Dialect reflection: Build-time code generation parses the
Dialectenum and generates compile-time mappings (CMake → Python script → generated header). When GCC fixes the reflection bug, will switch to pure C++26 reflection. - ExprType bindings: Python bindings auto-generated from the C++
ExprTypeenum (124 values). Runpython3 scripts/generate_expr_type_bindings.py > src/python/expr_type_bindings_generated.hafter modifyingexpression.h. - Advanced constexpr: Perfect hash tables, compile-time string processing.
The easiest way to build with GCC trunk + reflection support:
# Build the project (first build takes 30-45 min to compile GCC trunk)
docker compose -f docker/docker-compose.yml run --rm build
# Run tests
docker compose -f docker/docker-compose.yml run --rm test
# Build Python wheel
docker compose -f docker/docker-compose.yml run --rm wheel
# Development shell
docker compose -f docker/docker-compose.yml run --rm devSee docker/README.md for full documentation.
Requires GCC trunk built from source with -freflection support:
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build -j$(nproc)
sudo cmake --install buildpip install -e .cmake -B build -DCMAKE_BUILD_TYPE=Release \
-DLIBSQLGLOT_BUILD_TESTS=ON \
-DLIBSQLGLOT_BUILD_BENCHMARKS=ON
cmake --build build -j$(nproc)
ctest --test-dir buildCompiled sizes (stripped, -O3): C++ library 14KB, Python extension 2.0MB.
Code quality: Compiles with -Wall -Wextra -Wpedantic -Werror. No runtime dependencies. No RTTI. Passes 27,040 assertions across 361 test cases. Fuzz-tested with libFuzzer + AddressSanitizer.
Compiler optimisations (Release builds): Link-Time Optimisation (LTO), aggressive inlining, constant folding, constant merging, symbol visibility optimisation. All enabled by default with -DCMAKE_BUILD_TYPE=Release.
Architecture-specific builds: For maximum performance on your specific CPU, build with native architecture optimisations:
cmake -B build -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-march=native"
cmake --build buildThis enables CPU-specific instructions (AVX2, AVX-512, etc.) for your exact processor, typically 5-15% faster than generic builds.
Benchmarking: Comprehensive benchmark suite available. Build with -DLIBSQLGLOT_BUILD_BENCHMARKS=ON to measure performance on your workload.
Header-only design: you only pay for what you use. 19 header files, no .cpp. See include/libsqlglot/ for the full layout. Core files: parser.h (4203 lines), generator.h (2149), expression.h (1385, 115 expression types). Entry point is transpiler.h (86 lines).
Arena allocation: all AST nodes allocated in contiguous chunks, freed together in O(1) time. String interning deduplicates identifiers. Tokenisation is zero-copy via string_view. Everything uses RAII, no manual delete calls.
Grammar definitions from multiple sources (ANTLR and normalised JSON specs) are unified into a canonical IR, then compiled into a cache-efficient LUT used by the runtime parser.
Statements: SELECT, INSERT, UPDATE, DELETE, MERGE, CREATE TABLE, ALTER TABLE, DROP TABLE, DROP INDEX, TRUNCATE, CREATE VIEW, DROP VIEW, CREATE SCHEMA, DROP SCHEMA
Stored procedures: CALL, RETURN, DECLARE, IF/ELSEIF/ELSE/END IF, WHILE/DO/END WHILE, FOR/IN/LOOP/END LOOP
Clauses: WHERE, GROUP BY, HAVING, ORDER BY, LIMIT, OFFSET, QUALIFY
JOINs: INNER, LEFT, RIGHT, FULL, CROSS, NATURAL, ASOF
Advanced: CTEs (WITH RECURSIVE), window functions (PARTITION BY, ORDER BY, ROWS/RANGE, named windows), subqueries, PIVOT/UNPIVOT
Operators: Arithmetic, comparison, logical, LIKE, ILIKE, IS NULL, BETWEEN, IN, EXISTS
Expressions: CASE WHEN, arrays, array indexing, set operations (UNION, INTERSECT, EXCEPT)
Transactions: BEGIN, COMMIT, ROLLBACK, SAVEPOINT
Utilities: SET, DESCRIBE, SHOW, EXPLAIN
361 test cases, 27,040 assertions, all passing.
cd build
ctest --output-on-failure
# Run specific test suites
./tests/libsqlglot_tests "[parser]"
./tests/libsqlglot_tests "[security]"
./tests/libsqlglot_tests "[stored_procedures]"
./tests/libsqlglot_tests "[performance]"
# Run benchmarks
./benchmarks/bench_tokenizer
./benchmarks/bench_parser
./benchmarks/bench_transpiler27,040 assertions covering SQL injection, buffer overflow, stack overflow (recursion depth at 256, adjustable via Parser::kMaxRecursionDepth in parser.h), memory corruption (arena prevents use-after-free and double-free), integer overflow, and encoding attacks (UTF-8 identifiers rejected, UTF-8 string literals accepted). All pass.
There is a libFuzzer target in fuzzing/fuzz_parser.cpp:
cd fuzzing
clang++ -fsanitize=fuzzer,address -std=c++26 -freflection -I../include fuzz_parser.cpp -o fuzz_parser
./fuzz_parser -max_len=10000 -timeout=10Test compatibility with Python sqlglot's test fixtures:
cd compat
python sqlglot_compat.py --sqlglot-path /path/to/sqlglot --limit 1000libsqlglot provides precise, actionable error messages with exact locations:
Arena arena;
Parser parser(arena, "SELECT FROM users"); // Missing column list
// Throws: Line 1, column 8: Expected column list after SELECT (found: 'FROM')All parse errors include line number, column number, the actual token that caused the error, and a human-readable explanation of what was expected. Fail-fast, no cascading secondary errors.
// MySQL -> PostgreSQL
std::string mysql_sql = "SELECT `user_id`, `name` FROM `users` LIMIT 10";
std::string pg_sql = Transpiler::transpile(
mysql_sql, Dialect::MySQL, Dialect::PostgreSQL
);
// Output: SELECT "users"."user_id", "users"."name" FROM "users" LIMIT 10Arena arena;
auto stmt = arena.create<SelectStmt>();
// SELECT *
stmt->columns.push_back(arena.create<Star>());
// FROM users
stmt->from = arena.create<TableRef>("users");
// WHERE age > 18
auto age_col = arena.create<Column>("age");
auto threshold = arena.create<Literal>("18");
stmt->where = arena.create<BinaryOp>(ExprType::GT, age_col, threshold);
std::string sql = Generator::generate(stmt);
// "SELECT * FROM users WHERE age > 18"Arena arena;
auto stmt = Transpiler::parse(arena,
"SELECT name FROM users WHERE age > 18");
// Apply optimisation passes
Transpiler::optimize(arena, stmt);
// Optimisations include:
// - Column qualification (name -> users.name)
// - Predicate pushdown
// - Constant folding
// - Subquery elimination
std::string sql = Transpiler::generate(stmt);
// "SELECT users.name FROM users WHERE users.age > 18"Arena arena;
std::string plpgsql = R"(
FOR i IN 1..100 LOOP
IF i > 50 THEN
RETURN i;
END IF
END LOOP
)";
auto stmt = Transpiler::parse(arena, plpgsql);
std::string output = Transpiler::generate(stmt, Dialect::PostgreSQL);// Real-world analytics query: CTEs, window functions, multiple JOINs
std::string mysql_query = R"(
WITH regional_sales AS (
SELECT
region,
product_id,
SUM(amount) as total_sales,
COUNT(order_id) as order_count,
ROW_NUMBER() OVER (PARTITION BY region ORDER BY SUM(amount) DESC) as sales_rank
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY region, product_id
HAVING SUM(amount) > 10000
),
top_regions AS (
SELECT region
FROM regional_sales
WHERE sales_rank <= 5
GROUP BY region
HAVING COUNT(*) > 3
)
SELECT
r.region,
r.product_id,
p.product_name,
r.total_sales,
r.order_count,
RANK() OVER (ORDER BY r.total_sales DESC) as overall_rank,
LAG(r.total_sales) OVER (ORDER BY r.total_sales DESC) as prev_sales
FROM regional_sales r
INNER JOIN products p ON r.product_id = p.id
INNER JOIN top_regions t ON r.region = t.region
WHERE r.sales_rank <= 10
ORDER BY r.total_sales DESC
LIMIT 100
)";
// MySQL → BigQuery
std::string bigquery = Transpiler::transpile(
mysql_query, Dialect::MySQL, Dialect::BigQuery
);
// Full semantic preservation: CTEs, window functions, JOINs all intact
// Round-trip: MySQL → BigQuery → PostgreSQL → Snowflake → MySQL
std::string step1 = Transpiler::transpile(mysql_query, Dialect::MySQL, Dialect::BigQuery);
std::string step2 = Transpiler::transpile(step1, Dialect::BigQuery, Dialect::PostgreSQL);
std::string step3 = Transpiler::transpile(step2, Dialect::PostgreSQL, Dialect::Snowflake);
std::string back_to_mysql = Transpiler::transpile(step3, Dialect::Snowflake, Dialect::MySQL);
// Query semantics preserved across 4 dialect conversions
// Simple transformations: PostgreSQL → SQL Server
std::string pg_query = "SELECT * FROM users WHERE active = TRUE LIMIT 10";
Arena arena;
Parser parser(arena, pg_query);
auto stmt = parser.parse_select();
std::string sql_server = Generator::generate(stmt, Dialect::SQLServer);
// Result: SELECT TOP 10 * FROM users WHERE active = 1
// LIMIT → TOP, TRUE → 1
// Multi-dialect pipeline: Parse once, generate for multiple targets
Arena shared_arena;
auto ast = Transpiler::parse(shared_arena, "SELECT name FROM users WHERE age > 18");
std::string postgres_sql = Generator::generate(ast, Dialect::PostgreSQL);
std::string mysql_sql = Generator::generate(ast, Dialect::MySQL);
std::string duckdb_sql = Generator::generate(ast, Dialect::DuckDB);
std::string snowflake_sql = Generator::generate(ast, Dialect::Snowflake);
// Single parse, multiple outputs - efficient for multi-target scenariosBenchmarks run on x86-64 Linux with -O3 optimisation. libsqlglot compared against pure Python sqlglot 30.0.1. Python bindings add ~125ns overhead (95-98% of C++ performance).
What we measure: Full parse + generate round-trip (SQL → AST → SQL). No optimisation applied in either implementation. Both produce identical output, proving identical work done. Apples-to-apples comparison of parser and generator performance.
Measurement: std::chrono::high_resolution_clock with 1000 iterations per query, averaged.
The 16 standard queries are sqlglot's benchmark. The 8 stress tests are ours, excluded from the average. They're here to show the scaling doesn't stop: 178.6× on benchmarks, 252× on these. What happens past 252× is left as an exercise for the reader.
| Query | sqlglot (μs) | libsqlglot (μs) | Speedup |
|---|---|---|---|
| many_ctes | 1,097.84 | 4.53 | 242.4× |
| many_joins | 1,038.03 | 5.81 | 178.7× |
| nested_functions | 892.91 | 6.27 | 142.4× |
| nested_subqueries | 495.97 | 2.79 | 177.8× |
| many_unions | 2,585.04 | 17.30 | 149.4× |
| tpch | 1,110.16 | 7.74 | 143.4× |
| complex_where | 575.69 | 4.27 | 134.8× |
| many_windows | 933.16 | 13.00 | 71.8× |
| deep_arithmetic | 388.10 | 2.86 | 135.7× |
| many_columns | 1,643.41 | 14.81 | 110.9× |
| values | 15,951.28 | 162.90 | 97.9× |
| large_case | 4,480.21 | 35.30 | 126.9× |
| large_in | 11,756.63 | 116.38 | 101.0× |
| many_numbers | 6,463.57 | 67.71 | 95.5× |
| short | 118.61 | 1.54 | 77.0× |
| large_strings | 146.92 | 4.54 | 32.4× |
Average: 126.1× faster (range: 32.4× to 242.4×). A million queries: 29 seconds vs 52 minutes.
libsqlglot achieves this through:
- Memory: Arena allocation (O(1) cleanup), string interning (pointer equality), zero-copy tokenisation (
string_view) - Algorithms: Perfect hash keyword lookup (O(1)), branchless uppercase conversion (no branch misprediction)
- Compiler: LTO (whole-program optimisation), aggressive inlining, constant folding, C++23
constexpr - Cache: Contiguous memory layout, spatial locality, no per-node
malloc/newfragmentation
Designed to break parsers: 15-level nested CTEs, 35-level CASE expressions, 100+ WHERE conditions, 20-way joins. 1,186 lines of SQL, 51KB total.
| Query | sqlglot (μs) | libsqlglot (μs) | Speedup |
|---|---|---|---|
| Correlated subqueries | 14,400.86 | 60.60 | 237.6× |
| Complex subqueries | 18,414.37 | 73.18 | 251.6× |
| Complex string functions | 15,609.81 | 71.42 | 218.6× |
| Deep CTE nesting | 17,214.85 | 82.36 | 209.0× |
| Union chains | 24,075.57 | 115.77 | 208.0× |
| Complex WHERE clause | 14,499.12 | 81.99 | 176.8× |
| Deep CASE nesting | 14,651.62 | 103.10 | 142.1× |
| Multi-table joins | 13,862.49 | 141.16 | 98.2× |
Average: 192.7× faster (range: 98.2× to 251.6×). A million queries: 86 seconds vs 4.6 hours.
# Python sqlglot: 3,917 μs
parsed = sqlglot.parse_one("SELECT * FROM users WHERE age > 18")
output = parsed.sql() # "SELECT * FROM users WHERE age > 18"// libsqlglot: 45 μs (87× faster)
Arena arena;
auto ast = Transpiler::parse(arena, "SELECT * FROM users WHERE age > 18");
auto output = Transpiler::generate(ast); // "SELECT * FROM users WHERE age > 18"See benchmarks/bench_complete_comparison.py to reproduce.
45 dialects with full parse and generation support. Use Dialect::Name in C++ or "name" strings in Python (e.g., read="mysql", write="postgres").
Note: Dialect names are case-sensitive (e.g., Dialect::PostgreSQL, not Dialect::postgresql).
Each dialect includes proper identifier quoting, keyword handling, function name translation, and syntax transformations (e.g. LIMIT vs TOP vs FETCH FIRST).
If somehow a dialect is missing from the map, get_features() returns ANSI SQL defaults.
| Database | Dialect | Supported features |
|---|---|---|
| ANSI SQL | ANSI | Standard SQL compliance: ANSI joins, standard aggregations, CTEs |
| BigQuery | BigQuery | STRUCT types, ARRAY literals [1,2,3], SAFE_CAST, INT64/STRING types, nested field access |
| Calcite | Calcite | TABLESAMPLE BERNOULLI, Apache Calcite optimiser hints |
| ClickHouse | ClickHouse | Column-oriented syntax, MergeTree engines, SAMPLE BY, ARRAY JOIN |
| DB2 | DB2 | FETCH FIRST n ROWS ONLY, DB2 stored procedures, OLAP functions |
| Drill | Drill | Schema-free JSON queries, nested data access, FLATTEN |
| DuckDB | DuckDB | QUALIFY clause, ASOF joins, PIVOT/UNPIVOT, macro functions, LIST type |
| Hive | Hive | PARTITIONED BY, CLUSTERED BY, SerDe formats, Hive UDFs |
| MySQL | MySQL | Backtick identifiers, MySQL-specific functions, storage engines |
| Oracle | Oracle | CONNECT BY hierarchical queries, PRIOR, START WITH, DUAL table, PL/SQL blocks |
| Phoenix | Phoenix | HBase integration: SALT_BUCKETS, ARRAY_APPEND, UPSERT VALUES |
| Pinot | Pinot | Real-time OLAP: segment pruning, star-tree indexes, broker queries |
| PostgreSQL | PostgreSQL | RETURNING, ON CONFLICT, LATERAL joins, window functions, array types, JSONB |
| Presto | Presto | APPROX_DISTINCT, UNNEST, ROW types, lambda functions |
| Snowflake | Snowflake | FLATTEN for JSON, VARIANT type, TIME_TRAVEL, CLUSTER BY, RESULT_SCAN |
| Solr | Solr | score() relevance function, faceted search, Lucene query syntax |
| Spark | Spark | NULL-SAFE equality <=>, Hive metastore, broadcast hints, cache table |
| SQL Server | SQLServer | T-SQL syntax: TOP, IDENTITY, OUTPUT clause, EXEC, GO batches, temp tables # |
| SQLite | SQLite | Minimal SQL: no RIGHT JOIN, PRAGMA commands, autoincrement |
| Teradata | Teradata | MULTISET tables, BTEQ syntax, FastLoad/MultiLoad hints |
These dialects inherit features from a compatible base dialect and add specific extensions.
| Database | Dialect | Inherits from | Supported features |
|---|---|---|---|
| Athena | Athena | Presto | AWS Athena Presto syntax, S3 partitioning |
| CockroachDB | CockroachDB | PostgreSQL | UPSERT statement, distributed transactions, RETURNING clause |
| Databricks | Databricks | Spark | OPTIMIZE tables, ZORDER BY clustering, Delta Lake operations |
| Doris | Doris | MySQL | DUPLICATE KEY model, BUCKETS distribution, DISTRIBUTED BY HASH |
| Dremio | Dremio | Presto | CREATE REFLECTION for materialisation, data lakehouse queries |
| Druid | Druid | MySQL | TIME_FLOOR for time bucketing, approximate aggregations, roll-up |
| Dune | Dune | PostgreSQL | Analytics: bytearray_to_uint256, ETH address functions |
| Exasol | Exasol | PostgreSQL | DISTRIBUTE BY for parallel execution, Lua scripting UDFs |
| Fabric | Fabric | SQL Server | Three-part lakehouse.schema.table naming, OneLake integration |
| Greenplum | Greenplum | PostgreSQL | DISTRIBUTED BY/RANDOMLY, column/append-optimised tables, GPORCA optimiser |
| Impala | Impala | Hive | COMPUTE STATS, Kudu integration, CACHED IN pools |
| MariaDB | MariaDB | MySQL | MySQL-compatible with RETURNING, window functions, JSON functions |
| Materialize | Materialize | PostgreSQL | TAIL for streaming results, materialised views, temporal filters |
| Netezza | Netezza | PostgreSQL | DISTRIBUTE ON distribution keys, zone maps, statistics |
| Redshift | Redshift | PostgreSQL | DISTKEY distribution, SORTKEY ordering, SUPER type (JSON) |
| RisingWave | RisingWave | PostgreSQL | EMIT CHANGES for streaming, temporal joins, watermarks |
| SingleStore | SingleStore | MySQL | VECTOR type for embeddings, DOT_PRODUCT, columnstore/rowstore |
| Spark2 | Spark2 | Spark | Legacy Spark 2.x: CACHE TABLE, broadcast joins, RDD compatibility |
| StarRocks | StarRocks | MySQL | Vectorised execution, primary key model, materialised views, bitmap indexes |
| Tableau | Tableau | PostgreSQL | ZN() null-to-zero, Tableau calculation functions, RAWSQL passthrough |
| Teradata | Teradata | Teradata | MULTISET tables, BTEQ syntax, FastLoad/MultiLoad hints |
| TiDB | TiDB | MySQL | AUTO_RANDOM for distributed primary keys, MySQL compatibility, TiKV storage |
| TimescaleDB | TimescaleDB | PostgreSQL | time_bucket() for time-series, hypertables, continuous aggregates |
| Trino | Trino | Presto | Presto-compatible: UNNEST, lambda expressions, ROW types |
| Vertica | Vertica | PostgreSQL | CREATE PROJECTION for physical design, SEGMENTED BY HASH, columnar storage |
| YugabyteDB | YugabyteDB | PostgreSQL | SPLIT INTO n TABLETS, distributed SQL, PostgreSQL compatibility |
libsqlglot is currently a solo project. Bug reports, test cases, and dialect edge cases are welcome via GitHub issues. If you have a dialect you wish to see added, please open an issue or PR.
If a query parses incorrectly, or a dialect transformation that produces wrong output, please open an issue with the input SQL, source dialect, target dialect, expected output and any other pertinent details.
Pull requests are considered but there is no guarantee of merge. The codebase is intentionally small and opinionated.
No.
Apache 2.0