1.187 Embedded Databases#

Survey of embedded databases — in-process databases without a separate server. Covers SQL embedded (SQLite, DuckDB), key-value stores (RocksDB, LevelDB, LMDB), and language-specific options (BadgerDB, bbolt for Go, TinyDB for Python).


Explainer

Embedded Databases: A Database You Can Carry in Your Pocket#

The Problem With “Normal” Databases#

When most developers think of a database, they picture something like PostgreSQL or MySQL. To use one of these databases, you follow a process something like this: install the database server software on a machine, start a daemon process that listens on a network port, create a user account, configure access permissions, and then write connection code in your application that opens a network socket to the server and sends queries across that connection.

This arrangement has a name: the client-server model. Your application is the client. The database is the server. They are separate processes, often on separate machines, communicating over a network.

The client-server model is powerful. It allows many applications to share the same database. It allows the database to scale independently of the applications that use it. It allows the database administrator to tune, monitor, and backup the database separately from the application deployment. For large web applications where hundreds of application instances all need to read and write the same data, the client-server model is the right architecture.

But it comes with overhead. The network connection introduces latency. The separate process requires infrastructure to run. The configuration involves credentials, connection strings, and firewall rules. Starting up a test suite that needs a database means either spinning up a real server (slow, fragile, hard to parallelize) or mocking the database (brittle, incomplete).

Not every application needs all of this.


A Different Architecture: In-Process Databases#

An embedded database takes a completely different approach. Instead of running as a separate server process, an embedded database is a library — a piece of code that runs inside your application’s process.

Your application imports the library, opens a file, and the database is running. No daemon. No port. No connection string. No network. When your application exits, the database stops. The data persists in the file you opened, and the next time your application starts, it opens that file and the data is there.

Think of it as the difference between a server and a tool. PostgreSQL is a server you deploy and connect to. SQLite — the most widely deployed embedded database — is a tool you use, the way you use a compression library or a JSON parser.

This is why people sometimes say “SQLite is just a file.” It is not literally true (SQLite is a sophisticated database engine), but it captures something real: the entire database lives in a single file on your disk. You can copy that file to a USB drive, email it to someone, open it with a dozen different tools, and the data is there. There is no server to set up on the other end.


When Embedded Makes Sense#

The embedded database model is the right choice in a surprising number of contexts.

Mobile Applications#

Your phone has no database server. When an app on your iPhone or Android phone stores your notes, your contacts, your cached messages, your offline maps — it almost certainly uses SQLite, which is bundled into iOS and Android as a system library. There is no server to connect to, because the app runs offline, on a device, and needs to store data locally. SQLite is present in every Android and iOS application that needs structured local storage.

Desktop Software#

Desktop applications — email clients, password managers, music players, note-taking apps, browsers — all need to store structured data locally without requiring the user to install or manage a database server. Firefox uses SQLite to store your browsing history, bookmarks, and cookies. The history database is a file in your Firefox profile directory. You can open it with any SQLite tool. There is no server involved.

Unit Tests and CI#

Testing code that uses a database is painful when you need a real server. Embedded databases solve this cleanly: each test can open a fresh in-memory database, run the test, and throw the database away when done. The test is completely isolated, starts instantly, and leaves no state behind. This is one reason embedded databases are disproportionately popular in testing infrastructure even for applications that use PostgreSQL in production.

Analytics Scripts and Notebooks#

Data scientists frequently need to run complex SQL queries against datasets that are too large for comfortable pandas manipulation but too small to justify spinning up a data warehouse. An embedded analytical database allows you to write SQL against a CSV or Parquet file directly in a notebook, without standing up any infrastructure. The query runs in your process, results come back in seconds, and you move on.

Edge Computing and IoT#

Devices at the edge — industrial controllers, point-of-sale terminals, network appliances, scientific instruments — often need to store structured data locally but cannot connect to a central database server. The device may be offline intermittently or permanently. Embedded databases are the natural answer: the data lives on the device, managed by the application, without any network dependency.

Single-User Applications#

Not every application needs to share data between multiple users or processes. A command-line tool that maintains a local cache, a personal project that tracks expenses, a script that remembers which files it has already processed — these are single-user applications where the overhead of a client-server database is unjustified. An embedded database lets you get SQL semantics and persistent storage without any infrastructure.


What You Give Up#

Embedded databases are not the right choice for everything. Understanding the trade-offs prevents misuse.

No Network Access#

An embedded database lives in one process on one machine. Other processes on the same machine (or other machines entirely) cannot connect to it over a network. If you have ten application servers that all need to read and write the same data, an embedded database on each server means ten separate databases — they do not share data automatically.

This is the fundamental constraint of the embedded model. It is not a flaw — it is the trade-off that makes everything else possible. Removing the network is precisely what makes embedded databases simple, fast, and self-contained.

Single Process (Usually)#

Most embedded databases are designed for a single process (and sometimes a single connection). Multiple processes can technically open the same SQLite file simultaneously, but they will contend for write access and may block each other. Embedded databases are not general-purpose multi-process, multi-client servers — they are not designed to be.

Limited Concurrency for Writes#

Client-server databases like PostgreSQL handle hundreds of concurrent connections, each potentially writing simultaneously, using sophisticated locking and multi-version concurrency control. Embedded databases typically allow only one writer at a time. For most applications this is fine — writes are fast and the single-writer constraint is rarely the bottleneck. But for applications expecting high concurrent write throughput from multiple concurrent processes, client-server databases are the better fit.

No Built-In Replication or High Availability#

PostgreSQL has streaming replication, logical replication, and various high-availability configurations. MySQL has primary-replica replication and Galera Cluster. These features allow you to maintain multiple copies of your data for redundancy and to serve reads from multiple machines.

Embedded databases generally do not have built-in replication. Your data lives in one file. If that file is lost (disk failure, corruption), your data is gone unless you have made backups separately. Some projects build replication on top of embedded databases (Litestream adds streaming replication to SQLite, for example), but it is not inherent to the model.

Administration Tooling#

Client-server databases come with rich administration tooling: monitoring dashboards, query explain plans, connection pooling, slow query logs, online backup tools. Embedded databases are simpler, which means less administration overhead, but also means you have fewer built-in knobs to turn when something goes wrong.


The Spectrum Within Embedded Databases#

Embedded databases are not a monolithic category. They vary significantly in their data model, performance characteristics, and intended use.

Some embedded databases speak SQL — they give you tables, rows, joins, indexes, and the full relational model. These are the most familiar to most developers and handle the widest range of use cases.

Others are key-value stores — they store opaque values indexed by a key, with no SQL and no schema. You put a value in, you get it out by key, you iterate over keys in sorted order. Key-value stores are simpler and often faster for their specific use case, but they require you to do more work in your application code to manage data structure.

Some embedded databases are column-oriented, optimized for analytical queries that read a few columns across millions of rows. These are the opposite of traditional row-oriented databases in their internal layout, which gives them dramatically better performance for aggregation queries at the cost of slower individual row lookups.

The right kind of embedded database depends on your workload. Storing records and looking them up by ID calls for a different database than aggregating millions of log entries to compute statistics.


The File Is the Database#

One of the most practically useful things about embedded databases is that the database is a file. This has implications that developers often underestimate.

Backup is trivial. Copying the file (with appropriate care about consistency — most embedded databases provide a way to take a consistent backup) is the backup. No dump, no restore, no pg_dump/pg_restore ceremony.

Version control (for small databases) is possible. Development tools and configuration stored in an embedded database can be committed to git like any other file.

Portability is straightforward. Moving the database from one machine to another means moving the file. There is no server migration, no data export/import pipeline.

Inspectability is direct. Many embedded database formats can be read by general-purpose tools. A SQLite file can be opened by any SQLite client — the DB Browser for SQLite application, the sqlite3 command line tool, DBeaver, TablePlus. The data is accessible without the application that created it.

Testing with known data is easy. You can ship a pre-populated test database as a file in your test suite. Each test starts from the same known state.


A Brief Mental Model#

The simplest way to think about embedded databases is by analogy to other file-based tools your application already uses.

Your application probably reads configuration from a YAML or JSON file. That file is parsed by a library, loaded into memory, and used by your code. The configuration library is embedded — it runs in your process, it reads a file, it gives you data. A database configuration server would be absurd overkill for this purpose.

An embedded database is the same idea, applied to structured data that needs querying, indexing, and concurrent access. Instead of a JSON file (which you parse entirely into memory, with no indexing), you get a database file (which you query with SQL or key lookups, with B-tree or columnar indexes, with transactions).

The embedded database is the right tool when your data needs more structure and capability than a flat file, but your application does not need the scalability, multi-user access, or network accessibility of a client-server database.

For an enormous range of software — mobile apps, desktop tools, development environments, data science workflows, testing infrastructure, edge devices — embedded databases hit exactly the right point in the trade-off space.


The Bottom Line#

An embedded database is a library, not a service. You import it, you open a file, you have a database. No server, no daemon, no network, no configuration management. The database shuts down when your application exits and starts again when your application starts. The data lives in a file that you can copy, backup, and inspect like any other file.

This simplicity has a cost — no network sharing, limited concurrency, no built-in replication — but for a vast number of applications, those costs are irrelevant. The application only needs to store its own data, on one machine, for one set of users. An embedded database is the tool built exactly for that purpose.

If you have a smartphone in your pocket, you are carrying several embedded databases right now.

S1: Rapid Discovery

S1: Rapid Discovery — Embedded Databases#

Quick Answer#

SQLite for SQL/relational embedded databases. DuckDB for in-process analytics (OLAP). RocksDB for high-performance key-value storage at scale. LMDB for read-heavy workloads needing maximum throughput. BadgerDB or bbolt for Go services with embedded state.


What Are Embedded Databases?#

An embedded database is one that runs inside your application process rather than as a separate server. You import it as a library, open a file (or work in memory), and query it directly. There is no daemon to start, no network socket to configure, no connection string pointing to another host. The database is your application.

This is the defining characteristic: in-process execution. PostgreSQL and MySQL are client-server — your app talks to them over a socket. SQLite, DuckDB, and RocksDB are embedded — they live inside your process and access storage directly.


The Landscape at a Glance#

The embedded database space splits into two broad families: SQL databases and key-value stores. DuckDB adds a third niche: analytical databases optimized for OLAP queries over columnar data.

SQL embedded databases give you tables, joins, indexes, and SQL. SQLite dominates this category so thoroughly it is one of the most widely deployed pieces of software in the world — present in every Android phone, every iOS device, every browser, every Python installation.

Key-value stores give you a sorted map: you store and retrieve opaque values by key. They are faster for write-heavy workloads and simpler to operate, but you give up SQL expressiveness. RocksDB and LevelDB (LSM-tree based, write-optimized) and LMDB (B-tree, read-optimized) are the dominant options.

Analytical engines like DuckDB sit in a separate niche: they speak SQL, but their storage and execution model is column-oriented and optimized for aggregate queries over large datasets rather than transactional row lookups.


SQLite#

SQLite is 25 years old, written in C, and in the public domain. It is the most tested software in existence — its test suite has 100x more code than the library itself. Every device you own almost certainly runs SQLite.

SQLite uses a B-tree storage engine with ACID guarantees. It supports the full SQL standard (with a few deliberate deviations), transactions, triggers, views, and full-text search via FTS5. WAL (Write-Ahead Logging) mode, introduced in 2010, dramatically improved concurrency by allowing readers and the single writer to proceed simultaneously.

The “single writer” constraint is SQLite’s most important limitation. Only one write transaction can run at a time. For applications with high concurrent write throughput, this becomes a bottleneck. For the vast majority of applications — mobile apps, desktop software, configuration stores, unit tests, and scripts — it is completely irrelevant.

Community consensus: use SQLite by default for embedded SQL. The only reason to reach for something else is if you have a specific need SQLite cannot meet (high write concurrency, columnar analytics, or a language ecosystem without good bindings).

Stars on GitHub: ~14,000 (mirror — SQLite is not primarily GitHub-distributed) Language: C, with first-class bindings in Python, Go, Rust, Java, Swift, Kotlin, JavaScript (via WebAssembly), and nearly everything else.


DuckDB#

DuckDB launched in 2018 and has grown extraordinarily fast. It occupies a niche that was poorly served: OLAP analytics you can run inside a Python script, an R session, or a Node.js process without spinning up Spark, ClickHouse, or BigQuery.

Where SQLite is row-oriented (optimized for transactional lookups of individual records), DuckDB is column-oriented (optimized for scanning millions of rows and computing aggregates). The two databases solve different problems and are more complementary than competing.

DuckDB reads Parquet, CSV, and Arrow natively without importing data first. You can query SELECT * FROM 'data/*.parquet' WHERE year > 2020 directly. This makes it exceptional for data science workflows. It also has a full in-memory mode, making it popular for testing analytical pipelines.

DuckDB Labs (the commercial entity behind DuckDB) raised significant VC funding. The project is under active development with frequent releases. The Python and R integrations are first-class.

Community consensus: DuckDB is the obvious choice for local analytics. If you are aggregating, grouping, joining large tables, or reading Parquet files in a script or notebook, DuckDB is dramatically faster than SQLite for those workloads.

Stars on GitHub: ~26,000+ Language: C++, with excellent Python, R, Node.js, Go, Java, and Rust bindings.


RocksDB#

RocksDB is Meta’s (Facebook’s) production key-value store, a fork of Google’s LevelDB that has been extensively hardened for high-throughput SSD workloads. It is not typically used as an application database — it is used as a storage engine underneath other databases.

TiKV (TiDB’s storage layer), CockroachDB (via Pebble, a Go port), Cassandra (via optional RocksDB backend), YugabyteDB, and many others use RocksDB or derive from it. Kafka Streams uses RocksDB for local state stores.

RocksDB uses an LSM-tree (Log-Structured Merge-tree). Writes go to an in-memory buffer (MemTable), are flushed to immutable SST files, and are merged in background compaction processes. This makes writes extremely fast at the cost of read amplification and space amplification. It is tunable: you can configure compaction strategies, bloom filters, block cache sizes, and column families extensively.

For application developers who just need a fast KV store, RocksDB is often overkill. The configuration surface is enormous and the operational complexity is real. But if you are building infrastructure — a distributed database, a stream processor, a search index — RocksDB is the industry standard.

Stars on GitHub: ~29,000 Language: C++, with bindings in Java (first-class), Go, Python, Rust.


LevelDB#

LevelDB is Google’s original LSM-tree key-value store, the ancestor of RocksDB. It is simpler, smaller, and easier to embed than RocksDB, but also less capable. It lacks column families, does not support transactions (only atomic batch writes), and is single-threaded for reads.

Chrome uses LevelDB internally (for IndexedDB). Bitcoin Core used LevelDB for the UTXO set. It was historically influential, but in most new projects, RocksDB is the better choice if you need an LSM-tree KV store with serious performance requirements. LevelDB’s simplicity can be an advantage for smaller projects or language ports (goleveldb, py-leveldb).

Community consensus: prefer RocksDB over LevelDB for new projects needing LSM-tree semantics. LevelDB is stable but not actively developed to the same degree.

Stars on GitHub: ~36,000 (including mirrors) Language: C++, with many community ports.


LMDB#

LMDB (Lightning Memory-Mapped Database) takes a completely different architectural approach from the LSM-tree stores. It uses a copy-on-write B-tree with memory-mapped files. The entire database file is mmap’d, and the OS page cache becomes the buffer pool.

The result is exceptional read performance — reads are essentially pointer dereferences with no buffer management overhead — and a unique concurrency model where readers never block writers and writers never block readers. A single writer runs at a time (like SQLite), but readers scale to arbitrary numbers with zero locking.

LMDB is used in OpenLDAP, Tor, and various high-frequency systems where read throughput matters most. Its write performance is good but not exceptional — the copy-on-write approach means writes touch more pages than an LSM-tree. The database file can grow but never shrinks automatically, which requires occasional compaction.

Community consensus: LMDB is the fastest embedded KV for read-heavy workloads. If your access pattern is mostly reads with occasional writes, LMDB’s zero-copy memory-mapped approach is very hard to beat.

Stars on GitHub: ~3,500 (mirror — LMDB is primarily distributed via OpenLDAP) Language: C, with bindings in Python (lmdb), Node.js, Go, Rust, Java.


TinyDB#

TinyDB is a pure Python document store backed by a JSON file. It requires no C extensions, no compilation, and no dependencies. You can install it anywhere Python runs.

The trade-offs are significant: it is slow (JSON parsing on every access), does not support complex queries, and is not suitable for large datasets. Its purpose is convenience in scripts, small tools, and prototypes where performance is irrelevant and you want something simpler than even SQLite.

Community consensus: TinyDB is for scripts and toys, not production. For anything beyond a few thousand records or casual use, SQLite with Python’s built-in sqlite3 module is a better default.

Stars on GitHub: ~6,800 Language: Pure Python.


BadgerDB#

BadgerDB is a Go-native key-value store written entirely in Go (no CGO). It implements an LSM-tree similar to RocksDB but without the complexity of the RocksDB C++ codebase and the FFI overhead of calling into C from Go. It was developed by the Dgraph team and is used internally by Dgraph.

For Go developers who want a high-performance embedded KV store without CGO, BadgerDB is the standard choice. The pure-Go implementation means simpler cross-compilation, no linker complexity, and idiomatic Go APIs. Its performance is competitive with RocksDB for typical workloads, though it does not match RocksDB’s tuning ceiling.

Stars on GitHub: ~14,000 Language: Pure Go.


BoltDB / bbolt#

BoltDB (now maintained as bbolt by the etcd team) is a pure-Go embedded KV store with a B-tree engine and ACID transactions. It is to Go what SQLite is to other ecosystems: simple, reliable, transactional, and without CGO.

etcd — the distributed key-value store used by Kubernetes — uses bbolt as its storage backend. This gives bbolt exceptional credibility and a large de-facto user base. Its concurrency model is similar to SQLite: single writer, multiple readers.

The main trade-off versus BadgerDB is write performance. BoltDB’s B-tree is read-optimized (similar to LMDB’s design philosophy) and writes are slower than BadgerDB’s LSM-tree for write-heavy workloads.

Community consensus: bbolt for transactional embedded state in Go; BadgerDB for write-heavy Go workloads.

Stars on GitHub: ~8,500 (bbolt) Language: Pure Go.


Decision Guide#

NeedReach For
SQL in any languageSQLite
Local analytics, Parquet, aggregate queriesDuckDB
Storage engine for a database you’re buildingRocksDB
Read-heavy KV with maximum throughputLMDB
Write-heavy KV in a C++ applicationRocksDB or LevelDB
Embedded KV in Go (write-heavy)BadgerDB
Embedded KV in Go (transactional)bbolt
Tiny script, Python, simplicity over performanceTinyDB
Testing, in-memory analyticsDuckDB (in-memory mode)

Community Consensus Summary#

The embedded database space has a clear hierarchy. SQLite is the universal default for SQL — it is effectively irreplaceable for its niche. DuckDB has rapidly emerged as the default for analytical workloads in the same way. RocksDB is the default infrastructure-grade KV store for systems programmers. LMDB, BadgerDB, and bbolt serve specific niches where their particular architectural trade-offs align with the workload. LevelDB and TinyDB are legacy or niche options with narrower modern use cases.

The most important distinction to internalize is OLTP vs. OLAP: SQLite is transactional (many small reads/writes of individual rows), DuckDB is analytical (few large scans aggregating many rows). Using the wrong one for your workload will be painful.

S2: Comprehensive

S2: Comprehensive Discovery — Embedded Databases#

Architecture Deep Dive#

This document provides a technical examination of the major embedded databases, focusing on their storage architectures, concurrency models, performance characteristics, and the scenarios where each excels.


SQLite#

Storage Architecture: B-Tree Pages#

SQLite stores all data in a single file using a page-based B-tree structure. Every database file is divided into fixed-size pages (default 4096 bytes, configurable from 512 to 65536). Tables and indexes are each stored as separate B-tree structures within the file.

The B-tree design means SQLite is optimized for random access by primary key. Finding a row by its rowid is O(log n). This is the defining characteristic of an OLTP store: efficient point lookups and small-range scans.

The file format is explicitly designed for long-term stability. SQLite version 3 files written in 2004 are readable by current SQLite. This stability guarantee is codified in the project’s long-term support commitment.

WAL Mode (Write-Ahead Logging)#

SQLite’s default journal mode is DELETE (also called rollback journal mode). In this mode, the original database pages are preserved in a journal file before modification, allowing rollback. Writers hold an exclusive lock on the database file while writing.

WAL mode changes this fundamentally. In WAL mode:

  1. Writes append to a separate WAL file rather than modifying the database in place.
  2. Readers read from the database file and check the WAL for newer versions of pages.
  3. A checkpoint process periodically moves completed WAL entries back into the database file.

The concurrency consequence is significant: readers never block writers and writers never block readers. Multiple readers can proceed simultaneously with a single writer. This resolves the most common SQLite concurrency complaint for read-heavy workloads.

PRAGMA journal_mode=WAL;
PRAGMA synchronous=NORMAL;  -- Safe with WAL, faster than FULL

WAL mode does introduce one new constraint: all connections must be on the same filesystem (shared memory for the WAL index). Network filesystems (NFS, CIFS) are not supported with WAL. The database file cannot be on a network drive.

ACID Guarantees#

SQLite provides full ACID semantics:

  • Atomicity: Transactions either complete fully or not at all. Crash recovery via the journal restores consistency.
  • Consistency: Foreign key constraints, CHECK constraints, and NOT NULL are enforced (foreign keys require PRAGMA foreign_keys=ON).
  • Isolation: Serializable isolation — each transaction sees a consistent snapshot. In WAL mode, readers see the state at the start of their transaction.
  • Durability: fsync() is called after each committed transaction (configurable via synchronous pragma). synchronous=FULL is the safest setting; synchronous=NORMAL with WAL is safe against process crashes but not OS crashes; synchronous=OFF sacrifices durability for speed.

Single-Writer Limitation#

SQLite allows exactly one write transaction at a time. Concurrent write attempts serialize — the second writer waits (or returns SQLITE_BUSY) until the first completes. For applications with high concurrent write throughput, this serialization becomes the bottleneck.

In practice, SQLite write transactions complete in microseconds to milliseconds. An application doing 1000 writes per second, each taking 1ms, is writing for 1 second of every second — that is the ceiling. Most applications never approach this. The single-writer model is a real constraint for write-heavy multi-process workloads (multiple processes sharing a SQLite database file) but is rarely the bottleneck within a single process.

Virtual Tables#

SQLite has a virtual table mechanism that allows external data sources to be queried via SQL. FTS5 (full-text search) is implemented as a virtual table. The JSON1 extension exposes JSON parsing as virtual functions. The CSV virtual table queries CSV files via SQL. This extensibility makes SQLite adaptable far beyond plain relational data.

Type System#

SQLite uses dynamic typing with type affinity rather than strict types. A column declared as INTEGER can store text. This is a deliberate design decision enabling schema flexibility, though it can surprise developers expecting PostgreSQL-style strict typing. Values are stored in one of five storage classes: NULL, INTEGER, REAL, TEXT, BLOB.


DuckDB#

Columnar Storage#

DuckDB stores data column by column rather than row by row. In a table with 100 columns and 10 million rows, a row-oriented store like SQLite stores all 100 values for row 1, then all 100 values for row 2. A columnar store like DuckDB stores all 10 million values for column 1, then all 10 million values for column 2.

This layout transforms analytical query performance. A query computing SELECT AVG(revenue) FROM sales only needs to read the revenue column, skipping the other 99. For analytical workloads that aggregate a few columns across millions of rows, columnar storage reduces I/O by orders of magnitude.

Columnar storage also enables much better compression. Values in the same column tend to have similar magnitude, similar prefix patterns, or the same value entirely. Dictionary encoding, run-length encoding, and bitpacking are all highly effective on columnar data. A 1GB CSV file might compress to 100MB in DuckDB’s columnar format.

Vectorized Execution Engine#

DuckDB’s query engine processes data in vectors (batches of 1024–2048 values) rather than row at a time. Each operator (scan, filter, aggregate, join) processes an entire vector before passing results to the next operator. This vectorized model:

  • Keeps data in CPU cache between operators
  • Enables SIMD (Single Instruction Multiple Data) instructions that process 4, 8, or 16 values simultaneously
  • Reduces per-row function call overhead
  • Allows late materialization: columns not needed by a filter are not read

The result is that DuckDB analytical queries frequently run 10-100x faster than equivalent SQLite queries on the same data.

Parquet, CSV, and Arrow Integration#

DuckDB’s file format support is a key differentiator. You can query external files directly without importing them:

-- Query Parquet files directly (no import step)
SELECT year, SUM(revenue)
FROM 'sales/2024/*.parquet'
GROUP BY year;

-- Query CSV with schema inference
SELECT * FROM read_csv_auto('data.csv') LIMIT 10;

-- Arrow integration (zero-copy with Python pandas/polars)
SELECT * FROM arrow_table;

This makes DuckDB exceptional for data engineering pipelines where data arrives as files. The ability to JOIN a Parquet file against a CSV against an in-memory table in a single query, without any import step, is uniquely powerful for ad-hoc analysis.

OLAP-Optimized Query Planning#

DuckDB’s query planner implements features specifically important for analytical workloads:

  • Predicate pushdown into scans: Filters are evaluated as early as possible, skipping columns and row groups that cannot match.
  • Adaptive aggregation: Large aggregations use hash tables sized to fit in memory.
  • Multi-threaded execution: DuckDB automatically parallelizes queries across all available CPU cores. A single SELECT statement uses all cores.
  • Join ordering optimization: The query planner estimates cardinalities and chooses join order to minimize intermediate result sizes.

In-Memory and File-Backed Modes#

DuckDB operates in two modes. In-memory mode (:memory: or no filename) keeps all data in RAM and is discarded when the connection closes. This is useful for testing and for temporary analytical processing. File-backed mode persists to a .duckdb file with the same columnar storage format.

DuckDB files are not cross-version compatible in the way SQLite files are. Upgrading DuckDB may require re-creating databases, which limits its use as a long-term storage format compared to SQLite.

Concurrency Model#

DuckDB uses a multi-version concurrency control (MVCC) scheme, supporting serializable isolation with multiple concurrent readers. In file-backed mode, it allows multiple readers but a single writer. In-memory mode is designed for single-connection use.


RocksDB#

LSM-Tree Architecture#

RocksDB implements a Log-Structured Merge-tree (LSM-tree). Understanding this architecture is essential to understanding RocksDB’s performance characteristics.

The write path: All writes go first to an in-memory buffer called the MemTable, which is also written to a Write-Ahead Log (WAL) for crash recovery. When the MemTable fills, it is frozen and becomes an immutable MemTable, then flushed to disk as an SST (Sorted String Table) file at Level 0 (L0).

Compaction: SST files accumulate at L0. When L0 has enough files, they are compacted (merged and sorted) into Level 1 (L1). When L1 exceeds its size target, files are compacted into L2, and so on. Each level is 10x larger than the previous. The result is a tiered structure of sorted, immutable files.

The read path: To read a key, RocksDB must check the MemTable, then the immutable MemTables, then L0 (each file independently, since L0 files can overlap), then L1 through Lmax (each level has non-overlapping key ranges, so only one file per level needs checking). Bloom filters on each SST file avoid most unnecessary reads.

This architecture makes writes extremely fast (sequential I/O to the MemTable and WAL), at the cost of read amplification (potentially reading from many SST files) and write amplification (data is written multiple times due to compaction).

Compaction Strategies#

RocksDB supports multiple compaction strategies:

  • Leveled compaction (default): Each level has a size limit; compaction is triggered when a level exceeds its limit. Good for read performance, higher write amplification.
  • Universal compaction: All files are at L0-equivalent; merging is based on file count or size ratios. Higher space amplification, lower write amplification. Good for write-heavy workloads.
  • FIFO compaction: Files are deleted when total size exceeds a limit. Only appropriate for time-series data where old data expires.

Column Families#

RocksDB allows a single database to contain multiple column families — logically separate key namespaces with independent MemTables, SST files, and compaction settings. This enables:

  • Different compaction strategies for different data types in one database
  • Per-column-family bloom filter and compression settings
  • Atomic writes across column families (cross-family transactions)

Column families are how RocksDB enables applications that store heterogeneous data (e.g., a user table and a session table) in a single database file with per-table tuning.

Transactions#

RocksDB supports optimistic and pessimistic transactions via the TransactionDB API. This is a significant capability gap over LevelDB, which only supports atomic batch writes. Transactions support ACID semantics with serializable or snapshot isolation.

Block Cache and Bloom Filters#

Read performance in RocksDB is heavily influenced by two caching mechanisms:

  • Block cache: An in-memory cache (LRU by default) of uncompressed SST data blocks. Properly sized, the block cache eliminates most disk reads for hot data.
  • Bloom filters: Per-SST probabilistic data structure that quickly determines whether a key is absent from a file, avoiding unnecessary disk reads. False positive rate is tunable.

Use as a Storage Engine#

RocksDB’s most common deployment is not as an application database but as a storage engine underneath other databases:

  • TiKV: The distributed key-value layer of TiDB (a MySQL-compatible HTAP database)
  • CockroachDB: Used Pebble (a Go reimplementation of RocksDB’s design) as its storage engine
  • Apache Cassandra: Optional RocksDB backend (STRATIO RocksDB)
  • Kafka Streams: Local state stores for stateful stream processing
  • MyRocks: RocksDB as MySQL’s storage engine (Meta uses this in production)

In these deployments, the application database handles SQL parsing, query planning, replication, and distributed coordination while RocksDB handles the actual storage I/O.


LevelDB#

Simplified LSM-Tree#

LevelDB implements the same LSM-tree design that RocksDB later extended. Its architecture is conceptually simpler:

  • In-memory MemTable (implemented as a skiplist)
  • Immutable MemTable flushing to L0
  • Leveled compaction (L0 through L6, with 10x size ratio)
  • No column families
  • No transactions (only WriteBatch for atomic multi-key writes)
  • Single-threaded compaction

LevelDB’s simplicity makes it easy to understand, port, and embed in smaller projects. Chrome’s IndexedDB implementation and Bitcoin Core’s UTXO set used LevelDB precisely because it is a self-contained, well-understood component.

Limitations vs. RocksDB#

LevelDB lacks several features that RocksDB adds for production use:

  • No column families (all keys share one namespace)
  • No transactions (WriteBatch only)
  • Single background thread (slower compaction for large databases)
  • No built-in compression choice (uses Snappy only)
  • Less tunable performance characteristics

For new projects needing an LSM-tree KV store, RocksDB is generally preferred unless the project requires a minimal C++ dependency or a well-maintained language port (goleveldb in Go, for example).


LMDB#

Memory-Mapped B-Tree#

LMDB takes an architecturally distinct approach. Instead of managing its own buffer pool, LMDB maps the entire database file into the process’s virtual address space using mmap(). The OS kernel manages the actual physical memory — pages are loaded from disk on demand and evicted when memory pressure requires it.

This design has profound consequences:

  • Zero-copy reads: Reading a value means reading bytes from the mmap’d address range. No copying from a buffer pool, no buffer management overhead.
  • OS-managed caching: The OS page cache does the work of the buffer pool. On Linux, this means the database’s hot pages are kept in RAM by the kernel, shared across all processes, and evicted only under memory pressure.
  • Read performance: On read-heavy workloads, LMDB’s performance is frequently best-in-class because reads are essentially direct memory accesses.

Copy-on-Write B-Tree#

LMDB uses a copy-on-write B-tree with no WAL. When a write transaction modifies a page, it writes the modified page to a free page rather than modifying the original page in place. The transaction is committed by atomically updating a pointer to the new root page. Old pages are freed after all read transactions that could see them have completed.

This design provides:

  • MVCC without logging: Old versions of pages persist until no reader needs them.
  • No WAL file: The database file itself is always consistent; there is no separate journal to apply on recovery.
  • No background threads: LMDB has no background compaction, no background flushing, no background anything. All I/O is synchronous and explicit.

Concurrency: MVCC with Minimal Locking#

LMDB’s concurrency model is elegant. Each read transaction gets a snapshot of the database at the point it begins. Multiple read transactions can run simultaneously without any locking. A single write transaction runs at a time, protected by a mutex. Read transactions are not blocked by the writer and do not block the writer.

The “one writer at a time” limit is fundamental to LMDB’s design. Unlike RocksDB’s concurrent memtable writes, LMDB serializes all write transactions. For write-heavy workloads, this is a real constraint. For read-heavy workloads (LDAP directories, configuration stores, caches), this is irrelevant.

Database File Growth#

LMDB requires the database file’s maximum size to be declared at open time (the map_size parameter). The file is created at that size (as a sparse file on supporting filesystems). If the database grows beyond the declared size, writes fail. This requires capacity planning upfront.

LMDB databases also do not shrink automatically. Freed pages are reused by future writes but the file size does not decrease. Compaction requires creating a new database and copying all records.


TinyDB#

TinyDB’s implementation is intentionally simple: it reads the entire JSON file on open, keeps all data in memory as a Python dictionary, and writes the entire file on each modification. This makes it correct and easy to understand, but limits it to small datasets (thousands of records at most) and eliminates concurrency.

There is no indexing beyond a linear scan. Queries iterate over all documents. For its intended use case (small scripts, prototypes, configuration storage), these limitations are acceptable. For anything expecting database performance, TinyDB is the wrong tool.


BadgerDB#

Go-Native LSM-Tree#

BadgerDB implements an LSM-tree in pure Go, drawing heavily on RocksDB’s design but adapting it for Go’s runtime. The key design decisions:

  • Value log: BadgerDB separates keys from values. Keys and their metadata are stored in an LSM-tree (for efficient sorted access). Large values are stored in a separate append-only value log. This reduces write amplification significantly for large values because compaction only rewrites keys, not the full values.
  • MVCC: Full multi-version concurrency control with serializable transactions.
  • Concurrent writers: BadgerDB supports multiple concurrent write transactions, unlike RocksDB’s single-writer-at-a-time default (RocksDB does support concurrent memtable writes with some complexity).

Pure Go Advantage#

The absence of CGO matters for Go developers. CGO introduces:

  • Longer compile times
  • Cross-compilation complexity (need the C toolchain for the target architecture)
  • Larger binary sizes
  • Slightly higher call overhead per FFI crossing

BadgerDB eliminates all of these. go build produces a static binary with embedded BadgerDB for any supported GOARCH without any extra tooling.

Performance Characteristics#

BadgerDB’s value log design makes it particularly fast for large value workloads. For small values (< 1KB), the overhead of the value log indirection reduces performance relative to pure-KV stores. For large values (> 1KB), BadgerDB’s write performance is excellent because compaction skips the value data.


bbolt (BoltDB)#

B-Tree with ACID Transactions#

bbolt uses a copy-on-write B-tree similar in spirit to LMDB. All pages are 4096 bytes (OS page size). The entire database is a single mmap’d file. Transactions read from consistent snapshots; the writer commits by updating a root page atomically.

bbolt organizes data into buckets — named B-trees within the database. Nested buckets are supported. Each bucket has an integer key space or a byte-slice key space. This structure maps naturally to application data models without the SQL overhead.

etcd and Kubernetes#

bbolt gained its most important endorsement when etcd adopted it as the storage backend. etcd is the distributed key-value store that Kubernetes uses for all cluster state. Every Kubernetes cluster in the world runs bbolt through etcd. This gives bbolt an exceptionally large and demanding production footprint.

The etcd team took over maintenance of bbolt (from the original BoltDB project, which was abandoned) and renamed it bbolt. The project receives maintenance updates and bugfixes but is not under active feature development — it is intentionally stable.

Write Performance Trade-offs#

bbolt’s B-tree design means writes must write a modified page and all its ancestors up to the root. For deep trees with frequent small writes, this results in significant write amplification compared to an LSM-tree. etcd mitigates this by batching writes and keeping the dataset small enough to fit comfortably in the page cache.

For Go workloads that are write-heavy (thousands of small writes per second), BadgerDB’s LSM-tree is the better choice. For workloads that are read-heavy or transactional (many reads with occasional writes), bbolt’s simpler model and lower read amplification are advantages.


Performance Comparison#

DatabaseRandom ReadSequential WriteRandom WriteBest Use
SQLite (WAL)GoodGoodGoodOLTP, SQL
DuckDBExcellent (columnar)ExcellentModerateOLAP, analytics
RocksDBGood (with bloom)ExcellentExcellentWrite-heavy KV
LevelDBGoodGoodGoodSimpler KV
LMDBExcellentGoodModerateRead-heavy KV
BadgerDBGoodExcellent (large values)ExcellentGo KV, large values
bboltGoodModerateModerateGo transactional KV

“Excellent/Good/Moderate” are relative characterizations — actual performance depends heavily on workload, key size, value size, dataset size relative to RAM, and hardware.


Feature Comparison#

FeatureSQLiteDuckDBRocksDBLevelDBLMDBBadgerDBbbolt
Query LanguageSQLSQLNoneNoneNoneNoneNone
TransactionsACIDACIDACIDBatchACIDACIDACID
ConcurrencySW, MRSW, MRMW, MRSWSW, MRMW, MRSW, MR
IndexesB-treeColumnarBloomBloomB-treeB-tree
Max DB size281TBRAM/diskPetabytesLargemmap limitLargemmap limit
Pure libraryYesYesYesYesYesYesYes
CGO freeNoNoNoNoNoYesYes

SW = single writer, MR = multiple readers, MW = multiple writers


Choosing Between the Options#

SQL or not? If you need SQL (joins, GROUP BY, subqueries, schema enforcement), the choice is between SQLite and DuckDB. If you only need key-value storage, the remaining options are appropriate.

OLTP or OLAP? SQLite is designed for transactional workloads (many small reads and writes, individual row access). DuckDB is designed for analytical workloads (few large aggregating scans). Using DuckDB for a transactional application or SQLite for large aggregations will be painful.

Read-heavy or write-heavy? LMDB and bbolt are read-optimized (B-tree, mmap). RocksDB, BadgerDB, and LevelDB are write-optimized (LSM-tree). Most application workloads are read-heavy, making LMDB and bbolt underappreciated choices.

Language constraints? Go developers should consider BadgerDB or bbolt to avoid CGO. Python developers have excellent bindings for SQLite (built-in), DuckDB (first-class), and LMDB (fast C bindings). For very small Python scripts, TinyDB is acceptably simple.

Building infrastructure? If you are building a database, a stream processor, or a storage engine, RocksDB is the industry standard. Its tuning options, column families, and production battle-testing at Meta’s scale make it the right foundation.

S3: Need-Driven

S3: Need-Driven Discovery — Embedded Databases#

Who Uses Embedded Databases, and Why?#

Embedded databases appear across an enormous range of applications — from mobile apps to data science notebooks to distributed infrastructure. The right choice depends on your workload, your language, and your constraints. This section explores the most common personas and what each one actually needs.


Persona 1: The Mobile and Desktop Application Developer#

The Situation#

You are building a native mobile app (iOS/Android) or a desktop application (macOS, Windows, Linux). Your application needs to persist structured data locally: user settings, cached API responses, a list of items the user has created, search history, or offline-capable data that syncs with a server when connectivity is available.

You cannot require your users to install a database server. The database must be bundled with your app, silent, and self-managing. Startup must be instant. The database footprint should be small.

Why SQLite is the Answer#

SQLite is the only realistic answer for mobile and most desktop application development, and for good reason. It is already on every mobile platform:

  • iOS and macOS ship SQLite as part of the OS. Apple’s Core Data framework uses SQLite as its default backing store.
  • Android ships SQLite and exposes it through the Room persistence library, which is now Google’s recommended abstraction layer.
  • Flutter uses SQLite via the sqflite package.
  • React Native developers use SQLite via react-native-sqlite-storage.

On desktop, frameworks like Electron, Qt, and wxWidgets all have established SQLite integrations. Games built with Unity or Unreal commonly store save data and configuration in SQLite.

What the Mobile Developer Actually Needs#

The mobile developer’s workload is archetypal OLTP: read one user’s data, write a small update, read a list of items (usually < 10,000 for a typical app), occasionally do a simple join. SQLite handles all of this efficiently and with full ACID guarantees.

WAL mode is often recommended for mobile apps because it allows reads during writes — important for apps that run background sync while the UI is reading the database.

The schema migration story matters for mobile developers. Unlike web apps where the server database can be migrated by a deployment script, mobile apps must handle schema migration on the user’s device, often across many old versions simultaneously. SQLite’s PRAGMA user_version provides a simple mechanism for tracking schema version and running migration scripts.

When SQLite Becomes Insufficient#

SQLite’s single-writer model can become a bottleneck if the app does aggressive background sync — for example, a notes app that bulk-imports thousands of records from the server while the user is actively reading and writing. In practice, most mobile apps never hit this ceiling.

A harder constraint is process isolation: if you have a main app process and an extension (iOS app extensions, Android widgets), both accessing the same SQLite database, WAL mode becomes essential and you must be careful about timeout settings to avoid SQLITE_BUSY errors.

Desktop Data Synchronization Apps#

Desktop apps that synchronize large datasets (email clients like Thunderbird and email apps, note-taking apps like Obsidian and Notion’s desktop app, password managers like 1Password) use SQLite as their local store. SQLite’s stable file format means backup, restore, and “export your data” are straightforward — the database file is just a file.


Persona 2: The Data Scientist Doing Local Analytics#

The Situation#

You are a data scientist or data analyst working in Python or R. You have CSV files, Parquet files, or data pulled from a data warehouse. You need to explore, transform, and aggregate this data — often datasets of 1M to 1B rows that fit on your laptop but are slow to process with pandas or R dataframes.

You do not want to spin up a ClickHouse cluster for a one-off analysis. You do not want to upload your data to BigQuery for an exploratory query. You want to run SQL against a local file and get an answer in seconds.

Why DuckDB Transformed This Workflow#

Before DuckDB, the local analytics story was uncomfortable. Pandas loads data into memory row by row, uses Python’s execution model for operations, and runs single-threaded by default. For a CSV with 50 million rows, a simple GROUP BY could take minutes. SQLite does not support analytical functions well and its row-oriented storage makes column scans slow.

DuckDB resolved this by bringing in-process columnar analytics to scripting environments. The core insight is that the data scientist’s workflow maps almost perfectly to DuckDB’s strengths:

  • Data arrives as files (CSV, Parquet, JSON, Arrow) — DuckDB reads these natively
  • Queries are large aggregations (GROUP BY, window functions, complex joins) — DuckDB’s columnar execution excels here
  • The dataset fits on one machine — embedded avoids distributed overhead
  • Results are typically small (a table of 100 rows after aggregation) — memory is not a concern for the output

The Notebook Workflow#

In a Jupyter notebook, DuckDB’s Python integration is nearly frictionless:

import duckdb
result = duckdb.sql("SELECT category, SUM(revenue) FROM 'sales.parquet' GROUP BY 1")
df = result.df()  # Return as pandas DataFrame

This single import, no-configuration workflow is why DuckDB adoption among data scientists has been explosive. There is no connection string, no server, no schema to define. Point it at a file and query it.

DuckDB also integrates bidirectionally with pandas and Polars DataFrames. You can query an existing DataFrame as if it were a table, and you can return results as a DataFrame. This makes DuckDB fit into existing workflows without requiring the scientist to abandon their familiar data manipulation libraries.

Performance Reality#

On a MacBook Pro with an M-series chip, DuckDB can aggregate 100 million rows in 1-2 seconds. The same operation in pandas takes 30-60 seconds. The same operation in SQLite takes minutes or fails with memory issues. This 10-100x speedup is transformative for exploratory data analysis where you are iterating quickly on queries.

DuckDB achieves this through parallel execution (using all CPU cores), vectorized SIMD operations, and columnar storage that reads only the required columns. A query touching 3 columns of a 100-column Parquet file only reads 3% of the file’s data.

When DuckDB Is Not the Right Tool#

DuckDB’s OLTP story is weak. If you need to insert thousands of individual rows, update single rows by primary key, or run transactional workflows, SQLite is faster and more appropriate. DuckDB is optimized for batch analytics, not transactional record management.

DuckDB also lacks the file format stability of SQLite. For archival storage of important data, SQLite’s 25-year file format guarantee is more reliable. DuckDB is best for computation, not long-term storage.


Persona 3: The Systems Programmer Building Infrastructure#

The Situation#

You are building a database, a distributed system, a search engine, or a stream processor. You need a storage engine that handles the actual persistence layer — a sorted, durable, high-performance key-value store that your application logic sits above. You need control over compaction, tuning, and the I/O stack. You will likely be writing C++ or Rust, or integrating a C/C++ library.

RocksDB as the Industry Standard#

RocksDB is the dominant answer for this persona. Its adoption as the storage engine for TiKV, MyRocks, Kafka Streams, and many other systems reflects a community consensus that RocksDB’s combination of performance, tunability, and production battle-testing makes it the right foundation for serious storage infrastructure.

The systems programmer’s needs are different from the application developer’s:

  • Compaction control: You need to tune the balance between write amplification, read amplification, and space amplification depending on your workload.
  • Column families: You need logical separation between different kinds of data (user data, metadata, indices) with potentially different compaction and compression settings.
  • Custom comparators: Your keys may have internal structure (composite keys, encoded prefixes) that requires a custom sorting function.
  • Bloom filters and block cache: You need to tune the bloom filter false positive rate and block cache size based on your working set.
  • Backups: You need online backup (snapshot) without stopping writes.

RocksDB exposes all of this. Its Options struct has hundreds of parameters. This is not user-hostile complexity — it is the control surface that infrastructure engineers need.

The LevelDB vs. RocksDB Decision#

For simpler infrastructure projects — an embedded index for a custom search engine, a small key-value layer for a server that needs persistent state — LevelDB’s simplicity can be appealing. LevelDB’s codebase is smaller and easier to understand than RocksDB. If you need to port the storage engine to another language or embedding environment, LevelDB’s design is easier to reimplement (as evidenced by goleveldb, the Go port).

However, for any system that expects significant production load, RocksDB’s single-threaded compaction (in LevelDB) is a bottleneck. RocksDB added multi-threaded compaction, direct I/O support, rate limiting, and dozens of other production-necessary features that LevelDB lacks.

When to Consider LMDB for Systems Work#

LMDB is competitive for read-heavy infrastructure with bounded dataset sizes. OpenLDAP’s directory service is the canonical example: millions of directory lookups per second, infrequent modifications. LMDB’s memory-mapped zero-copy read path handles this load efficiently with minimal CPU overhead.

For systems where the dataset fits in RAM and reads dominate writes (caches, configuration stores, lookup tables), LMDB’s architecture is often superior to RocksDB’s LSM-tree. The absence of background compaction means LMDB’s latency tail is very short — there are no compaction stalls.

The constraint is write scalability. If your system needs to sustain high write throughput over a growing dataset, RocksDB’s LSM-tree with background compaction is the right architecture. LMDB’s single-writer model limits write throughput in a way that compounds with the overhead of copy-on-write B-tree modifications.


Persona 4: The Go Developer Building a Stateful Service#

The Situation#

You are building a Go service that needs embedded state — a cache that must survive restarts, a queue of pending work, a lookup table, or a small local database for a microservice. You want to avoid the complexity of an external database. You want a pure Go solution with no CGO to keep your build simple and your binary self-contained.

Your constraints:

  • Must be pure Go (no CGO) for simple cross-compilation
  • Must handle concurrent reads from multiple goroutines
  • Must be ACID transactional (you cannot afford data corruption)
  • Must survive process restarts
  • Moderate throughput (thousands to tens of thousands of operations per second)

BadgerDB vs. bbolt: The Go Embedded KV Decision#

Both BadgerDB and bbolt are pure Go, both are ACID transactional, and both are widely used in production Go services. The choice comes down to your access pattern.

Choose bbolt if:

  • Your dataset is read-heavy (many more reads than writes)
  • You need transactional access patterns similar to a SQL database (long-running read transactions that see a consistent snapshot)
  • Your writes are infrequent or can be batched
  • You want the simplest possible API
  • You need the proven reliability of etcd/Kubernetes usage

bbolt’s B-tree design means reads are fast (O(log n) pointer traversal through mmap’d pages) and writes are correct but not fast. If your service reads its embedded state frequently and modifies it occasionally, bbolt is an excellent match.

Choose BadgerDB if:

  • Your dataset is write-heavy (frequent small writes or large batch writes)
  • You store large values (> 1KB) — BadgerDB’s value log specifically optimizes this case
  • You need higher write throughput than bbolt can sustain
  • You are comfortable with a more complex operational model (value log garbage collection, etc.)

BadgerDB’s LSM-tree amortizes write cost through background compaction, similar to RocksDB. For services that receive a stream of events and persist each one, BadgerDB sustains much higher write throughput than bbolt.

Practical Go Service Examples#

Configuration and metadata store: A service that stores its own configuration, cached computed state, or metadata about external resources. This is typically read-heavy (configuration is read on every request, written rarely). bbolt is ideal — it is simple, its transactions map naturally to “read configuration at startup,” and its reliability is proven.

Job queue with persistence: A background worker service that maintains a queue of pending jobs, marks them in-progress, and marks them complete. Each job transition is a write. With hundreds of concurrent workers completing jobs, write throughput matters. BadgerDB’s concurrent transaction support and LSM-tree write performance make it a better fit.

Event store: A service that appends incoming events and occasionally queries them. Append-heavy workloads favor BadgerDB. If events need to be queried by multiple dimensions (not just appended), SQLite via a CGO binding might be worth the trade-off.

Distributed system local state: If you are building a distributed system (similar to how etcd uses bbolt), the requirement is usually transactional correctness over raw performance. bbolt’s proven correctness in etcd makes it the safe choice.

SQLite as an Alternative for Go Developers#

It is worth noting that CGO-based SQLite bindings for Go (mattn/go-sqlite3, modernc.org/sqlite for pure-Go) are widely used and well-maintained. The modernc.org/sqlite package is a pure-Go SQLite port that avoids CGO entirely by translating the C source to Go.

If your Go service needs SQL semantics — complex queries, schema enforcement, joins — SQLite via modernc.org/sqlite is a viable pure-Go option. The performance is somewhat lower than the C SQLite, but for moderate workloads, the SQL expressiveness may be worth it.


Persona 5: The Script Writer and Rapid Prototyper#

The Situation#

You are writing a Python script — an automation tool, a data transformation utility, a personal project, or a prototype. You need to store some data between runs: cached API responses, a set of processed item IDs, a mapping of URLs to metadata. You want something simpler than setting up a database and faster to code than managing JSON files manually.

The Trade-offs at This Level#

At the scripting level, three options are realistic:

Python’s built-in sqlite3 is the right default for most scripts. It requires zero installation (sqlite3 is in the standard library), supports full SQL, handles transactions correctly, and is fast enough for any script-level workload. The main friction is writing SQL strings, which some find verbose for simple use cases.

TinyDB is the right choice when you want a document store API (insert a Python dict, query dicts, no schema) and your dataset is small (< 10,000 documents). Its pure Python implementation and JSON backing make it easy to inspect the data file directly. The API is friendly for programmers who think in Python dicts rather than SQL tables. The cost is performance: TinyDB reads and writes the entire JSON file on each access.

shelve (Python standard library) provides a dict-like interface backed by dbm (which is often GDBM or ndbm under the hood). It is even simpler than TinyDB for pure key-value use cases, but the binary format is not portable and the API is more limited.

When the Script Outgrows TinyDB#

Scripts have a way of becoming applications. The automation tool that started as 100 lines becomes a service with 10,000 lines. At this point, TinyDB’s limitations become apparent:

  • Performance degrades linearly with dataset size (full file parse on every query)
  • No concurrent access (file locking issues)
  • No indexing (queries are O(n) scans)
  • JSON serialization is lossy for some Python types (datetime, bytes, custom objects)

The right upgrade path is SQLite. The sqlite3 module’s API is more verbose but the performance difference on a 100,000-record dataset is 100-1000x. For analytical scripts that grew from a prototype, DuckDB may be the right upgrade path instead.


Summary: Matching Persona to Tool#

PersonaPrimary NeedRecommended ToolWhy
Mobile/desktop developerSQL, bundled, offline-firstSQLitePlatform-native, ACID, stable format
Desktop app (heavy analytics)Fast local queriesDuckDB + SQLiteDuckDB for analysis, SQLite for records
Data scientist (Python/R)OLAP on files, fast aggregatesDuckDBColumnar, Parquet native, in-process
Systems programmer (C++)Storage engine foundationRocksDBTunability, compaction, production-tested
Systems programmer (read-heavy)Fast KV reads, bounded datasetLMDBZero-copy mmap, no background threads
Go developer (transactional state)Pure Go, reliable, read-heavybboltProven in etcd, simple API
Go developer (write-heavy)Pure Go, high write throughputBadgerDBLSM-tree, concurrent writers
Script writer (SQL comfort)Zero-dependency SQLsqlite3 (stdlib)Built-in, no install, full SQL
Script writer (dict-first)Document API, tiny datasetTinyDBSimplest possible API
Infrastructure builder (Go)Distributed KV, correctnessbbolt or BadgerDBDepends on access pattern
S4: Strategic

S4: Strategic Discovery — Embedded Databases#

Ecosystem Health and Long-Term Viability#

Choosing an embedded database is a long-term commitment. Unlike a web API that can be swapped, a database engine touches your file format, your backup strategy, your performance tuning, and potentially your language bindings. The strategic question is not just “which is fastest today” but “which will still be maintained, well-supported, and improving five years from now.”


SQLite: The Immortal Standard#

Organizational Structure#

SQLite is maintained by a small team — effectively Dr. Richard Hipp (the original author) and a few contributors — at Hwaci, a small consulting company. It is funded through a combination of consulting contracts and donations. There is no VC investment, no commercial pressures, and no acquisition target.

SQLite is released into the public domain. There is no license. There are no restrictions. This has a profound long-term implication: SQLite cannot be “closed” or relicensed. Even if the Hwaci team disbanded tomorrow, SQLite would continue to exist and be forkable by anyone.

Maintenance Commitment#

The SQLite developers have published an explicit commitment to maintain backward compatibility through at least the year 2050. This is not marketing — the SQLite team treats it as an engineering constraint. No change that breaks existing databases or existing code is acceptable. SQLite 3.x files written in 2004 are still readable by the current version, and the team intends to maintain this.

This makes SQLite uniquely safe for long-term archival storage and embedded applications that cannot easily upgrade their database format.

Ecosystem Health Signals#

  • Present in every major OS (iOS, Android, macOS, Windows 10+, Linux distributions)
  • Bundled in Python, PHP, Ruby’s standard library
  • Default ORM target for Django, SQLAlchemy, and most ORMs
  • Used by Firefox, Chrome, Safari, Adobe, Skype, WhatsApp, and thousands of other applications
  • Test suite includes 100x more test code than implementation code

The risk of SQLite becoming unmaintained or abandoned is essentially zero. It is one of the most mission-critical pieces of software in existence.

Current Development Direction#

SQLite continues to receive meaningful improvements. Recent and ongoing work includes:

  • WebAssembly build (SQLite WASM): Official, first-party SQLite compiled to WASM for browser use, replacing sql.js
  • Litestream and Turso: Third-party projects extending SQLite with streaming replication (Litestream) and a distributed multi-tenant layer (Turso/libSQL), showing active ecosystem innovation
  • JSONB: Native binary JSON storage format added in 2023

The SQLite ecosystem is not stagnant — it is the base layer for a growing set of extensions and derivatives.


DuckDB: Rapidly Ascending, Commercially Backed#

DuckDB Labs#

DuckDB is developed by DuckDB Labs, a company founded by the original researchers (Hannes Mühleisen and Mark Raasveldt at CWI, the Dutch research institute). DuckDB Labs raised $18 million in a Series A round in 2023 from Andreessen Horowitz (a16z) and others. This gives DuckDB:

  • A funded, full-time development team
  • Commercial incentives to grow adoption and maintain quality
  • Investor pressure for sustainability (the company must generate revenue or raise further rounds)

The commercial model appears to be cloud offerings (MotherDuck, a serverless DuckDB service) with the core library remaining open source under the MIT license. This mirrors the successful open-core models of PostgreSQL-based companies.

Growth Trajectory#

DuckDB’s adoption has been unusually rapid for a database. The Python package has been downloaded hundreds of millions of times. The GitHub star count grew from near-zero to 26,000+ in roughly 5 years. Talks about DuckDB at data engineering conferences (dbt Coalesce, PyData, VLDB) have drawn significant interest.

The community is generating extensions at a high rate — DuckDB’s extension system allows third-party additions, and extensions for spatial data (SPATIAL), JSON processing, full-text search, and various file formats have appeared.

Risks#

DuckDB carries risks that SQLite does not:

  • VC-backed commercial entity: If DuckDB Labs fails commercially, development could slow or stop. The MIT license protects users from lockout, but an unmaintained library is a real concern.
  • File format stability: DuckDB does not make SQLite’s long-term format stability guarantee. Upgrading DuckDB may require re-exporting data. This limits DuckDB’s suitability as a long-term archival format.
  • Relative youth: At ~7 years old versus SQLite’s 25 years, DuckDB’s edge cases and failure modes are less thoroughly understood in production.

These risks are manageable — DuckDB’s momentum is strong and the MIT license protects against worst-case scenarios — but they differentiate DuckDB from SQLite’s essentially-zero risk profile.


RocksDB: Critical Infrastructure Under Meta Stewardship#

Meta’s Investment#

RocksDB is maintained by a team at Meta (Facebook). Meta uses RocksDB extensively in production — MyRocks (MySQL + RocksDB) stores significant portions of Facebook’s social graph data. The engineering team maintaining RocksDB is motivated by their own production use, which aligns their incentives with correctness and performance rather than feature marketing.

RocksDB is released under the Apache 2.0 license (and BSD for some components), making it permissively licensed and forkable.

Ecosystem Dependency#

RocksDB’s strategic importance is difficult to overstate. It is the storage engine for:

  • TiKV (TiDB’s storage layer) — broadly used in China and growing globally
  • CockroachDB (via Pebble, a Go reimplementation) — major cloud-native SQL database
  • Kafka Streams — the state store for stream processing at scale
  • Cassandra (optional RocksDB backend)
  • Numerous internal systems at large technology companies

If Meta were to abandon RocksDB, the ecosystem impact would be severe enough that a fork would certainly emerge and be well-funded. The project is unlikely to be abandoned precisely because so many organizations depend on it and have the capability to maintain it.

Current Direction#

RocksDB’s recent development focuses on:

  • Remote compaction: Offloading compaction to dedicated compaction workers (for disaggregated storage architectures)
  • Tiered storage: Supporting object storage (S3-compatible) as a lower tier
  • Improved observability: Better statistics and tracing for production operations

These features reflect the infrastructure-engineering use case dominating RocksDB’s development: the engine is increasingly targeted at large-scale distributed systems rather than embedded application use.


LMDB, LevelDB, BadgerDB, bbolt: Stability vs. Growth#

LMDB#

LMDB is maintained as part of the OpenLDAP project. It is mature, stable, and not under active feature development. This is both its strength and its limitation. LMDB will not change in ways that break existing databases or applications. It will also not gain new capabilities.

The risk profile is low: LMDB is proven technology used in production for over a decade. The OpenLDAP project is not going away. But LMDB’s innovation rate is near zero, which means it will not gain features like remote replication, cloud integration, or new data types.

LevelDB#

LevelDB is maintained by Google but not under active development. Google’s internal projects have largely moved to different storage systems, and LevelDB receives only occasional bugfixes. The project is stable but effectively in maintenance mode.

For new projects, RocksDB is the better choice. LevelDB’s main strategic advantage is its simplicity and the number of well-maintained ports in other languages (goleveldb, etc.) that may be more actively maintained than the C++ original.

BadgerDB#

BadgerDB is maintained by the Dgraph team. Dgraph (the graph database company) uses BadgerDB internally and has a financial interest in keeping it functional. However, Dgraph has had a turbulent corporate history — the company changed strategy multiple times, pivoted to a managed service, and faced questions about its commercial viability.

The risk profile for BadgerDB is higher than for the other options. If the Dgraph team deprioritizes BadgerDB, development could slow significantly. The MIT license and pure-Go implementation mean it is forkable and maintainable by the community, but community forks require community interest.

For critical infrastructure, this risk warrants consideration. For green-field Go projects where the pure-Go constraint matters, BadgerDB remains a reasonable choice with eyes open to the risk.

bbolt#

bbolt’s strategic position is unusual. The original BoltDB project was abandoned by its author, but the etcd team adopted and renamed it bbolt precisely because etcd depends on it. This creates a maintenance floor: as long as Kubernetes and etcd are deployed at scale (for the foreseeable future), bbolt will receive maintenance.

However, bbolt is not under active feature development. It is intentionally stable and simple. The etcd team is not adding new capabilities to bbolt — they maintain it to keep it correct and bug-free. This is appropriate for a storage engine component underneath a distributed system, but it means bbolt will not evolve significantly.


Decision Matrix#

Dimension 1: SQL vs. Key-Value#

NeedOptions
SQL queries (SELECT, JOIN, WHERE, GROUP BY)SQLite, DuckDB
Schema enforcement and foreign keysSQLite
Simple key-value access (put/get/delete/scan)RocksDB, LevelDB, LMDB, BadgerDB, bbolt
Document store (schema-less records)TinyDB (small scale)

Dimension 2: OLTP vs. OLAP#

NeedOptions
Transactional record managementSQLite (row-oriented, ACID)
Analytical aggregation over large datasetsDuckDB (column-oriented, vectorized)
Both in one database (HTAP)Use SQLite + DuckDB together, or accept trade-offs

Dimension 3: Read-Heavy vs. Write-Heavy#

Access PatternOptions
Read-heavy (many reads, few writes)LMDB, bbolt, SQLite in WAL mode
Write-heavy (sustained high write throughput)RocksDB, BadgerDB, LevelDB
Mixed (balanced reads and writes)SQLite WAL, RocksDB with tuning
Append-only (time-series, event logs)RocksDB with FIFO compaction, BadgerDB

Dimension 4: Language Ecosystem#

LanguageRecommended Options
PythonSQLite (stdlib), DuckDB (first-class)
Gobbolt, BadgerDB, modernc.org/sqlite (pure-Go)
RustRocksDB (rust-rocksdb), SQLite (rusqlite), LMDB (lmdb-rs)
Java/JVMRocksDB (first-class Java API), SQLite (sqlite-jdbc)
JavaScript/Node.jsbetter-sqlite3, DuckDB-node
Swift/Kotlin (mobile)SQLite via platform APIs
C/C++All options have native APIs

Dimension 5: Dataset Size and Longevity#

ScenarioRecommendation
Small (< 1M rows, < 1GB)Any option works; SQLite or TinyDB
Medium (1M-100M rows, 1-100GB)SQLite, DuckDB, RocksDB, LMDB
Large (> 100M rows, > 100GB)RocksDB, DuckDB; SQLite with WAL
Long-term archival (years, format stability)SQLite (public domain, 2050 commitment)
Temporary/ephemeral analyticsDuckDB in-memory mode

Dimension 6: Deployment Context#

ContextRecommendation
Mobile app (iOS/Android)SQLite
Desktop applicationSQLite
Browser (WebAssembly)SQLite WASM (official), DuckDB-WASM
Data science notebookDuckDB
Microservice embedded statebbolt or BadgerDB (Go), SQLite (others)
Storage engine for a databaseRocksDB
Edge computing / IoTSQLite
Unit tests / CIDuckDB in-memory, SQLite in-memory

Strategic Recommendation Summary#

For application developers: Start with SQLite. It handles 90% of embedded database needs across all application types. Only deviate when you have a specific need SQLite cannot meet.

For analytical workloads: DuckDB has earned its place as the default. Its rapid adoption in the data science community and its technical superiority for OLAP workloads make it the clear choice. The commercial risks are manageable given the MIT license and strong community.

For infrastructure engineers: RocksDB is the industry standard. If you are building something that will process serious write throughput as a storage layer, RocksDB’s configuration space and production credibility are necessary.

For Go developers: bbolt for transactional, read-heavy state with zero-risk maintenance guarantees. BadgerDB for write-heavy workloads with awareness of the higher organizational risk. Both are better than CGO-based bindings for projects where build simplicity matters.

For the long term: SQLite is the only embedded database with a credible 30-year horizon of maintenance, format stability, and zero organizational risk. If you need to store data that must be readable in 2050, SQLite is the answer.


SQLite as a platform: Projects like Turso (libSQL, a SQLite fork with network extensions), Litestream (streaming replication), and SQLite WASM are extending SQLite into distributed and browser use cases that were previously off-limits. SQLite is becoming the foundation for a class of edge database infrastructure.

DuckDB in the cloud: MotherDuck (the DuckDB-as-a-service offering) extends DuckDB to serverless cloud analytics. The ability to query local DuckDB files and cloud-hosted data in the same query is architecturally interesting for hybrid local-cloud analytics.

Embedded databases for AI/ML: Vector similarity search is increasingly needed as AI applications require embedding-based lookup. SQLite has vector extensions (sqlite-vec, sqlite-vss). DuckDB has vector support. This trend may reshape which embedded databases gain adoption in AI application development.

RocksDB for disaggregated storage: Cloud infrastructure is moving toward separating compute from storage (S3-backed databases). RocksDB’s remote compaction and tiered storage features are positioning it for this architecture, where the “embedded” label becomes more complicated — the engine runs in-process but its storage tier is remote.

Published: 2026-03-04 Updated: 2026-03-04