Phase 5: document optional Postgres backend + migration runbook

Add a Database Backend section to the README (config vars, compose profile, schema-per-instance) and an "Optional PostgreSQL Backend" section to docs/upgrading.md covering enablement, search_path isolation, role/db provisioning, and the SQLite -> Postgres data-migration runbook. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 22:11:17 +02:00 · 2026-06-14 08:37:26 +01:00
parent da012afd51
commit afda05403f
2 changed files with 96 additions and 1 deletions
@@ -244,7 +244,7 @@ Each worker is an independent process sharing one listening socket, so the kerne

 Pick a worker count around the number of CPU cores available to the container; start with `2`–`4` and measure under realistic load.

-**SQLite caveat:** all workers share the same SQLite file on the same host. WAL mode (enabled automatically) allows concurrent readers alongside the single writer (the collector), so reads scale — but **writes do not**, and this does not extend across multiple hosts (a network filesystem breaks SQLite locking). To scale the API across hosts, switch `DATABASE_URL` to PostgreSQL; the API requires no code changes for this.
+**SQLite caveat:** all workers share the same SQLite file on the same host. WAL mode (enabled automatically) allows concurrent readers alongside the single writer (the collector), so reads scale — but **writes do not**, and this does not extend across multiple hosts (a network filesystem breaks SQLite locking). To scale the API across hosts, switch to PostgreSQL (`DATABASE_BACKEND=postgres`); the API requires no code changes for this. See [Database Backend](#database-backend).

 > Prefer `API_WORKERS` over running multiple `api` containers (`--scale api=N`): the `api` service uses a fixed `container_name`, and one process-managed container per stack keeps logs, health checks, and monitoring simple.

@@ -346,6 +346,32 @@ All components are configured via environment variables. Create a `.env` file or

 > **Note:** `MQTT_PREFIX` also accepts the legacy alias `MQTT_TOPIC_PREFIX` for backward compatibility.

+### Database Backend
+
+MeshCore Hub defaults to **SQLite** (zero-config, single host). Set `DATABASE_BACKEND=postgres` to switch to **PostgreSQL** for write scaling and multi-host deployments. Postgres is opt-in — leave these unset to keep using SQLite.
+
+| Variable            | Default       | Description                                                                              |
+| ------------------- | ------------- | --------------------------------------------------------------------------------------- |
+| `DATABASE_BACKEND`  | `sqlite`      | `sqlite` or `postgres`. Explicit switch — Postgres is never selected implicitly.         |
+| `DATABASE_HOST`     | `postgres`    | Postgres hostname (`postgres` = bundled container service name)                          |
+| `DATABASE_PORT`     | `5432`        | Postgres port                                                                            |
+| `DATABASE_NAME`     | `meshcorehub` | Database name                                                                            |
+| `DATABASE_SCHEMA`   | `meshcorehub` | Schema (search_path). Set a distinct value per instance on a shared cluster             |
+| `DATABASE_USER`     | `meshcorehub` | Role name                                                                                |
+| `DATABASE_PASSWORD` | _(none)_      | **Required** for Postgres                                                                |
+| `DATABASE_URL`      | _(none)_      | Advanced: full SQLAlchemy URL; overrides all of the above                                |
+
+**Docker:** Postgres is bundled behind the `postgres` profile. The container's credentials/name are derived from the `DATABASE_*` values (single source of truth).
+
+```bash
+docker compose --profile postgres --profile core up    # Start on Postgres
+docker compose --profile core up                        # Start on SQLite (default)
+```
+
+**Schema-per-instance:** several instances (e.g. `prod`, `stg`) can share one Postgres cluster, each isolated to its own schema via `search_path` — give each a distinct `DATABASE_SCHEMA`. The schema is created automatically on `db upgrade`.
+
+See [docs/upgrading.md](docs/upgrading.md#optional-postgresql-backend) for the setup reference and the SQLite → Postgres data-migration runbook.
+
 ### Collector Settings

 | Variable                           | Default | Description                                              |
@@ -4,6 +4,75 @@ This guide covers upgrading from a previous MeshCore Hub release to the current

 ## v0.13.0

+### Optional PostgreSQL Backend
+
+MeshCore Hub can now run on **PostgreSQL** as an alternative to the default SQLite database. SQLite remains the zero-config default — Postgres is entirely opt-in and **no action is required** to keep using SQLite. Switch to Postgres to scale writes and run the stack across multiple hosts (SQLite's file locking does not work over network filesystems and caps you at a single host). Existing operators can migrate their live SQLite data into Postgres with a single command (downtime required while writers are stopped).
+
+#### Enabling Postgres
+
+Set `DATABASE_BACKEND=postgres` and the `DATABASE_*` connection variables, then activate the compose `postgres` profile:
+
+| Variable            | Default       | Description                                                                                  |
+| ------------------- | ------------- | -------------------------------------------------------------------------------------------- |
+| `DATABASE_BACKEND`  | `sqlite`      | `sqlite` (default) or `postgres`. Explicit switch — Postgres is never used implicitly.        |
+| `DATABASE_HOST`     | `postgres`    | Postgres hostname (`postgres` is the bundled container's service name).                       |
+| `DATABASE_PORT`     | `5432`        | Postgres port.                                                                                |
+| `DATABASE_NAME`     | `meshcorehub` | Database name. The bundled container is initialised with this name.                           |
+| `DATABASE_SCHEMA`   | `meshcorehub` | Postgres schema (search_path). **Set a distinct value per instance** on a shared cluster.     |
+| `DATABASE_USER`     | `meshcorehub` | Role name. The bundled container is initialised with this user.                              |
+| `DATABASE_PASSWORD` | _(none)_      | **Required** for Postgres. Generate one, e.g. `openssl rand -base64 32`.                      |
+
+```bash
+# Start the stack on Postgres (bundled container)
+docker compose -f docker-compose.yml -f docker-compose.dev.yml \
+  --profile postgres --profile core up -d
+```
+
+The bundled `postgres` container derives its `POSTGRES_USER` / `POSTGRES_PASSWORD` / `POSTGRES_DB` from the same `DATABASE_USER` / `DATABASE_PASSWORD` / `DATABASE_NAME` values — one source of truth. For a **managed/external** Postgres, point `DATABASE_HOST` at it (and skip the `postgres` profile). Advanced users can instead set a full `DATABASE_URL` (e.g. `postgresql+psycopg2://user:pass@host:5432/db`), which takes precedence over the component variables.
+
+#### Schema-per-instance (`search_path`)
+
+Each Hub instance is isolated to its own Postgres **schema** via the connection's `search_path`, not its own database. This lets several instances (e.g. `prod`, `stg`) share **one** Postgres cluster without colliding — each gets its own tables and its own `alembic_version`. Give every instance a distinct `DATABASE_SCHEMA` (e.g. `meshcorehub_prod`, `meshcorehub_stg`). The schema is created automatically on `db upgrade` if it does not exist.
+
+#### Provisioning the role and database
+
+The bundled container provisions the role and database for you on first start. For a managed/external Postgres, create them once before pointing Hub at it:
+
+```sql
+CREATE ROLE meshcorehub LOGIN PASSWORD 'your-password';
+CREATE DATABASE meshcorehub OWNER meshcorehub;
+-- The schema is created by `db upgrade`; the role just needs CREATE on the database.
+```
+
+No admin/bootstrap credentials are needed at runtime — Hub only ever connects as `DATABASE_USER`.
+
+#### Migrating an existing SQLite database to Postgres
+
+Downtime is required while writers are stopped; the source SQLite file is never modified.
+
+1. **Back up first.** Copy your `meshcore.db` (or back up the `hub_data` volume — see [Backup & Restore](../README.md#backup--restore)).
+2. **Stop the writers** (collector and api):
+   ```bash
+   docker compose -f docker-compose.yml -f docker-compose.dev.yml stop collector api
+   ```
+3. **Bring up Postgres** and create the schema:
+   ```bash
+   docker compose -f docker-compose.yml -f docker-compose.dev.yml --profile postgres up -d postgres
+   docker compose -f docker-compose.yml -f docker-compose.dev.yml --profile postgres run --rm migrate
+   ```
+   `migrate` runs `db upgrade` against Postgres, creating the schema, all tables (with correct native types — `boolean`, `json`, `timestamptz`), and stamping `alembic_version`.
+4. **Copy the data** with the built-in command:
+   ```bash
+   docker compose -f docker-compose.yml -f docker-compose.dev.yml --profile postgres \
+     run --rm migrate meshcore-hub db migrate-to-postgres
+   ```
+   It defaults the source to `sqlite:///{DATA_HOME}/collector/meshcore.db` and the target to your configured `DATABASE_*` connection. It copies every table in foreign-key order through the ORM (so SQLite's dynamically typed values are converted correctly — `0/1` → `boolean`, JSON text → `json`, naive datetimes → UTC `timestamptz`), then prints a per-table source-vs-target row-count reconciliation and fails on any mismatch. Use `--dry-run` to preview counts first, and `--truncate` to overwrite a non-empty target.
+5. **Start the stack on Postgres** with `DATABASE_BACKEND=postgres` set (see *Enabling Postgres* above).
+
+> **Why not pgloader?** pgloader infers the target schema from SQLite's *dynamic* typing and produces wrong Postgres types (e.g. `is_observer` as `bigint` not `boolean`, JSON columns as `text`, no `timestamptz`), and no `alembic_version` consistent with the migration history. The built-in command reuses the ORM models, so types convert correctly and the schema is created by `db upgrade`.
+
+> **Managed Postgres / non-superuser roles:** the migration disables foreign-key triggers during the copy via `session_replication_role = replica`, which requires a superuser. When the target role is not a superuser (typical for managed Postgres), the command automatically falls back to copying in parent-first order instead. Pass `--no-replication-role` to force the fallback explicitly.
+
 ### Raw Packets (capture, browse, and search wire packets)

 A new **Raw Packets** feature captures every inbound MeshCore packet exactly as it arrives over the LetsMesh `packets` feed into a dedicated `raw_packets` table, independent of how the collector later classifies it. A new `/packets` API and a SPA **Packets** page (table on desktop, cards on mobile) let operators browse, filter, and search the raw traffic.