Phase 5: document optional Postgres backend + migration runbook

Add a Database Backend section to the README (config vars, compose profile,
schema-per-instance) and an "Optional PostgreSQL Backend" section to
docs/upgrading.md covering enablement, search_path isolation, role/db
provisioning, and the SQLite -> Postgres data-migration runbook.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Louis King
2026-06-14 08:37:26 +01:00
parent da012afd51
commit afda05403f
2 changed files with 96 additions and 1 deletions
+27 -1
View File
@@ -244,7 +244,7 @@ Each worker is an independent process sharing one listening socket, so the kerne
Pick a worker count around the number of CPU cores available to the container; start with `2``4` and measure under realistic load.
**SQLite caveat:** all workers share the same SQLite file on the same host. WAL mode (enabled automatically) allows concurrent readers alongside the single writer (the collector), so reads scale — but **writes do not**, and this does not extend across multiple hosts (a network filesystem breaks SQLite locking). To scale the API across hosts, switch `DATABASE_URL` to PostgreSQL; the API requires no code changes for this.
**SQLite caveat:** all workers share the same SQLite file on the same host. WAL mode (enabled automatically) allows concurrent readers alongside the single writer (the collector), so reads scale — but **writes do not**, and this does not extend across multiple hosts (a network filesystem breaks SQLite locking). To scale the API across hosts, switch to PostgreSQL (`DATABASE_BACKEND=postgres`); the API requires no code changes for this. See [Database Backend](#database-backend).
> Prefer `API_WORKERS` over running multiple `api` containers (`--scale api=N`): the `api` service uses a fixed `container_name`, and one process-managed container per stack keeps logs, health checks, and monitoring simple.
@@ -346,6 +346,32 @@ All components are configured via environment variables. Create a `.env` file or
> **Note:** `MQTT_PREFIX` also accepts the legacy alias `MQTT_TOPIC_PREFIX` for backward compatibility.
### Database Backend
MeshCore Hub defaults to **SQLite** (zero-config, single host). Set `DATABASE_BACKEND=postgres` to switch to **PostgreSQL** for write scaling and multi-host deployments. Postgres is opt-in — leave these unset to keep using SQLite.
| Variable | Default | Description |
| ------------------- | ------------- | --------------------------------------------------------------------------------------- |
| `DATABASE_BACKEND` | `sqlite` | `sqlite` or `postgres`. Explicit switch — Postgres is never selected implicitly. |
| `DATABASE_HOST` | `postgres` | Postgres hostname (`postgres` = bundled container service name) |
| `DATABASE_PORT` | `5432` | Postgres port |
| `DATABASE_NAME` | `meshcorehub` | Database name |
| `DATABASE_SCHEMA` | `meshcorehub` | Schema (search_path). Set a distinct value per instance on a shared cluster |
| `DATABASE_USER` | `meshcorehub` | Role name |
| `DATABASE_PASSWORD` | _(none)_ | **Required** for Postgres |
| `DATABASE_URL` | _(none)_ | Advanced: full SQLAlchemy URL; overrides all of the above |
**Docker:** Postgres is bundled behind the `postgres` profile. The container's credentials/name are derived from the `DATABASE_*` values (single source of truth).
```bash
docker compose --profile postgres --profile core up # Start on Postgres
docker compose --profile core up # Start on SQLite (default)
```
**Schema-per-instance:** several instances (e.g. `prod`, `stg`) can share one Postgres cluster, each isolated to its own schema via `search_path` — give each a distinct `DATABASE_SCHEMA`. The schema is created automatically on `db upgrade`.
See [docs/upgrading.md](docs/upgrading.md#optional-postgresql-backend) for the setup reference and the SQLite → Postgres data-migration runbook.
### Collector Settings
| Variable | Default | Description |
+69
View File
@@ -4,6 +4,75 @@ This guide covers upgrading from a previous MeshCore Hub release to the current
## v0.13.0
### Optional PostgreSQL Backend
MeshCore Hub can now run on **PostgreSQL** as an alternative to the default SQLite database. SQLite remains the zero-config default — Postgres is entirely opt-in and **no action is required** to keep using SQLite. Switch to Postgres to scale writes and run the stack across multiple hosts (SQLite's file locking does not work over network filesystems and caps you at a single host). Existing operators can migrate their live SQLite data into Postgres with a single command (downtime required while writers are stopped).
#### Enabling Postgres
Set `DATABASE_BACKEND=postgres` and the `DATABASE_*` connection variables, then activate the compose `postgres` profile:
| Variable | Default | Description |
| ------------------- | ------------- | -------------------------------------------------------------------------------------------- |
| `DATABASE_BACKEND` | `sqlite` | `sqlite` (default) or `postgres`. Explicit switch — Postgres is never used implicitly. |
| `DATABASE_HOST` | `postgres` | Postgres hostname (`postgres` is the bundled container's service name). |
| `DATABASE_PORT` | `5432` | Postgres port. |
| `DATABASE_NAME` | `meshcorehub` | Database name. The bundled container is initialised with this name. |
| `DATABASE_SCHEMA` | `meshcorehub` | Postgres schema (search_path). **Set a distinct value per instance** on a shared cluster. |
| `DATABASE_USER` | `meshcorehub` | Role name. The bundled container is initialised with this user. |
| `DATABASE_PASSWORD` | _(none)_ | **Required** for Postgres. Generate one, e.g. `openssl rand -base64 32`. |
```bash
# Start the stack on Postgres (bundled container)
docker compose -f docker-compose.yml -f docker-compose.dev.yml \
--profile postgres --profile core up -d
```
The bundled `postgres` container derives its `POSTGRES_USER` / `POSTGRES_PASSWORD` / `POSTGRES_DB` from the same `DATABASE_USER` / `DATABASE_PASSWORD` / `DATABASE_NAME` values — one source of truth. For a **managed/external** Postgres, point `DATABASE_HOST` at it (and skip the `postgres` profile). Advanced users can instead set a full `DATABASE_URL` (e.g. `postgresql+psycopg2://user:pass@host:5432/db`), which takes precedence over the component variables.
#### Schema-per-instance (`search_path`)
Each Hub instance is isolated to its own Postgres **schema** via the connection's `search_path`, not its own database. This lets several instances (e.g. `prod`, `stg`) share **one** Postgres cluster without colliding — each gets its own tables and its own `alembic_version`. Give every instance a distinct `DATABASE_SCHEMA` (e.g. `meshcorehub_prod`, `meshcorehub_stg`). The schema is created automatically on `db upgrade` if it does not exist.
#### Provisioning the role and database
The bundled container provisions the role and database for you on first start. For a managed/external Postgres, create them once before pointing Hub at it:
```sql
CREATE ROLE meshcorehub LOGIN PASSWORD 'your-password';
CREATE DATABASE meshcorehub OWNER meshcorehub;
-- The schema is created by `db upgrade`; the role just needs CREATE on the database.
```
No admin/bootstrap credentials are needed at runtime — Hub only ever connects as `DATABASE_USER`.
#### Migrating an existing SQLite database to Postgres
Downtime is required while writers are stopped; the source SQLite file is never modified.
1. **Back up first.** Copy your `meshcore.db` (or back up the `hub_data` volume — see [Backup & Restore](../README.md#backup--restore)).
2. **Stop the writers** (collector and api):
```bash
docker compose -f docker-compose.yml -f docker-compose.dev.yml stop collector api
```
3. **Bring up Postgres** and create the schema:
```bash
docker compose -f docker-compose.yml -f docker-compose.dev.yml --profile postgres up -d postgres
docker compose -f docker-compose.yml -f docker-compose.dev.yml --profile postgres run --rm migrate
```
`migrate` runs `db upgrade` against Postgres, creating the schema, all tables (with correct native types — `boolean`, `json`, `timestamptz`), and stamping `alembic_version`.
4. **Copy the data** with the built-in command:
```bash
docker compose -f docker-compose.yml -f docker-compose.dev.yml --profile postgres \
run --rm migrate meshcore-hub db migrate-to-postgres
```
It defaults the source to `sqlite:///{DATA_HOME}/collector/meshcore.db` and the target to your configured `DATABASE_*` connection. It copies every table in foreign-key order through the ORM (so SQLite's dynamically typed values are converted correctly — `0/1` → `boolean`, JSON text → `json`, naive datetimes → UTC `timestamptz`), then prints a per-table source-vs-target row-count reconciliation and fails on any mismatch. Use `--dry-run` to preview counts first, and `--truncate` to overwrite a non-empty target.
5. **Start the stack on Postgres** with `DATABASE_BACKEND=postgres` set (see *Enabling Postgres* above).
> **Why not pgloader?** pgloader infers the target schema from SQLite's *dynamic* typing and produces wrong Postgres types (e.g. `is_observer` as `bigint` not `boolean`, JSON columns as `text`, no `timestamptz`), and no `alembic_version` consistent with the migration history. The built-in command reuses the ORM models, so types convert correctly and the schema is created by `db upgrade`.
> **Managed Postgres / non-superuser roles:** the migration disables foreign-key triggers during the copy via `session_replication_role = replica`, which requires a superuser. When the target role is not a superuser (typical for managed Postgres), the command automatically falls back to copying in parent-first order instead. Pass `--no-replication-role` to force the fallback explicitly.
### Raw Packets (capture, browse, and search wire packets)
A new **Raw Packets** feature captures every inbound MeshCore packet exactly as it arrives over the LetsMesh `packets` feed into a dedicated `raw_packets` table, independent of how the collector later classifies it. A new `/packets` API and a SPA **Packets** page (table on desktop, cards on mobile) let operators browse, filter, and search the raw traffic.