Files
mc-webui/docs/architecture.md
MarekWo 53ef2759d5 docs: cover analyzer settings, vacuum/optimize, path apply, watchdog soft patterns
User-guide: new Settings > Analyzer tab (custom analyzer services with default/disabled
toggles and {packetHash} placeholder), apply-path upload button in DM Path Management,
Backup modal's Optimize button + live size label, console change_path now accepts
arrow/whitespace separators with consistent multi-byte chunk length and "path" output
shows hop count and byte size.

Architecture: new /api/analyzers CRUD + default endpoints, /api/db/size and the split
/api/db/vacuum kickoff + /api/db/vacuum/status polling (worker-thread VACUUM to survive
proxy idle timeouts), /api/contacts/<key>/paths/<id>/apply, /health and /health/strict
top-level routes, analyzers table and direct_messages.delivery_path_hash_size column,
recombined path_len byte storage. DeviceManager: per-send channel-secret refresh,
liveness telemetry (_last_rx_at + _consecutive_stats_failures), TCP self-heal via
_liveness_watcher_loop + in-place reconnect. Retention scheduler: on-by-default
90/90/60/30, post-cleanup VACUUM at >=1000 deletions, app-context wrapping, archiver
emoji-name fallback. Socket.IO clients forced to polling transport.

Watchdog: documented hard- vs soft-pattern detection (5 hits in 2 min for sluggish
get_stats / get_battery failures), pointer to /health/strict, and the systemd-restart
deploy note for scripts/watchdog/ changes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-08 11:53:41 +02:00

24 KiB
Raw Permalink Blame History

mc-webui Architecture

Technical documentation for mc-webui, covering system architecture, project structure, and internal APIs.

Table of Contents


Tech Stack

  • Backend: Python 3.11+, Flask, Flask-SocketIO (gevent), SQLite
  • Frontend: HTML5, Bootstrap 5, vanilla JavaScript, Socket.IO client
  • Deployment: Docker / Docker Compose (Single-container architecture)
  • Communication: Direct hardware access (USB, BLE, or TCP) via meshcore library
  • Data source: SQLite Database (./data/meshcore/<pubkey_prefix>.db)

Container Architecture

mc-webui uses a single-container architecture for simplified deployment and direct hardware communication:

┌─────────────────────────────────────────────────────────────┐
│                     Docker Network                           │
│                                                              │
│  ┌───────────────────────────────────────────────────────┐   │
│  │                       mc-webui                        │   │
│  │                                                       │   │
│  │  - Flask web app (Port 5000)                          │   │
│  │  - DeviceManager (Direct USB/BLE/TCP access)          │   │
│  │  - Database (SQLite)                                  │   │
│  │                                                       │   │
│  └─────────┬─────────────────────────────────────────────┘   │
│            │                                                 │
└────────────┼─────────────────────────────────────────────────┘
             │
             ▼
      ┌──────────────┐
      │ USB/BLE/TCP  │
      │    Device    │
      └──────────────┘

Three transport options are supported with the following priority: BLE > TCP > Serial (USB). Set the MC_BLE_ADDRESS or MC_TCP_HOST environment variable to activate BLE or TCP transport respectively; otherwise, USB serial is used by default.

This v2 architecture eliminates the need for a separate bridge container and relies on the native meshcore Python library for direct communication, ensuring lower latency and greater stability.

Docker Entrypoint (BLE cleanup)

scripts/docker-entrypoint.sh runs before the Flask app starts. When MC_BLE_ADDRESS is set, it uses D-Bus to check if BlueZ has an active session to the device and disconnects it. BlueZ auto-reconnects trusted devices, which leaves stale GATT notification handles that block bleak from establishing a new session. A clean disconnect at startup ensures the app starts with a fresh BLE state.

Multi-architecture Images

Official images are built via GitHub Actions for linux/amd64, linux/arm64, and linux/arm/v7 (Raspberry Pi 2/3/4/5 supported). Build dependencies (gcc, python3-dev, libjpeg-dev, zlib1g-dev) are installed and then purged to keep the final image size small while still compiling Pillow and pycryptodome from source when wheels are unavailable (notably on arm/v7). GHA layer cache (cache-from / cache-to) speeds up subsequent rebuilds. Images are pushed to both Docker Hub (mawoj/mc-webui) and GitHub Container Registry (ghcr.io/marekwo/mc-webui), with latest tag on main and dev tag on the dev branch.


DeviceManager Architecture

The DeviceManager handles the connection to the MeshCore device via a direct session:

  • Single persistent session - One long-lived connection utilizing the meshcore library
  • Event-driven - Subscribes to device events (e.g., incoming messages, advert receptions, ACKs) and triggers appropriate handlers
  • Direct Database integration - Seamlessly syncs contacts, messages, and device settings to the SQLite database
  • Real-time messages - Instant message processing via callback events without polling
  • Thread-safe queue - Commands are serialized to prevent device lockups
  • Auto-restart watchdog - Monitors connection health and restarts the session on crash
  • BLE keepalive & reconnect - When using Bluetooth transport, a 60s keepalive loop detects "zombie" connections (reads still succeed but writes silently fail). On disconnect or keepalive failure, the manager marks the session as permanently failed and the /health endpoint returns 503, letting the Docker healthcheck trigger a fast container restart (~5s) to get a clean BLE state rather than attempting unreliable in-process reconnects
  • Echo correlation - Sent channel messages pre-compute their expected pkt_payload using the channel secret and send timestamp (±3s for clock drift), so incoming echoes are matched exactly instead of only by 1-byte channel hash (prevents misattribution when two messages go out simultaneously on the same channel)
  • Per-channel region scope - Before each channel send, the channel's mapped region scope key (16 bytes) is pushed to the firmware via CMD_SET_FLOOD_SCOPE_KEY (54). The scope-set + send pair is serialised under a _send_lock so concurrent sends on different channels can't swap each other's scope. Channels without a mapping get an all-zero key so a previously-set scope doesn't leak across channels
  • Per-send channel-secret refresh - Channel indices on the device compact down after a deletion, so the boot-time _load_channel_secrets() cache can drift. send_channel_message calls _refresh_channel_secret(idx) first (one extra get_channel(idx) round-trip) to fetch the current secret straight from firmware, update the in-memory cache and DB if they had drifted, and use it for the pkt_payload echo correlation
  • Liveness telemetry - Tracks _last_rx_at (bumped on every RX_LOG_DATA event) and _consecutive_stats_failures (incremented on get_stats_* / get_bat exceptions, cleared on success). Surfaced via /health/strict for the external watchdog
  • TCP self-heal - A _liveness_watcher_loop task on the DM event loop calls force_reconnect() when no RX event has arrived for HEALTH_STRICT_MAX_RX_STALE_SEC (5 min). send_channel_message also detects empty-string concurrent.futures.TimeoutError from set_flood_scope_key (the symptom of a degraded long-lived TCP) and runs an in-place reconnect + one retry before failing. A 30 s cooldown and _reconnect_lock prevent churn; _intentional_disconnect keeps the DISCONNECTED handler from racing the reconnect

Project Structure

mc-webui/
├── Dockerfile                      # Main app Docker image
├── docker-compose.yml              # Single-container orchestration
├── app/
│   ├── __init__.py
│   ├── main.py                     # Flask entry point + Socket.IO handlers
│   ├── config.py                   # Configuration from env vars
│   ├── database.py                 # SQLite database models and CRUD operations
│   ├── device_manager.py           # Core logic for meshcore communication
│   ├── contacts_cache.py           # Persistent contacts cache (DB-backed)
│   ├── read_status.py              # Server-side read status manager (DB-backed)
│   ├── version.py                  # Git-based version management
│   ├── migrate_v1.py               # Migration script from v1 flat files to v2 SQLite
│   ├── meshcore/
│   │   ├── __init__.py
│   │   ├── cli.py                  # Meshcore library wrapper interface
│   │   └── parser.py               # Data parsers
│   ├── archiver/
│   │   └── manager.py              # Archive scheduler and management
│   ├── routes/
│   │   ├── __init__.py
│   │   ├── api.py                  # REST API endpoints
│   │   └── views.py                # HTML views
│   ├── static/                     # Frontend assets (CSS, JS, images, vendors)
│   │   └── js/fab-utils.js         # Floating-button drag/collapse/sizing helpers
│   └── templates/                  # HTML templates
├── docs/                           # Documentation
├── scripts/
│   ├── update.sh                   # Automated update script
│   ├── docker-entrypoint.sh        # Container startup (BLE cleanup)
│   ├── updater/                    # Remote update webhook service
│   └── watchdog/                   # Container health monitor
└── README.md

Database Architecture

mc-webui v2 uses a robust SQLite Database with WAL (Write-Ahead Logging) enabled.

Location: ./data/meshcore/<pubkey_prefix>.db

Key tables:

  • messages - All channel and direct messages (with FTS5 index for full-text search)
  • contacts - Contact list with sync status, types, block/ignore flags, no_auto_flood flag
  • channels - Channel configuration and keys
  • echoes - Sent message tracking and repeater paths, hash_size for path_hash_mode
  • direct_messages - DM messages with delivery tracking (delivery_status, delivery_attempt, delivery_max_attempts, delivery_path)
  • acks - DM delivery status
  • settings - Application settings (migrated from .webui_settings.json)
  • regions - User-curated MeshCore flood scopes (name, key_hex, is_default)
  • channel_scopes - Per-channel region mapping (channel_idxregion_id, CASCADE on region delete; absent row = no override → firmware default applies)
  • read_status - Per-channel read counters and favorites (is_favorite column; used to pin channels in the sidebar/dropdown sort order)
  • analyzers - User-configured MeshCore Analyzer services (name, url_template with {packetHash} placeholder, is_default, is_disabled; partial unique index enforces a single default)

direct_messages gained a delivery_path_hash_size column (auto-migrated, defaults to 1) so reloaded DM bubbles render multi-byte routes correctly. The path_len column on channel_messages, direct_messages, and paths now stores the raw firmware byte (masked hop count plus path_hash_mode in the upper bits), recombined at write time via pack_path_len(); the API endpoints decode it back into path_hash_size on read.

The use of SQLite allows for fast queries, reliable data storage, full-text search, and complex filtering (such as contact ignoring/blocking) without the risk of file corruption inherent to flat JSON files.

Retention scheduler

Retention is enabled by default with 90 / 90 / 60 / 30 days for channel_messages / direct_messages / advertisements / diagnostics. The job runs daily at 03:30 local (TZ from .env) and cleanup_old_messages() also deletes from echoes, paths, and acks (the diagnostic tables — historically the bulk of DB size). When at least 1 000 rows are removed in a pass, the scheduler immediately runs VACUUM to reclaim file space (a SQLite DELETE only marks pages free).

The retention/cleanup scheduler runs APScheduler jobs in worker threads, so each job is decorated with @_with_app_context and the Flask app is passed in via set_flask_app(); the init_*_schedule() callers also wrap themselves in app.app_context() so the boot-time read of current_app.db doesn't blow up with "Working outside of application context".

The archiver builds the .msgs path from device_name, but the meshcore library strips non-ASCII when writing the file (so a device renamed to include an emoji breaks the strict path match). The archiver now falls back to globbing the data directory for a single non-archive .msgs file when the expected path is missing — mirroring migrate_v1.

The channels API reads from the channels DB table rather than iterating device slots. _load_channel_secrets() syncs the table on every device connect (and prunes stale rows), set_channel() / remove_channel() update it synchronously with the device, and _refresh_channel_secret() refreshes individual rows on per-send refresh. This makes /api/channels a single sub-millisecond SELECT and unaffected by device responsiveness — the original symptom (only "Public" showing up after a refresh when the device briefly stalls) is gone.


API Reference

Messages

Method Endpoint Description
GET /api/messages List messages (?archive_date, ?days, ?channel_idx)
POST /api/messages Send message ({text, channel_idx, reply_to?})
GET /api/messages/updates Check for new messages (smart refresh)
GET /api/messages/<id>/meta Get message metadata (echoes, paths)
GET /api/messages/search Full-text search (?q=, ?channel_idx=, ?limit=)

Contacts

Method Endpoint Description
GET /api/contacts List contacts
GET /api/contacts/detailed Full contact data (includes protection, ignore, block flags)
GET /api/contacts/cached Get cached contacts (superset of device contacts)
POST /api/contacts/delete Soft-delete contact ({selector})
POST /api/contacts/cached/delete Delete cached contact
GET /api/contacts/protected List protected public keys
POST /api/contacts/<key>/protect Toggle contact protection
POST /api/contacts/<key>/ignore Toggle contact ignore
POST /api/contacts/<key>/block Toggle contact block
GET /api/contacts/blocked-names Get blocked names count
POST /api/contacts/block-name Block a name pattern
GET /api/contacts/blocked-names-list List blocked name patterns
POST /api/contacts/preview-cleanup Preview cleanup criteria
POST /api/contacts/cleanup Remove contacts by filter
GET /api/contacts/cleanup-settings Get auto-cleanup settings
POST /api/contacts/cleanup-settings Update auto-cleanup settings
GET /api/contacts/pending Pending contacts (?types=1&types=2)
POST /api/contacts/pending/approve Approve pending contact
POST /api/contacts/pending/reject Reject pending contact
POST /api/contacts/pending/clear Clear all pending contacts
POST /api/contacts/manual-add Add contact from URI or params
POST /api/contacts/<key>/push-to-device Push cached contact to device
POST /api/contacts/<key>/move-to-cache Move device contact to cache
GET /api/contacts/repeaters List repeater contacts (for path picker)
GET /api/contacts/<key>/paths Get contact paths
POST /api/contacts/<key>/paths Add path to contact
PUT /api/contacts/<key>/paths/<id> Update path (star, label)
DELETE /api/contacts/<key>/paths/<id> Delete path
POST /api/contacts/<key>/paths/reorder Reorder paths
POST /api/contacts/<key>/paths/<id>/apply Push a configured path to the firmware as the active route (mirrors change_path); invalidates the contacts cache
POST /api/contacts/<key>/paths/reset_flood Reset to FLOOD routing
POST /api/contacts/<key>/paths/clear Clear all paths
GET /api/contacts/<key>/no_auto_flood Get "Keep path" flag
PUT /api/contacts/<key>/no_auto_flood Set "Keep path" flag

Channels

Method Endpoint Description
GET /api/channels List all channels
POST /api/channels Create new channel (idempotent — returns existing slot if name already used)
POST /api/channels/join Join existing channel (idempotent unless explicit index overrides)
DELETE /api/channels/<index> Remove channel
GET /api/channels/<index>/qr QR code (?format=json|png)
GET /api/channels/muted Get muted channels
POST /api/channels/<index>/mute Toggle channel mute
GET /api/channels/scopes Bulk per-channel region mapping for UI
PUT /api/channels/<index>/scope Assign/clear region scope ({region_id: int|null})
GET /api/channels/favorites List favorite channel indices
POST /api/channels/<index>/favorite Set favorite state ({favorite: bool})

Regions (MeshCore flood scopes)

Method Endpoint Description
GET /api/regions List the device's region registry
POST /api/regions Create region ({name}); key derived as SHA256('#'+name)[:16]
DELETE /api/regions/<id> Delete region; CASCADE clears channel mappings; if it was the firmware default, clears it on device
POST /api/regions/<id>/default Mark default in DB AND push to firmware (CMD_SET_DEFAULT_FLOOD_SCOPE = 63, requires firmware v1.15+)
DELETE /api/regions/default Clear default region in DB and on firmware

The PUT /api/channels/<index>/scope endpoint accepts any index in [0, device_manager._max_channels) (40 on current firmwares; falls back to 8 if the DM is unreachable).

Analyzers

Method Endpoint Description
GET /api/analyzers List configured analyzer services
POST /api/analyzers Create analyzer ({name, url_template}); template must contain {packetHash}
PUT /api/analyzers/<id> Update analyzer (name / url / is_disabled)
DELETE /api/analyzers/<id> Delete analyzer
POST /api/analyzers/<id>/default Mark as default (enforced single-default via partial unique index)
DELETE /api/analyzers/default Clear the default analyzer

The backend no longer ships a pre-built analyzer_url per message — channel-message payloads include packet_hash instead, and the frontend substitutes {packetHash} in the chosen URL template at click time.

Direct Messages

Method Endpoint Description
GET /api/dm/conversations List DM conversations
GET /api/dm/messages Get messages (?conversation_id=, ?limit=)
POST /api/dm/messages Send DM ({recipient, text})
GET /api/dm/updates Check for new DMs
GET /api/dm/auto_retry Get DM retry configuration
POST /api/dm/auto_retry Update DM retry configuration

Device & Settings

Method Endpoint Description
GET /api/status Connection status (device name, transport type, serial port / BLE address)
GET /api/device/info Device information
GET /api/device/stats Device statistics
GET /api/device/settings Get device settings
POST /api/device/settings Update device settings
GET /api/device/config Get device configuration (name, coords, advert_loc_policy, path_hash_mode, radio params, tx_power)
POST /api/device/config Update device configuration from Settings > Device tab. Subset of fields incl. path_hash_mode (0=1B, 1=2B, 2=3B)
POST /api/device/command Execute command (advert, floodadv)
GET /api/device/commands List available special commands
GET /api/chat/settings Get chat settings (quote length, route popup timeout/no-autoclose)
POST /api/chat/settings Update chat settings
GET /api/ui/settings Get UI settings (toast timeout, no-autoclose, position)
POST /api/ui/settings Update UI settings
GET /api/retention-settings Get message retention settings
POST /api/retention-settings Update retention settings

Archives & Backup

Method Endpoint Description
GET /api/archives List archives
POST /api/archive/trigger Manual archive
GET /api/backup/list List database backups
POST /api/backup/create Create database backup
GET /api/backup/download Download backup file
GET /api/db/size Current DB file size (bytes)
POST /api/db/vacuum Kick off SQLite VACUUM in a worker thread. Returns 202 immediately; 409 if already running. The kickoff endpoint deliberately splits from polling so reverse proxies with ~30 s idle timeouts can't kill it mid-rewrite
GET /api/db/vacuum/status Poll vacuum progress: {running, elapsed_seconds, size_before, size_after}

Health endpoints

These are top-level routes (not under /api/), consumed by Docker's healthcheck and the host-level watchdog.

Method Endpoint Description
GET /health Lenient liveness check. Returns 503 only when BLE reconnection has permanently failed (so Docker triggers a container restart to clear BLE state). Returns 200 otherwise
GET /health/strict Strict device-health check for the external watchdog. JSON response. Returns 503 when (a) BLE permanently failed, (b) _consecutive_stats_failures ≥ 5, or (c) transport is serial/usb/tcp and no RX event for > HEALTH_STRICT_MAX_RX_STALE_SEC (5 min). Returns 200 with the same counters when healthy

Other

Method Endpoint Description
GET /api/read_status Get server-side read status
POST /api/read_status/mark_read Mark messages as read
POST /api/read_status/mark_all_read Mark all messages as read
GET /api/version Get app version
GET /api/check-update Check for available updates
GET /api/updater/status Get updater service status
POST /api/updater/trigger Trigger remote update
GET /api/advertisements Get recent advertisements
GET /api/console/history Get console command history
POST /api/console/history Save console command
DELETE /api/console/history Clear console history
GET /api/console/output Get persisted console output transcript (capped at 500 entries)
POST /api/console/output Append entry to transcript
DELETE /api/console/output Clear transcript
GET /api/logs Get application logs

WebSocket API

All Socket.IO clients (/chat, /console, /logs) are configured with transports: ['polling']. The Werkzeug dev server can't upgrade WebSockets, so every io() upgrade attempt previously returned HTTP 500 and clients fell into a polling/upgrade reconnect loop — visible as 1015 s freezes on app load. Long-polling keeps real-time pushes working with ~12 s latency.

Console Namespace (/console)

Interactive console via Socket.IO WebSocket connection.

Client → Server:

  • send_command - Execute command ({command: "infos"})

Server → Client:

  • console_status - Connection status
  • command_response - Command result ({success, command, output})

Chat Namespace (/chat)

Real-time message delivery via Socket.IO.

Server → Client:

  • new_channel_message - New channel message received
  • new_dm_message - New DM received
  • message_echo - Echo/ACK update for sent message (includes hash_size)
  • dm_ack - DM delivery confirmation
  • dm_retry_status - Real-time retry progress (dm_id, attempt, max_attempts)
  • dm_retry_failed - All retry attempts exhausted (dm_id)
  • dm_delivered_info - Delivery details after ACK (dm_id, attempt, max_attempts, path, hash_size)
  • path_changed - Contact path discovered/updated (public_key)

Logs Namespace (/logs)

Real-time log streaming via Socket.IO.

Server → Client:

  • log_line - New log line

Offline Support

The application works completely offline without internet connection. Vendor libraries (Bootstrap, Bootstrap Icons, Socket.IO, Emoji Picker) are bundled locally. A Service Worker provides hybrid caching to ensure functionality without connectivity.