Long-lived TCP against the meshcore-proxy can degrade in a way the socket
can't see: some commands (set_flood_scope_key with all-zero key) start
timing out while RX events and other commands keep working. The 5 s
execute() timeout fires with concurrent.futures.TimeoutError() — whose
str() is empty — so the UI showed "Could not set region scope (none):"
with no error text, and only channels with a mapped region could send
because their non-zero scope_key happened to keep working.
Two recovery paths:
- send_channel_message now detects the timeout case (set_flood_scope_key
surfaces timed_out=True) and runs force_reconnect() + one retry before
failing. The user sees a brief delay instead of a cryptic error and
having to restart the container.
- A new _liveness_watcher_loop task runs on the DM event loop and forces
a reconnect when no RX event has arrived for HEALTH_STRICT_MAX_RX_STALE_SEC
(5 min). /health/strict now also reports rx_stale for TCP (previously
serial/USB only), so an external watchdog could act on it too.
force_reconnect() runs on the DM loop via run_coroutine_threadsafe with
a 20 s cap, a 30 s cooldown to avoid churn under fire, and a
_reconnect_lock to prevent concurrent attempts. mc.disconnect() fires
DISCONNECTED — _intentional_disconnect tells _on_disconnected to skip
its own reconnect loop so the two don't race.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The container watchdog only restarted on three legacy "device clearly dead"
log lines, so today's failure mode (firmware briefly stalls and get_stats_*
/ get_battery commands time out with an empty error while passive RX
keeps working) never tripped it — leaving the user with 10-15 s freezes
several times a day and no automatic recovery.
DeviceManager now tracks two liveness signals:
- _last_rx_at, bumped on every RX_LOG_DATA event
- _consecutive_stats_failures, incremented on get_stats_* / get_bat
exceptions and cleared on success
New /health/strict endpoint exposes these to the watchdog. It returns 503
when the device is connected but has 5+ consecutive stats failures, or
when no RX event has been seen for over 5 minutes on a serial transport.
The cheap /health endpoint keeps its lenient behavior so Docker's
healthcheck doesn't suddenly start tripping.
The watchdog's check_device_unresponsive() gains a "soft" pattern class
with a count threshold of 5 in the last 2 minutes — matching against
get_stats_core/radio/packets failed:, Failed to get battery:, and
Failed to get channel. Hard patterns still trigger on a single hit.
Deploy note: the watchdog runs as a host-level systemd service and is
NOT restarted by mcupdate, so after deploy run:
sudo systemctl restart mc-webui-watchdog.service
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
BLE connections can enter a "zombie" state where notifications (reads) still
arrive but writes silently fail. This went undetected until the user tried
to send a message, at which point the connection was already dead.
Additionally, after an abnormal BLE disconnect, BlueZ retains stale GATT
notification handles, causing reconnection to fail with
"[org.bluez.Error.NotPermitted] Notify acquired".
Changes:
- Add BLE keepalive loop (60s interval) that sends get_bat() to detect
zombie connections proactively and trigger reconnection automatically
- Add adapter power-cycle (hci0 off/on via D-Bus) during BLE reconnection
to clear stale GATT notification state
- Dedicated _ble_reconnect() with 5 attempts + adapter reset between each
- Health endpoint returns 503 when BLE permanently fails, triggering
Docker container restart via healthcheck
- Guard against concurrent reconnection attempts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Stage 2 of manual contact add feature:
- POST /api/contacts/manual-add endpoint (URI or raw params)
- New /contacts/add page with 3 input tabs (URI, QR code, Manual)
- QR scanning via html5-qrcode (camera + image upload fallback)
- Client-side URI parsing with preview before submission
- Nav card in Contact Management above Pending Contacts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In-memory ring buffer (2000 entries) captures all Python log records.
New /logs page streams entries via WebSocket in real-time with:
- Level filter (DEBUG/INFO/WARNING/ERROR)
- Module filter (auto-populated from seen loggers)
- Text search with highlighting
- Auto-scroll with pause/resume
- Dark theme matching Console style
Menu entry added under Configuration section.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bridge now detects device name from meshcli prompt ("DeviceName|*")
and exposes it via /health endpoint. mc-webui fetches this at startup
and uses RuntimeConfig for dynamic device name throughout the app.
Fallback chain: prompt detection → .infos command → MC_DEVICE_NAME env var
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Browser blocks mixed content (HTTPS page -> HTTP WebSocket on port 5001).
Solution: Route WebSocket through main Flask app which goes through
existing HTTPS reverse proxy.
- Add Flask-SocketIO to main mc-webui app
- WebSocket handler proxies commands to bridge via HTTP
- Remove port 5001 external exposure (no longer needed)
- Remove duplicate title from console header
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Flask-SocketIO backend with gevent for real-time communication
- Create chat-style console UI showing only user's command responses
- WebSocket commands tracked separately from HTTP API (ws_ prefix)
- Console accessible from main menu as fullscreen modal
- Command history navigation with arrow keys
- Auto-reconnection on disconnect
- Update service worker cache for offline support
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace DM modal with full-page view at /dm route for better mobile UX
- Add workaround for meshcore-cli bug where SENT_MSG contains sender's
name instead of recipient - now saving sent DMs to separate log file
- Fix DM button styling to match Reply button (btn-outline-secondary)
- Add dm.js for DM page functionality
- Add dm.html template with green navbar for visual distinction
- Update menu link to navigate to /dm instead of opening modal
- Remove unused DM modal functions from app.js
- Update documentation with new DM workflow
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implemented core backend functionality:
- Flask application structure with blueprints
- Configuration module loading from environment variables
- MeshCore CLI wrapper with subprocess execution and timeout handling
- Message parser for .msgs JSON Lines file format
- REST API endpoints (messages, status, sync, contacts cleanup)
- HTML views with Bootstrap 5 responsive design
- Frontend JavaScript with auto-refresh and live updates
- Custom CSS styling for chat interface
API Endpoints:
- GET /api/messages - List messages with pagination
- POST /api/messages - Send message with optional reply-to
- GET /api/status - Device connection status
- POST /api/sync - Trigger message sync
- POST /api/contacts/cleanup - Remove inactive contacts
- GET /api/device/info - Device information
Features:
- Auto-refresh every 60s (configurable)
- Reply to messages with @[UserName] format
- Toast notifications for feedback
- Settings modal for contact management
- Responsive design (mobile-friendly)
- Message bubbles with sender, timestamp, SNR, hop count
Ready for testing on production server (192.168.131.80:5000)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>