mc-webui

mirror of https://github.com/MarekWo/mc-webui.git synced 2026-06-11 01:04:56 +02:00

Author	SHA1	Message	Date
MarekWo	201bc137e5	fix(channels): also pass path_hash_size on the echo-driven raw_packet rebuild Earlier path_hash_mode fix updated the send-time build but the matching edit to _refresh_raw_packet_if_drifted didn't make it into commit `10df846`. For channels where the secret isn't available at send time, guess_pkt_payload stays None and raw_packet is created for the first time in this fallback path (triggered when echo correlation matches via the channel-hash branch). Without the path_hash_size argument the build defaulted to 1-byte hashes, producing the same mixed-size badge the prior fix was meant to eliminate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-09 14:54:42 +02:00
MarekWo	10df8464b7	fix(channels): honor device path_hash_mode when building raw_packet Resends were building raw_packet with the default 1-byte path-hash size, ignoring the device's actual path_hash_mode. When path_hash_mode=1 (2-byte hashes) the original send produced 2-byte path entries in repeater echoes, but the resend's path_len byte said "1-byte" — so post-resend echoes appended 1-byte hashes, mixing into the badge as inconsistent tokens (e.g. "44D8, D103, E7" — the trailing E7 was a 1-byte fragment). Cache path_hash_mode from DEVICE_INFO at connect (fw_ver_code >= 10) and expose path_hash_size = max(1, mode+1). Pass it through to _build_grp_txt_raw_packet in send_channel_message and the clock-drift refresh path. Keep cache in sync with set_param('path_hash_mode', N). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-09 14:40:21 +02:00
MarekWo	d23e865f35	feat(channels): merge post-resend echoes into existing repeater badge PR #4 of 5. After a successful resend, re-arm _pending_echo with the original msg_id and known pkt_payload so echoes from previously-unreached repeaters that pick up the rebroadcast are classified as 'sent' and carry msg_id in the SocketIO emit. The frontend echo handler now collects forced msg_ids and passes them to refreshMessagesMeta(forceIds), which bypasses the "already has route info, skip" guard for those ids. End result: clicking resend extends the repeater list on the existing message's badge in place — no duplicate row, no stale count. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-09 14:32:44 +02:00
MarekWo	4729055900	feat(channels): firmware version gate for raw resend (requires ≥1.16) CMD_SEND_RAW_PACKET (0x41) was introduced in companion-v1.16.0 (FIRMWARE_VER_CODE bump 11 → 13). Older firmware returns ERR_CODE_UNSUPPORTED_CMD with no useful context for the user. Capture fw_ver_code from the DEVICE_INFO event at connect (re-using the existing send_device_query call) and expose a supports_raw_resend property. The resend endpoint now refuses early with a clear message ("Firmware too old for raw resend, need ≥1.16, device reports fw_ver_code=N") and /api/status surfaces both fw_ver_code and the supports_raw_resend flag so the UI can hide or disable the button on older firmware. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-09 13:03:48 +02:00
MarekWo	9c48518771	fix(channels): surface meshcore lib's error_code/code_string on resend failure The lib's reader.py wraps device ERROR frames as {error_code, code_string}, not {reason, error}. The previous extraction collapsed every device error to "unknown error", hiding the actual ERR_CODE_* the firmware sent back. Check code_string/reason/error in order, then fall back to a raw error_code, then "unknown error". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-09 12:47:44 +02:00
MarekWo	67c59cc341	feat(channels): backend resend endpoint via CMD_SEND_RAW_PACKET PR #3 of 5. Adds POST /api/messages/<msg_id>/resend, which re-broadcasts an own channel message verbatim using the raw_packet bytes captured at send time. Pushes the wire bytes directly through companion command 0x41 (CMD_SEND_RAW_PACKET), bypassing the higher-level send paths so repeaters dedupe by packet hash via Mesh::hasSeen — only previously-unreached nodes will pick up the resend. Returns 404 for unknown msg_id, 400 for not-own / missing snapshot / disconnected device, 500 for unexpected device errors. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-09 12:39:34 +02:00
MarekWo	fa0b1c9109	feat(channels): capture raw_packet at send time for raw resend PR #2 of 5. Builds the full GRP_TXT wire bytes (header + transport_codes if scoped + path_len + encrypted payload) from the ts+0 pkt_payload guess and stores it in channel_messages.raw_packet right after the send. When echo correlation later identifies the actual pkt_payload (potentially using a different ±dt candidate due to host/firmware clock drift), the raw_packet is rebuilt from the actual one so a future resend matches the original packet hash and dedupes at the repeaters. Transport-scope codes are computed in Python via HMAC-SHA256(scope_key, payload_type\|\|payload)[:2], mirroring TransportKey::calcTransportCode in MeshCore Core (including the 0x0000/0xFFFF reservations). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-09 12:27:17 +02:00
MarekWo	fef6845c03	fix(connection): self-heal degraded long-lived TCP via in-place reconnect Long-lived TCP against the meshcore-proxy can degrade in a way the socket can't see: some commands (set_flood_scope_key with all-zero key) start timing out while RX events and other commands keep working. The 5 s execute() timeout fires with concurrent.futures.TimeoutError() — whose str() is empty — so the UI showed "Could not set region scope (none):" with no error text, and only channels with a mapped region could send because their non-zero scope_key happened to keep working. Two recovery paths: - send_channel_message now detects the timeout case (set_flood_scope_key surfaces timed_out=True) and runs force_reconnect() + one retry before failing. The user sees a brief delay instead of a cryptic error and having to restart the container. - A new _liveness_watcher_loop task runs on the DM event loop and forces a reconnect when no RX event has arrived for HEALTH_STRICT_MAX_RX_STALE_SEC (5 min). /health/strict now also reports rx_stale for TCP (previously serial/USB only), so an external watchdog could act on it too. force_reconnect() runs on the DM loop via run_coroutine_threadsafe with a 20 s cap, a 30 s cooldown to avoid churn under fire, and a _reconnect_lock to prevent concurrent attempts. mc.disconnect() fires DISCONNECTED — _intentional_disconnect tells _on_disconnected to skip its own reconnect loop so the two don't race. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 21:10:03 +02:00
MarekWo	13a650bb6c	fix(channels): read channels from DB instead of iterating device slots The TimeoutError-based fallback added in `1d47c9c` only fires when mc.commands.get_channel() actually raises — but on a sluggish device the call returns an empty/falsy event without raising, so the loop walks all dm._max_channels slots (40 on the firmware in production), each empty result returns None, and the API yields just Public (or whatever slot 0 happened to succeed on). The DB fallback never triggered and the user kept seeing just Public after refresh. The channels table in the DB is already the authoritative cache: - _load_channel_secrets() syncs it on every device connect and prunes stale rows, - set_channel()/remove_channel() update it synchronously with the device, - _refresh_channel_secret() refreshes individual rows on per-send refresh. Drop the device-slot iteration in cli.get_channels() and read from the DB. /api/channels response time becomes a single SELECT (<1 ms) and is unaffected by device responsiveness — exactly what we wanted from the fallback in the first place. Also revert the TimeoutError re-raise in get_channel_info(): the console `channels` and `add_channel` commands iterate slots and would crash on the first slow one. Logging + None on failure is the right behavior for slot iteration. The 3 s default timeout stays since it still keeps individual slot probes cheap. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 12:21:34 +02:00
MarekWo	422e7a3b34	feat(watchdog): catch sluggish-device failures via soft-pattern counting The container watchdog only restarted on three legacy "device clearly dead" log lines, so today's failure mode (firmware briefly stalls and get_stats_* / get_battery commands time out with an empty error while passive RX keeps working) never tripped it — leaving the user with 10-15 s freezes several times a day and no automatic recovery. DeviceManager now tracks two liveness signals: - _last_rx_at, bumped on every RX_LOG_DATA event - _consecutive_stats_failures, incremented on get_stats_* / get_bat exceptions and cleared on success New /health/strict endpoint exposes these to the watchdog. It returns 503 when the device is connected but has 5+ consecutive stats failures, or when no RX event has been seen for over 5 minutes on a serial transport. The cheap /health endpoint keeps its lenient behavior so Docker's healthcheck doesn't suddenly start tripping. The watchdog's check_device_unresponsive() gains a "soft" pattern class with a count threshold of 5 in the last 2 minutes — matching against get_stats_core/radio/packets failed:, Failed to get battery:, and Failed to get channel. Hard patterns still trigger on a single hit. Deploy note: the watchdog runs as a host-level systemd service and is NOT restarted by mcupdate, so after deploy run: sudo systemctl restart mc-webui-watchdog.service Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 09:43:43 +02:00
MarekWo	1d47c9c0e8	fix(perf): polling-only Socket.IO + channels DB fallback on USB timeout Werkzeug dev server can't upgrade WebSockets, so every io() upgrade attempt returned HTTP 500 and clients fell into a polling/upgrade reconnect loop — visible as 10-15s freezes on app load. Force transports: ['polling'] on /chat, /console and /logs clients; long-poll keeps real-time pushes working with ~1-2s latency. When the MeshCore device briefly stalls, get_channel_info() used to block on the default 30s timeout per slot, so iterating max_channels slots could take minutes; in practice only Public answered and the rest timed out, leaving the UI with just one channel. Drop per-call timeout to 3s, raise TimeoutError to the caller, and have cli.get_channels() break on first timeout and merge the remaining slots from the channels table in the DB (which already mirrors device state via upsert_channel). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 07:31:47 +02:00
MarekWo	10792b8566	feat(analyzer): add configurable analyzer services in Settings Add a Settings > Analyzer tab letting users CRUD custom MeshCore Analyzer services with a star-toggle default and inline disabled switch. The chart icon under each group-chat message now resolves at click time: built-in Letsmesh when no enabled customs, the default when set, or a chooser modal otherwise. Backend stops shipping the prebuilt analyzer_url and emits packet_hash instead — the frontend substitutes {packetHash} in the chosen URL template. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-05 15:34:45 +02:00
MarekWo	843d59a2d6	fix(channels): refresh per-send channel secret to keep echo correlation working Channel indices on the device can shift after the user deletes a channel — subsequent slots compact down by one — but mc-webui only ran _load_channel_secrets() once at startup, so the in-memory cache mapped channel_idx to whichever secret was there at boot. Once the indices moved, expected_payloads for sent channel messages were encrypted with the wrong key, so legitimate repeater echoes always fell into the 'doesn't match expected candidates' branch and never got linked to the originating send. send_channel_message now calls _refresh_channel_secret(idx) before building the candidate list: one extra get_channel(idx) round-trip that fetches the current secret straight from the firmware, updates the in-memory cache + DB if they had drifted, and is used for the pkt_payload computation. If the slot is empty, the stale cache entry is dropped. Also bump the set_param timeout for path_hash_mode and custom_var to 20s — the meshcore lib has a 15s internal timeout, so the previous 5s outer wrapper raised a bare concurrent.futures.TimeoutError with empty str(e) before the device's ERROR event could surface. The exception handler now logs the exception type as well so future empty-string errors are still diagnosable, and stores the event.payload (not the never-defined event.data) when capturing the sent message's pkt_payload field.	2026-06-05 10:42:07 +02:00
MarekWo	c39037214c	fix(messages): persist raw path_len byte so incoming path_hash_size is correct meshcore lib 2.x splits the wire path_len byte into payload['path_len'] (masked hop count) + payload['path_hash_mode'] (hash-size mode). We were storing only the masked half in channel_messages / direct_messages / paths, so the downstream decode_path_len() in the API endpoints always returned hash_size=1 — fine for the Hops counter but wrong for any UI that renders the incoming hex path (e.g. echo-fallback rendering). Added pack_path_len() that recombines the two fields back into the firmware byte and routed all three insertion sites through it. The channel-message socket emit now uses the recombined byte too, so realtime path_hash_size matches the value the API will return on reload. No schema migration needed — the column still holds an INTEGER. Old rows continue to decode as hash_size=1 (their original behavior); only newly received messages benefit from the fix.	2026-06-05 09:54:56 +02:00
MarekWo	bcaa550809	fix(dm): persist delivery_path_hash_size so reloaded bubbles render multi-byte routes Live dm_delivered_info already carried the correct hash_size, but the DB row only kept delivery_path. After a reload the API filled in path_hash_size from the incoming path_len column (NULL for outgoing DMs → default 1), so 2-byte routes were re-rendered as single-byte hops. Added a delivery_path_hash_size column (auto-migrated, defaults to 1) that update_dm_delivery_info now stores alongside the delivery path, populated from the same hash_size already known by each delivery path (retry ctx, PATH event, delayed contact backfill). /api/dm/messages returns the new field; dm.js prefers it over path_hash_size when rendering the Route line, falling back to the old field for legacy rows.	2026-06-05 09:22:51 +02:00
MarekWo	4effa47fe1	fix(ui): multi-byte path rendering across contact list, DM modal, retry Same root cause as the previous console fix: meshcore lib 2.x stores out_path_len as the masked hop count and out_path_hash_mode separately. Several UI surfaces and the DM retry logic were still decoding the hash-size mode from the upper bits of out_path_len, which always yields 1 for in-memory contact data and silently truncates multi-byte paths. Fixed sites: - /api/contacts/detailed: path_or_mode and outgoing payload now use out_path_hash_mode; the field is included in /api/contacts too. - dm.js: Contact Info modal computes hashSize for the import button from out_path_hash_mode. - console "contacts" command: same correction as "path". - device_manager._paths_match / _extract_path_hex: accept hash mode as a parameter; callers (_dm_retry_task, _delayed_path_backfill, Phase 2 rotation dedup) pass contact.out_path_hash_mode. - PATH event handlers: derive hash_size from path_hash_mode instead of decoding it from an already-masked path_len.	2026-06-05 08:54:29 +02:00
MarekWo	fecf8cdccb	fix(console): multi-byte hops in change_path parser and path display The console treated 2-/3-byte hops as 1-byte: - change_path "<name>" d103 5e34 (space-separated) was joined into continuous hex with hash_size=1, producing four 1-byte hops instead of two 2-byte ones. - path <name> always rendered 1-byte hops because it decoded the hash-size mode from upper bits of out_path_len. In meshcore 2.x the library already masks out_path_len to the hop count and exposes the mode separately in out_path_hash_mode. Parser now splits on commas, whitespace, or arrow separators and requires consistent hop length. Display reads out_path_hash_mode and also shows the byte size, e.g. "D103,5E34 (2 hops, 2B)".	2026-06-05 08:29:33 +02:00
MarekWo	3ef1eac0be	feat(console): rename to mc-webui, fix change_path, persist transcript - Rename "meshcli Console" to "mc-webui Console" (modal title + docs). - Drop redundant "Connected to..." messages; replace intro with a one-line "Type 'help' for available commands." hint. - Use a teal device-name style so the header label is readable on the dark background. - Display contact paths with commas (D1,90,05,54) instead of arrows in `contacts` and `path`, matching the standard MeshCore client. - Fix `change_path`: previously read only args[2] after shlex split, silently writing a 1-byte path. Now joins remaining args, accepts comma/space/continuous-hex, validates hex, auto-deduces hash_size from comma-chunk length (1/2/3-byte hops), and routes through _change_path_async so path_hash_mode is set and the contacts cache is invalidated. - Update `help` line and add a usage hint for the no-args form. - Add capped persistent output transcript: GET/POST/DELETE /api/console/output (cap 500 entries). Console restores prior entries (faded) above a divider on open and exposes a trash button to clear it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-05 22:18:46 +02:00
MarekWo	e293de2a76	fix(regions): rename tab to Regions and soften v1.14 firmware error Two small follow-ups after initial deployment. - Rename the Settings tab 'Channels' -> 'Regions' (id now tabSettingsRegions). The tab manages the region registry, not channels; the old label was confusing. The per-channel picker still lives under Manage Channels as before. - Graceful handling of firmware rejection: CMD_SET_DEFAULT_FLOOD_SCOPE (63) and CMD_GET_DEFAULT_FLOOD_SCOPE (64) were introduced in firmware v1.15.0; on v1.14.x the device replies with a generic ERR frame and our toast showed the unhelpful 'Firmware error: unknown'. Now the device_manager translates the empty/timeout reason into a concrete message naming the v1.15 requirement, and the api handler appends 'Your choice is saved locally' so the user knows the local state still persists. Same treatment for the delete-default-region clear path.	2026-04-24 11:54:01 +02:00
MarekWo	afe0c7cf17	feat(regions): per-channel scope picker + send-flow integration Fourth slice — the feature is now functional end-to-end from UI to radio. - Manage Channels modal: each row now has a pin-map button between Mute and Share that opens a region picker for that channel; rows show an inline badge with the assigned region name. - Region picker modal (new #regionPickerModal): radio list of regions with a "(None) — use firmware default" option at the top. Empty-state shows a "Manage Regions" CTA that deep-links to Settings > Channels. - api.py: two new routes — - GET /api/channels/scopes → bulk map for UI rendering - PUT /api/channels/<idx>/scope → {region_id: int \| null} set/clear - device_manager.send_channel_message: looks up the channel's scope, then — under _send_lock — pushes the 16-byte key via CMD 54 before the actual send_chan_msg. Channels without a mapping get an all-zero key so a previously-set scope doesn't leak across channels (firmware's send_scope is sticky until overwritten, not one-shot).	2026-04-24 07:27:33 +02:00
MarekWo	0e38e0ce8c	feat(regions): DeviceManager wrappers for flood-scope commands Second slice of the per-channel region-scope feature — firmware plumbing. No UI, routes, or send-flow integration yet; those land in PR #3 / #4. - _send_lock: threading.Lock added to __init__ (consumed in PR #4 to serialize the set-scope + send-channel-message pair across Flask threads; introduced here to keep the init diff small). - set_flood_scope_key(key_hex): thin wrapper over the existing meshcore-py `set_flood_scope(bytes)` path (CMD 54). None/empty clears the volatile scope. Used on the channel-send hot path in PR #4. - set_default_flood_scope(name, key_hex): hand-rolled CMD 63 frame (opcode + 31-byte NUL-padded name + 16-byte key = 48 bytes) via the lib's generic send() with [OK, ERROR] wait. Installed meshcore-py (<=2.2.15) has no wrapper for this opcode; frame format matches MyMesh.cpp lines 1893-1909. - Deliberately NOT implementing CMD 64 (GET_DEFAULT_FLOOD_SCOPE): the library's reader drops RESP_CODE 28 as "unhandled" (reader.py:919-921), so there is no Event we can wait for. Until upstream adds support, mc-webui treats its own regions.is_default row as the source of truth and pushes one-way via CMD 63. Comment in code documents the reason.	2026-04-24 07:20:30 +02:00
MarekWo	57a0ca018d	fix: treat slots with empty name as empty regardless of secret bytes Some firmwares return SHA256(\"\")[:16] (e3b0c442...) for an empty channel slot's secret instead of all zeros. The load path checked only for the all-zero sentinel, so those slots passed the \"valid\" branch and got persisted to the DB with a synthetic 'Channel N' name plus the bogus secret. The stale rows then leaked into db.get_channels() and would have supplied wrong keys for pkt_payload computation. Anchor the decision on name presence: a slot is used iff firmware returned a non-empty name. Drop the 'Channel {idx}' fallback so we never invent names for empty slots. The existing end-of-loop cleanup then removes any phantom rows already in the DB on next connect. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-22 20:52:35 +02:00
MarekWo	3dd1c52687	feat: contacts settings tab with suppress + auto-ignore options Move Manual approval toggle into a new Contacts tab in the global Settings modal and clean up the Contact Management panel (drop the duplicated Settings/Manage Contacts headers, shorten the Existing Contacts blurb). Add two new persisted options gated on Manual approval being ON: Suppress new advert notifications (frontend hides FAB badge + browser notification while the Pending list itself stays populated) and Automatically add new contacts to "Ignored" (advert handler marks the new contact ignored before emitting pending_contact, so the user is silenced end-to-end while contacts remain in the cache for promotion via "To Device"). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-18 10:01:58 +02:00
MarekWo	77c3ffa5c2	fix: prevent echo mis-correlation for sent channel messages Pre-compute expected pkt_payloads at send time using channel secret + timestamp (±3s for clock drift), then match echoes exactly instead of only checking the 1-byte channel hash. Fixes race condition where an incoming message's echo on the same channel could be incorrectly attributed to a just-sent message (wrong Analyzer URL). Falls back to channel-hash matching when channel secret is unavailable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-14 21:47:07 +02:00
MarekWo	8f8bd30747	fix: refresh mc.contacts from device on dirty flag to update stale names Contact names stayed stale indefinitely because mc.contacts (in-memory dict) was only populated at startup. When a remote node renamed itself, the device firmware updated its contact list but the app never re-read it. Now ensure_contacts(follow=True) is called when contacts_dirty is set: - In _on_advertisement(): refresh before name lookup (incremental via lastmod) - In get_contacts_with_last_seen(): refresh + DB sync before serving API data Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-09 12:29:45 +02:00
MarekWo	bbfca38d34	fix: use adv_lat/adv_lon keys for device coordinates Device info from meshcore uses adv_lat/adv_lon, not lat/lon. Fixed in get_param, set_param (lat/lon individually), and the new /api/device/config endpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-06 19:26:42 +02:00
MarekWo	bc1da9e45e	fix: get_device_info checked for 'data' attr instead of 'payload' Event objects use 'payload', not 'data'. This bug was latent because the cache was always populated during connect — only exposed after the cache invalidation fix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-06 14:43:26 +02:00
MarekWo	1e6f8caf03	fix: invalidate self_info cache after set_param get_device_info() cached SELF_INFO payload in _self_info and never refreshed it after set operations, so get always returned stale values. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-06 14:40:14 +02:00
MarekWo	c3f61ce3f7	fix: get radio returns actual values, implement set radio command get radio used wrong key names (freq/bw/sf/cr instead of radio_freq/radio_bw/radio_sf/radio_cr from SELF_INFO payload). set radio was missing entirely — would silently fall through to custom variable handler. Now parses freq,bw,sf,cr and calls mc.commands.set_radio(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-06 14:33:13 +02:00
MarekWo	1a194d5050	fix: implement get advert_loc_policy console command The set command was implemented but get was missing, causing "Unknown param" error. Reads adv_loc_policy from device info. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-06 14:10:20 +02:00
MarekWo	6c02220719	fix: skip empty channel slots during sync, clean up stale DB channels Empty device channel slots have all-zero secrets (32 hex chars) which passed the length check and got persisted to DB as "Channel N". This caused ghost channels (e.g. Channel 14) to appear in unread counts while the sidebar correctly showed only real channels. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-06 11:08:38 +02:00
MarekWo	c36d7b5fbf	fix(ble): simplify reconnection — rely on container restart for clean state In-container BLE reconnection is unreliable because bleak leaves stale GATT notification handles after abnormal disconnect, and adapter power- cycling from within Docker corrupts bleak's internal BlueZ manager state. New approach: - On BLE disconnect or keepalive failure, immediately mark as permanently failed (no in-container reconnect attempts) - Health endpoint returns 503, Docker healthcheck triggers container restart - Docker entrypoint script disconnects stale BLE connections before app starts, ensuring clean GATT state for bleak This is reliable because: - MeshCore.create_ble(address=...) works on fresh container starts - The BlueZ daemon on the host maintains adapter state correctly - Container restart is fast (~5s) and gives a truly clean BLE state Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-05 16:39:03 +02:00
MarekWo	53063f199a	fix(ble): connect via BlueZ D-Bus instead of bleak direct connect bleak inside Docker cannot initiate new BLE connections — it can only take over connections already established by BlueZ. Replace the force-disconnect approach with a connect-via-BlueZ approach: 1. _ble_ensure_connected() connects the device via BlueZ D-Bus (Device1.Connect) before bleak tries to take over 2. BleakScanner.find_device_by_address() provides the BLEDevice object that bleak 3.x needs (raw MAC address doesn't work) 3. MeshCore.create_ble(device=...) takes over the BlueZ connection On reconnect after disconnect: 1. Power-cycle adapter clears stale GATT notification handles 2. BlueZ re-connects the trusted device automatically 3. bleak takes over the re-established connection Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-05 14:13:06 +02:00
MarekWo	9c692fac8b	fix(ble): use BleakScanner to find device before connecting In bleak 3.x, BleakClient(address_string) can't find paired BLE devices that aren't actively advertising. This caused BleakDeviceNotFoundError or 30-second connection timeouts. Fix: pre-scan via BleakScanner.find_device_by_address() which queries BlueZ's D-Bus object tree directly, then pass the BLEDevice object to MeshCore.create_ble(device=...) instead of the raw MAC address. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-05 14:10:47 +02:00
MarekWo	a92b505975	fix(ble): untrust device during connect to prevent BlueZ auto-reconnect BlueZ auto-reconnects trusted BLE devices, which races with bleak's connect and causes 'failed to discover services' or 'Notify acquired'. Now we temporarily untrust the device before connecting (to prevent BlueZ from auto-reconnecting during the handoff), then re-trust it after bleak has established its GATT session. Also adds _ble_retrust() helper to re-trust the device in a finally block, ensuring the bond is maintained even on connection failure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-05 14:06:18 +02:00
MarekWo	1de98433d4	fix(ble): add adapter power-cycle to startup retry loop On startup, _connect_with_retry also needs adapter power-cycling every 3rd failed attempt to clear stale GATT state from previous sessions. Without this, the container can fail all 10 startup retries when BlueZ holds stale notification handles. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-05 13:40:53 +02:00
MarekWo	f352ccd968	fix(ble): add keepalive and robust reconnection for BLE zombie connections BLE connections can enter a "zombie" state where notifications (reads) still arrive but writes silently fail. This went undetected until the user tried to send a message, at which point the connection was already dead. Additionally, after an abnormal BLE disconnect, BlueZ retains stale GATT notification handles, causing reconnection to fail with "[org.bluez.Error.NotPermitted] Notify acquired". Changes: - Add BLE keepalive loop (60s interval) that sends get_bat() to detect zombie connections proactively and trigger reconnection automatically - Add adapter power-cycle (hci0 off/on via D-Bus) during BLE reconnection to clear stale GATT notification state - Dedicated _ble_reconnect() with 5 attempts + adapter reset between each - Health endpoint returns 503 when BLE permanently fails, triggering Docker container restart via healthcheck - Guard against concurrent reconnection attempts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-05 13:37:33 +02:00
MarekWo	f6c9c65a51	fix(channels): refresh channel secret cache after join/create After set_channel(), read back the actual secret from the device and update both _channel_secrets in-memory cache and the DB. This fixes newly-joined # channels (where firmware auto-generates the key) having no repeater info, missing Analyzer URLs, and incorrect route data until container restart. Also clean up _channel_secrets on channel removal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-31 21:00:34 +02:00
MarekWo	29e5e6982d	fix(chat): prevent poll-triggered reload after send by using server timestamp The 60s checkForUpdates poll was detecting has_updates due to clock skew between client and server timestamps. Now the send API returns the server timestamp, and the frontend uses it for markChannelAsRead — ensuring the poll sees no updates for own sent messages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-31 10:28:54 +02:00
MarekWo	695321c0c9	fix(dm): show delivery info immediately on ACK/failure without reopen _confirm_delivery() now saves retry context (attempt, max_attempts, path) and emits dm_delivered_info so the frontend shows delivery details instantly. Similarly, dm_retry_failed now includes attempt count so the failure state shows how many attempts were made. Previously this info was only available after reloading messages from DB (closing and reopening the conversation). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-31 09:58:41 +02:00
MarekWo	2368ec656e	feat(path_hash_mode): fix DM route display and delivery path segmentation Stage 4 of path_hash_mode support. DM delivery paths now carry hash_size through the entire pipeline: retry context → ACK handler → SocketIO emission → frontend rendering. All hardcoded 2-char hex segmentation removed from dm.js. Backend changes (device_manager.py): - Track path_hash_size alongside path_desc in DM retry context - Update path_hash_size on path rotation and flood fallback - Add hash_size to all 4 dm_delivered_info SocketIO emissions - Derive hash_size from PATH event path_len for discovered paths Frontend changes (dm.js): - Add segmentHexPath() utility (shared by all 3 route functions) - formatDmRoute(), buildDmRouteHtml(), showDmRoutePopup() accept hashSize - All call sites pass hash_size from event data or message context Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 13:11:00 +02:00
MarekWo	e8f271f4ef	feat(path_hash_mode): add hop_count and path_hash_size to API responses Stage 2 of path_hash_mode support. All API endpoints and SocketIO emissions now include decoded hop_count and path_hash_size fields alongside the raw path_len, so the frontend can display and segment paths correctly for any hash mode. Changes: - Import decode_path_len in api.py - GET /api/messages: add hop_count, path_hash_size, echo_hash_sizes - GET /api/messages/<id>/meta: add hop_count, path_hash_size, echo_hash_sizes - GET /api/dm/messages: add hop_count, path_hash_size - SocketIO new_message emission: add hop_count, path_hash_size Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 10:00:03 +02:00
MarekWo	719e11e868	feat(path_hash_mode): add decode_path_len and fix RX_LOG_DATA parsing Stage 1 of path_hash_mode support. The critical bug in _on_rx_log_data treated the raw path_len byte as a direct byte count, which breaks with mode>0 (e.g. mode=1, 0 hops → path_len=0x40=64, reading 64 bytes of non-existent path data). Now properly decodes the encoded path_len byte into hop_count, hash_size, and path_byte_len. Changes: - Add decode_path_len() utility for MeshCore v1.14+ path_len encoding - Fix _on_rx_log_data binary parsing to use decoded path length - Pass hash_size through _process_echo → DB insert → SocketIO emission - Add hash_size column to echoes table (schema + migration) - Update insert_echo() to store hash_size (default 1 for backward compat) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-30 09:47:20 +02:00
MarekWo	10c232fc7d	fix(ble): force-disconnect stale BlueZ connection before connecting BlueZ auto-reconnects trusted BLE devices after container restart, blocking bleak from establishing a new GATT session. Clear the stale connection via D-Bus before each connect attempt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-29 19:42:34 +02:00
MarekWo	9f335794e4	fix(ble): update runtime device name on every connect BLE connections with retries can take >60s, exceeding the startup wait timeout. Move runtime_config.set_device_name() into _connect() so the navbar shows the correct name regardless of connection delay. Also fixes name update on reconnections. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-29 19:24:45 +02:00
MarekWo	147a12c8f5	fix(dm): persist delivery_status='delivered' on ACK receipt DM delivery status was lost when switching conversations because _confirm_delivery() only stored the ACK record and emitted a socket event, but never set delivery_status='delivered' in direct_messages. During retries, each attempt generates a new ACK code. The DM record stores the initial expected_ack, but the actual ACK may arrive for a later retry's code. The ACK lookup by expected_ack then fails to match. Now _confirm_delivery() also sets delivery_status='delivered', and message loading checks this DB field first (like it already did for 'failed'), so delivery persists across page navigations. Also fixed 213 existing DMs on server via data migration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-29 14:49:37 +02:00
MarekWo	b18c0145dd	docs(ble): add pairing guide, remove unused MC_BLE_PIN config MC_BLE_PIN was non-functional — bleak in Docker cannot perform interactive pairing (no BlueZ agent). Pairing must be done on the host before starting mc-webui. Added comprehensive pairing guide at docs/meshcore_bluetooth_pairing.md. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-29 13:56:45 +02:00
MarekWo	710f69c350	feat: add BLE transport support for companion devices Integrate meshcore library's BLE connection (via bleak) as a third transport option alongside serial and TCP. Priority: BLE > TCP > Serial. Config: MC_BLE_ADDRESS and MC_BLE_PIN environment variables. Docker: bluez/dbus packages, NET_ADMIN cap, D-Bus socket mount. UI: transport type badge in navbar, transport_type in /api/status. Watchdog: skip USB reset for BLE connections (same as TCP). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-29 10:03:45 +02:00
MarekWo	701f6f1197	fix(dm): refresh mc.contacts from device on PATH_UPDATE event The Contact Info dialog showed stale path data (e.g. "Flood" instead of the discovered route) because auto_update_contacts is OFF and PATH_UPDATE only sets _contacts_dirty=True without refreshing mc.contacts. The API then served stale in-memory data even after cache invalidation. Now ensure_contacts(follow=True) is called on PATH_UPDATE to read fresh contact data from the device before invalidating cache and emitting the socket event. PATH_UPDATE events are rare (only on path discovery), so the serial I/O cost is acceptable unlike advertisements. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 17:52:41 +01:00
MarekWo	0b3bd1da60	fix(dm): delayed path backfill for FLOOD-delivered messages When FLOOD delivery is confirmed, the PATH_UPDATE event payload often has empty path data because firmware updates the contact's out_path asynchronously. After 3s delay, read the contact's updated path from the meshcore library's in-memory contacts dict and backfill the DB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 15:23:35 +01:00

1 2 3

131 Commits