Commit Graph

79 Commits

Author SHA1 Message Date
Lloyd
b949bdeab8 Merge pull request #190 from tjdownes/fix/tx-serialization
fix: serialise radio TX and close duty-cycle TOCTOU race
2026-04-24 08:59:56 +01:00
Lloyd
1626b3f307 feat: add max flood hops configuration to repeater settings 2026-04-22 13:52:40 +01:00
TJ Downes
179158e68b fix(engine): release _tx_lock during local-TX retry backoff; add lock tests
Reviewer concern (PR 190):
  The 1-second backoff sleep for local_transmission retry happened inside
  `async with self._tx_lock`, blocking all other queued TX tasks for the
  full second — hurting latency and throughput under load.

Fix — tighten lock scope to one attempt per acquisition:
  Before:  acquire lock → [attempt 0 → sleep(1) → attempt 1] → release
  After:   for each attempt:
             [sleep(1) if retry]          ← OUTSIDE the lock
             acquire lock
             re-check can_transmit        ← fresh check every acquisition
             attempt single send
             record_tx on success
             release lock

The duty-cycle gate now runs on every lock acquisition (not just the first),
which is correct: airtime state may change during the backoff sleep.

Tests added (tests/test_tx_lock.py):
  1. test_concurrent_sends_do_not_interleave — two tasks racing to the same
     delay timer must never overlap inside send_packet.
  2. test_duty_cycle_toctou_is_fixed — second packet is dropped when the
     first consumes the budget inside the lock.
  3. test_local_retry_releases_lock_during_backoff — a concurrent relayed
     packet fires at ~0.1s while local retry sleeps 1s; confirms it is not
     blocked by the backoff.
  4. test_non_local_failure_propagates — relayed send failure raises
     immediately with exactly one attempt.
  5. test_duty_cycle_rechecked_on_retry — if the budget is exhausted during
     backoff, the retry is dropped by the in-lock gate (not sent).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 05:29:47 -07:00
Lloyd
af79eaf63f Merge pull request #192 from tjdownes/perf/hash-once
perf: compute packet hash once per packet in the forwarding hot path
2026-04-22 10:38:04 +01:00
Lloyd
40ec2ba293 Merge pull request #194 from tjdownes/perf/debug-log-guards
Merged this now. It’s a safe change with no behavioural impact, and it removes unnecessary work in the hot paths when DEBUG logging is off. Happy to revisit if we want to standardise on lazy formatting later, but this gives us an immediate win.
2026-04-22 09:45:55 +01:00
Rigear
d318334288 Merge remote-tracking branch 'origin/fix-perform-speed' into feat/mqtt_merge 2026-04-21 20:59:42 -07:00
TJ Downes
c0163ce897 perf: guard hot-path debug log f-strings with isEnabledFor(DEBUG)
Python evaluates f-string arguments before calling logger.debug(), so in
production (INFO level) every debug log call in the hot path still paid the
cost of string formatting even though the output was discarded.

The most expensive sites are in __call__ (runs on every received packet):
  - "RX packet: header=0x{...}, payload_len=..., path_len=..., rssi=..., snr=..."
  - "Packet header=0x{...}, type=..., route=..."

And in _calculate_tx_delay (runs on every forwarded packet):
  - "Route=FLOOD/DIRECT, len=...B, airtime=...ms, delay=...s"
  - "Congestion detected, score=..., delay multiplier=..."

Plus transport code and local-TX debug logs (less frequent but same issue).

Fix: wrap each f-string logger.debug() call with
  if logger.isEnabledFor(logging.DEBUG):
so the f-string is never constructed when debug logging is disabled.

logger.isEnabledFor() is a pure in-memory integer comparison — essentially
free at runtime.  In production at INFO level this eliminates string
concatenation, attribute lookups (packet.header, len(packet.payload), etc.),
and format operations on every forwarded packet.

Eight call sites guarded; no logic changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 19:46:11 -07:00
TJ Downes
4e16fd040d perf: compute packet hash once per packet in the forwarding hot path
Before this change, calculate_packet_hash() (SHA-256 + hex + upper) was called
3 times per forwarded packet and 4 times per dropped packet:
  __call__              → pkt_hash_full = packet.calculate_packet_hash()   #1
  → flood/direct_forward → is_duplicate → calculate_packet_hash()          #2
  → flood/direct_forward → mark_seen    → calculate_packet_hash()          #3
  (drop) → _get_drop_reason → is_duplicate → calculate_packet_hash()       #4

pkt_hash_full was computed in __call__ but never threaded down into
process_packet, flood_forward, direct_forward, is_duplicate, or _get_drop_reason.
Each method recomputed it independently.

Fix: add optional packet_hash: Optional[str] = None to is_duplicate,
_get_drop_reason, flood_forward, direct_forward, and process_packet.  Pass
pkt_hash_full from __call__ through the chain.  Each method uses the provided
hash or falls back to computing it — preserving backward compatibility for
external callers (TraceHelper, etc.) that have no pre-computed hash.

Result: 1 SHA-256 computation per packet in the hot path regardless of whether
the packet is forwarded or dropped.

Also adds explicit INVARIANT docstrings to flood_forward, direct_forward, and
is_duplicate documenting that these methods must remain synchronous (no await).
The is_duplicate + mark_seen pair is atomic within the asyncio event loop; adding
an await between them would allow two concurrent tasks to both pass the duplicate
check for the same packet — forwarding it twice.

Docs: docs/pr_hash_once.md — problem analysis, call-chain diagram, per-method
diffs, quantification (~3-8 µs saved per packet), test plan (including hash-count
assertion), and proof that passing the original's hash to the deep-copied packet
is correct.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 19:28:45 -07:00
TJ Downes
fdbc85c926 fix: serialise radio TX and close duty-cycle TOCTOU race
Add self._tx_lock (asyncio.Lock) to RepeaterHandler and acquire it inside
delayed_send after the per-packet sleep completes.

Problem 1 — radio interleave: concurrent delayed_send coroutines (one per
queued packet) could both exit their sleep at nearly the same moment and call
dispatcher.send_packet simultaneously, interleaving SPI/serial register writes
to the half-duplex LoRa radio.

Problem 2 — TOCTOU gap: the upfront can_transmit() check in __call__ and the
record_tx() call in delayed_send are separated by the entire TX delay (up to
several seconds).  Under burst conditions two tasks both pass the check before
either has recorded its airtime, causing both to transmit and the duty-cycle
budget to be exceeded.

Fix: acquire _tx_lock after the sleep so delay timers still run concurrently
(matching firmware behaviour), then immediately re-check can_transmit() inside
the lock before sending.  Because only one task holds the lock at a time,
airtime state is stable; check and record_tx() are effectively atomic — no
TOCTOU window.  Airtime is recorded only on a successful send, so a radio
failure never inflates the budget.

Also move `import random` from inside _calculate_tx_delay to module level
(stdlib imports belong at the top; the lazy-import pattern is unnecessary here).

Docs: docs/pr_tx_serialization.md — problem statement, root-cause analysis,
alternative approaches considered, invariant table, full unit + field test plan,
and proof of correctness for the asyncio.Lock approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 18:37:56 -07:00
Lloyd
c5fd41f28a feat: enhance task management in handlers with tracking and error logging 2026-04-21 09:38:03 +01:00
Rigear
27fa2381ea feat:
* Added retain status message bool
* Added back old templates
* Added migration path from old mqtt and letsmesh configs to new mqtt_broker config
2026-04-15 21:20:11 -07:00
Rigear
64530a623e refactor: Updated letsmesh references 2026-04-11 21:44:26 -07:00
Rigear
acf8079761 feat: Merge mqtt handler and letsmesh handlers 2026-04-11 16:09:14 -07:00
Joshua Mesilane
38e1fbe3f9 Changing from 'Global' Flood to 'Unscoped' flood as '*' doesn't actually mean wildcard, it means unscoped. Region keys should still only be forwaded if they're whitelisted. UI changes pending 2026-04-06 22:32:05 +10:00
agessaman
744826199e feat: implement duplicate packet recording for UI visibility in RepeaterHandler
- Added record_duplicate method to RepeaterHandler to log known duplicate packets without forwarding.
- Enhanced RepeaterDaemon to subscribe to raw packets for deduplication logging, ensuring all path variants are visible in the UI.
- Updated recent_packets management to group duplicates under their original packets for better tracking.
2026-03-23 17:02:29 -07:00
Lloyd
369b420ae3 feat: enhance RepeaterHandler with duplicate packet limit and cache cleanup, add graceful shutdown handling in RepeaterDaemon, and increase PacketRouter queue size 2026-03-23 14:30:01 +00:00
Lloyd
d11d957318 Merge pull request #158 from agessaman/feat/companion-traces
Fix multibyte trace handling for companion/repeater and exclude trace packets from _record_for_ui
2026-03-22 22:44:10 +00:00
agessaman
c5c94fe60a feat: exclude TRACE packets from logging in RepeaterHandler and PacketRouter
- Updated record_packet_only method to skip logging for TRACE packets, as TraceHelper manages trace paths.
- Enhanced documentation to clarify the handling of TRACE packets in the web UI.
- Added tests to ensure TRACE packets are not recorded, maintaining data integrity.
2026-03-22 15:26:28 -07:00
Lloyd
55fe9feddd feat: add useSignalQuality utility for signal strength evaluation 2026-03-22 22:26:18 +00:00
agessaman
7558c5604c feat: enhance repeater TX mode functionality so companion tenants can TX while in monitor mode
- Modify TX modes: forward, monitor, and add no_tx, allowing for flexible packet handling.
- Updated configuration and API endpoints to support the new modes.
- Adjusted logic in RepeaterHandler to manage packet processing based on the selected mode.
- Enhanced CLI commands to reflect the new mode settings.
- Added tests for each TX mode to ensure correct behavior.
2026-03-15 13:03:18 -07:00
Lloyd
596c96d1f4 Extend test to: serialization/deserialization with multi-byte paths
- Functionality of Packet.apply_path_hash_mode and get_path_hashes
- Engine flood_forward and direct_forward with real multi-byte encoded packets
- PacketBuilder.create_trace payload structure and TraceHandler parsing
- Enforcement of max-hop boundaries per hash size
2026-03-11 14:23:29 +00:00
Lloyd
632f1d2d1a Merge pull request #132 from agessaman/dev-companion-v2-cleanup
Update OTA repeater stats to return correct uptime, airtime, packet counts, etc.
2026-03-10 09:32:43 +00:00
agessaman
25c2a14a81 Update OTA repeater stats to return correct uptime, airtime, packet counts, etc.
- Introduced `total_rx_airtime_ms` in `AirtimeManager` to track received packet airtime.
- Added `record_rx` method to log received airtime in `AirtimeManager`.
- Updated `RepeaterHandler` to count received packets and log RX airtime using the new method.
- Enhanced statistics reporting in `get_stats` to include total received airtime.
- Updated `ProtocolRequestHelper` to include total RX airtime in the RepeaterStats structure for better monitoring.
2026-03-09 17:27:51 -07:00
Lloyd
95537cd158 Merge pull request #130 from agessaman/dev-companion-v2-cleanup
Enhance packet recording and refactor handling in RepeaterHandler and PacketRouter
2026-03-09 08:54:13 +00:00
agessaman
f2a5eab726 Refactor packet handling in PacketRouter and RepeaterHandler
- Removed redundant original_path assignment in `RepeaterHandler` to streamline packet processing.
- Introduced `_is_direct_final_hop` helper method in `PacketRouter` to determine if a packet is the final destination for direct routes with an empty path.
- Updated comments in `PacketRouter` to clarify the handling of packets during routing, especially for direct forwarding scenarios.
- Adjusted logic to ensure packets are correctly processed or delivered based on their routing status, enhancing overall packet management.
2026-03-08 19:39:49 -07:00
agessaman
1002ba3194 Refactor packet handling in RepeaterHandler and PacketRouter
- Introduced helper methods `_path_hash_display` and `_packet_record_src_dst` in `RepeaterHandler` to streamline path hash and source/destination hash extraction.
- Updated `record_packet` method to utilize a new `_build_packet_record` method for improved readability and maintainability.
- Enhanced `PacketRouter` comments for clarity on handling remote destinations and packet processing, ensuring better understanding of the routing logic.
2026-03-08 18:23:16 -07:00
agessaman
4490c9bb8c Add packet recording for injection-only types in RepeaterHandler
- Introduced `record_packet_only` method in `RepeaterHandler` to log packets for UI/storage without forwarding or duplicate checks, specifically for injection-only types like ANON_REQ and ACK.
- Updated `PacketRouter` to call `_record_for_ui` method, ensuring that relevant metadata is recorded for packets processed by various handlers.
- Enhanced handling of packet metadata, including RSSI and SNR values, to improve the visibility of packet information in the web UI.
2026-03-08 17:23:20 -07:00
Lloyd
884de123b9 Merge branch 'pr-128' into feat/companion
# Conflicts:
#	repeater/engine.py
#	repeater/web/api_endpoints.py
2026-03-08 07:09:06 +00:00
agessaman
3725d6eb21 Add path hash mode configuration and management
- Introduced `path_hash_mode` setting in `config.yaml.example` to specify the hash size for flood packets.
- Updated `ConfigManager` to re-apply the path hash mode when the mesh section changes, with validation for acceptable values (0, 1, 2).
- Enhanced `RepeaterDaemon` to set the default path hash mode during initialization, ensuring consistent handling of flood packets.
2026-03-07 13:57:46 -08:00
Lloyd
e5c91f382a Add mesh configuration for loop detection and update API endpoint handling 2026-03-06 21:21:30 +00:00
agessaman
0217a49ed2 Refactor RepeaterHandler path management and enhance packet validation
- Removed redundant call to mark_seen() for duplicate packets.
- Added validation to ensure hop count does not exceed the maximum limit before appending to the packet path.
- Updated logic to check for path size constraints when appending hash bytes, improving packet processing efficiency.
2026-03-05 16:52:43 -08:00
agessaman
c150b9a9bf Merge upstream/feat/newRadios into dev-companion-v2-cleanup
- Keep our Vite-built assets and index.html script (index-DyUIpN7m.js)
- Remove upstream-only asset chunks and RoomServers-BxQ-0q-x.js
- README: keep two-backend intro, add upstream CAUTION/compatibility table
- manage.sh: keep dialog/gauge UX and .[hardware]; add CH341 udev, sudoers, libusb, polkit, silent upgrade
- sqlite_handler: add crc_errors table, index, and cleanup from upstream
- engine: add validate_packet and mark_seen in direct_forward; keep our path hash_size/hop_count logic
- advert: keep comment, use current_time = now
- api_endpoints: use restart_service() from service_utils
- config merge: strip user config comments before yq merge (upstream)

Made-with: Cursor
2026-03-05 16:43:14 -08:00
agessaman
b6757a0ca0 Refactor path handling in RepeaterHandler to utilize hash-based representations
- Replaced list-based path storage with hash-based methods for original and forwarded paths, improving efficiency and consistency.
- Updated display logic to format path hashes correctly, ensuring compatibility with new hash size management.
- Adjusted local transmission handling to align with the new hash representation, enhancing clarity in packet processing.
2026-03-05 14:06:43 -08:00
agessaman
0271aa9455 Add support for multi-byte hashes via local_hash_bytes
- Updated the RepeaterHandler constructor to accept local_hash_bytes, improving path handling.
- Implemented checks to ensure packet paths do not exceed MAX_PATH_SIZE when appending hash bytes.
- Refactored direct_forward method to utilize local_hash_bytes for next hop validation and path manipulation.
- Adjusted path length encoding to accommodate changes in path management logic.
2026-03-05 09:39:11 -08:00
Lloyd
136af19178 add loop detection configuration and tests for flood routing 2026-03-05 14:06:28 +00:00
Lloyd
7dccb7457f Add advertisement configuration options for rate limiting and adaptive settings 2026-03-05 11:12:28 +00:00
Lloyd
4a05e20172 Add CRC error tracking and API endpoints for error count and history
- Create a new table for storing CRC errors in SQLite.
- Implement methods to store and retrieve CRC error counts and history.
- Update StorageCollector to record CRC errors and expose relevant methods.
- Enhance RepeaterHandler to track and record CRC error deltas from the radio hardware.
- Add API endpoints to fetch CRC error count and history.
2026-03-02 12:36:08 +00:00
Lloyd
c2f57c3d0f add tests and more validation to packets, remove crc setting from config as hardcoded. 2026-03-02 11:35:50 +00:00
agessaman
e14fd3feea Refactor noise floor retrieval in RepeaterHandler to use asyncio executor
- Updated the noise floor retrieval method to run in an executor, preventing blocking of the event loop during the KISS modem's command execution.
- This change enhances responsiveness by allowing the process to handle other tasks while waiting for the noise floor measurement.
2026-02-21 15:38:04 -08:00
agessaman
65164fffb7 Improve retransmission logic and duty cycle handling in RepeaterHandler
- Improved local transmission handling by deferring local TX when duty cycle limits are exceeded, instead of dropping packets.
- Added LBT metadata extraction and logging for better monitoring of transmission attempts and delays.
- Refactored `schedule_retransmit` to support retrying local transmissions on failure, enhancing reliability.
- Introduced a lock in PacketRouter to serialize local TX operations, preventing race conditions during packet processing.
2026-02-21 15:38:01 -08:00
agessaman
c2f8a2e3cd refactor: companion FrameServer and related (substantive only, no Black)
Reapply refactor from ce8381a (replace monolithic FrameServer with thin
pymc_core subclass, re-export constants, SQLite persistence hooks) while
preserving pre-refactor whitespace where patch applied cleanly. Remaining
files match refactor commit exactly. Diff vs ce8381a is whitespace-only.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-21 15:35:47 -08:00
Lloyd
371b9fdbb8 fix: update airtime calculations to use config values in AirtimeManager 2026-01-19 11:52:23 +00:00
Lloyd
d671c58184 suppress duplicate packets in processing logic 2026-01-12 16:23:11 +00:00
Lloyd
1999af8bdd add Packet Hash Cache to UI 2026-01-10 22:59:43 +00:00
Lloyd
effad3378f Increase default cache TTL from 60 seconds to 1 hour 2026-01-10 22:27:15 +00:00
Paul Picazo
942d4dfe28 Enhance advert storage logic: prioritize direct routes and handle zero-hop measurements
Refactor packet processing: use processed_packet for forwarding and drop reason checks

Fix: Update zero-hop determination logic in AdvertHelper

Fix: Clone packet in process_packet to prevent modification of the original

Fix: Import copy module for deep copying of packets in process_packet
2026-01-03 14:35:30 -08:00
Lloyd
8b8edb9929 add pymc console endpoints and ui 2026-01-02 16:35:18 +00:00
Lloyd
98b425f444 added CLI 2025-12-29 14:37:54 +00:00
dmduran12
e74e062fe5 feat(api): Expose additional config values in /api/stats
Adds the following fields to the stats API response for MeshCore CLI parity:

- config.repeater.max_flood_hops: Max flood hops setting (for 'get flood.max')
- config.repeater.advert_interval_minutes: Local advert interval (for 'get advert.interval')
- config.delays.rx_delay_base: RX delay base setting (for 'get rxdelay')

These fields are already present in config.yaml but were not exposed via the
stats API, making them inaccessible to web dashboards and CLI tools that
communicate over HTTP.

This enables pyMC Console's Terminal to display these values without
requiring direct config file access.

Co-Authored-By: Warp <agent@warp.dev>
2025-12-27 19:46:29 -08:00
Lloyd
24866707f4 retransmission handling to await completion and extract LBT metadata 2025-12-21 21:31:36 +00:00