Reviewer concern (PR 190):
The 1-second backoff sleep for local_transmission retry happened inside
`async with self._tx_lock`, blocking all other queued TX tasks for the
full second — hurting latency and throughput under load.
Fix — tighten lock scope to one attempt per acquisition:
Before: acquire lock → [attempt 0 → sleep(1) → attempt 1] → release
After: for each attempt:
[sleep(1) if retry] ← OUTSIDE the lock
acquire lock
re-check can_transmit ← fresh check every acquisition
attempt single send
record_tx on success
release lock
The duty-cycle gate now runs on every lock acquisition (not just the first),
which is correct: airtime state may change during the backoff sleep.
Tests added (tests/test_tx_lock.py):
1. test_concurrent_sends_do_not_interleave — two tasks racing to the same
delay timer must never overlap inside send_packet.
2. test_duty_cycle_toctou_is_fixed — second packet is dropped when the
first consumes the budget inside the lock.
3. test_local_retry_releases_lock_during_backoff — a concurrent relayed
packet fires at ~0.1s while local retry sleeps 1s; confirms it is not
blocked by the backoff.
4. test_non_local_failure_propagates — relayed send failure raises
immediately with exactly one attempt.
5. test_duty_cycle_rechecked_on_retry — if the budget is exhausted during
backoff, the retry is dropped by the in-lock gate (not sent).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Merged this now. It’s a safe change with no behavioural impact, and it removes unnecessary work in the hot paths when DEBUG logging is off. Happy to revisit if we want to standardise on lazy formatting later, but this gives us an immediate win.
Python evaluates f-string arguments before calling logger.debug(), so in
production (INFO level) every debug log call in the hot path still paid the
cost of string formatting even though the output was discarded.
The most expensive sites are in __call__ (runs on every received packet):
- "RX packet: header=0x{...}, payload_len=..., path_len=..., rssi=..., snr=..."
- "Packet header=0x{...}, type=..., route=..."
And in _calculate_tx_delay (runs on every forwarded packet):
- "Route=FLOOD/DIRECT, len=...B, airtime=...ms, delay=...s"
- "Congestion detected, score=..., delay multiplier=..."
Plus transport code and local-TX debug logs (less frequent but same issue).
Fix: wrap each f-string logger.debug() call with
if logger.isEnabledFor(logging.DEBUG):
so the f-string is never constructed when debug logging is disabled.
logger.isEnabledFor() is a pure in-memory integer comparison — essentially
free at runtime. In production at INFO level this eliminates string
concatenation, attribute lookups (packet.header, len(packet.payload), etc.),
and format operations on every forwarded packet.
Eight call sites guarded; no logic changes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Before this change, calculate_packet_hash() (SHA-256 + hex + upper) was called
3 times per forwarded packet and 4 times per dropped packet:
__call__ → pkt_hash_full = packet.calculate_packet_hash() #1
→ flood/direct_forward → is_duplicate → calculate_packet_hash() #2
→ flood/direct_forward → mark_seen → calculate_packet_hash() #3
(drop) → _get_drop_reason → is_duplicate → calculate_packet_hash() #4
pkt_hash_full was computed in __call__ but never threaded down into
process_packet, flood_forward, direct_forward, is_duplicate, or _get_drop_reason.
Each method recomputed it independently.
Fix: add optional packet_hash: Optional[str] = None to is_duplicate,
_get_drop_reason, flood_forward, direct_forward, and process_packet. Pass
pkt_hash_full from __call__ through the chain. Each method uses the provided
hash or falls back to computing it — preserving backward compatibility for
external callers (TraceHelper, etc.) that have no pre-computed hash.
Result: 1 SHA-256 computation per packet in the hot path regardless of whether
the packet is forwarded or dropped.
Also adds explicit INVARIANT docstrings to flood_forward, direct_forward, and
is_duplicate documenting that these methods must remain synchronous (no await).
The is_duplicate + mark_seen pair is atomic within the asyncio event loop; adding
an await between them would allow two concurrent tasks to both pass the duplicate
check for the same packet — forwarding it twice.
Docs: docs/pr_hash_once.md — problem analysis, call-chain diagram, per-method
diffs, quantification (~3-8 µs saved per packet), test plan (including hash-count
assertion), and proof that passing the original's hash to the deep-copied packet
is correct.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add self._tx_lock (asyncio.Lock) to RepeaterHandler and acquire it inside
delayed_send after the per-packet sleep completes.
Problem 1 — radio interleave: concurrent delayed_send coroutines (one per
queued packet) could both exit their sleep at nearly the same moment and call
dispatcher.send_packet simultaneously, interleaving SPI/serial register writes
to the half-duplex LoRa radio.
Problem 2 — TOCTOU gap: the upfront can_transmit() check in __call__ and the
record_tx() call in delayed_send are separated by the entire TX delay (up to
several seconds). Under burst conditions two tasks both pass the check before
either has recorded its airtime, causing both to transmit and the duty-cycle
budget to be exceeded.
Fix: acquire _tx_lock after the sleep so delay timers still run concurrently
(matching firmware behaviour), then immediately re-check can_transmit() inside
the lock before sending. Because only one task holds the lock at a time,
airtime state is stable; check and record_tx() are effectively atomic — no
TOCTOU window. Airtime is recorded only on a successful send, so a radio
failure never inflates the budget.
Also move `import random` from inside _calculate_tx_delay to module level
(stdlib imports belong at the top; the lazy-import pattern is unnecessary here).
Docs: docs/pr_tx_serialization.md — problem statement, root-cause analysis,
alternative approaches considered, invariant table, full unit + field test plan,
and proof of correctness for the asyncio.Lock approach.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Added record_duplicate method to RepeaterHandler to log known duplicate packets without forwarding.
- Enhanced RepeaterDaemon to subscribe to raw packets for deduplication logging, ensuring all path variants are visible in the UI.
- Updated recent_packets management to group duplicates under their original packets for better tracking.
- Updated record_packet_only method to skip logging for TRACE packets, as TraceHelper manages trace paths.
- Enhanced documentation to clarify the handling of TRACE packets in the web UI.
- Added tests to ensure TRACE packets are not recorded, maintaining data integrity.
- Modify TX modes: forward, monitor, and add no_tx, allowing for flexible packet handling.
- Updated configuration and API endpoints to support the new modes.
- Adjusted logic in RepeaterHandler to manage packet processing based on the selected mode.
- Enhanced CLI commands to reflect the new mode settings.
- Added tests for each TX mode to ensure correct behavior.
- Functionality of Packet.apply_path_hash_mode and get_path_hashes
- Engine flood_forward and direct_forward with real multi-byte encoded packets
- PacketBuilder.create_trace payload structure and TraceHandler parsing
- Enforcement of max-hop boundaries per hash size
- Introduced `total_rx_airtime_ms` in `AirtimeManager` to track received packet airtime.
- Added `record_rx` method to log received airtime in `AirtimeManager`.
- Updated `RepeaterHandler` to count received packets and log RX airtime using the new method.
- Enhanced statistics reporting in `get_stats` to include total received airtime.
- Updated `ProtocolRequestHelper` to include total RX airtime in the RepeaterStats structure for better monitoring.
- Removed redundant original_path assignment in `RepeaterHandler` to streamline packet processing.
- Introduced `_is_direct_final_hop` helper method in `PacketRouter` to determine if a packet is the final destination for direct routes with an empty path.
- Updated comments in `PacketRouter` to clarify the handling of packets during routing, especially for direct forwarding scenarios.
- Adjusted logic to ensure packets are correctly processed or delivered based on their routing status, enhancing overall packet management.
- Introduced helper methods `_path_hash_display` and `_packet_record_src_dst` in `RepeaterHandler` to streamline path hash and source/destination hash extraction.
- Updated `record_packet` method to utilize a new `_build_packet_record` method for improved readability and maintainability.
- Enhanced `PacketRouter` comments for clarity on handling remote destinations and packet processing, ensuring better understanding of the routing logic.
- Introduced `record_packet_only` method in `RepeaterHandler` to log packets for UI/storage without forwarding or duplicate checks, specifically for injection-only types like ANON_REQ and ACK.
- Updated `PacketRouter` to call `_record_for_ui` method, ensuring that relevant metadata is recorded for packets processed by various handlers.
- Enhanced handling of packet metadata, including RSSI and SNR values, to improve the visibility of packet information in the web UI.
- Introduced `path_hash_mode` setting in `config.yaml.example` to specify the hash size for flood packets.
- Updated `ConfigManager` to re-apply the path hash mode when the mesh section changes, with validation for acceptable values (0, 1, 2).
- Enhanced `RepeaterDaemon` to set the default path hash mode during initialization, ensuring consistent handling of flood packets.
- Removed redundant call to mark_seen() for duplicate packets.
- Added validation to ensure hop count does not exceed the maximum limit before appending to the packet path.
- Updated logic to check for path size constraints when appending hash bytes, improving packet processing efficiency.
- Replaced list-based path storage with hash-based methods for original and forwarded paths, improving efficiency and consistency.
- Updated display logic to format path hashes correctly, ensuring compatibility with new hash size management.
- Adjusted local transmission handling to align with the new hash representation, enhancing clarity in packet processing.
- Updated the RepeaterHandler constructor to accept local_hash_bytes, improving path handling.
- Implemented checks to ensure packet paths do not exceed MAX_PATH_SIZE when appending hash bytes.
- Refactored direct_forward method to utilize local_hash_bytes for next hop validation and path manipulation.
- Adjusted path length encoding to accommodate changes in path management logic.
- Create a new table for storing CRC errors in SQLite.
- Implement methods to store and retrieve CRC error counts and history.
- Update StorageCollector to record CRC errors and expose relevant methods.
- Enhance RepeaterHandler to track and record CRC error deltas from the radio hardware.
- Add API endpoints to fetch CRC error count and history.
- Updated the noise floor retrieval method to run in an executor, preventing blocking of the event loop during the KISS modem's command execution.
- This change enhances responsiveness by allowing the process to handle other tasks while waiting for the noise floor measurement.
- Improved local transmission handling by deferring local TX when duty cycle limits are exceeded, instead of dropping packets.
- Added LBT metadata extraction and logging for better monitoring of transmission attempts and delays.
- Refactored `schedule_retransmit` to support retrying local transmissions on failure, enhancing reliability.
- Introduced a lock in PacketRouter to serialize local TX operations, preventing race conditions during packet processing.
Refactor packet processing: use processed_packet for forwarding and drop reason checks
Fix: Update zero-hop determination logic in AdvertHelper
Fix: Clone packet in process_packet to prevent modification of the original
Fix: Import copy module for deep copying of packets in process_packet
Adds the following fields to the stats API response for MeshCore CLI parity:
- config.repeater.max_flood_hops: Max flood hops setting (for 'get flood.max')
- config.repeater.advert_interval_minutes: Local advert interval (for 'get advert.interval')
- config.delays.rx_delay_base: RX delay base setting (for 'get rxdelay')
These fields are already present in config.yaml but were not exposed via the
stats API, making them inaccessible to web dashboards and CLI tools that
communicate over HTTP.
This enables pyMC Console's Terminal to display these values without
requiring direct config file access.
Co-Authored-By: Warp <agent@warp.dev>