Commit Graph

1 Commits

Author SHA1 Message Date
TJ Downes
fdbc85c926 fix: serialise radio TX and close duty-cycle TOCTOU race
Add self._tx_lock (asyncio.Lock) to RepeaterHandler and acquire it inside
delayed_send after the per-packet sleep completes.

Problem 1 — radio interleave: concurrent delayed_send coroutines (one per
queued packet) could both exit their sleep at nearly the same moment and call
dispatcher.send_packet simultaneously, interleaving SPI/serial register writes
to the half-duplex LoRa radio.

Problem 2 — TOCTOU gap: the upfront can_transmit() check in __call__ and the
record_tx() call in delayed_send are separated by the entire TX delay (up to
several seconds).  Under burst conditions two tasks both pass the check before
either has recorded its airtime, causing both to transmit and the duty-cycle
budget to be exceeded.

Fix: acquire _tx_lock after the sleep so delay timers still run concurrently
(matching firmware behaviour), then immediately re-check can_transmit() inside
the lock before sending.  Because only one task holds the lock at a time,
airtime state is stable; check and record_tx() are effectively atomic — no
TOCTOU window.  Airtime is recorded only on a successful send, so a radio
failure never inflates the budget.

Also move `import random` from inside _calculate_tx_delay to module level
(stdlib imports belong at the top; the lazy-import pattern is unnecessary here).

Docs: docs/pr_tx_serialization.md — problem statement, root-cause analysis,
alternative approaches considered, invariant table, full unit + field test plan,
and proof of correctness for the asyncio.Lock approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 18:37:56 -07:00