Introduces a 'set format and forget' workflow for MQTT brokers. Users
reference a bundled preset by name inside the existing brokers: list,
and the package supplies the endpoints, audiences, and TLS settings.
Endpoint changes ship via 'pip install -U' instead of manual edits.
What changes
- New repeater/presets/ package with a tiny lazy YAML loader and two
bundled presets: waev (mqtt-{a,b}.waev.app) and letsmesh (EU + US).
- New format-family constant MC2MQTT_FORMATS = ('meshcoretomqtt',
'letsmesh', 'waev') replaces the inline tuple in topic resolution.
The legacy 'mqtt' format keeps its custom-topic semantics unchanged.
- Two-pass broker assembly in mqtt_handler.py: pass 1 expands every
{preset: <name>} entry inline; pass 2 collapses duplicates by name
with later-wins semantics. Place override entries AFTER preset
entries.
- Hard-coded LETSMESH_BROKERS constant deleted; its data now lives in
repeater/presets/letsmesh.yaml.
- convert_letsmesh_to_broker_config() collapsed from ~70 to ~25 lines
by emitting {preset: letsmesh} plus disable overrides for unwanted
brokers. Honors broker_index in (-1, 0, 1), additional_brokers, and
enabled flag exactly as before.
- update_mqtt_config API endpoint accepts {preset: <name>} entries and
passes them through unchanged so the web UI can author them when the
frontend is updated.
- config.yaml.example documents the preset entry shape, the override
rule, and the format family hierarchy.
- pyproject.toml ships presets/*.yaml as package data.
How to use
mqtt_brokers:
iata_code: "LAX"
brokers:
- preset: waev
# Override a single preset broker:
brokers:
- preset: waev
- name: waev-b
enabled: false
Tests
- tests/test_presets.py: 9 tests covering loader, expand/merge,
MC2MQTT topic-family parity, and parametrized legacy migration.
Co-Authored-By: Oz <oz-agent@warp.dev>
- Introduced options for using GPS coordinates for repeater location fields in config.
- Implemented precision control for GPS coordinates.
- Added a new API endpoint for a Server-Sent Events stream of GPS diagnostics.
- Updated GPSService to handle new configuration options and fallback logic.
- Enhanced unit tests for GPS location handling.
- Introduced `en_pin` and `en_pins` parameters in radio configuration.
- Updated `get_radio_for_board` to handle new configuration options.
- Added unit tests to verify correct handling of `en_pins`.
Addresses PR 191 reviewer feedback:
1. Shutdown drain
stop() now waits up to 5 s for in-flight _route_packet tasks to finish,
then cancels any that remain. Previously only the queue-consumer loop was
cancelled; created tasks were abandoned with no guarantee they completed.
Mechanism: _route_tasks set tracks live tasks (added on create, discarded
in the done-callback). stop() takes a snapshot and calls asyncio.wait()
with timeout=5.0, then cancels the still-pending subset.
2. Drop counter
_cap_drop_count increments each time a packet is dropped at the cap.
The running total is included in every WARNING log line and also printed
at shutdown so operators can tell at a glance whether the safety valve is
actually firing in production.
3. Tests (tests/test_packet_router.py)
test_cap_drops_packets_when_full — cap=3, send 8 → 5 drops, 3 in-flight
test_cap_drop_count_increments — count increments by 1 per drop
test_cap_drop_count_zero_... — count stays 0 when cap never reached
test_stop_waits_for_in_flight_tasks — slow task (0.2 s) completes, not cancelled
test_stop_cancels_tasks_...timeout — hanging task cancelled after timeout
test_route_tasks_set_cleaned_up — set empty after all tasks finish
test_counter_matches_set_size — _in_flight == len(_route_tasks) at cap
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reviewer concern (PR 190):
The 1-second backoff sleep for local_transmission retry happened inside
`async with self._tx_lock`, blocking all other queued TX tasks for the
full second — hurting latency and throughput under load.
Fix — tighten lock scope to one attempt per acquisition:
Before: acquire lock → [attempt 0 → sleep(1) → attempt 1] → release
After: for each attempt:
[sleep(1) if retry] ← OUTSIDE the lock
acquire lock
re-check can_transmit ← fresh check every acquisition
attempt single send
record_tx on success
release lock
The duty-cycle gate now runs on every lock acquisition (not just the first),
which is correct: airtime state may change during the backoff sleep.
Tests added (tests/test_tx_lock.py):
1. test_concurrent_sends_do_not_interleave — two tasks racing to the same
delay timer must never overlap inside send_packet.
2. test_duty_cycle_toctou_is_fixed — second packet is dropped when the
first consumes the budget inside the lock.
3. test_local_retry_releases_lock_during_backoff — a concurrent relayed
packet fires at ~0.1s while local retry sleeps 1s; confirms it is not
blocked by the backoff.
4. test_non_local_failure_propagates — relayed send failure raises
immediately with exactly one attempt.
5. test_duty_cycle_rechecked_on_retry — if the budget is exhausted during
backoff, the retry is dropped by the in-lock gate (not sent).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Merged this now. It’s a safe change with no behavioural impact, and it removes unnecessary work in the hot paths when DEBUG logging is off. Happy to revisit if we want to standardise on lazy formatting later, but this gives us an immediate win.
Problem
-------
update_packet_metrics() called rrdtool.info() (cached for 5 s) to get the
RRD's last_update timestamp. rrdtool.info() returns a massive Python dict:
17 data sources × 5 RRAs × ~8 fields each = ~700+ dict entries per call.
tracemalloc showed +10696 new allocations / +251 KB at this exact line,
flagged as "Investigate" in the memory diagnostics dashboard.
The rrdtool.info() approach was also unnecessarily complex: it required a
5-second secondary cache, a _pending_rrd_update buffer, and two extra
instance attributes — all to answer one question ("did we already write
this period?") that we can answer ourselves with a single integer.
Fix
---
Replace _last_rrd_info_cache / _last_rrd_info_time / _pending_rrd_update
with a single self._last_rrd_update: int = 0 that stores the timestamp of
the last successful rrdtool.update() call. The throttle check becomes:
if timestamp <= self._last_rrd_update:
return
On success: self._last_rrd_update = timestamp
Zero dict allocations per call. The only downside vs rrdtool.info() is
that _last_rrd_update resets to 0 on process restart, meaning the first
packet after a restart always triggers a write — correct behaviour.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Problem
-------
rrdtool.fetch() is a blocking C library call that reads 24 hours of RRD
data from disk. The dashboard can call get_data() on every page refresh.
On an SD card each fetch can cost several milliseconds of I/O, and because
the RRD step is 60 seconds the data cannot change more often than that —
any fetch within the same 60-second window returns identical data.
The combined-optimizations branch had a 60-second read cache; rightup's
batching refactor inadvertently removed it. This PR restores it.
Solution
--------
* Add self._get_data_cache: tuple = (0.0, None) to __init__
* In get_data(): set use_cache = (start_time is None and end_time is None)
- if use_cache and cache is < 60 s old: return cached result immediately
- after a successful live fetch with use_cache: store (now, result)
* Explicit start_time / end_time callers always bypass the cache so
fine-grained or historical queries are never stale
Why 60 s TTL?
The RRD step is 60 s, so the database cannot hold a newer sample until
the next step boundary. A 60-second cache is tight enough that the
dashboard always shows data ≤ one step stale, and loose enough that
a burst of refreshes costs one disk read instead of N.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>