Commit Graph

579 Commits

Author SHA1 Message Date
Lloyd
9eae6ed872 Merge pull request #193 from tjdownes/perf/sqlite-wal-threadlocal
perf: thread-local SQLite connections, synchronous=NORMAL, dedup indexes
2026-04-22 11:00:50 +01:00
Lloyd
986e22de1f Merge pull request #195 from tjdownes/perf/advert-deque
This is a solid, well-analyzed optimization, thank you
2026-04-22 10:45:46 +01:00
Lloyd
af79eaf63f Merge pull request #192 from tjdownes/perf/hash-once
perf: compute packet hash once per packet in the forwarding hot path
2026-04-22 10:38:04 +01:00
Lloyd
5c947e6c2e feat: enhance WebSocket handling and add throttling for stats broadcasting 2026-04-22 09:48:27 +01:00
Lloyd
40ec2ba293 Merge pull request #194 from tjdownes/perf/debug-log-guards
Merged this now. It’s a safe change with no behavioural impact, and it removes unnecessary work in the hot paths when DEBUG logging is off. Happy to revisit if we want to standardise on lazy formatting later, but this gives us an immediate win.
2026-04-22 09:45:55 +01:00
Lloyd
96b3daf6e8 Merge pull request #196 from tjdownes/perf/rrdtool-batch
perf(rrdtool): cache get_data() result for 60 s to avoid repeated disk reads
2026-04-22 09:35:31 +01:00
Lloyd
0a77fe67ce feat: reapply ui changes from PR 2026-04-22 08:39:15 +01:00
Lloyd
db41080dea Merge pull request #187 from Rigear/feat/mqtt_merge
Feat/mqtt merge
2026-04-22 08:37:22 +01:00
Rigear
f50919858d fix: Force merged web assets from fix-perform-speed branch to fix bad merge of the files 2026-04-21 21:22:02 -07:00
Rigear
c7b2b02316 fix: Fixed extra topic publishing to letsmesh 2026-04-21 21:21:13 -07:00
Rigear
d318334288 Merge remote-tracking branch 'origin/fix-perform-speed' into feat/mqtt_merge 2026-04-21 20:59:42 -07:00
TJ Downes
d592af6e19 fix(rrdtool): replace rrdtool.info() with self-tracked timestamp to eliminate allocation storm
Problem
-------
update_packet_metrics() called rrdtool.info() (cached for 5 s) to get the
RRD's last_update timestamp.  rrdtool.info() returns a massive Python dict:
17 data sources × 5 RRAs × ~8 fields each = ~700+ dict entries per call.
tracemalloc showed +10696 new allocations / +251 KB at this exact line,
flagged as "Investigate" in the memory diagnostics dashboard.

The rrdtool.info() approach was also unnecessarily complex: it required a
5-second secondary cache, a _pending_rrd_update buffer, and two extra
instance attributes — all to answer one question ("did we already write
this period?") that we can answer ourselves with a single integer.

Fix
---
Replace _last_rrd_info_cache / _last_rrd_info_time / _pending_rrd_update
with a single self._last_rrd_update: int = 0 that stores the timestamp of
the last successful rrdtool.update() call.  The throttle check becomes:

    if timestamp <= self._last_rrd_update:
        return

On success: self._last_rrd_update = timestamp

Zero dict allocations per call.  The only downside vs rrdtool.info() is
that _last_rrd_update resets to 0 on process restart, meaning the first
packet after a restart always triggers a write — correct behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 20:50:27 -07:00
TJ Downes
fdd788212d perf(rrdtool): cache get_data() result for 60 s to avoid repeated disk reads
Problem
-------
rrdtool.fetch() is a blocking C library call that reads 24 hours of RRD
data from disk.  The dashboard can call get_data() on every page refresh.
On an SD card each fetch can cost several milliseconds of I/O, and because
the RRD step is 60 seconds the data cannot change more often than that —
any fetch within the same 60-second window returns identical data.

The combined-optimizations branch had a 60-second read cache; rightup's
batching refactor inadvertently removed it.  This PR restores it.

Solution
--------
* Add self._get_data_cache: tuple = (0.0, None) to __init__
* In get_data(): set use_cache = (start_time is None and end_time is None)
  - if use_cache and cache is < 60 s old: return cached result immediately
  - after a successful live fetch with use_cache: store (now, result)
* Explicit start_time / end_time callers always bypass the cache so
  fine-grained or historical queries are never stale

Why 60 s TTL?
The RRD step is 60 s, so the database cannot hold a newer sample until
the next step boundary.  A 60-second cache is tight enough that the
dashboard always shows data ≤ one step stale, and loose enough that
a burst of refreshes costs one disk read instead of N.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 19:55:38 -07:00
TJ Downes
c52ae53cc6 perf(advert): replace list with deque for _recent_drops; use islice for _known_neighbors cap
Problem 1 — _recent_drops: the list was evicted with pop(0), which is an
O(n) memmove every time a drop is recorded.  With maxlen=20 this is
negligible today, but pop(0) on a list is always O(n) and the pattern is
worth eliminating.

Problem 2 — _known_neighbors cap: the eviction path did
  set(list(self._known_neighbors)[500:])
which first materialises the entire set as a list (O(n) allocation) before
slicing.  itertools.islice works directly on the set iterator and only
allocates the 500 kept items, halving peak memory pressure during cleanup.

Changes:
* Import itertools (already absent from this file)
* Import deque from collections alongside OrderedDict
* self._recent_drops initialised as deque(maxlen=20); self._max_recent_drops
  removed (maxlen is the single source of truth)
* Drop-recording block: rebuild deque from generator (preserves pubkey dedup
  filter) then append — automatic eviction replaces the explicit pop(0) guard
* Known-neighbors cap: itertools.islice(self._known_neighbors, 500) replaces
  list(self._known_neighbors)[500:]

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 19:52:44 -07:00
TJ Downes
c0163ce897 perf: guard hot-path debug log f-strings with isEnabledFor(DEBUG)
Python evaluates f-string arguments before calling logger.debug(), so in
production (INFO level) every debug log call in the hot path still paid the
cost of string formatting even though the output was discarded.

The most expensive sites are in __call__ (runs on every received packet):
  - "RX packet: header=0x{...}, payload_len=..., path_len=..., rssi=..., snr=..."
  - "Packet header=0x{...}, type=..., route=..."

And in _calculate_tx_delay (runs on every forwarded packet):
  - "Route=FLOOD/DIRECT, len=...B, airtime=...ms, delay=...s"
  - "Congestion detected, score=..., delay multiplier=..."

Plus transport code and local-TX debug logs (less frequent but same issue).

Fix: wrap each f-string logger.debug() call with
  if logger.isEnabledFor(logging.DEBUG):
so the f-string is never constructed when debug logging is disabled.

logger.isEnabledFor() is a pure in-memory integer comparison — essentially
free at runtime.  In production at INFO level this eliminates string
concatenation, attribute lookups (packet.header, len(packet.payload), etc.),
and format operations on every forwarded packet.

Eight call sites guarded; no logic changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 19:46:11 -07:00
TJ Downes
3397d972ce perf: thread-local SQLite connections, synchronous=NORMAL, dedup indexes
Five targeted changes to sqlite_handler.py, all in the same file.

1. Thread-local persistent connections
   _connect() previously opened a new sqlite3.connect() on every DB call and
   ran journal_mode + busy_timeout PRAGMAs each time.  On SD-card storage each
   connection open involves file-system operations; each PRAGMA is a round-trip.
   threading.local() now caches one connection per thread (write executor thread
   + event-loop/HTTP threads), eliminating per-call setup overhead.

2. PRAGMA synchronous=NORMAL
   Default synchronous=FULL flushes WAL frames to disk after every transaction.
   NORMAL flushes only at WAL checkpoints — safe for this workload (no data loss
   beyond the current transaction on power failure) and significantly faster on
   SD cards, which have slow fsync (5-20ms per flush).

3. Migration 8: UNIQUE index on companion_messages(companion_hash, packet_hash)
   companion_push_message previously deduped via SELECT + INSERT (two statements,
   two SD-card reads per message).  The new UNIQUE index enables INSERT OR IGNORE,
   replacing the round-trip with a single atomic statement.

4. Migration 9: UNIQUE index on adverts(pubkey)
   Without this index store_advert's ON CONFLICT clause cannot fire and each
   advert inserts a new row instead of updating the existing one — unbounded
   table growth on busy meshes.  The migration deduplicates existing rows
   (keeping the most-recently-seen per pubkey) before adding the index.

5. Remove duplicate get_unsynced_count definition
   The method was defined twice with the same signature.  Python silently uses
   the last definition; the first was dead code with reversed SQL parameter
   binding order.  Removed the first; added a note to the surviving definition.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 19:41:50 -07:00
TJ Downes
4e16fd040d perf: compute packet hash once per packet in the forwarding hot path
Before this change, calculate_packet_hash() (SHA-256 + hex + upper) was called
3 times per forwarded packet and 4 times per dropped packet:
  __call__              → pkt_hash_full = packet.calculate_packet_hash()   #1
  → flood/direct_forward → is_duplicate → calculate_packet_hash()          #2
  → flood/direct_forward → mark_seen    → calculate_packet_hash()          #3
  (drop) → _get_drop_reason → is_duplicate → calculate_packet_hash()       #4

pkt_hash_full was computed in __call__ but never threaded down into
process_packet, flood_forward, direct_forward, is_duplicate, or _get_drop_reason.
Each method recomputed it independently.

Fix: add optional packet_hash: Optional[str] = None to is_duplicate,
_get_drop_reason, flood_forward, direct_forward, and process_packet.  Pass
pkt_hash_full from __call__ through the chain.  Each method uses the provided
hash or falls back to computing it — preserving backward compatibility for
external callers (TraceHelper, etc.) that have no pre-computed hash.

Result: 1 SHA-256 computation per packet in the hot path regardless of whether
the packet is forwarded or dropped.

Also adds explicit INVARIANT docstrings to flood_forward, direct_forward, and
is_duplicate documenting that these methods must remain synchronous (no await).
The is_duplicate + mark_seen pair is atomic within the asyncio event loop; adding
an await between them would allow two concurrent tasks to both pass the duplicate
check for the same packet — forwarding it twice.

Docs: docs/pr_hash_once.md — problem analysis, call-chain diagram, per-method
diffs, quantification (~3-8 µs saved per packet), test plan (including hash-count
assertion), and proof that passing the original's hash to the deep-copied packet
is correct.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 19:28:45 -07:00
Lloyd
c82f0cfce6 feat:add ui websockets teardown. 2026-04-21 14:47:18 +01:00
Lloyd
be56e919fd feat: add server-side airtime bucket aggregation for optimized chart rendering 2026-04-21 14:46:30 +01:00
Lloyd
81a3b70415 feat: implement graceful shutdown handling and version cache optimizations 2026-04-21 12:07:08 +01:00
Lloyd
9797e08421 feat: implement background scheduling for deferred network publishing tasks, tidy shutdown process 2026-04-21 10:07:15 +01:00
Lloyd
3df4b03fd9 feat: implement deferred network publishing for packets, adverts, and noise floor records 2026-04-21 09:49:12 +01:00
Lloyd
c5fd41f28a feat: enhance task management in handlers with tracking and error logging 2026-04-21 09:38:03 +01:00
Lloyd
1883bc47be refactor: centralize database connection handling with WAL mode and busy timeout 2026-04-20 16:17:34 +01:00
Lloyd
b26ebeb807 fix: optimize memory tracing by reducing overhead and filtering snapshots 2026-04-20 16:04:19 +01:00
Lloyd
68a461b965 feat: add memory debug to gui 2026-04-20 15:41:27 +01:00
Lloyd
5eb1fc47ca feat: add memory_debug endpoint for memory leak diagnostics and improve SSL context handling for GitHub requests 2026-04-20 14:51:48 +01:00
Rigear
096c5a8f07 fix: Do not connect a disabled broker 2026-04-19 22:18:35 -07:00
Rigear
11f749e0e9 fix: Initialize tls_verified and properly handle when mqtt_broker is None 2026-04-19 18:04:36 -07:00
Rightup
799a85ddf9 fix: remove --no-index from R2 pip install so pyyaml resolves from PyPI 2026-04-19 19:34:26 +01:00
Rigear
093ebc2873 feat: Web assets build after rebasing from dev 2026-04-18 20:53:39 -07:00
Rigear
2e1d19ab80 Merge remote-tracking branch 'origin/dev' into feat/mqtt_merge
# Conflicts:
#	config.yaml.example
#	repeater/data_acquisition/__init__.py
#	repeater/data_acquisition/storage_collector.py
#	repeater/web/html/assets/CADCalibration-319vQEzv.js
#	repeater/web/html/assets/CADCalibration-Cwr0Kq49.js
#	repeater/web/html/assets/CADCalibration-DWusgblB.js
#	repeater/web/html/assets/Companions-DU19yZyB.js
#	repeater/web/html/assets/Companions-cufpceKN.js
#	repeater/web/html/assets/Companions-zmTexa6a.js
#	repeater/web/html/assets/Configuration-BmDpq7bV.js
#	repeater/web/html/assets/ConfirmDialog-BafURQpE.js
#	repeater/web/html/assets/ConfirmDialog-C9Yf394V.js
#	repeater/web/html/assets/ConfirmDialog-h2bJ_WKJ.js
#	repeater/web/html/assets/Dashboard-CnQfG826.js
#	repeater/web/html/assets/Login-BDsVY-me.js
#	repeater/web/html/assets/Logs-BpG7T8_d.js
#	repeater/web/html/assets/Logs-CVZ1ZqH8.js
#	repeater/web/html/assets/Logs-sxcWuUjs.js
#	repeater/web/html/assets/MessageDialog-B-qWtO0z.js
#	repeater/web/html/assets/MessageDialog-Cp4W1enq.js
#	repeater/web/html/assets/MessageDialog-D2OlpbZ7.js
#	repeater/web/html/assets/Neighbors-BAwKrJdF.js
#	repeater/web/html/assets/Neighbors-BamkiPcU.js
#	repeater/web/html/assets/Neighbors-WHAK_7hU.js
#	repeater/web/html/assets/RoomServers-DbCgmJ6x.js
#	repeater/web/html/assets/RoomServers-i32N0iwv.js
#	repeater/web/html/assets/RoomServers-o3kDed-S.js
#	repeater/web/html/assets/Sessions-B8ZVRIGt.js
#	repeater/web/html/assets/Sessions-B9uqWGaO.js
#	repeater/web/html/assets/Sessions-O3vBapMM.js
#	repeater/web/html/assets/Setup-DyJMgh0L.js
#	repeater/web/html/assets/Statistics-BbiQtXdu.js
#	repeater/web/html/assets/Statistics-CeTg6NYy.js
#	repeater/web/html/assets/Statistics-QSH8GjMX.js
#	repeater/web/html/assets/SystemStats-B7qxcRYp.js
#	repeater/web/html/assets/SystemStats-BmXJQonl.js
#	repeater/web/html/assets/SystemStats-DVaA1ybj.js
#	repeater/web/html/assets/Terminal-CUqcF84y.js
#	repeater/web/html/assets/Terminal-D1kRkrmc.js
#	repeater/web/html/assets/Terminal-Dq6FyjMj.js
#	repeater/web/html/assets/api-CiSov_eM.js
#	repeater/web/html/assets/api-DegLD39Y.js
#	repeater/web/html/assets/api-DjLVJkR1.js
#	repeater/web/html/assets/index-cutq4vvY.js
#	repeater/web/html/assets/packets-Bg0pkGLO.js
#	repeater/web/html/assets/packets-CPLd89q8.js
#	repeater/web/html/assets/packets-DmoWuBlc.js
#	repeater/web/html/assets/system-Bocs8bSU.js
#	repeater/web/html/assets/system-CsY7_jKa.js
#	repeater/web/html/assets/system-qCwV23PE.js
#	repeater/web/html/assets/useSignalQuality-DQTATYAm.js
#	repeater/web/html/assets/useSignalQuality-DlXA7j0p.js
#	repeater/web/html/assets/useSignalQuality-u0_rDpC6.js
#	repeater/web/html/index.html
2026-04-18 20:25:30 -07:00
Rigear
92f9fe77ae fix: user/pass nor loading from config 2026-04-18 20:15:58 -07:00
Rightup
dfe9ba20f3 Fix R2 wheels installation path for improved dependency resolution 2026-04-18 23:15:34 +01:00
Rightup
d336c72625 Enhance installation process with R2 wheels support for ARM devices 2026-04-18 23:15:13 +01:00
Lloyd
083ad2bc7a Merge pull request #184 from zindello/feat/luckfoxInstallSupport 2026-04-18 13:00:09 +01:00
Joshua Mesilane
a9590fac01 Fix the headless install option 2026-04-18 21:09:01 +10:00
Lloyd
8f2888f2d5 Merge pull request #183 from zindello/feat/luckfoxInstallSupport
Fix for polkit version detection
2026-04-18 09:05:12 +01:00
Joshua Mesilane
7ba26b72cb Fix for polkit version detection 2026-04-18 17:39:35 +10:00
Lloyd
56e5a93699 Merge pull request #182 from zindello/feat/luckfoxInstallSupport 2026-04-18 08:32:30 +01:00
Joshua Mesilane
8ebcb09eff Headless install fix 2026-04-18 17:12:06 +10:00
Joshua Mesilane
62d6627fab Fix readme 2026-04-18 17:09:26 +10:00
Joshua Mesilane
4e3b2bbc9a Updates to support installs on the LuckFox platform 2026-04-18 16:50:44 +10:00
Rigear
d6681ab407 feat: Update UI files from fc223397df8e5681e886752b279bc25ed34938b8 hash in Rigear/pyMC-RepeaterUI 2026-04-17 21:12:15 -07:00
Rigear
3f09e910d9 fix(QOL): reordered mqtt yaml config so names are first. 2026-04-17 21:09:23 -07:00
Rigear
79d40afc71 fix: Force TLS when loading in existing Letsmesh configs from yaml 2026-04-17 21:08:43 -07:00
Rightup
9442c51225 feat: update logo in ui 2026-04-17 23:51:43 +01:00
Rightup
ffaaa76ea0 feat: add glass to repeater. 2026-04-17 23:51:04 +01:00
Rigear
f641761b05 feat: UI updated from 4a24b6d2c7 2026-04-16 14:54:30 -07:00
Rigear
6d133efdbe fix: If we're using websockets, default to tls enabled = true if we're using port 443 2026-04-16 13:23:46 -07:00