perf(rrdtool): cache get_data() result for 60 s to avoid repeated disk reads

Problem ------- rrdtool.fetch() is a blocking C library call that reads 24 hours of RRD data from disk. The dashboard can call get_data() on every page refresh. On an SD card each fetch can cost several milliseconds of I/O, and because the RRD step is 60 seconds the data cannot change more often than that — any fetch within the same 60-second window returns identical data. The combined-optimizations branch had a 60-second read cache; rightup's batching refactor inadvertently removed it. This PR restores it. Solution -------- * Add self._get_data_cache: tuple = (0.0, None) to __init__ * In get_data(): set use_cache = (start_time is None and end_time is None) - if use_cache and cache is < 60 s old: return cached result immediately - after a successful live fetch with use_cache: store (now, result) * Explicit start_time / end_time callers always bypass the cache so fine-grained or historical queries are never stale Why 60 s TTL? The RRD step is 60 s, so the database cannot hold a newer sample until the next step boundary. A 60-second cache is tight enough that the dashboard always shows data ≤ one step stale, and loose enough that a burst of refreshes costs one disk read instead of N. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-03 20:22:15 +02:00 · 2026-04-21 19:55:38 -07:00
parent c82f0cfce6
commit fdd788212d
1 changed files with 20 additions and 1 deletions
--- a/repeater/data_acquisition/rrdtool_handler.py
+++ b/repeater/data_acquisition/rrdtool_handler.py
@@ -23,6 +23,10 @@ class RRDToolHandler:
        self._pending_rrd_update = None
        self._last_rrd_info_time = 0
        self._last_rrd_info_cache = None
+        # Read-side cache: rrdtool.fetch() returns 24 h of data and is a
+        # blocking disk read.  Cache the result for 60 s — matching the RRD
+        # step size — so repeated dashboard refreshes don't hammer the SD card.
+        self._get_data_cache: tuple = (0.0, None)  # (fetched_at, result)

    def _init_rrd(self):
        if not self.available:
@@ -162,9 +166,20 @@ class RRDToolHandler:
            )
            return None

+        # Serve from cache if result is still fresh.  RRD step is 60 s, so
+        # anything newer than that is guaranteed to be identical to a live fetch.
+        # Only the default (full 24-hour, no explicit bounds) call is cached —
+        # explicit start/end requests always bypass the cache.
+        now = time.time()
+        use_cache = start_time is None and end_time is None
+        if use_cache:
+            cache_fetched_at, cache_result = self._get_data_cache
+            if now - cache_fetched_at < 60.0 and cache_result is not None:
+                return cache_result
+
        try:
            if end_time is None:
-                end_time = int(time.time())
+                end_time = int(now)
            if start_time is None:
                start_time = end_time - (24 * 3600)

@@ -220,6 +235,10 @@ class RRDToolHandler:

            result["timestamps"] = timestamps

+            # Populate read cache for default (unconstrained) calls only.
+            if use_cache:
+                self._get_data_cache = (now, result)
+
            return result

        except Exception as e: