From fdd788212d5ded6c515351c32e81c19f2d59a0a8 Mon Sep 17 00:00:00 2001
From: TJ Downes <273720+tjdownes@users.noreply.github.com>
Date: Tue, 21 Apr 2026 19:55:38 -0700
Subject: [PATCH] perf(rrdtool): cache get_data() result for 60 s to avoid
 repeated disk reads
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Problem
-------
rrdtool.fetch() is a blocking C library call that reads 24 hours of RRD
data from disk.  The dashboard can call get_data() on every page refresh.
On an SD card each fetch can cost several milliseconds of I/O, and because
the RRD step is 60 seconds the data cannot change more often than that —
any fetch within the same 60-second window returns identical data.

The combined-optimizations branch had a 60-second read cache; rightup's
batching refactor inadvertently removed it.  This PR restores it.

Solution
--------
* Add self._get_data_cache: tuple = (0.0, None) to __init__
* In get_data(): set use_cache = (start_time is None and end_time is None)
  - if use_cache and cache is < 60 s old: return cached result immediately
  - after a successful live fetch with use_cache: store (now, result)
* Explicit start_time / end_time callers always bypass the cache so
  fine-grained or historical queries are never stale

Why 60 s TTL?
The RRD step is 60 s, so the database cannot hold a newer sample until
the next step boundary.  A 60-second cache is tight enough that the
dashboard always shows data ≤ one step stale, and loose enough that
a burst of refreshes costs one disk read instead of N.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 repeater/data_acquisition/rrdtool_handler.py | 21 +++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/repeater/data_acquisition/rrdtool_handler.py b/repeater/data_acquisition/rrdtool_handler.py
index a1e2397..3a66f87 100644
--- a/repeater/data_acquisition/rrdtool_handler.py
+++ b/repeater/data_acquisition/rrdtool_handler.py
@@ -23,6 +23,10 @@ class RRDToolHandler:
         self._pending_rrd_update = None
         self._last_rrd_info_time = 0
         self._last_rrd_info_cache = None
+        # Read-side cache: rrdtool.fetch() returns 24 h of data and is a
+        # blocking disk read.  Cache the result for 60 s — matching the RRD
+        # step size — so repeated dashboard refreshes don't hammer the SD card.
+        self._get_data_cache: tuple = (0.0, None)  # (fetched_at, result)
 
     def _init_rrd(self):
         if not self.available:
@@ -162,9 +166,20 @@ class RRDToolHandler:
             )
             return None
 
+        # Serve from cache if result is still fresh.  RRD step is 60 s, so
+        # anything newer than that is guaranteed to be identical to a live fetch.
+        # Only the default (full 24-hour, no explicit bounds) call is cached —
+        # explicit start/end requests always bypass the cache.
+        now = time.time()
+        use_cache = start_time is None and end_time is None
+        if use_cache:
+            cache_fetched_at, cache_result = self._get_data_cache
+            if now - cache_fetched_at < 60.0 and cache_result is not None:
+                return cache_result
+
         try:
             if end_time is None:
-                end_time = int(time.time())
+                end_time = int(now)
             if start_time is None:
                 start_time = end_time - (24 * 3600)
 
@@ -220,6 +235,10 @@ class RRDToolHandler:
 
             result["timestamps"] = timestamps
 
+            # Populate read cache for default (unconstrained) calls only.
+            if use_cache:
+                self._get_data_cache = (now, result)
+
             return result
 
         except Exception as e: