Files
meshview/docs/TIMESTAMP_MIGRATION.md
2025-11-03 14:47:29 -08:00

5.7 KiB

High-Resolution Timestamp Migration

This document describes the implementation of GitHub issue #55: storing high-resolution timestamps as integers in the database for improved performance and query efficiency.

Overview

The meshview database now stores timestamps in two formats:

  1. TEXT format (import_time): Human-readable ISO8601 format with microseconds (e.g., 2025-03-12 04:15:56.058038)
  2. INTEGER format (import_time_us): Microseconds since Unix epoch (1970-01-01 00:00:00 UTC)

The dual format approach provides:

  • Backward compatibility: Existing import_time TEXT columns remain unchanged
  • Performance: Fast integer comparisons and math operations
  • Precision: Microsecond resolution for accurate timing
  • Efficiency: Compact storage and fast indexed lookups

Database Changes

New Columns Added

Three tables have new import_time_us columns:

  1. packet.import_time_us (INTEGER)

    • Stores when the packet was imported into the database
    • Indexed for fast queries
  2. packet_seen.import_time_us (INTEGER)

    • Stores when the packet_seen record was imported
    • Indexed for performance
  3. traceroute.import_time_us (INTEGER)

    • Stores when the traceroute was imported
    • Indexed for fast lookups

New Indexes

The following indexes were created for optimal query performance:

CREATE INDEX idx_packet_import_time_us ON packet(import_time_us DESC);
CREATE INDEX idx_packet_from_node_time_us ON packet(from_node_id, import_time_us DESC);
CREATE INDEX idx_packet_seen_import_time_us ON packet_seen(import_time_us);
CREATE INDEX idx_traceroute_import_time_us ON traceroute(import_time_us);

Migration Process

For Existing Databases

Run the migration script to add the new columns and populate them from existing data:

python migrate_add_timestamp_us.py [database_path]

If no path is provided, it defaults to packets.db in the current directory.

The migration script:

  1. Checks if migration is needed (idempotent)
  2. Adds import_time_us columns to the three tables
  3. Populates the new columns from existing import_time values
  4. Creates indexes for optimal performance
  5. Verifies the migration completed successfully

For New Databases

New databases created with the updated schema will automatically include the import_time_us columns. The MQTT store module populates both columns when inserting new records.

Code Changes

Models (meshview/models.py)

The ORM models now include the new import_time_us fields:

class Packet(Base):
    import_time: Mapped[datetime] = mapped_column(nullable=True)
    import_time_us: Mapped[int] = mapped_column(BigInteger, nullable=True)

MQTT Store (meshview/mqtt_store.py)

The data ingestion logic now populates both timestamp columns using UTC time:

now = datetime.datetime.now(datetime.timezone.utc)
now_us = int(now.timestamp() * 1_000_000)

# Both columns are populated
import_time=now,
import_time_us=now_us,

Important: All new timestamps use UTC (Coordinated Universal Time) for consistency across time zones.

Using the New Timestamps

Example Queries

Query packets from the last 7 days:

-- Old way (slower)
SELECT * FROM packet 
WHERE import_time >= datetime('now', '-7 days');

-- New way (faster)
SELECT * FROM packet
WHERE import_time_us >= (strftime('%s', 'now', '-7 days') * 1000000);

Query packets in a specific time range:

SELECT * FROM packet
WHERE import_time_us BETWEEN 1759254380000000 AND 1759254390000000;

Calculate time differences (in microseconds):

SELECT 
    id,
    (import_time_us - LAG(import_time_us) OVER (ORDER BY import_time_us)) / 1000000.0 as seconds_since_last
FROM packet
LIMIT 10;

Converting Timestamps

From datetime to microseconds (UTC):

import datetime
now = datetime.datetime.now(datetime.timezone.utc)
now_us = int(now.timestamp() * 1_000_000)

From microseconds to datetime:

import datetime
timestamp_us = 1759254380813451
dt = datetime.datetime.fromtimestamp(timestamp_us / 1_000_000)

In SQL queries:

-- Datetime to microseconds
SELECT CAST((strftime('%s', import_time) || substr(import_time, 21, 6)) AS INTEGER);

-- Microseconds to datetime (approximate)
SELECT datetime(import_time_us / 1000000, 'unixepoch');

Performance Benefits

The integer timestamp columns provide significant performance improvements:

  1. Faster comparisons: Integer comparisons are much faster than string/datetime comparisons
  2. Smaller index size: Integer indexes are more compact than datetime indexes
  3. Range queries: BETWEEN operations on integers are highly optimized
  4. Math operations: Easy to calculate time differences, averages, etc.
  5. Sorting: Integer sorting is faster than datetime sorting

Backward Compatibility

The original import_time TEXT columns remain unchanged:

  • Existing code continues to work
  • Human-readable timestamps still available
  • Gradual migration to new columns possible
  • No breaking changes for existing queries

Future Work

Future improvements could include:

  • Migrating queries to use import_time_us columns
  • Deprecating the TEXT import_time columns (after transition period)
  • Adding helper functions for timestamp conversion
  • Creating views that expose both formats

Testing

The migration was tested on a production database with:

  • 132,466 packet records
  • 1,385,659 packet_seen records
  • 28,414 traceroute records

All records were successfully migrated with microsecond precision preserved.

References