mirror of
https://github.com/l5yth/potato-mesh.git
synced 2026-03-28 17:42:48 +01:00
docs: document prometheus metrics (#399)
This commit is contained in:
100
PROMETHEUS.md
Normal file
100
PROMETHEUS.md
Normal file
@@ -0,0 +1,100 @@
|
||||
# Prometheus Monitoring for PotatoMesh
|
||||
|
||||
PotatoMesh exposes runtime telemetry through a dedicated Prometheus endpoint so you can
|
||||
observe message flow, node health, and geospatial metadata alongside the rest of your
|
||||
infrastructure. This guide explains how the exporter is wired into the web
|
||||
application, which metrics are available, and how to integrate the endpoint with a
|
||||
Prometheus server.
|
||||
|
||||
## Runtime integration
|
||||
|
||||
The Sinatra application automatically loads the `prometheus-client` gem and mounts the
|
||||
collector and exporter middlewares during boot. No additional configuration is
|
||||
required to enable the `/metrics` endpoint—running the web application is enough to
|
||||
serve Prometheus data on the same port as the dashboard. The middleware pair both
|
||||
collects default Rack statistics and publishes PotatoMesh-specific gauges and
|
||||
counters that are updated whenever the ingestors process new node records.
|
||||
|
||||
A background refresh is triggered during start-up via
|
||||
`update_all_prometheus_metrics_from_nodes`, which seeds the gauges based on the latest
|
||||
state in the database. Subsequent POST requests to the ingest APIs update each metric
|
||||
in near real time.
|
||||
|
||||
## Selecting which nodes are exported
|
||||
|
||||
To avoid creating high-cardinality time series, PotatoMesh does not export per-node
|
||||
metrics unless you opt in by providing node identifiers. Control this behaviour with
|
||||
the `PROM_REPORT_IDS` environment variable:
|
||||
|
||||
- Leave the variable unset or blank to only export aggregate gauges such as the total
|
||||
node count.
|
||||
- Set `PROM_REPORT_IDS=*` to export metrics for every node in the database.
|
||||
- Provide a comma-separated list (for example `PROM_REPORT_IDS=ABCD1234,EFGH5678`) to
|
||||
expose metrics for specific nodes.
|
||||
|
||||
The selection applies to both the initial refresh and the incremental updates handled
|
||||
by the ingest pipeline.
|
||||
|
||||
## Available metrics
|
||||
|
||||
| Metric name | Type | Labels | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `meshtastic_messages_total` | Counter | _none_ | Increments each time the ingest pipeline accepts a new message payload. |
|
||||
| `meshtastic_nodes` | Gauge | _none_ | Tracks the number of nodes currently stored in the database. |
|
||||
| `meshtastic_node` | Gauge | `node`, `short_name`, `long_name`, `hw_model`, `role` | Reports a node as present (value `1`) along with identity metadata. |
|
||||
| `meshtastic_node_battery_level` | Gauge | `node` | Most recent battery percentage reported by the node. |
|
||||
| `meshtastic_node_voltage` | Gauge | `node` | Most recent battery voltage reading. |
|
||||
| `meshtastic_node_uptime_seconds` | Gauge | `node` | Uptime reported by the device in seconds. |
|
||||
| `meshtastic_node_channel_utilization` | Gauge | `node` | Latest channel utilisation ratio supplied by the node. |
|
||||
| `meshtastic_node_transmit_air_utilization` | Gauge | `node` | Proportion of on-air time spent transmitting. |
|
||||
| `meshtastic_node_latitude` | Gauge | `node` | Latitude component of the last known position. |
|
||||
| `meshtastic_node_longitude` | Gauge | `node` | Longitude component of the last known position. |
|
||||
| `meshtastic_node_altitude` | Gauge | `node` | Altitude (in metres) of the last known position. |
|
||||
|
||||
All per-node gauges are only emitted for identifiers included in `PROM_REPORT_IDS`.
|
||||
Some values require telemetry packets to be present—for example, devices must provide
|
||||
metrics or positional updates before the related gauges appear.
|
||||
|
||||
## Accessing the `/metrics` endpoint
|
||||
|
||||
Once the application is running, query the exporter directly:
|
||||
|
||||
```bash
|
||||
curl http://localhost:41447/metrics
|
||||
```
|
||||
|
||||
Use any HTTP client capable of plain-text requests. Prometheus scrapers should target
|
||||
the same URL. The endpoint returns data in the standard exposition format produced by
|
||||
`prometheus-client`.
|
||||
|
||||
## Prometheus scrape configuration
|
||||
|
||||
Add a job to your Prometheus server configuration that points to the PotatoMesh
|
||||
instance. This example polls an instance running locally on the default port every 15
|
||||
seconds:
|
||||
|
||||
```yaml
|
||||
scrape_configs:
|
||||
- job_name: potatomesh
|
||||
scrape_interval: 15s
|
||||
static_configs:
|
||||
- targets:
|
||||
- localhost:41447
|
||||
```
|
||||
|
||||
If your deployment requires authentication or runs behind a reverse proxy, configure
|
||||
Prometheus to match your network topology (for example by adding basic authentication
|
||||
credentials, custom headers, or TLS settings).
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **No per-node metrics appear.** Ensure that `PROM_REPORT_IDS` is set and that the
|
||||
specified nodes exist in the database. Set the value to `*` if you want to export
|
||||
every node during initial validation.
|
||||
- **Metrics look stale after a restart.** Confirm that the ingestor is still posting
|
||||
telemetry. The exporter only reflects data stored in the PotatoMesh database.
|
||||
- **Scrapes time out.** Verify that the Prometheus server can reach the PotatoMesh
|
||||
HTTP port and that no reverse proxy is blocking the `/metrics` path.
|
||||
|
||||
With the endpoint configured, you can build Grafana dashboards or alerting rules to
|
||||
keep track of community mesh health in real time.
|
||||
@@ -133,6 +133,12 @@ The web app contains an API:
|
||||
|
||||
The `API_TOKEN` environment variable must be set to a non-empty value and match the token supplied in the `Authorization` header for `POST` requests.
|
||||
|
||||
### Observability
|
||||
|
||||
PotatoMesh ships with a Prometheus exporter mounted at `/metrics`. Consult
|
||||
[`PROMETHEUS.md`](./PROMETHEUS.md) for deployment guidance, metric details, and
|
||||
scrape configuration examples.
|
||||
|
||||
## Python Ingestor
|
||||
|
||||
The web app is not meant to be run locally connected to a Meshtastic node but rather
|
||||
|
||||
Reference in New Issue
Block a user