Make FEATURE_SPAM_DETECTION on by default with opt-out, mirroring
FEATURE_PACKETS: flip the web feature flag's Python default to true and the
Compose substitutions (collector/api/web) to :-true, so the shipped stack
scores and hides likely-spam without configuration. Opt out with
FEATURE_SPAM_DETECTION=false.
Update .env.example, docs/configuration.md and the v0.15 upgrade notes to
describe the feature as enabled-by-default with opt-out.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adjust the default spam-scoring knobs across Python settings, SpamConfig,
docker-compose, .env.example and docs to reduce false positives on chatty
legitimate users:
SPAM_MIN_PATH_HOPS 5 -> 3
SPAM_PATH_THRESHOLD 5 -> 6
SPAM_NAME_THRESHOLD 5 -> 10
SPAM_WEIGHT_PATH 0.7 -> 0.75
SPAM_WEIGHT_NAME 0.3 -> 0.25
SPAM_SCORE_THRESHOLD 0.6 -> 0.65
Also document the spam-detection feature and the pull_policy change in a
new v0.15.0 section of docs/upgrading.md.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add an optional, off-by-default spam-detection feature that scores each
message's spam likelihood at ingest, stores the score on the row, and lets
the display layer hide likely-spam by default behind a "show potential spam"
toggle. Nothing is ever dropped at ingest, so the threshold can be retuned
without reprocessing.
Scoring (collector/spam.py): windowed COUNT(*) over new
(path_prefix, received_at) and (sender_normalized, received_at) indexes —
joint path+sender signal plus a sender-name signal (trailing-digit suffix
stripped so bob1/bob2 collapse to bob). When the path is short/zero-hop or
absent, the name signal stands alone at full weight so local spam is still
flaggable. A background sweep re-scores recent rows with hindsight to catch
the leading edge of bursts. The collector logs each score (WARNING at/above
the threshold).
Display: the messages API gains include_spam and a master-switch-aware
hide-filter; the SPA shows the toggle + a badge only when the feature is on.
Config: FEATURE_SPAM_DETECTION is the single operator switch, bridged in
Compose to the backend SPAM_DETECTION_ENABLED for collector + api (mirrors
the FEATURE_PACKETS / RAW_PACKET_CAPTURE_ENABLED pattern). Both default off.
Works on SQLite and Postgres: DB-agnostic queries, an Alembic batch migration
for the three new columns + two indexes, and backend-aware collector test
fixtures (lifted db_backend/db_url into the shared conftest).
Also: move the meshcore-hub image pull_policy out of the base compose file.
It lived in docker-compose.yml as pull_policy: daily and made `make up` pull
the published image over a freshly built local one. Base is now policy-neutral
(default missing); dev sets pull_policy: build on the hub services so it only
ever uses local builds. Prod refreshes images via a manual `docker compose
... pull`.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move scattered configuration tables and operational sections out of the
README into dedicated reference documents:
- docs/configuration.md: single source of truth for all environment
variables, grouped into 12 sections (Common, Database, Caching,
Collector, Webhooks, Auth, Data Retention, API, Web Dashboard,
Feature Flags, Traefik, Prometheus & Alertmanager)
- docs/deployment.md: production setup, reverse proxy, multi-instance,
API scaling, Redis caching
- docs/observer.md: remote observers plus PACKETCAPTURE_* and
SERIAL_PORT reference
- docs/maintenance.md: backup and restore
README is reduced from 712 to 385 lines; the ARM32/Raspberry Pi note
is dropped. database.md, auth.md, webhooks.md, and content.md have
their env-var tables removed and link back to configuration.md. Stale
cross-references in database.md, upgrading.md, and .env.example are
updated to point at the new locations.