Commit Graph

4 Commits

Author SHA1 Message Date
Louis King 461dbc5008 feat(spam): ship spam detection enabled by default
Make FEATURE_SPAM_DETECTION on by default with opt-out, mirroring
FEATURE_PACKETS: flip the web feature flag's Python default to true and the
Compose substitutions (collector/api/web) to :-true, so the shipped stack
scores and hides likely-spam without configuration. Opt out with
FEATURE_SPAM_DETECTION=false.

Update .env.example, docs/configuration.md and the v0.15 upgrade notes to
describe the feature as enabled-by-default with opt-out.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 09:00:16 +01:00
Louis King caaecfb3c2 feat(spam): retune default scoring config and add v0.15 upgrade notes
Adjust the default spam-scoring knobs across Python settings, SpamConfig,
docker-compose, .env.example and docs to reduce false positives on chatty
legitimate users:

  SPAM_MIN_PATH_HOPS   5  -> 3
  SPAM_PATH_THRESHOLD  5  -> 6
  SPAM_NAME_THRESHOLD  5  -> 10
  SPAM_WEIGHT_PATH     0.7 -> 0.75
  SPAM_WEIGHT_NAME     0.3 -> 0.25
  SPAM_SCORE_THRESHOLD 0.6 -> 0.65

Also document the spam-detection feature and the pull_policy change in a
new v0.15.0 section of docs/upgrading.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 08:40:24 +01:00
Louis King c48db03afb feat(spam): score messages at ingest and hide likely spam
Add an optional, off-by-default spam-detection feature that scores each
message's spam likelihood at ingest, stores the score on the row, and lets
the display layer hide likely-spam by default behind a "show potential spam"
toggle. Nothing is ever dropped at ingest, so the threshold can be retuned
without reprocessing.

Scoring (collector/spam.py): windowed COUNT(*) over new
(path_prefix, received_at) and (sender_normalized, received_at) indexes —
joint path+sender signal plus a sender-name signal (trailing-digit suffix
stripped so bob1/bob2 collapse to bob). When the path is short/zero-hop or
absent, the name signal stands alone at full weight so local spam is still
flaggable. A background sweep re-scores recent rows with hindsight to catch
the leading edge of bursts. The collector logs each score (WARNING at/above
the threshold).

Display: the messages API gains include_spam and a master-switch-aware
hide-filter; the SPA shows the toggle + a badge only when the feature is on.

Config: FEATURE_SPAM_DETECTION is the single operator switch, bridged in
Compose to the backend SPAM_DETECTION_ENABLED for collector + api (mirrors
the FEATURE_PACKETS / RAW_PACKET_CAPTURE_ENABLED pattern). Both default off.

Works on SQLite and Postgres: DB-agnostic queries, an Alembic batch migration
for the three new columns + two indexes, and backend-aware collector test
fixtures (lifted db_backend/db_url into the shared conftest).

Also: move the meshcore-hub image pull_policy out of the base compose file.
It lived in docker-compose.yml as pull_policy: daily and made `make up` pull
the published image over a freshly built local one. Base is now policy-neutral
(default missing); dev sets pull_policy: build on the hub services so it only
ever uses local builds. Prod refreshes images via a manual `docker compose
... pull`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 00:11:39 +01:00
Louis King 973bf23fe8 docs: centralise env vars and split deployment/observer/maintenance docs
Move scattered configuration tables and operational sections out of the
README into dedicated reference documents:

- docs/configuration.md: single source of truth for all environment
  variables, grouped into 12 sections (Common, Database, Caching,
  Collector, Webhooks, Auth, Data Retention, API, Web Dashboard,
  Feature Flags, Traefik, Prometheus & Alertmanager)
- docs/deployment.md: production setup, reverse proxy, multi-instance,
  API scaling, Redis caching
- docs/observer.md: remote observers plus PACKETCAPTURE_* and
  SERIAL_PORT reference
- docs/maintenance.md: backup and restore

README is reduced from 712 to 385 lines; the ARM32/Raspberry Pi note
is dropped. database.md, auth.md, webhooks.md, and content.md have
their env-var tables removed and link back to configuration.md. Stale
cross-references in database.md, upgrading.md, and .env.example are
updated to point at the new locations.
2026-06-18 12:20:49 +01:00