feat(sensing-server): adaptive person count — RollingP95 + dedup_factor runtime API#491
Conversation
…or runtime API
RollingP95 adaptive normalizer (ADR-044 §5.2):
- Streaming P95 estimator (600-sample / ~30 s window) replaces fixed-scale
denominators (variance/300, motion/250, spectral/500) that saturated against
live ESP32 values, collapsing dynamic range to zero.
- Cold-start (<60 samples) falls back to legacy denominators — day-0 behaviour
is preserved.
- Three new fields on AppStateInner: p95_variance, p95_motion_band_power,
p95_spectral_power (all RollingP95::new(600, 60)).
- compute_person_score() refactored to accept &AppStateInner; all three call
sites (wifi, wifi-fallback, simulated) updated.
- 5 unit tests in rolling_p95_tests module.
dedup_factor runtime API (ADR-044 §5.3):
- New field dedup_factor: f64 (default 3.0) on AppStateInner.
- fuse_or_fallback() gains dedup_factor param; fallback switches from max() to
sum/dedup_factor (ceiling), matching the fork's sum-based aggregation.
- RuntimeConfig struct + load/save_runtime_config() for data/config.json
persistence across restarts.
- Three new REST endpoints:
GET /api/v1/config/dedup-factor
POST /api/v1/config/dedup-factor
POST /api/v1/config/ground-truth (auto-tune from known person count)
Explicitly NOT included:
- lambda=5.0 (upstream keeps its 0.1 default — deployment-specific tuning)
- CC intensity threshold 0.3 and min-cluster-size 4 hardcodes
- max_cc_size filter removal
|
Hi @schwarztim — current
The PRs were all evaluated thoughtful and well-described. Once you rebase against |
…count Merge #491: feat(sensing-server): adaptive person count — RollingP95 + dedup_factor (integration on schwarztim's behalf)
#654) The previous table mixed status badges (✅ /⚠️ / 🔬) and verbose "pending wiring / not yet released" caveat columns. Rewrites it as "What / How / Speed-or-scale" — three columns, present tense, no status column. Captures what actually shipped this week: * Presence detection now points at the trained head shipped on HF (100% validation accuracy), with the phase-variance fallback reframed as a no-model option rather than a "loader pending" caveat. * 17-keypoint pose is its own row now — cog-pose-estimation v0.0.1 binaries on GCS, 8.4 ms cold-start on Pi 5, train-your-own in 2.1 s on RTX 5080. References ADR-101 + the benchmark log. * Multi-person counting drops the "Heuristic, not learned" framing. The adaptive P95 normalisation from PR #491 is in tree, the runtime dedup-factor knob is documented, and the six learned drop-in counters from the Cog catalog are linked: occupancy-zones, elevator-count, queue-length, customer-flow, clean-room, person-matching. * Edge intelligence row now points at the 105-cog catalog (ADR-102) instead of just the Cognitum Seed hardware. * Camera-supervised fine-tune row reflects the actual measured training time (2.1 s on RTX 5080 for 400 epochs) instead of the laptop estimate. * Drops the status-legend footer (no more ✅/⚠️ /🔬 column to legend). Replaces it with a pointer down to the Edge Module Catalog. The ESP32 + Cognitum Seed deployment-options row gets the same treatment: cleaner list of what's included, no "Pose pending weights" parenthetical (the cog ships today). Net effect: same information, present tense, positive voice. Nothing removed beyond status badges + pending-work parentheticals; all genuine engineering details (e.g. "needs ~30 s ambient calibration" for the fallback) are preserved inline.
Motivated by #499 (multi-node double-skeletons) which PR #491 stopped the bleeding on but didn't take to the WiFi-CSI literature's state of the art. Designs a learned counter that replaces today's slot heuristic + dedup_factor knob, reusing the primitives we've already shipped this week: * Candle / RTX 5080 training pipeline (proven yesterday, 2.1 s for 400 epochs on pose_v1.safetensors) * HF presence encoder as initialization (architectures compatible, unlike the pose head case) * ruvector-mincut (Stoer-Wagner) for multi-node fusion upper-bound * Cog packaging spec (ADR-100) + edge module registry (ADR-102) * Paired-data pipeline (PR #641 streaming-safe align-ground-truth.js) — `n_persons` labels come for free; no new data collection campaign required to bootstrap. Architecture: per-node CSI [56×20] -> frozen HF encoder -> 128-dim embedding \ > count head (softmax {0..7}) > confidence head (sigmoid) N nodes' distributions -> confidence-weighted log-sum -> Stoer-Wagner min-cut upper-bound clip -> { count, confidence, count_p95_low, count_p95_high, per_node_breakdown } Compares the proposal explicitly against WiCount / DeepCount / CrossCount / HeadCount published numbers and is honest about the hardware gap (their 3x3 MIMO research NICs vs our 1x1 SISO ESP32-S3). v0.1.0 acceptance gates target >=80% within-+/-1 same-room and >=60% cross-room — modest on purpose; bounded by the same paired- data scarcity #645 documents for pose. The framework is the deliverable; the accuracy follows the data. Includes: * Architecture diagram in ascii * Comparison table vs published WiFi-CSI counting SOTA * Per-failure-mode mapping from #499 symptoms to how the learned counter addresses each * v0.1.0 + v0.2.0 acceptance gates with measurable thresholds * Repo layout for the new `v2/crates/cog-person-count/` crate * Five-step migration plan from this ADR -> first GCS release Status: Proposed. Implementation follows in the same incremental pattern ADR-101 used: scaffold-cog PR -> train+publish PR -> server-wiring PR.
… (ADR-103) (#694) First implementation PR for ADR-103. Same incremental shape that ADR-101 used: scaffold the cog crate, ship a stub-backend release that satisfies the runtime contract + 15 tests + measured cold-start, then follow up with the trained count_v1.safetensors in a separate PR. What ships: * v2/crates/cog-person-count/ — new workspace member. - Cargo.toml: candle-core/candle-nn 0.9 (cpu default, cuda feature opt-in), safetensors, ureq, sha2 — same dep shape as the pose cog but minus wifi-densepose-train (this cog has no training-side consumer, so the dep tree is materially smaller → 2.36 MB binary vs the pose cog's 4.5 MB). - src/inference.rs: CountNet (Conv1d 56→64→128→128 encoder + count head Linear(128→64→8)+softmax + confidence head Linear(128→32→1)+sigmoid). Stub backend returns `{1-person, 0-confidence}` honestly when no safetensors present. - src/fusion.rs: fuse_confidence_weighted() — Bayesian product of per-node distributions with confidence-weighted log-sum, plus fuse_with_mincut_clip() hook for the v0.2.0 Stoer-Wagner upper-bound (`ruvector-mincut` dep lands when min-cut graph builder is ready). Confidences floored at 1e-3 and probs floored at 1e-9 before logs — no NaN propagation. - src/publisher.rs: emits {count, confidence, count_p95_low, count_p95_high, n_nodes, probs} per ADR-103 §"Output". - src/main.rs: full ADR-100 four-verb CLI (version|manifest|health |run). The `run` subcommand explicitly returns "wiring pending v0.0.1" so the in-process library API is the v0.0.1-clean integration path. - tests/smoke.rs (8 tests) + fusion::tests (7 tests, in-lib) — 15 total, all green. Cover stub-backend behaviour, wrong-shape rejection, fusion math (empty / single / agreement / high-conf override / normalisation), p95-range correctness, and min-cut clip semantics. - cog/{manifest.template.json, config.schema.json, README.md} + cog/artifacts/ placeholder dir. * v2/Cargo.toml: registers the new workspace member. Verified locally: cargo check -p cog-person-count --no-default-features → clean cargo test -p cog-person-count --no-default-features → 8/8 pass cargo test -p cog-person-count --lib → 7/7 pass cargo build -p cog-person-count --release → 2.36 MB binary ./cog-person-count version → "person-count 0.3.0" ./cog-person-count manifest → JSON skeleton ./cog-person-count health → backend:stub, count:1, conf:0, p95:[1,1] Cold-start: 30 sequential `health` invocations → 53.3 ms/invocation (vs cog-pose-estimation's 76.2 ms — smaller dep tree) cog/README.md adds: * Security section — six-row threat table covering safetensor mmap trust, non-finite outputs, sensing fetch failures, fusion divide-by-zero / log-of-zero, min-cut degenerate cases, and stdout spoofing. * Performance / optimization section — binary size, release profile (already opt-level=3 / lto=fat / codegen-units=1 / strip=true at workspace level), cold-start comparison table, projected warm-path latency budget. Still pending (separate PRs, ADR-103 §"Migration"): * Train count_v1.safetensors on the existing 1,077 paired samples with `n_persons` labels (Candle on RTX 5080, same script that produced pose_v1.safetensors yesterday). * `run` subcommand wiring (long-running polling loop, same shape as cog-pose-estimation::runtime). * Cross-compile + sign + GCS upload (mirror of cog-pose-estimation release pipeline). * Server-side `csi.rs::score_to_person_count` call-site rewire to consume this cog when installed; falls back to PR #491's heuristic when not.
…al (#697) Phase 4 of ADR-103. Adds the long-running polling loop so the cog's fourth verb (`run`) does real work, completing the ADR-100 runtime contract end-to-end: cog-person-count version → "person-count 0.3.0" cog-person-count manifest → JSON skeleton cog-person-count health → loads weights + 1-shot infer + emit cog-person-count run --config → long-running per-frame emit ← THIS What ships: * src/runtime.rs (new) — `run_loop` polls sensing_url every poll_ms, slides a [56, 20] CSI window, runs InferenceEngine::infer, emits publisher::person_count events. Same shape as cog-pose-estimation::runtime — fetch_frame extracts amplitudes from `snapshot.nodes[0].amplitude[]`, fails open on connect errors with a WARN log rather than crashing. * src/lib.rs — registers the runtime module. * src/main.rs — cmd_run now loads RunConfig from a JSON file, builds the InferenceEngine (with weights if cfg.model_path is set, otherwise auto-discover), emits a run.started event, and hands off to the Tokio multi-thread runtime's block_on(run_loop). Single-node fusion is a no-op for N=1 today; v0.2.0 will append predictions from sibling nodes and call fusion::fuse_confidence_weighted before emit. Verified locally: cargo check -p cog-person-count --no-default-features → clean cargo test -p cog-person-count → 15/15 pass (no regressions) cargo build -p cog-person-count --release → 2.36 MB unchanged ./cog-person-count run --config bad-config.json: line 1: {"event":"run.started","fields":{"cog":"person-count", "sensing_url":"http://127.0.0.1:9999/...",poll_ms:100, "model_path":"(auto-discover)"}} line 2: WARN sensing-server fetch failed error=Connection Failed: Connect error: actively refused (loop alive — exits cleanly on SIGTERM, no crash, no NaN) Also adds a "Relationship to the in-process score_to_person_count heuristic" section to cog/README.md explaining the dual-emitter design (sensing-server keeps emitting the PR #491 slot heuristic; the cog runs out-of-process and emits person.count events from the learned model). Operators choose by installing the cog or not — no sensing-server rebuild required. ADR-103 §"Migration" status: 1. Land ADR + scaffold ........... done (#693, #694) 2. Train count_v1 ................ done (#695) 3. Cross-compile + sign + GCS .... done (#696) 4. Server-side wiring ............ done — out-of-process design means no rewire needed; this cog is the wiring. 5. v0.2.0 multi-room + LoRA ...... data-bound (#645)
|
@ruvnet hey, sorry for the delay on this. I'm literally just seeing this now. I was checking it periodically after I submitted the PR, but there is a lot of really good stuff that I was working on there. Mainly one of the things that I recall, it's been some time, is the ability to remotely update the ESP32s. That's one of the issues that I kind of ran into and I created a container where they can check into, some user experience changes. But yeah, I actually spent a lot of time working on this. So I hope you liked it and sorry again for the delay. |
Motivation
Person counting in
v2/uses fixed-scale feature normalization which works well in calibrated environments but degrades when signal characteristics drift across rooms, interference levels, or hardware. Two improvements here are deployment-neutral:1.
RollingP95adaptive normalizercompute_person_score()previously normalized features with hard-coded denominators (variance/300,motion_band_power/250,spectral_power/500). When live ESP32 values exceed those limits the normalized inputs clamp to 1.0 and dynamic range collapses.RollingP95is a streaming P95 estimator (600-sample / ~30 s sliding window) that self-calibrates to whatever feature distribution the deployment produces. Cold-start (< 60 samples) falls back to the legacy denominators so day-0 behaviour is fully preserved.2.
dedup_factorruntime APIExposes the multi-node cluster deduplication divisor via REST so deployments can tune to their environment without rebuilding. Includes an auto-tune endpoint that derives the optimal
dedup_factorfrom a known person count (calibration mode). Config persists across restarts indata/config.json.Explicitly NOT included
This fork also has additional ISTA
lambdatuning (specificallylambda=5.0) for its local 8×8×4 babycube grid. Those values are deployment-specific and intentionally not included in this PR — they would degrade person-count quality on different room geometries. This PR keeps upstream's existinglambda: 0.1default.Changes
v2/crates/wifi-densepose-sensing-server/src/main.rsRollingP95struct +impl(ADR-044 §5.2)RuntimeConfigstruct +load_runtime_config/save_runtime_config(ADR-044 §5.3)AppStateInner: addedp95_variance,p95_motion_band_power,p95_spectral_power,dedup_factor,data_dirfieldscompute_person_score()signature:&FeatureInfo→&AppStateInner + &FeatureInfo(adaptive denominators)config_get_dedup_factor,config_set_dedup_factor,config_set_ground_truthGET/POST /api/v1/config/dedup-factor,POST /api/v1/config/ground-truthrolling_p95_testsmodulev2/crates/wifi-densepose-sensing-server/src/multistatic_bridge.rsfuse_or_fallback()gainsdedup_factor: f64parameter; fallback switches frommax()toceil(sum / dedup_factor)Test results
cargo test --workspace --no-default-features: 1636 passed, 0 failed (includes 5 newRollingP95unit tests)python archive/v1/data/proof/verify.py: VERDICT: FAIL — pre-existing onorigin/main(numpy/scipy version drift); not caused by this PRNotes
dedup_factorto match observed cluster count. Useful for calibration during install.RollingP95is a generic primitive — could be reused for other adaptive thresholds in future.lambda: 0.1intomography.rsis untouched.