test: numerical regression harness (frozen golden vs ggml/VAD/CIF/CTC output) by LauraGPT · Pull Request #3003 · modelscope/FunASR

LauraGPT · 2026-06-20T07:29:00Z

C2 (roadmap P3) — a numerical regression harness so future changes can't silently break the runtime.

What

tests/ runs each built tool on a fixed ~6 s clip and diffs the output against frozen golden, catching regressions in the ggml graphs, the FSMN-VAD state machine, the CIF predictor and CTC decode.

tests/run_regression.sh — auto-detects which tools are built; the tiny VAD model (1.7 MB) is auto-fetched, ASR GGUFs are tested when present locally or with RUN_FULL=1 (downloads from FunAudioLLM/*-GGUF). Non-zero exit on any mismatch. BIN_DIR / MODELS_DIR overridable — drops straight into a CI step.
tests/sample.wav (~6 s, 192 KB) + tests/golden/*.txt — golden captured on Linux x86-64 with the published f16 GGUFs.

Verified (Linux)

All present tools PASS (vad, sensevoice, paraformer, nano — 4/4).
Default mode (no local models) fetches VAD, PASS, and cleanly SKIPs absent ASR models (exit 0).

Additive — runtime/llama.cpp/tests/ only. Golden is exact-match on the reference platform; update only on a deliberate, reviewed output change.

… output) Adds tests/ — runs each runtime tool on a fixed 6 s clip and diffs against frozen golden output, catching regressions in the ggml graphs, the FSMN-VAD state machine, the CIF predictor and CTC decode. - tests/run_regression.sh: auto-detects which tools are built; VAD model is auto-fetched (1.7 MB), ASR GGUFs tested when present or with RUN_FULL=1 (downloads from HF). Non-zero exit on any mismatch. BIN_DIR/MODELS_DIR overridable. - tests/sample.wav (~6 s) + tests/golden/*.txt: golden captured on Linux x86-64 with the f16 GGUFs from FunAudioLLM/*-GGUF. Verified locally: all present tools PASS; default mode fetches VAD + skips absent models.

gemini-code-assist

Code Review

This pull request introduces numerical regression tests for the FunASR llama.cpp runtime, including a test runner script, documentation, and frozen golden outputs for various models. Feedback on the test script suggests simplifying the tool execution logic by removing redundant binary lookups and fragile chaining, letting the runner function handle the binary path resolution directly.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-20T07:30:05Z

+run_tool(){ # name binary golden key  models... -- run...
+  local name="$1" b key gold; b=$(bin "$2"); gold="$DIR/golden/$3"; key="$4"; shift 4
+  local models=(); while [ "$1" != "--" ]; do models+=("$1"); shift; done; shift
+  [ -n "$b" ] || { skipper "$name" "no binary"; return; }
+  [ -f "$gold" ] || { skipper "$name" "no golden"; return; }
+  ensure_models "$key" "${models[@]}" || { skipper "$name" "model missing (set RUN_FULL=1)"; return; }
+  check "$name" "$gold" "$("$@" 2>/dev/null)"
+}
+
+echo "== FunASR llama.cpp regression (sample.wav) =="
+B=$(bin llama-funasr-vad)        && run_tool vad        llama-funasr-vad        vad.txt        fsmn-vad   fsmn-vad.gguf -- "$B" -m "$MODELS/fsmn-vad.gguf" -a "$SAMPLE"
+B=$(bin llama-funasr-sensevoice) && run_tool sensevoice llama-funasr-sensevoice sensevoice.txt sensevoice sensevoice-small-f16.gguf -- "$B" -m "$MODELS/sensevoice-small-f16.gguf" -a "$SAMPLE"
+B=$(bin llama-funasr-paraformer) && run_tool paraformer llama-funasr-paraformer paraformer.txt paraformer paraformer-f16.gguf -- "$B" -m "$MODELS/paraformer-f16.gguf" -a "$SAMPLE"
+B=$(bin llama-funasr-cli)        && run_tool nano       llama-funasr-cli        nano.txt       nano       funasr-encoder-f16.gguf qwen3-0.6b-q8_0.gguf -- "$B" --enc "$MODELS/funasr-encoder-f16.gguf" -m "$MODELS/qwen3-0.6b-q8_0.gguf" -a "$SAMPLE"


The current design of prefixing each tool run with B=$(bin ...) and chaining with && is redundant and fragile.

Redundancy: The binary path is looked up twice—once via B=$(bin ...) in the caller and once via b=$(bin "$2") inside run_tool.

Fragility: The bin function currently returns 0 even when a binary is not found (due to bash's default behavior when no if/elif branch matches). If bin is ever refactored to return 1 on failure (which is standard for lookup functions), the && chain will short-circuit, preventing run_tool from executing and silently skipping the skip reporting (skipper won't be called).

We can simplify this by letting run_tool prepend the resolved binary $b to the command arguments, removing the need for the B variable and the && chain entirely.

Suggested change

run_tool(){ # name binary golden key models... -- run...

local name="$1" b key gold; b=$(bin "$2"); gold="$DIR/golden/$3"; key="$4"; shift 4

local models=(); while [ "$1" != "--" ]; do models+=("$1"); shift; done; shift

[ -n "$b" ] || { skipper "$name" "no binary"; return; }

[ -f "$gold" ] || { skipper "$name" "no golden"; return; }

ensure_models "$key" "${models[@]}" || { skipper "$name" "model missing (set RUN_FULL=1)"; return; }

check "$name" "$gold" "$("$@" 2>/dev/null)"

}

echo "== FunASR llama.cpp regression (sample.wav) =="

B=$(bin llama-funasr-vad) && run_tool vad llama-funasr-vad vad.txt fsmn-vad fsmn-vad.gguf -- "$B" -m "$MODELS/fsmn-vad.gguf" -a "$SAMPLE"

B=$(bin llama-funasr-sensevoice) && run_tool sensevoice llama-funasr-sensevoice sensevoice.txt sensevoice sensevoice-small-f16.gguf -- "$B" -m "$MODELS/sensevoice-small-f16.gguf" -a "$SAMPLE"

B=$(bin llama-funasr-paraformer) && run_tool paraformer llama-funasr-paraformer paraformer.txt paraformer paraformer-f16.gguf -- "$B" -m "$MODELS/paraformer-f16.gguf" -a "$SAMPLE"

B=$(bin llama-funasr-cli) && run_tool nano llama-funasr-cli nano.txt nano funasr-encoder-f16.gguf qwen3-0.6b-q8_0.gguf -- "$B" --enc "$MODELS/funasr-encoder-f16.gguf" -m "$MODELS/qwen3-0.6b-q8_0.gguf" -a "$SAMPLE"

run_tool(){ # name binary golden key models... -- run...

local name="$1" b key gold; b=$(bin "$2"); gold="$DIR/golden/$3"; key="$4"; shift 4

local models=(); while [ "$1" != "--" ]; do models+=("$1"); shift; done; shift

[ -n "$b" ] || { skipper "$name" "no binary"; return; }

[ -f "$gold" ] || { skipper "$name" "no golden"; return; }

ensure_models "$key" "${models[@]}" || { skipper "$name" "model missing (set RUN_FULL=1)"; return; }

check "$name" "$gold" "$("$b" "$@" 2>/dev/null)"

}

echo "== FunASR llama.cpp regression (sample.wav) =="

run_tool vad llama-funasr-vad vad.txt fsmn-vad fsmn-vad.gguf -- -m "$MODELS/fsmn-vad.gguf" -a "$SAMPLE"

run_tool sensevoice llama-funasr-sensevoice sensevoice.txt sensevoice sensevoice-small-f16.gguf -- -m "$MODELS/sensevoice-small-f16.gguf" -a "$SAMPLE"

run_tool paraformer llama-funasr-paraformer paraformer.txt paraformer paraformer-f16.gguf -- -m "$MODELS/paraformer-f16.gguf" -a "$SAMPLE"

run_tool nano llama-funasr-cli nano.txt nano funasr-encoder-f16.gguf qwen3-0.6b-q8_0.gguf -- --enc "$MODELS/funasr-encoder-f16.gguf" -m "$MODELS/qwen3-0.6b-q8_0.gguf" -a "$SAMPLE"

…inary-detok)

gemini-code-assist Bot reviewed Jun 20, 2026

View reviewed changes

LauraGPT added 2 commits June 20, 2026 08:11

test: golden as text (binaries now detok in-process; merge after in-b…

8e22cf4

…inary-detok)

test: fix sensevoice golden (correct transcription text)

9860954

LauraGPT merged commit 3610b75 into main Jun 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test: numerical regression harness (frozen golden vs ggml/VAD/CIF/CTC output)#3003

test: numerical regression harness (frozen golden vs ggml/VAD/CIF/CTC output)#3003
LauraGPT merged 3 commits into
mainfrom
feat/c2-regression-tests

LauraGPT commented Jun 20, 2026

gemini-code-assist Bot left a comment

gemini-code-assist Bot Jun 20, 2026

Labels

1 participant