fix(cli): valid SRT timestamps + clear json duration fields by LauraGPT · Pull Request #2982 · modelscope/FunASR

LauraGPT · 2026-06-17T14:16:13Z

Found by smoke-testing the new funasr CLI as a first-time user.

SRT bug: funasr audio.wav -f srt produced an invalid cue when the model returns no per-sentence timestamps — a bogus 00:00:00,000 --> 99:59:59,999 end. Now spans the real audio duration.

json bug: duration_s was actually the processing time — a 60s file showed "duration_s": 2.3, misleading users into thinking the audio is 2.3s. Split into audio_duration_s (real length) + processing_s (elapsed). Safe rename (the new CLI is unreleased).

Tested:

-f srt  -> 00:00:00,000 --> 00:00:59,520  (real duration)
-f json -> "audio_duration_s": 59.52, "processing_s": 2.165

- SRT: when the model returns no per-sentence timestamps, the fallback used a bogus 99:59:59,999 end time. Now span the real audio duration so the SRT is valid. - json: duration_s was actually the processing time (a 60s file showed duration_s: 2.3). Split into audio_duration_s (real audio length) and processing_s (elapsed). Safe rename - the new CLI is unreleased. Found by smoke-testing the funasr CLI as a new user.

gemini-code-assist

Code Review

This pull request updates the output formatting in funasr/cli.py to dynamically retrieve and use the actual audio duration (via soundfile) for JSON and SRT outputs when per-sentence timestamps are missing, rather than using hardcoded placeholder values. The reviewer suggested applying a similar fix to the TSV output format to ensure consistency and correct metadata.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-17T14:17:47Z

    elif fmt == "tsv":
        return format_tsv(segments) if segments else f"start\tend\ttext\n0.000\t0.000\t{text}"


The TSV output format has the same issue as the SRT format: when there are no per-sentence timestamps (segments is empty), it returns a fallback string with a hardcoded 0.000 end time (0.000\\t0.000\\t{text}). To be consistent with the SRT fix and ensure correct metadata, we should retrieve the real audio duration and use it as the end timestamp in the TSV fallback.

elif fmt == "tsv":\n if segments:\n return format_tsv(segments)\n try:\n import soundfile as sf\n dur_s = round(sf.info(audio_path).duration, 3)\n except Exception:\n dur_s = 0.0\n return f"start\\tend\\ttext\\n0.000\\t{dur_s:.3f}\\t{text}"

User-friendly funasr CLI (funasr audio.wav -f json/srt/text), clearer errors, and accumulated fixes since 1.3.9: - New funasr CLI replacing the Hydra entrypoint (old one -> funasr-hydra) - fix: clear FileNotFoundError for missing audio paths (#2981) - fix: valid SRT timestamps + clear json duration fields (#2982, #2983) - fix: correct fun-asr-nano CLI model id -2512 (#2984) - feat: batched VAD-segment decoding for Fun-ASR-Nano vLLM (#2979) - fix: warn on vLLM dtype=fp16 degraded output (#2980) - fix: bf16/fp16 inference (#2978), repetition_penalty CUDA crash (#2974)

LauraGPT merged commit 6f8d161 into modelscope:main Jun 17, 2026

gemini-code-assist Bot reviewed Jun 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(cli): valid SRT timestamps + clear json duration fields#2982

fix(cli): valid SRT timestamps + clear json duration fields#2982
LauraGPT merged 1 commit into
modelscope:mainfrom
LauraGPT:fix/cli-srt-duration

LauraGPT commented Jun 17, 2026

gemini-code-assist Bot left a comment

gemini-code-assist Bot Jun 17, 2026

Labels

1 participant

		elif fmt == "tsv":
		return format_tsv(segments) if segments else f"start\tend\ttext\n0.000\t0.000\t{text}"