Skip to content

fix(cli): valid SRT timestamps + clear json duration fields#2982

Merged
LauraGPT merged 1 commit into
modelscope:mainfrom
LauraGPT:fix/cli-srt-duration
Jun 17, 2026
Merged

fix(cli): valid SRT timestamps + clear json duration fields#2982
LauraGPT merged 1 commit into
modelscope:mainfrom
LauraGPT:fix/cli-srt-duration

Conversation

@LauraGPT

Copy link
Copy Markdown
Collaborator

Found by smoke-testing the new funasr CLI as a first-time user.

SRT bug: funasr audio.wav -f srt produced an invalid cue when the model returns no per-sentence timestamps — a bogus 00:00:00,000 --> 99:59:59,999 end. Now spans the real audio duration.

json bug: duration_s was actually the processing time — a 60s file showed "duration_s": 2.3, misleading users into thinking the audio is 2.3s. Split into audio_duration_s (real length) + processing_s (elapsed). Safe rename (the new CLI is unreleased).

Tested:

-f srt  -> 00:00:00,000 --> 00:00:59,520  (real duration)
-f json -> "audio_duration_s": 59.52, "processing_s": 2.165
- SRT: when the model returns no per-sentence timestamps, the fallback used a
  bogus 99:59:59,999 end time. Now span the real audio duration so the SRT is
  valid.
- json: duration_s was actually the processing time (a 60s file showed
  duration_s: 2.3). Split into audio_duration_s (real audio length) and
  processing_s (elapsed). Safe rename - the new CLI is unreleased.

Found by smoke-testing the funasr CLI as a new user.
@LauraGPT LauraGPT merged commit 6f8d161 into modelscope:main Jun 17, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the output formatting in funasr/cli.py to dynamically retrieve and use the actual audio duration (via soundfile) for JSON and SRT outputs when per-sentence timestamps are missing, rather than using hardcoded placeholder values. The reviewer suggested applying a similar fix to the TSV output format to ensure consistency and correct metadata.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread funasr/cli.py
Comment on lines 69 to 70
elif fmt == "tsv":
return format_tsv(segments) if segments else f"start\tend\ttext\n0.000\t0.000\t{text}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The TSV output format has the same issue as the SRT format: when there are no per-sentence timestamps (segments is empty), it returns a fallback string with a hardcoded 0.000 end time (0.000\\t0.000\\t{text}). To be consistent with the SRT fix and ensure correct metadata, we should retrieve the real audio duration and use it as the end timestamp in the TSV fallback.

    elif fmt == "tsv":\n        if segments:\n            return format_tsv(segments)\n        try:\n            import soundfile as sf\n            dur_s = round(sf.info(audio_path).duration, 3)\n        except Exception:\n            dur_s = 0.0\n        return f"start\\tend\\ttext\\n0.000\\t{dur_s:.3f}\\t{text}"
LauraGPT added a commit that referenced this pull request Jun 17, 2026
User-friendly funasr CLI (funasr audio.wav -f json/srt/text), clearer
errors, and accumulated fixes since 1.3.9:
- New funasr CLI replacing the Hydra entrypoint (old one -> funasr-hydra)
- fix: clear FileNotFoundError for missing audio paths (#2981)
- fix: valid SRT timestamps + clear json duration fields (#2982, #2983)
- fix: correct fun-asr-nano CLI model id -2512 (#2984)
- feat: batched VAD-segment decoding for Fun-ASR-Nano vLLM (#2979)
- fix: warn on vLLM dtype=fp16 degraded output (#2980)
- fix: bf16/fp16 inference (#2978), repetition_penalty CUDA crash (#2974)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant