Skip to content

fix(load): clear FileNotFoundError for missing audio path#2981

Merged
LauraGPT merged 1 commit into
modelscope:mainfrom
LauraGPT:fix/clear-file-not-found
Jun 17, 2026
Merged

fix(load): clear FileNotFoundError for missing audio path#2981
LauraGPT merged 1 commit into
modelscope:mainfrom
LauraGPT:fix/clear-file-not-found

Conversation

@LauraGPT

Copy link
Copy Markdown
Collaborator

Problem (new-user UX)

Following the README quickstart with the placeholder path input="meeting.wav" (a file new users don't have) does not give a clear error. Instead it crashes deep inside the VAD with a cryptic:

TypeError: expected Tensor as element 1 in argument 0, but got str
  ... fsmn_vad_streaming/model.py: torch.cat((cache["prev_samples"], audio_sample_list[0]))

Root cause: in load_audio_text_image_video, audio is only loaded if os.path.exists(path). A non-existent path string falls through the gate and is passed downstream as a raw str.

Fix

Raise a clear FileNotFoundError naming the missing path, before the load gate. Scoped to data_type="sound" so text/PUNC inputs are unaffected.

Tested (fresh venv, as a new user)

CASE 1 missing file -> FileNotFoundError: Audio file not found: 'meeting.wav'. Pass a valid local file path, URL, numpy array, torch.Tensor, or bytes.
CASE 2 valid file   -> unchanged, transcribes normally

Found by walking through the README quickstart as a first-time user.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a fail-fast check in 'load_audio_text_image_video' to raise a clear 'FileNotFoundError' when a local audio file path does not exist, preventing cryptic downstream errors. The review feedback correctly points out that this check should also apply when 'data_type' is None (which defaults to 'sound'), as the current implementation only triggers when 'data_type' is explicitly set to 'sound'.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +103 to +108
if (
isinstance(data_or_path_or_list, str)
and data_type == "sound"
and not data_or_path_or_list.startswith(("http://", "https://"))
and not os.path.exists(data_or_path_or_list)
):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The check currently only validates when data_type == "sound". However, on line 114, data_type can also be None (which defaults to "sound" behavior). If data_type is None and the file does not exist, the check will be bypassed, and the function will still fail downstream with a cryptic error.\n\nWe should update the condition to also cover data_type is None.

    if (\n        isinstance(data_or_path_or_list, str)\n        and (data_type == "sound" or data_type is None)\n        and not data_or_path_or_list.startswith(("http://", "https://"))\n        and not os.path.exists(data_or_path_or_list)\n    ):
When an audio file path did not exist, load_audio_text_image_video silently
passed the raw string downstream, crashing later with a cryptic
"expected Tensor as element 1 in argument 0, but got str" deep inside the VAD.
New users following the README quickstart (placeholder meeting.wav) hit this
confusing internal error.

Now raise a clear FileNotFoundError naming the missing path. Scoped to
data_type=sound so text inputs are unaffected.

Tested: missing path -> FileNotFoundError with helpful message; valid path ->
unchanged.
@LauraGPT LauraGPT force-pushed the fix/clear-file-not-found branch from 32882d5 to 3bdf6f9 Compare June 17, 2026 13:38
@LauraGPT

Copy link
Copy Markdown
Collaborator Author

Thanks! Addressed — the check now also covers data_type is None (which defaults to sound): data_type in (None, "sound").

@LauraGPT LauraGPT merged commit e6ab4c3 into modelscope:main Jun 17, 2026
LauraGPT added a commit that referenced this pull request Jun 17, 2026
User-friendly funasr CLI (funasr audio.wav -f json/srt/text), clearer
errors, and accumulated fixes since 1.3.9:
- New funasr CLI replacing the Hydra entrypoint (old one -> funasr-hydra)
- fix: clear FileNotFoundError for missing audio paths (#2981)
- fix: valid SRT timestamps + clear json duration fields (#2982, #2983)
- fix: correct fun-asr-nano CLI model id -2512 (#2984)
- feat: batched VAD-segment decoding for Fun-ASR-Nano vLLM (#2979)
- fix: warn on vLLM dtype=fp16 degraded output (#2980)
- fix: bf16/fp16 inference (#2978), repetition_penalty CUDA crash (#2974)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant