docs: fix README streaming example (runnable + actually streams) by LauraGPT · Pull Request #2987 · modelscope/FunASR

LauraGPT · 2026-06-18T10:47:28Z

Problem

The Usage section's streaming example was broken and misleading:

# Streaming real-time
model = AutoModel(model="paraformer-zh-streaming", device="cuda")
result = model.generate(input="chunk.wav", cache={}, chunk_size=[0, 10, 5])

Not runnable — chunk.wav is a placeholder file that doesn't exist.
Doesn't actually stream — a single one-shot generate() call missing is_final, encoder_chunk_look_back/decoder_chunk_look_back, and the chunk loop. A user can't learn streaming from it.

Fix

Replace it with the real chunk-by-chunk loop, matching the repo's own example (examples/industrial_data_pretraining/paraformer_streaming/demo.py): read the audio, iterate fixed-stride chunks, pass cache + is_final + look-back, and print partial text per chunk. Applied to both README.md and README_zh.md.

Verification

Ran the new snippet on GPU with paraformer-zh-streaming — it emits incremental text chunk by chunk and reconstructs the full sentence:

欢迎大 / 家来 / 体验达 / 摩院推 / 出的语 / 音识 / 别模型
-> 欢迎大家来体验达摩院推出的语音识别模型

Only the streaming code example changes — no header/structure edits.

The Usage streaming snippet used input="chunk.wav" (a file that does not exist) and a single one-shot generate() call missing is_final / encoder_chunk_look_back / the chunk loop, so it neither ran nor demonstrated streaming. Replace it with the real chunk-by-chunk loop (matching the repo example examples/industrial_data_pretraining/paraformer_streaming/demo.py): read audio, iterate fixed-stride chunks, pass cache + is_final + look-back, print partial text per chunk. Verified on GPU (paraformer-zh-streaming): emits incremental text per chunk and reconstructs the full sentence.

gemini-code-assist

Code Review

This pull request updates both the English and Chinese README files to provide a complete, realistic example of streaming real-time audio chunk-by-chunk using the soundfile library. The feedback suggests converting the loaded audio to mono if it has multiple channels to prevent potential shape mismatch errors during model inference.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-18T10:48:49Z

+import soundfile as sf
 model = AutoModel(model="paraformer-zh-streaming", device="cuda")
-result = model.generate(input="chunk.wav", cache={}, chunk_size=[0, 10, 5])
+audio, sr = sf.read("speech.wav", dtype="float32")   # 16 kHz mono


If the user's speech.wav is a stereo audio file, sf.read will return a 2D array, which can cause shape mismatch errors during feature extraction or model inference. Converting the audio to mono if it has multiple channels makes the example more robust.

Suggested change

audio, sr = sf.read("speech.wav", dtype="float32") # 16 kHz mono

audio, sr = sf.read("speech.wav", dtype="float32")

if audio.ndim > 1:

audio = audio[:, 0] # Convert to mono if stereo

gemini-code-assist · 2026-06-18T10:48:49Z

+import soundfile as sf
 model = AutoModel(model="paraformer-zh-streaming", device="cuda")
-result = model.generate(input="chunk.wav", cache={}, chunk_size=[0, 10, 5])
+audio, sr = sf.read("speech.wav", dtype="float32")   # 16 kHz 单声道


If the user's speech.wav is a stereo audio file, sf.read will return a 2D array, which can cause shape mismatch errors during feature extraction or model inference. Converting the audio to mono if it has multiple channels makes the example more robust.

Suggested change

audio, sr = sf.read("speech.wav", dtype="float32") # 16 kHz 单声道

audio, sr = sf.read("speech.wav", dtype="float32")

if audio.ndim > 1:

audio = audio[:, 0] # Convert to mono if stereo

LauraGPT merged commit 7e01e94 into main Jun 18, 2026

LauraGPT deleted the fix/readme-streaming-example branch June 18, 2026 10:47

gemini-code-assist Bot reviewed Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: fix README streaming example (runnable + actually streams)#2987

docs: fix README streaming example (runnable + actually streams)#2987
LauraGPT merged 1 commit into
mainfrom
fix/readme-streaming-example

LauraGPT commented Jun 18, 2026

gemini-code-assist Bot left a comment

gemini-code-assist Bot Jun 18, 2026

gemini-code-assist Bot Jun 18, 2026

Labels

1 participant

Uh oh!

Conversation

LauraGPT commented Jun 18, 2026

Problem

Fix

Verification

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

gemini-code-assist Bot Jun 18, 2026

Choose a reason for hiding this comment

Labels

1 participant