Skip to content

comfy-aimdo: 0.4.8 (CORE-271)#14244

Merged
comfyanonymous merged 1 commit into
Comfy-Org:masterfrom
rattus128:prs/aimdo-0.4.8
Jun 3, 2026
Merged

comfy-aimdo: 0.4.8 (CORE-271)#14244
comfyanonymous merged 1 commit into
Comfy-Org:masterfrom
rattus128:prs/aimdo-0.4.8

Conversation

@rattus128

Copy link
Copy Markdown
Contributor

Aimdo 0.4.8 fixes a crash in multi-gpu due to contention on the singleton bounce buffer.

example test conditions:

RTX4090x2
qwen 2512

image

Before:

  File "/home/rattus/ComfyUI/comfy/samplers.py", line 560, in _calc_cond_batch_multigpu
    raise error
  File "/home/rattus/ComfyUI/comfy/multigpu.py", line 55, in _worker_loop
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/samplers.py", line 540, in _handle_batch_pooled
    _handle_batch(device, batch_tuple, worker_results)
  File "/home/rattus/ComfyUI/comfy/samplers.py", line 525, in _handle_batch
    output = model_current.apply_model(input_x, timestep_, **c).to(output_device).chunk(batch_chunks)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/model_base.py", line 186, in apply_model
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/patcher_extension.py", line 113, in execute
    return self.original(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/model_base.py", line 230, in _apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/ldm/qwen_image/model.py", line 418, in forward
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/patcher_extension.py", line 113, in execute
    return self.original(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/ldm/qwen_image/model.py", line 490, in _forward
    hidden_states = self.img_in(hidden_states)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/ops.py", line 501, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/ops.py", line 493, in forward_comfy_cast_weights
    weight, bias, offload_stream = cast_bias_weight(self, input, offloadable=True)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/ops.py", line 321, in cast_bias_weight
    offload_stream = cast_modules_with_vbar([s], dtype, device, bias_dtype, non_blocking)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/ops.py", line 188, in cast_modules_with_vbar
    handle_pin(s, pin, xfer_source, xfer_dest, size=dest_size)
  File "/home/rattus/ComfyUI/comfy/ops.py", line 186, in handle_pin
    cast_maybe_lowvram_patch(source, pin, offload_stream, xfer_dest2=dest)
  File "/home/rattus/ComfyUI/comfy/ops.py", line 177, in cast_maybe_lowvram_patch
    comfy.model_management.cast_to_gathered(xfer_source, xfer_dest, non_blocking=non_blocking, stream=stream, r2=xfer_dest2)
  File "/home/rattus/ComfyUI/comfy/model_management.py", line 1439, in cast_to_gathered
    if comfy.memory_management.read_tensor_file_slice_into(tensor, dest_view, stream=stream, destination2=dest2_view):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rattus/ComfyUI/comfy/memory_management.py", line 59, in read_tensor_file_slice_into
    comfy_aimdo.host_buffer.read_file_to_device(file_obj, info.offset, info.size,
  File "/home/rattus/ComfyUI/.venv/lib/python3.12/site-packages/comfy_aimdo/host_buffer.py", line 71, in read_file_to_device
    raise RuntimeError("hostbuf_file_reader_read failed")
RuntimeError: hostbuf_file_reader_read failed

After:

[INFO] got prompt
[INFO] Using pytorch attention in VAE
[INFO] Using pytorch attention in VAE
[INFO] VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
[INFO] Found quantization metadata version 1
[INFO] Using MixedPrecisionOps for text encoder
[INFO] CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
[INFO] Requested to load QwenImageTEModel_
[INFO] Model QwenImageTEModel_ prepared for dynamic VRAM loading. 7910MB Staged. 0 patches attached. Force pre-loaded 122 weights: 561 KB.
[INFO] Model QwenImageTEModel_ prepared for dynamic VRAM loading. 7910MB Staged. 0 patches attached. Force pre-loaded 122 weights: 561 KB.
[INFO] model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
[INFO] model_type FLUX
[INFO] Creating deepclone of QwenImage for cuda:1.
[INFO] model weight dtype torch.float8_e4m3fn, manual cast: torch.bfloat16
[INFO] model_type FLUX
[INFO] Requested to load QwenImage
[INFO] Requested to load QwenImage
[INFO] Model QwenImage prepared for dynamic VRAM loading. 19483MB Staged. 0 patches attached. Force pre-loaded 241 weights: 37 KB.
[INFO] Model QwenImage prepared for dynamic VRAM loading. 19483MB Staged. 0 patches attached. Force pre-loaded 241 weights: 37 KB.
100%|██████████| 20/20 [00:20<00:00,  1.00s/it]
[INFO] Requested to load WanVAE
[INFO] 0 models unloaded.
[INFO] Model WanVAE prepared for dynamic VRAM loading. 241MB Staged. 0 patches attached. Force pre-loaded 60 weights: 61 KB.
[INFO] Prompt executed in 27.03 second

Step rate is still close to 2X vs 1GPU:

[INFO] got prompt
[INFO] Requested to load QwenImage
[INFO] Model QwenImage prepared for dynamic VRAM loading. 19483MB Staged. 0 patches attached. Force pre-loaded 241 weights: 37 KB.
100%|██████████| 20/20 [00:40<00:00,  2.02s/it]
[INFO] 0 models unloaded.
[INFO] Model WanVAE prepared for dynamic VRAM loading. 241MB Staged. 0 patches attached. Force pre-loaded 60 weights: 61 KB.
[INFO] Prompt executed in 40.99 seconds

Regression Tests:

Same sys, Run again ✅
Same sys, Bypass the multi-gpu node and rerun ✅
Same sys, Restart and run on GPU1 ✅
Same sys, Restart and run on GPU2 ✅
Windows, RTX5090 LTX2.3 ✅
Windows, RTX5090 WAN2.2 FP16 ✅

Aimdo 0.4.8 fixes a crash in multi-gpu due to contention on the
singleton bounce buffer.
@socket-security

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatedcomfy-aimdo@​0.4.7 ⏵ 0.4.89910010010070

View full report

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 54c364a7-025f-4028-8dd9-158099ca526e

📥 Commits

Reviewing files that changed from the base of the PR and between dc10c01 and 0dfef43.

📒 Files selected for processing (1)
  • requirements.txt

📝 Walkthrough

Walkthrough

This PR updates the comfy-aimdo dependency from version 0.4.7 to 0.4.8 in requirements.txt. All other dependencies remain unchanged. This is a manifest-only update with no code changes.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'comfy-aimdo: 0.4.8' clearly and specifically identifies the main change—updating the comfy-aimdo dependency to version 0.4.8, which matches the requirements.txt modification.
Description check ✅ Passed The description is directly related to the changeset, explaining the bug fix (multi-GPU crash due to singleton bounce buffer contention) that motivated the dependency upgrade, with detailed before/after logs and regression test results.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rattus128 rattus128 marked this pull request as draft June 2, 2026 21:38
@rattus128 rattus128 marked this pull request as ready for review June 2, 2026 22:40
@comfyanonymous comfyanonymous merged commit bd7da05 into Comfy-Org:master Jun 3, 2026
14 checks passed
@rattus128 rattus128 changed the title comfy-aimdo: 0.4.8 Jun 5, 2026
@rattus128 rattus128 changed the title comfy-aimdo: 0.4.8 (CORE-270) Jun 5, 2026
zhangp365 pushed a commit to zhangp365/ComfyUI that referenced this pull request Jun 29, 2026
Aimdo 0.4.8 fixes a crash in multi-gpu due to contention on the
singleton bounce buffer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants