fix: Some models only support streaming output (e.g. Qwen3 open-source edition), so all chunks must be consumed and concatenated into a single `LLMResult` for compatibility.

2026-03-01 04:45:09 +00:00 · 2026-03-01 11:37:33 +08:00
parent e900811910
commit c2ea3d5b44
1 changed files with 2 additions and 1 deletions
--- a/api/core/model_runtime/model_providers/__base/large_language_model.py
+++ b/api/core/model_runtime/model_providers/__base/large_language_model.py
@@ -91,7 +91,8 @@ def _build_llm_result_from_chunks(
    """
    Build a single `LLMResult` by accumulating all returned chunks.

-    Some models only support streaming output (e.g. Qwen3 open-source edition),
+    Some models only support streaming output (e.g. Qwen3 open-source edition)
+    and the plugin side may still implement the response via a chunked stream,
    so all chunks must be consumed and concatenated into a single ``LLMResult``.

    The ``usage`` is taken from the last chunk that carries it, which is the