fix: Some models only support streaming output (e.g. Qwen3 open-source edition), so all chunks must be consumed and concatenated into a single `LLMResult` for compatibility.

This commit is contained in:
FFXN
2026-03-01 11:37:33 +08:00
parent e900811910
commit c2ea3d5b44

View File

@@ -91,7 +91,8 @@ def _build_llm_result_from_chunks(
"""
Build a single `LLMResult` by accumulating all returned chunks.
Some models only support streaming output (e.g. Qwen3 open-source edition),
Some models only support streaming output (e.g. Qwen3 open-source edition)
and the plugin side may still implement the response via a chunked stream,
so all chunks must be consumed and concatenated into a single ``LLMResult``.
The ``usage`` is taken from the last chunk that carries it, which is the