模型配置与 Model 工厂

本章目标:
讲清 models[] 配置项如何通过反射(reflection)实例化任意 LangChain 兼容的 chat model
拆解 create_chat_model 工厂在 thinking_enabled / supports_vision 等开关下的覆盖逻辑
给出新增一个自定义模型 provider 的最小模板与从代码校验逻辑反推出的约束清单

TL;DR

DeerFlow 的模型层不绑定任何固定 SDK:每个 models[] 条目用 use 字段声明一个 module:Class 类路径,工厂在运行时用反射 resolve_class 导入并实例化。create_chat_model 负责把配置字典转成构造参数,并按 thinking_enabled 在 when_thinking_enabled / when_thinking_disabled / thinking 之间做深合并覆盖,自适应 Anthropic 原生、OpenAI 兼容网关、vLLM 三种 thinking 表达方式。VllmChatModel 是为 vLLM 0.19.0 定制的子类,跨多轮保留非标准 reasoning 字段。缺失 provider 包时,reflection 层会给出可直接复制的 uv add 安装提示。

Overview

为什么需要一个 Model 工厂,而不是直接 ChatOpenAI(...)?

DeerFlow 是一个 super agent harness,要同时支持 OpenAI、Anthropic、Gemini、Doubao、DeepSeek、Ollama、vLLM、MindIE 等异构后端。这些 provider 在三件事上彼此不兼容:

构造类不同 —— langchain_openai:ChatOpenAI、langchain_anthropic:ChatAnthropic、deerflow.models.vllm_provider:VllmChatModel 等。硬编码 import 会把所有 SDK 变成强依赖,且无法扩展。
thinking / reasoning 的开关方式不同 —— Anthropic 用构造参数 thinking={"type": ...};OpenAI 兼容网关用 extra_body.thinking;vLLM 用 extra_body.chat_template_kwargs.enable_thinking。
流式与多轮语义不同 —— vLLM 的 reasoning 字段是非标准的,LangChain 默认适配器会丢弃它,导致交错式 thinking/tool-call 流程在下一轮失去上文。

工厂模式把"读配置 → 选类 → 调参 → 实例化"收敛到一个入口 create_chat_model,配置层只声明 use 类路径与字段,反射层负责按需 import 并在缺包时给出安装提示。

Architecture

模型层只有三个核心文件,职责清晰:

Source 列表

backend/packages/harness/deerflow/models/factory.py —— create_chat_model 工厂,负责覆盖合并与实例化
backend/packages/harness/deerflow/models/init.py —— 仅导出 create_chat_model
backend/packages/harness/deerflow/models/vllm_provider.py —— VllmChatModel 自定义 provider
backend/packages/harness/deerflow/config/model_config.py —— ModelConfig Pydantic schema
backend/packages/harness/deerflow/reflection/resolvers.py —— resolve_class / resolve_variable 反射加载

Components / Subsystems

ModelConfig:声明式模型 schema

ModelConfig 是一个 extra="allow" 的 Pydantic 模型,这意味着除了显式声明的字段外,任何 provider 特有参数(如 api_base、temperature、num_predict)都会被收集进来。显式字段中有一组"元字段"不会传给构造器:use、name、display_name、description、supports_thinking、supports_reasoning_effort、when_thinking_enabled、when_thinking_disabled、thinking、supports_vision,它们在工厂里被 model_dump(exclude=...) 排除backend/packages/harness/deerflow/models/factory.py:66。

thinking 字段是 when_thinking_enabled 的快捷写法,两者都提供时会被合并backend/packages/harness/deerflow/config/model_config.py:35。

resolve_class / resolve_variable:反射加载与安装提示

use 字段格式为 module.path:ClassName,由 resolve_variable 用 module_path, variable_name = variable_path.rsplit(":", 1) 拆分后 import_module,再 getattr 取出类backend/packages/harness/deerflow/reflection/resolvers.py:44-62。resolve_class 在其上追加两层校验:必须是 type,且必须是 base_class(工厂传入 BaseChatModel)的子类backend/packages/harness/deerflow/reflection/resolvers.py:87-93。

当 provider 包未安装时,_build_missing_dependency_hint 会把已知 provider 模块映射到正确的 pip/uv 包名(如 langchain_google_genai → langchain-google-genai),返回一句可直接复制执行的提示backend/packages/harness/deerflow/reflection/resolvers.py:11-22。

create_chat_model:工厂核心

工厂入口签名为 create_chat_model(name=None, thinking_enabled=False, *, app_config=None, **kwargs)backend/packages/harness/deerflow/models/factory.py:50。name 为 None 时取 config.models[0].name 作为默认模型backend/packages/harness/deerflow/models/factory.py:60-61。它还对若干特殊 provider 做针对性处理:Codex Responses API 模型把 thinking 映射为 reasoning_effort 并移除 max_tokensbackend/packages/harness/deerflow/models/factory.py:120-133;MindIEChatModel 强制 max_retries 默认为 1 防止超时级联backend/packages/harness/deerflow/models/factory.py:137-139。

VllmChatModel:保留 reasoning 的 vLLM provider

VllmChatModel 继承 langchain_openai:ChatOpenAI,解决 vLLM 0.19.0 的一个具体问题:LangChain 默认 OpenAI 适配器会丢弃非标准的 reasoning 字段,导致交错式 thinking/tool-call 流程在后续轮次失去之前的推理上文backend/packages/harness/deerflow/models/vllm_provider.py:1-13。它在三处保留 reasoning:

_get_request_payload:把上一轮 assistant 消息的 reasoning 重新注入出站请求backend/packages/harness/deerflow/models/vllm_provider.py:168-191
_create_chat_result:非流式响应保留 reasoning / reasoning_contentbackend/packages/harness/deerflow/models/vllm_provider.py:193-212
_convert_chunk_to_generation_chunk:流式 delta 保留 reasoningbackend/packages/harness/deerflow/models/vllm_provider.py:214-258

此外 _normalize_vllm_chat_template_kwargs 把 DeerFlow 早期文档的 chat_template_kwargs.thinking 别名归一化为 vLLM 0.19.0 Qwen reasoning parser 实际读取的 enable_thinking,保证旧配置仍可用backend/packages/harness/deerflow/models/vllm_provider.py:39-62。

Data Flow

下图展示一次 create_chat_model("doubao-seed-1.8", thinking_enabled=True) 的完整调用链。

关键编号步骤说明:

调用方(如 lead_agent)传入模型名与 thinking_enabledbackend/packages/harness/deerflow/agents/lead_agent/agent.py:435。
工厂用 get_model_config(name) 取出该模型的 ModelConfigbackend/packages/harness/deerflow/models/factory.py:62。
resolve_class 按 use 类路径反射导入并做 BaseChatModel 子类校验backend/packages/harness/deerflow/models/factory.py:65。
thinking_enabled 为真且 supports_thinking 为真时,把 effective_wte 合并进构造参数backend/packages/harness/deerflow/models/factory.py:88-92。
用解析出的类与合并后的参数实例化模型backend/packages/harness/deerflow/models/factory.py:150。

Implementation Details

thinking 覆盖逻辑是工厂里最精细的部分。先合并出"生效的 enabled 设置"effective_wte,再分场景处理:

python

# 把 thinking 快捷字段合并进 when_thinking_enabled
has_thinking_settings = (model_config.when_thinking_enabled is not None) or (model_config.thinking is not None)
effective_wte: dict = dict(model_config.when_thinking_enabled) if model_config.when_thinking_enabled else {}
if model_config.thinking is not None:
    merged_thinking = {**(effective_wte.get("thinking") or {}), **model_config.thinking}
    effective_wte = {**effective_wte, "thinking": merged_thinking}
if thinking_enabled and has_thinking_settings:
    if not model_config.supports_thinking:
        raise ValueError(f"Model {name} does not support thinking. ...")
    if effective_wte:
        model_settings_from_config.update(effective_wte)

backend/packages/harness/deerflow/models/factory.py:83-92

解读:thinking 字段优先级高于 when_thinking_enabled.thinking({**old, **new} 中后者覆盖前者)。开启 thinking 但模型未声明 supports_thinking 时直接抛错,而非静默忽略。

关闭 thinking(not thinking_enabled)时的分支按"用户显式 disable → OpenAI 兼容网关 → vLLM → Anthropic 原生"依次匹配:

python

if not thinking_enabled:
    if model_config.when_thinking_disabled is not None:
        model_settings_from_config.update(model_config.when_thinking_disabled)  # 用户显式优先
    elif has_thinking_settings and effective_wte.get("extra_body", {}).get("thinking", {}).get("type"):
        model_settings_from_config["extra_body"] = _deep_merge_dicts(..., {"thinking": {"type": "disabled"}})
        model_settings_from_config["reasoning_effort"] = "minimal"
    elif has_thinking_settings and (disable_ctk := _vllm_disable_chat_template_kwargs(...)):
        model_settings_from_config["extra_body"] = _deep_merge_dicts(..., {"chat_template_kwargs": disable_ctk})
    elif has_thinking_settings and effective_wte.get("thinking", {}).get("type"):
        model_settings_from_config["thinking"] = {"type": "disabled"}  # Anthropic 原生

backend/packages/harness/deerflow/models/factory.py:93-112

解读:工厂从 when_thinking_enabled 的结构推断 provider 类型——出现 extra_body.thinking.type 判为 OpenAI 兼容网关,出现 extra_body.chat_template_kwargs.{thinking,enable_thinking} 判为 vLLM,出现顶层 thinking.type 判为 Anthropic 原生。这样无需新增字段就能自动给出正确的"关闭"表达。

supports_reasoning_effort 为假时,reasoning_effort 会同时从 kwargs 和配置里弹出,避免传给不支持的模型backend/packages/harness/deerflow/models/factory.py:113-115。

supports_vision 不影响工厂构造(被 exclude),而是在工具装配处读取:为真时把 view_image_tool 加入 agent 工具集backend/packages/harness/deerflow/tools/tools.py:108-111。

扩展指南:新增一个自定义模型 provider

适用于需要重写 LangChain 默认行为的场景(例如保留某个后端的非标准字段、改写流式解析)。参考 VllmChatModel 的最小模板:

python

# backend/packages/harness/deerflow/models/my_provider.py
from typing import Any
from langchain_openai import ChatOpenAI          # 或任意 BaseChatModel 子类
from langchain_core.outputs import ChatResult


class MyChatModel(ChatOpenAI):
    """自定义 provider:在标准 OpenAI 兼容协议上做增量改写。"""

    model_config = {"arbitrary_types_allowed": True}

    @property
    def _llm_type(self) -> str:
        return "my-openai-compatible"

    def _create_chat_result(self, response, generation_info=None) -> ChatResult:
        result = super()._create_chat_result(response, generation_info=generation_info)
        # 在此保留你的非标准字段 ...
        return result

对应 config.yaml 条目:

yaml

models:
  - name: my-model
    display_name: My Model
    use: deerflow.models.my_provider:MyChatModel   # module:Class
    model: my-model-id
    api_key: $MY_API_KEY
    base_url: http://localhost:8000/v1
    supports_thinking: false
    supports_vision: false

约束清单(均从工厂与反射的校验代码读出,不是建议而是硬性要求):

use 必须是 module.path:ClassName 格式;缺少 : 会被 resolve_variable 判为非法路径并抛 ImportErrorbackend/packages/harness/deerflow/reflection/resolvers.py:44-46。
解析出的对象必须是 type 且是 BaseChatModel 的子类,否则 resolve_class 抛 ValueErrorbackend/packages/harness/deerflow/reflection/resolvers.py:89-93。
name 必填且应唯一;use、model 必填(ModelConfig 中为 Field(...))backend/packages/harness/deerflow/config/model_config.py:7-14。
构造函数会收到除元字段外的全部配置项(model_dump(exclude_none=True, exclude={...})),provider 类必须能接受这些 kwargs(或继承自能接受 extra 的基类)backend/packages/harness/deerflow/models/factory.py:66-80。
若要支持 thinking,必须设置 supports_thinking: true,否则在 thinking_enabled=True 时工厂抛错backend/packages/harness/deerflow/models/factory.py:88-90。
若类的 model_fields 含 stream_usage,工厂会默认置 True 以保证 token 用量可被中间件统计backend/packages/harness/deerflow/models/factory.py:146-148。
已知 provider 模块名建议补进 MODULE_TO_PACKAGE_HINTS,缺包提示才能给出正确包名backend/packages/harness/deerflow/reflection/resolvers.py:3-8。

Configuration

models[] 每个条目对应一个 ModelConfig。下表字段均来自 schema 与 example 文件:

字段	含义	必填	Source
`name`	模型唯一名,运行时按此选择	是	model_config.py:7
`display_name`	前端展示名	否	model_config.py:8
`description`	模型描述	否	model_config.py:9
`use`	provider 类路径 `module:Class`	是	model_config.py:10
`model`	后端模型 ID	是	model_config.py:14
`use_responses_api`	是否走 OpenAI `/v1/responses`	否	model_config.py:16
`output_version`	结构化输出版本(如 `responses/v1`)	否	model_config.py:20
`supports_thinking`	是否支持 thinking;为假时开 thinking 会抛错	否(默认 false)	model_config.py:24
`supports_reasoning_effort`	是否支持 `reasoning_effort`;为假时该参数被剔除	否(默认 false)	model_config.py:25
`when_thinking_enabled`	thinking 开启时合并的额外设置	否	model_config.py:26
`when_thinking_disabled`	thinking 关闭时合并的额外设置(用户显式优先)	否	model_config.py:30
`supports_vision`	为真时 agent 工具集加入 `view_image_tool`	否(默认 false)	model_config.py:34
`thinking`	`when_thinking_enabled` 的快捷写法,二者会合并	否	model_config.py:35
`api_key` / `base_url` / 其他	`extra="allow"` 透传给 provider 构造器	视 provider	model_config.py:15

$ENV 与安全标注:任何以 $ 开头的配置值会被 resolve_env_variables 用 os.getenv 解析,环境变量缺失时直接抛 ValueErrorbackend/packages/harness/deerflow/config/app_config.py:280-285。务必用 api_key: $OPENAI_API_KEY 这类引用,不要把明文密钥写进 config.yaml,因为该文件常被纳入版本控制。

OpenAI 兼容网关的 vLLM 配置示例(注意 enable_thinking 在 extra_body.chat_template_kwargs 下):

yaml

- name: qwen3-32b-vllm
  use: deerflow.models.vllm_provider:VllmChatModel
  model: Qwen/Qwen3-32B
  api_key: $VLLM_API_KEY
  base_url: http://localhost:8000/v1
  supports_thinking: true
  when_thinking_enabled:
    extra_body:
      chat_template_kwargs:
        enable_thinking: true

config.example.yaml:316-330

Common Pitfalls / Tips

use 写成点号会失败:必须是 module:Class(冒号),写成 langchain_openai.ChatOpenAI 会被 resolve_variable 判为非法路径backend/packages/harness/deerflow/reflection/resolvers.py:44-46。注意 config.yaml 注释里 example 用 langchain_openai:ChatOpenAI 的冒号写法才是对的。
开 thinking 但忘了 supports_thinking: true:工厂会直接抛 ValueError 而非静默降级——这是有意设计,避免误以为在用 thinkingbackend/packages/harness/deerflow/models/factory.py:88-90。但注意 lead_agent 会先做一次软降级:thinking_enabled 为真而模型不支持时打 warning 并置回 False,所以经由 agent 路径不会触发该异常backend/packages/harness/deerflow/agents/lead_agent/agent.py:379-381。
第三方 OpenAI 兼容网关 token 用量为空:LangChain 仅在无自定义 base_url 时默认开 stream_usage。工厂对设了 base_url 的 ChatOpenAI 自动补 stream_usage=True,否则 TokenUsageMiddleware 拿不到数据backend/packages/harness/deerflow/models/factory.py:34-47。
vLLM 用旧 thinking 别名:仍可用,VllmChatModel 发送前会把 chat_template_kwargs.thinking 归一化为 enable_thinking;但新配置推荐直接写 enable_thinkingbackend/packages/harness/deerflow/models/vllm_provider.py:56-62。
vision 不影响模型构造:supports_vision 不会改变 provider 实例化参数,只决定 view_image_tool 是否进入工具集与中间件是否注入图片backend/packages/harness/deerflow/tools/tools.py:108-111。

References

backend/packages/harness/deerflow/models/factory.py —— create_chat_model 工厂全流程
backend/packages/harness/deerflow/models/vllm_provider.py —— VllmChatModel reasoning 保留实现
backend/packages/harness/deerflow/config/model_config.py —— ModelConfig schema 定义
backend/packages/harness/deerflow/reflection/resolvers.py —— 反射加载与缺包安装提示
backend/packages/harness/deerflow/config/app_config.py —— $ENV 解析与 get_model_config
backend/packages/harness/deerflow/agents/lead_agent/agent.py —— 工厂的主要调用方与 thinking 软降级
config.example.yaml —— models[] 全量配置示例

页面	关系
./04-配置系统与AppConfig.md	上游:`AppConfig` 如何加载与缓存 `config.yaml`、`get_model_config` 与 `$ENV` 解析的归属
./34-反射与动态加载机制.md	平行:`resolve_class` / `resolve_variable` 的通用反射机制,本章是其在模型层的应用
./10-LeadAgent与Agent工厂.md	下游:Agent 工厂如何调用 `create_chat_model` 并按请求/agent/默认三级解析模型名
./35-可观测性-Tracing与Token用量.md	下游:`stream_usage` 自动开启与 tracing callbacks 附加如何支撑 token 用量统计

模型配置与 Model 工厂 ​

TL;DR ​

Overview ​

Architecture ​

Components / Subsystems ​

ModelConfig:声明式模型 schema ​

resolve_class / resolve_variable:反射加载与安装提示 ​

create_chat_model:工厂核心 ​

VllmChatModel:保留 reasoning 的 vLLM provider ​

Data Flow ​

Implementation Details ​

扩展指南:新增一个自定义模型 provider ​

Configuration ​

Common Pitfalls / Tips ​

References ​

Related Pages ​

模型配置与 Model 工厂

TL;DR

Overview

Architecture

Components / Subsystems

ModelConfig:声明式模型 schema

resolve_class / resolve_variable:反射加载与安装提示

create_chat_model:工厂核心

VllmChatModel:保留 reasoning 的 vLLM provider

Data Flow

Implementation Details

扩展指南:新增一个自定义模型 provider

Configuration

Common Pitfalls / Tips

References

Related Pages