Agent 自主工具创建：从工具发现到代码生成与自验证-尧图网站建设

📅 发布时间：2026/7/5 8:59:59

当前主流 Agent 框架依赖预定义工具集——开发者预先编写好搜索、计算、数据库查询等工具，Agent 在运行时从中选择调用。但当用户需求超出预定义工具的覆盖范围时，Agent 就会陷入"无工具可用"的困境。自主工具创建（Autonomous Tool Creation）是 Agent 能力的下一次跃迁：让 Agent 自己编写、测试和使用所需的工具。

一、工具创建的三种范式### 范式一：代码即工具Agent 根据任务需求直接生成 Python 函数，在沙箱中执行，将返回值作为工具调用结果：pythonclass CodeToolCreator: def init(self, sandbox, code_llm): self.sandbox = sandbox self.llm = code_llm def create_and_run(self, task_description, context): # 第一步：生成工具代码 tool_code = self.llm.generate( task=f"""编写一个 Python 函数解决以下任务。函数应自包含，不依赖外部未安装的库。包含输入参数类型注解和 docstring。如果需要数据，从 context 中获取。任务：{task_description} 可用上下文数据：{context} 只输出函数代码，用python 包裹。“”" ) # 第二步：在沙箱中执行 result = self.sandbox.execute(tool_code, timeout=30) if result.error: # 第三步：自验证与修复 fixed_code = self.llm.generate( task=f"修复以下代码的报错：\n{tool_code}\n\n错误：{result.error}" ) result = self.sandbox.execute(fixed_code, timeout=30) return result.output`text### 范式二：API 即工具Agent 通过 OpenAPI 规范或文档发现可用 API，自动封装为可调用工具：`pythonclass APIToolDiscovery: def discover_from_openapi(self, openapi_spec_url): “”“从 OpenAPI 规范自动提取可用工具”“” spec = fetch(openapi_spec_url) tools = [] for path, methods in spec[“paths”].items(): for method, details in methods.items(): tool = { “name”: details.get(“operationId”, f"{method}_{path}“), “description”: details.get(“summary”, “”), “parameters”: self._extract_params(details), “endpoint”: path, “method”: method.upper(), } tools.append(tool) return tools def create_tool_wrapper(self, api_tool): “”“为 API 工具生成可执行包装器””" def wrapper(**kwargs): response = requests.request( method=api_tool[“method”], url=self.base_url + api_tool[“endpoint”], json=kwargs, headers=self.headers, ) return response.json() return wrappertextAPI 发现的工程价值在于它让 Agent 能够动态扩展能力边界。当用户提出一个新需求而预定义工具无法满足时，Agent 可以自动搜索相关 API 文档，理解接口规范，生成可执行的包装器，并在沙箱中验证后投入使用。整个过程不需要人工干预，从用户描述需求到 Agent 自主创建并使用工具形成完整闭环。在实际应用中，API 发现模式在数据分析和自动化测试场景中效果尤为显著。例如当用户需要查询某个 SaaS 平台的 API 使用统计时，Agent 可以自动发现该平台的 OpenAPI 文档，提取相关端点，封装为可调用工具，然后执行查询并返回结果。这种能力将 Agent 从使用预定义工具升级为按需创造工具，大幅扩展了可处理问题的空间。### 范式三：组合工具创建Agent 将多个现有工具组合成一个新的复合工具：pythonclass CompositeToolCreator: def create(self, goal, available_tools): prompt = f"““目标：{goal} 可用工具： {self._format_tools(available_tools)} 请设计一个工具组合方案，用 JSON 描述执行流程： {{ “new_tool_name”: “…”, “description”: “…”, “pipeline”: [ {{“step”: 1, “tool”: “search”, “input_from”: “user_query”}}, {{“step”: 2, “tool”: “summarize”, “input_from”: “step_1_output”}}, {{“step”: 3, “tool”: “translate”, “input_from”: “step_2_output”}} ] }}””" plan = self.llm.generate(prompt, response_format=“json”) return self._compile_pipeline(plan, available_tools)`text## 二、自验证机制Agent 创建的工具不能直接信任——代码可能有 bug，API 可能返回异常数据。自验证是安全使用自创建工具的关键。### 测试用例自动生成`pythonclass ToolValidator: def validate(self, tool_code, tool_signature): # 生成测试用例 test_cases = self._generate_test_cases(tool_signature) results = [] for case in test_cases: try: output = self.sandbox.execute(tool_code, input_data=case.input) # 验证输出类型和格式 if not self._check_output(output, case.expected_type): results.append({“case”: case, “status”: “type_error”}) continue # 验证语义正确性（用 LLM 判断） if not self._semantic_check(case.input, output, case.description): results.append({“case”: case, “status”: “semantic_error”}) else: results.append({“case”: case, “status”: “passed”}) except Exception as e: results.append({“case”: case, “status”: “runtime_error”, “error”: str(e)}) pass_rate = sum(1 for r in results if r[“status”] == “passed”) / len(results) return ValidationResult(pass_rate=pass_rate, details=results) def _generate_test_cases(self, signature): prompt = f"““为以下函数生成 5 个测试用例（含正常、边界、异常场景）。函数签名：{signature} 输出 JSON 数组，每个元素包含 input, expected_type, description。””" return json.loads(self.llm.generate(prompt))`text### 运行时护栏`pythonclass SafeToolExecutor: definit(self, sandbox, max_calls=5, max_runtime=30): self.sandbox = sandbox self.max_calls = max_calls self.max_runtime = max_runtime def execute_tool(self, tool_func, args, context): # 资源限制 self._check_call_budget() # 输入验证 sanitized = self._sanitize_input(args) # 执行 try: result = self.sandbox.run( tool_func, sanitized, timeout=self.max_runtime, network_restricted=True, # 禁止网络访问 fs_restricted=True, # 限制文件系统访问 ) except TimeoutError: return ToolResult(success=False, error=“execution_timeout”) except Exception as e: return ToolResult(success=False, error=str(e)) # 输出验证 if not self._validate_output(result, context): return ToolResult(success=False, error=“output_validation_failed”) return ToolResult(success=True, output=result)`text## 三、工具缓存与复用自创建的工具不应是一次性的。一个工具缓存系统可以让 Agent 在后续相似任务中复用已有工具：`pythonclass ToolCache: definit(self, embedding_model, store): self.embedder = embedding_model self.store = store # 向量数据库 def find_similar_tool(self, task_description): “”“根据任务描述查找已缓存的工具”“” embedding = self.embedder.encode(task_description) hits = self.store.search(embedding, top_k=3, min_similarity=0.88) for hit in hits: tool = self.store.get(hit.id) if tool.is_valid and tool.compatibility_verified: return tool return None def cache_tool(self, task_description, tool_code, test_results): if test_results.pass_rate >= 0.8: embedding = self.embedder.encode(task_description) self.store.insert( embedding=embedding, tool_code=tool_code, task_description=task_description, created_at=datetime.now(), usage_count=0, success_rate=test_results.pass_rate, )`text## 四、安全边界设计自主工具创建引入了新的安全风险——Agent 生成的代码可能包含恶意操作，或因 bug 导致数据损坏。### 分级权限模型`pythonTOOL_PERMISSIONS = { “readonly_fs”: { # 只读文件系统 “allowed_ops”: [“read”, “list”], “forbidden_ops”: [“write”, “delete”, “execute”], “sandbox”: True, }, “network_restricted”: { # 受限网络 “allowed_domains”: [“api.weather.com”, “api.finance.com”], “forbidden_domains”: [“*”], }, “full_access”: { # 完全访问（需人工审批） “requires_approval”: True, “audit_log”: True, },}def assign_permission(tool_code): “”“根据代码内容自动分配权限级别”“” if “import os” in tool_code and “os.remove” in tool_code: return “full_access” # 涉及文件删除，需要审批 if “requests.get” in tool_code: return “network_restricted” return “readonly_fs”```text### 人在回路的安全策略在自主工具创建的实际部署中，完全自动化的方案风险过高。一个务实的策略是引入"人在回路"（Human-in-the-Loop）机制——对于首次创建的工具，提交给人工审核后才能进入缓存库；对于已审核通过的工具模式，允许 Agent 在相似场景中直接复用，但记录完整审计日志。审核的重点不在于代码逻辑的正确性——这可以由自验证机制保证——而在于工具的行为意图是否安全。例如，一个"发送邮件"的工具在功能上是正确的，但如果它被创建的上下文是"将用户数据发送到外部服务器"，就需要人工判断是否允许。这种语义级别的安全判断目前无法完全自动化，必须依赖人类审查。另一个重要的安全策略是工具的生命周期管理。自创建的工具应该有过期机制——如果一个工具在 30 天内没有被复用，自动从缓存库中移除。这既能减少缓存膨胀，也能防止过时的工具被误用。同时，每当底层数据模式或 API 接口发生变化时，相关工具应自动标记为"需重新验证"，确保工具的持续有效性。## 总结自主工具创建将 Agent 从"工具的使用者"提升为"工具的创造者"。代码即工具、API 即工具、组合工具三种范式覆盖了从零编写到复用集成的不同场景，自验证机制保证了工具质量，工具缓存系统实现了知识积累。安全边界设计是这一能力安全落地的关键——沙箱执行、分级权限和运行时护栏构成了从代码生成到执行的全链路防护。当 Agent 能够自主创建和验证工具时，它的能力边界就从"预定义工具覆盖范围"扩展到了"可编程解决问题空间"。