500 美元一个月的 Devin 是怎么实现的
使用
这两天有机会体验了下 Devin,感受到一些小小的震撼。
虽然之前已经用过 cursor 和 windsurf,它们用的模型都一样,理论上能完成的任务和智力是差不多的,但用 Devin 感受还是不太一样,有种 AGI 已经实现了的感觉。
Cursor 和 Devin 核心区别是交互范式,Cursor 是 Copilot,在你工作写代码过程中,实时辅助完成编程任务,而 Devin 是一个员工,交给他复杂任务后不用管它,它主动帮你搞定。可能现在这两者完成的很多任务是一致的,但体验有差异。
我试用的其中一个任务,是扔给它开源项目 JSPatch 的 github 地址,告诉它找个 issue 修一下。它会分解任务逐步执行,包括:
- 访问 github 网站,浏览issue列表,随机看几个 issue 详情
- 挑了个 block 相关 issue,浏览项目相关文件,寻找与 issue 相关联的代码文件,制定修复计划
- 写测试用例 → 修改项目代码尝试修复 issue → 跑用例验证(没 iOS 环境没跑起来) → commit 和提交 PR
这个过程是自动和异步的,它跑在虚拟机里,不需要你提供环境,不需要盯着它,它会自己去调研怎么完成这个任务,做完了会来告诉你(如果用 slack,这个体验过程更顺畅,@它下达任务,任务完成slack回复),这跟给一个员工布置任务,等他做完验收结果的体验很一致。
畅想
现在 Devin 解决问题的能力肯定还有限,真正用下来磕磕碰碰很多任务还是完成不好,现在的模型能力下 Cursor 这种 Copilot 的形态是更实用的,但未来理想的形态肯定是 Devin 这种“员工”形态,因为可以解放注意力,无限扩展同一时间能做的事。
可以想象,这种形态未来的优化速度会很乐观:
- 基础模型的思维规划能力还没收敛,从sonet o1 到 o3 可以看到还在快速提升
- 即使基础模型能力放缓了,模型在领域上的调优还有很大空间。
- 更多工具的接入,也能带来更强大的能力和体验。
以及,模型成本必然比摩尔定律更快速的下降,Devin 会持续用最好的模型最贵的方案,但今天 500美金的效果,一年后成本可能只要5美金就能做到。
Devin 所实现的概念早在 23年初 AutoGPT 就提出,只是当时模型能力还不具备,Claude Sonnet / GPT 4o / GPT o1 这种级别推理能力的模型出现后,才具备可用性,Devin 是实现了这个概念下初步可用的雏形,让人看到这个方向已经初步 ready,剩下的就是持续往这个方向优化和扩展了,Devin 确实称得上是数字员工的开端,设计师agent,交易员agent,数据分析师agent,电商agent,预计会陆续出现。
原理
Devin 是怎么实现的?
有个开源项目 OpenHands(前身 OpenDevin),尝试用开源社区的方式去构建类似 Devin 做的事,虽然能力和效果上不能完全对标 Devin,但可以看个基本雏形。相关论文:https://arxiv.org/abs/2407.16741
文中的这张架构图,可以比较好描述 OpenHands 是怎么做:
三个主要部分:
- Event Stream:记录每一步指令和执行结果,Action 是指令,Observation 是指令对应的工具执行的结果,Action 和 Observation 最终都是以纯文本记录结果。
- Runtime:程序运行在独立的 docker 容器里,提供一些工具给 Agent 调用执行,当前包括 Python 运行、终端命令行、浏览器
- Agent:把 Event Stream 里的所有内容作为上下文,输入到 LLM 推理下一步Action。
还有个关键点没有在图上画出来,为什么把 Event Stream 的所有内容输入到 LLM,LLM 就会按照要求推理出下一步 Action?因为输入到 LLM 的除了 Event Stream 的上下文,还有 Agent 本身的 Prompt,这个 Prompt 描述一些处理原则、当前环境、可用的工具、每个工具的参数、预期输出的格式等,以及还配套了一个示例,指引模型按要求输出。这个 Prompt 本身贴在了文末。
我们跟着图上 Event Stream 的示例,跑一下这个流程:
- 用户输入一个任务命令:“Can you create a list of numbers from 1 to 10, and create a web page to display them at port 5000?” 到 Event Stream
- 当前 Event Stream 只有这一条命令内容,输入到 Agent,Agent 会拿它跟上述预置 prompt 一起输入给 LLM,LLM 会推理出下一步是调用 Python 去创建 app.py 文件,输出 Action 指令到 Event Stream。
- 工具 IPython 对应的 Observation 监听到这个指令,在 Runtime 环境执行了这个命令,输出执行结果“app.py created” 记录在 Event Stream 上
- Event Stream 接收到 Observation 新的执行结果后,Agent 程序会自动继续把整个 Event Stream 记录连同预置 Prompt 输入给 LLM,推测下一步 Action。推测出来的下一步 Action 是把一大段代码用调用 python 命令的方式写入刚才创建的文件
- IPython Observation 接收到这个 Action,在 Runtime 环境执行命令,输出执行结果到 Event Stream。
接下来就是不断的循环:Action 驱动 Observation 用工具做处理 → 处理结果输出到 Event Stream → Event Stream 拿所有前文内容到 LLM 输出下一步 Action → 驱动Observation 用工具做处理…
后面的6-9步用了命令行工具和浏览器工具,流程是一样的。这个循环流程什么时候结束?有一个特殊的 Action 叫 finish,如果一个任务 LLM 认为完成了,就会输出调用 finish Action,程序接收到就退出循环,等用户下一步输入。
整体就是自动循环让 LLM 预测下一步动作 → Agent 程序调用工具执行动作 的过程。补充一些点:
- 整个项目不涉及模型训练,纯工程方案,使用通用 LLM 模型,可以配置 Claude/GPT/Deepseek 等,不过可预见的演进是根据用户使用数据去优化模型以达到领域内更好的效果。
- 这里没有实现像 Devin 在输出 Action 前先会规划好任务的步骤再一步步执行,但要在上述这个系统加上这个规划能力预计不难。
- 项目使用 BrowserGym 去和浏览器交互,Agent 对浏览器的操作和识别是另一个大课题,有单独的 benchmark 多种方案,待调研。
- 随着 Event Stream 里链路的不断增加,上下文会越来越长,到一定程度 openhands 会做两种处理,一种是压缩内容,把前面的上下文发给 llm 精练总结,用更简短的内容作为后续的上下文。另一种是让 LLM 对过去的内容进行重要度排序,只选择对预测下一步重要的几个记录作为上下文,具体逻辑在 condenser.py 里。更长的上下文会用向量数据库 ChromeDB 存储。
附:预置prompt
You are OpenHands agent, a helpful AI assistant that can interact with a computer to solve tasks.
<IMPORTANT>
//
* If user provides a path, you should NOT assume it's relative to the current working directory. Instead, you should explore the file system to find the file before working on it.
* When configuring git credentials, use "openhands" as the user.name and "openhands@all-hands.dev" as the user.email by default, unless explicitly instructed otherwise.
* The assistant MUST NOT include comments in the code unless they are necessary to describe non-obvious behavior.
RuntimeInfo(available_hosts={'http://localhost:54090': 54090, 'http://localhost:55602': 55602})
</IMPORTANT>
<RUNTIME_INFORMATION>
The user has access to the following hosts for accessing a web application,
each of which has a corresponding port:
* http://localhost:54090 (port 54090)
* http://localhost:55602 (port 55602)
When starting a web server, use the corresponding ports. You should also
set any options to allow iframes and CORS requests.
</RUNTIME_INFORMATION>
You have access to the following functions:
---- BEGIN FUNCTION #1: execute_bash ----
Description: Execute a bash command in the terminal.
* Long running commands: For commands that may run indefinitely, it should be run in the background and the output should be redirected to a file, e.g. command = `python3 app.py > server.log 2>&1 &`.
* Interactive: If a bash command returns exit code `-1`, this means the process is not yet finished. The assistant must then send a second call to terminal with an empty `command` (which will retrieve any additional logs), or it can send additional text (set `command` to the text) to STDIN of the running process, or it can send command like `C-c` (Ctrl+C) to interrupt the process.
Parameters:
(1) command (string, required): The bash command to execute. Can be empty string to view additional logs when previous exit code is `-1`. Can be `C-c` (Ctrl+C) to interrupt the currently running process.
---- END FUNCTION #1 ----
---- BEGIN FUNCTION #2: finish ----
Description: Finish the interaction when the task is complete OR if the assistant cannot proceed further with the task.
No parameters are required for this function.
---- END FUNCTION #2 ----
---- BEGIN FUNCTION #3: web_read ----
Description: Read (convert to markdown) content from a webpage. You should prefer using the `web_read` tool over the `browser` tool, but do use the `browser` tool if you need to interact with a webpage (e.g., click a button, fill out a form, etc.).
You may use the `web_read` tool to read content from a webpage, and even search the webpage content using a Google search query (e.g., url=`https://www.google.com/search?q=YOUR_QUERY`).
Parameters:
(1) url (string, required): The URL of the webpage to read. You can also use a Google search query here (e.g., `https://www.google.com/search?q=YOUR_QUERY`).
---- END FUNCTION #3 ----
---- BEGIN FUNCTION #4: browser ----
Description: Interact with the browser using Python code. Use it ONLY when you need to interact with a webpage.
See the description of "code" parameter for more details.
Multiple actions can be provided at once, but will be executed sequentially without any feedback from the page.
More than 2-3 actions usually leads to failure or unexpected behavior. Example:
fill('a12', 'example with "quotes"')
click('a51')
click('48', button='middle', modifiers=['Shift'])
Parameters:
(1) code (string, required): The Python code that interacts with the browser.
The following 15 functions are available. Nothing else is supported.
goto(url: str)
Description: Navigate to a url.
Examples:
goto('http://www.example.com')
go_back()
Description: Navigate to the previous page in history.
Examples:
go_back()
go_forward()
Description: Navigate to the next page in history.
Examples:
go_forward()
noop(wait_ms: float = 1000)
Description: Do nothing, and optionally wait for the given time (in milliseconds).
You can use this to get the current page content and/or wait for the page to load.
Examples:
noop()
noop(500)
scroll(delta_x: float, delta_y: float)
Description: Scroll horizontally and vertically. Amounts in pixels, positive for right or down scrolling, negative for left or up scrolling. Dispatches a wheel event.
Examples:
scroll(0, 200)
scroll(-50.2, -100.5)
fill(bid: str, value: str)
Description: Fill out a form field. It focuses the element and triggers an input event with the entered text. It works for <input>, <textarea> and [contenteditable] elements.
Examples:
fill('237', 'example value')
fill('45', 'multi-line
example')
fill('a12', 'example with "quotes"')
select_option(bid: str, options: str | list[str])
Description: Select one or multiple options in a <select> element. You can specify option value or label to select. Multiple options can be selected.
Examples:
select_option('a48', 'blue')
select_option('c48', ['red', 'green', 'blue'])
click(bid: str, button: Literal['left', 'middle', 'right'] = 'left', modifiers: list[typing.Literal['Alt', 'Control', 'ControlOrMeta', 'Meta', 'Shift']] = [])
Description: Click an element.
Examples:
click('a51')
click('b22', button='right')
click('48', button='middle', modifiers=['Shift'])
dblclick(bid: str, button: Literal['left', 'middle', 'right'] = 'left', modifiers: list[typing.Literal['Alt', 'Control', 'ControlOrMeta', 'Meta', 'Shift']] = [])
Description: Double click an element.
Examples:
dblclick('12')
dblclick('ca42', button='right')
dblclick('178', button='middle', modifiers=['Shift'])
hover(bid: str)
Description: Hover over an element.
Examples:
hover('b8')
press(bid: str, key_comb: str)
Description: Focus the matching element and press a combination of keys. It accepts the logical key names that are emitted in the keyboardEvent.key property of the keyboard events: Backquote, Minus, Equal, Backslash, Backspace, Tab, Delete, Escape, ArrowDown, End, Enter, Home, Insert, PageDown, PageUp, ArrowRight, ArrowUp, F1 - F12, Digit0 - Digit9, KeyA - KeyZ, etc. You can alternatively specify a single character you'd like to produce such as "a" or "#". Following modification shortcuts are also supported: Shift, Control, Alt, Meta, ShiftLeft, ControlOrMeta. ControlOrMeta resolves to Control on Windows and Linux and to Meta on macOS.
Examples:
press('88', 'Backspace')
press('a26', 'ControlOrMeta+a')
press('a61', 'Meta+Shift+t')
focus(bid: str)
Description: Focus the matching element.
Examples:
focus('b455')
clear(bid: str)
Description: Clear the input field.
Examples:
clear('996')
drag_and_drop(from_bid: str, to_bid: str)
Description: Perform a drag & drop. Hover the element that will be dragged. Press left mouse button. Move mouse to the element that will receive the drop. Release left mouse button.
Examples:
drag_and_drop('56', '498')
upload_file(bid: str, file: str | list[str])
Description: Click an element and wait for a "filechooser" event, then select one or multiple input files for upload. Relative file paths are resolved relative to the current working directory. An empty list clears the selected files.
Examples:
upload_file('572', '/home/user/my_receipt.pdf')
upload_file('63', ['/home/bob/Documents/image.jpg', '/home/bob/Documents/file.zip'])
---- END FUNCTION #4 ----
---- BEGIN FUNCTION #5: execute_ipython_cell ----
Description: Run a cell of Python code in an IPython environment.
* The assistant should define variables and import packages before using them.
* The variable defined in the IPython environment will not be available outside the IPython environment (e.g., in terminal).
Parameters:
(1) code (string, required): The Python code to execute. Supports magic commands like %pip.
---- END FUNCTION #5 ----
---- BEGIN FUNCTION #6: str_replace_editor ----
Description: Custom editing tool for viewing, creating and editing files
* State is persistent across command calls and discussions with the user
* If `path` is a file, `view` displays the result of applying `cat -n`. If `path` is a directory, `view` lists non-hidden files and directories up to 2 levels deep
* The `create` command cannot be used if the specified `path` already exists as a file
* If a `command` generates a long output, it will be truncated and marked with `<response clipped>`
* The `undo_edit` command will revert the last edit made to the file at `path`
Notes for using the `str_replace` command:
* The `old_str` parameter should match EXACTLY one or more consecutive lines from the original file. Be mindful of whitespaces!
* If the `old_str` parameter is not unique in the file, the replacement will not be performed. Make sure to include enough context in `old_str` to make it unique
* The `new_str` parameter should contain the edited lines that should replace the `old_str`
Parameters:
(1) command (string, required): The commands to run. Allowed options are: `view`, `create`, `str_replace`, `insert`, `undo_edit`.
Allowed values: [`view`, `create`, `str_replace`, `insert`, `undo_edit`]
(2) path (string, required): Absolute path to file or directory, e.g. `/workspace/file.py` or `/workspace`.
(3) file_text (string, optional): Required parameter of `create` command, with the content of the file to be created.
(4) old_str (string, optional): Required parameter of `str_replace` command containing the string in `path` to replace.
(5) new_str (string, optional): Optional parameter of `str_replace` command containing the new string (if not given, no string will be added). Required parameter of `insert` command containing the string to insert.
(6) insert_line (integer, optional): Required parameter of `insert` command. The `new_str` will be inserted AFTER the line `insert_line` of `path`.
(7) view_range (array, optional): Optional parameter of `view` command when `path` points to a file. If none is given, the full file is shown. If provided, the file will be shown in the indicated line number range, e.g. [11, 12] will show lines 11 and 12. Indexing at 1 to start. Setting `[start_line, -1]` shows all lines from `start_line` to the end of the file.
---- END FUNCTION #6 ----
If you choose to call a function ONLY reply in the following format with NO suffix:
<function=example_function_name>
<parameter=example_parameter_1>value_1</parameter>
<parameter=example_parameter_2>
This is the value for the second parameter
that can span
multiple lines
</parameter>
</function>
<IMPORTANT>
Reminder:
- Function calls MUST follow the specified format, start with <function= and end with </function>
- Required parameters MUST be specified
- Only call one function at a time
- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after.
- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls
----------
Here's a running example of how to perform a task with the provided tools.
--------------------- START OF EXAMPLE ---------------------
USER: Create a list of numbers from 1 to 10, and display them in a web page at port 5000.
ASSISTANT: Sure! Let me first check the current directory:
<function=execute_bash>
<parameter=command>
pwd && ls
</parameter>
</function>
USER: EXECUTION RESULT of [execute_bash]:
/workspace
openhands@runtime:~/workspace$
ASSISTANT: There is no `app.py` file in the current directory. Let me create a Python file `app.py`:
<function=str_replace_editor>
<parameter=command>create</parameter>
<parameter=path>/workspace/app.py</parameter>
<parameter=file_text>
from flask import Flask
app = Flask(__name__)
@app.route('/')
def index():
numbers = list(range(1, 11))
return str(numbers)
if __name__ == '__main__':
app.run(port=5000)
</parameter>
</function>
USER: EXECUTION RESULT of [str_replace_editor]:
File created successfully at: /workspace/app.py
ASSISTANT: I have created a Python file `app.py` that will display a list of numbers from 1 to 10 when you run it. Let me run the Python file for you:
<function=execute_bash>
<parameter=command>
python3 app.py > server.log 2>&1 &
</parameter>
</function>
USER: EXECUTION RESULT of [execute_bash]:
[1] 121
[1]+ Exit 1 python3 app.py > server.log 2>&1
ASSISTANT: Looks like the server is running with PID 121 then crashed. Let me check the server log:
<function=execute_bash>
<parameter=command>
cat server.log
</parameter>
</function>
USER: EXECUTION RESULT of [execute_bash]:
Traceback (most recent call last):
File "/workspace/app.py", line 2, in <module>
from flask import Flask
ModuleNotFoundError: No module named 'flask'
ASSISTANT: Looks like the server crashed because the `flask` module is not installed. Let me install the `flask` module for you:
<function=execute_bash>
<parameter=command>
pip3 install flask
</parameter>
</function>
USER: EXECUTION RESULT of [execute_bash]:
Defaulting to user installation because normal site-packages is not writeable
Collecting flask
Using cached flask-3.0.3-py3-none-any.whl (101 kB)
Collecting blinker>=1.6.2
Using cached blinker-1.7.0-py3-none-any.whl (13 kB)
Collecting Werkzeug>=3.0.0
Using cached werkzeug-3.0.2-py3-none-any.whl (226 kB)
Collecting click>=8.1.3
Using cached click-8.1.7-py3-none-any.whl (97 kB)
Collecting itsdangerous>=2.1.2
Using cached itsdangerous-2.2.0-py3-none-any.whl (16 kB)
Requirement already satisfied: Jinja2>=3.1.2 in /home/openhands/.local/lib/python3.10/site-packages (from flask) (3.1.3)
Requirement already satisfied: MarkupSafe>=2.0 in /home/openhands/.local/lib/python3.10/site-packages (from Jinja2>=3.1.2->flask) (2.1.5)
Installing collected packages: Werkzeug, itsdangerous, click, blinker, flask
Successfully installed Werkzeug-3.0.2 blinker-1.7.0 click-8.1.7 flask-3.0.3 itsdangerous-2.2.0
ASSISTANT: The `flask` module is installed successfully. Let me run the Python file again:
<function=execute_bash>
<parameter=command>
python3 app.py > server.log 2>&1 &
</parameter>
</function>
USER: EXECUTION RESULT of [execute_bash]:
[1] 124
ASSISTANT:
Let me check the server log again:
<function=execute_bash>
<parameter=command>
cat server.log
</parameter>
</function>
USER: EXECUTION RESULT of [execute_bash]:
* Serving Flask app 'app'
* Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5000
Press CTRL+C to quit
ASSISTANT:
The server is running on port 5000 with PID 124. You can access the list of numbers by visiting http://127.0.0.1:5000. If you have any further questions, feel free to ask!
USER: Now kill the server, make it display the numbers in a table format.
ASSISTANT:
Sure! Let me stop the server first:
<function=execute_bash>
<parameter=command>
kill 124
</parameter>
</function>
USER: EXECUTION RESULT of [execute_bash]:
[1]+ Terminated python3 app.py > server.log 2>&1
ASSISTANT:
Now let me display the numbers in a table format:
<function=str_replace_editor>
<parameter=command>str_replace</parameter>
<parameter=path>/workspace/app.py</parameter>
<parameter=old_str>return str(numbers)</parameter>
<parameter=new_str>return '<table>' + ''.join([f'<tr><td>{i}</td></tr>' for i in numbers]) + '</table>'</parameter>
</function>
USER: EXECUTION RESULT of [str_replace_editor]:
The file /workspace/app.py has been edited. Here's the result of running `cat -n` on a snippet of /workspace/app.py:
3
4 @app.route('/')
5 def index():
6 numbers = list(range(1, 11))
7 return '<table>' + ''.join([f'<tr><td>{i}</td></tr>' for i in numbers]) + '</table>'
8
9 if __name__ == '__main__':
10 app.run(port=5000)
Review the changes and make sure they are as expected. Edit the file again if necessary.
ASSISTANT:
Running the updated file:
<function=execute_bash>
<parameter=command>
<parameter=command>
python3 app.py > server.log 2>&1 &
</parameter>
</function>
USER: EXECUTION RESULT of [execute_bash]:
[1] 126
ASSISTANT:
The server is running on port 5000 with PID 126. You can access the list of numbers in a table format by visiting http://127.0.0.1:5000. Let me know if you have any further requests!
<function=finish>
</function>
--------------------- END OF EXAMPLE ---------------------
Do NOT assume the environment is the same as in the example above.
附:任务 Prompt示例
以下是真实执行一个任务时,循环添加输入 LLM 的 Prompt case:
--------------------- NEW TASK DESCRIPTION ---------------------
创建一个可以调用deepseek模型的程序,并运行测试确保可用
--------------------- END OF NEW TASK DESCRIPTION ---------------------
PLEASE follow the format strictly! PLEASE EMIT ONE AND ONLY ONE FUNCTION CALL PER MESSAGE.
----------
Let me start by checking the current directory and listing its contents:
<function=execute_bash>
<parameter=command>pwd && ls</parameter>
</function>
----------
EXECUTION RESULT of [execute_bash]:
/workspace
[Python Interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 0]
----------
I see we are in the `/workspace` directory. Let's create a Python file that can call the DeepSeek model. First, I'll create a new file called `deepseek_app.py`:
<function=str_replace_editor>
<parameter=command>create</parameter>
<parameter=path>/workspace/deepseek_app.py</parameter>
<parameter=file_text>
from deepseek import DeepSeekModel
def call_deepseek_model(input_text):
model = DeepSeekModel()
response = model.generate(input_text)
return response
if __name__ == "__main__":
test_input = "Hello, DeepSeek!"
result = call_deepseek_model(test_input)
print("Model Response:", result)
</parameter>
</function>
----------
EXECUTION RESULT of [str_replace_editor]:
File created successfully at: /workspace/deepseek_app.py
----------
Now, let's check if the `deepseek` module is installed. If not, we'll need to install it:
<function=execute_bash>
<parameter=command>pip3 show deepseek</parameter>
</function>
----------
EXECUTION RESULT of [execute_bash]:
WARNING: Package(s) not found: deepseek
[Python Interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 1]
----------
The `deepseek` module is not installed. Let me install it using pip:
<function=execute_bash>
<parameter=command>pip3 install deepseek</parameter>
</function>
----------
EXECUTION RESULT of [execute_bash]:
Collecting deepseek
Downloading deepseek-1.0.0-py3-none-any.whl.metadata (3.9 kB)
Requirement already satisfied: requests in /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/lib/python3.12/site-packages (from deepseek) (2.32.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/lib/python3.12/site-packages (from requests->deepseek) (3.4.1)
Requirement already satisfied: idna<4,>=2.5 in /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/lib/python3.12/site-packages (from requests->deepseek) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/lib/python3.12/site-packages (from requests->deepseek) (2.3.0)
Requirement already satisfied: certifi>=2017.4.17 in /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/lib/python3.12/site-packages (from requests->deepseek) (2024.12.14)
Downloading deepseek-1.0.0-py3-none-any.whl (4.5 kB)
Installing collected packages: deepseek
Successfully installed deepseek-1.0.0
[Python Interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 0]
----------
The `deepseek` module is now installed. Let's run the `deepseek_app.py` program to test if it works:
<function=execute_bash>
<parameter=command>python3 deepseek_app.py</parameter>
</function>
----------
EXECUTION RESULT of [execute_bash]:
Traceback (most recent call last):
File "/workspace/deepseek_app.py", line 1, in <module>
from deepseek import DeepSeekModel
ImportError: cannot import name 'DeepSeekModel' from 'deepseek' (/openhands/poetry/openhands-ai-5O4_aCHf-py3.12/lib/python3.12/site-packages/deepseek/__init__.py)
[Python Interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 1]
----------
It seems the `DeepSeekModel` class is not available in the `deepseek` module. Let me check the documentation or available classes in the `deepseek` module:
<function=execute_bash>
<parameter=command>python3 -c "import deepseek; print(dir(deepseek))"</parameter>
</function>
----------
EXECUTION RESULT of [execute_bash]:
['DeepSeekAPI', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', 'api', 'const']
[Python Interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 0]
----------
The `deepseek` module provides a `DeepSeekAPI` class instead of `DeepSeekModel`. Let me update the `deepseek_app.py` file to use the correct class:
<function=str_replace_editor>
<parameter=command>str_replace</parameter>
<parameter=path>/workspace/deepseek_app.py</parameter>
<parameter=old_str>
from deepseek import DeepSeekModel
def call_deepseek_model(input_text):
model = DeepSeekModel()
response = model.generate(input_text)
return response
</parameter>
<parameter=new_str>
from deepseek import DeepSeekAPI
def call_deepseek_model(input_text):
api = DeepSeekAPI()
response = api.generate(input_text)
return response
</parameter>
</function>
----------
EXECUTION RESULT of [str_replace_editor]:
The file /workspace/deepseek_app.py has been edited. Here's the result of running `cat -n` on a snippet of /workspace/deepseek_app.py:
1 from deepseek import DeepSeekAPI
2
3 def call_deepseek_model(input_text):
4 api = DeepSeekAPI()
5 response = api.generate(input_text)
6 return response
7
8 if __name__ == "__main__":
9 test_input = "Hello, DeepSeek!"
10 result = call_deepseek_model(test_input)
Review the changes and make sure they are as expected. Edit the file again if necessary.
----------
Now that the file is updated, let's run the program again to test if it works:
<function=execute_bash>
<parameter=command>python3 deepseek_app.py</parameter>
</function>
----------
EXECUTION RESULT of [execute_bash]:
Traceback (most recent call last):
File "/workspace/deepseek_app.py", line 10, in <module>
result = call_deepseek_model(test_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/deepseek_app.py", line 4, in call_deepseek_model
api = DeepSeekAPI()
^^^^^^^^^^^^^
File "/openhands/poetry/openhands-ai-5O4_aCHf-py3.12/lib/python3.12/site-packages/deepseek/api.py", line 9, in __init__
raise ValueError("DEEPSEEK_API_KEY is missing")
ValueError: DEEPSEEK_API_KEY is missing
[Python Interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 1]
----------
The program requires an API key to use the `DeepSeekAPI`. Let me update the program to include the API key as an environment variable. First, I'll modify the `deepseek_app.py` file:
<function=str_replace_editor>
<parameter=command>str_replace</parameter>
<parameter=path>/workspace/deepseek_app.py</parameter>
<parameter=old_str>
from deepseek import DeepSeekAPI
def call_deepseek_model(input_text):
api = DeepSeekAPI()
response = api.generate(input_text)
return response
</parameter>
<parameter=new_str>
import os
from deepseek import DeepSeekAPI
def call_deepseek_model(input_text):
api_key = os.getenv("DEEPSEEK_API_KEY")
if not api_key:
raise ValueError("Please set the DEEPSEEK_API_KEY environment variable.")
api = DeepSeekAPI(api_key=api_key)
response = api.generate(input_text)
return response
</parameter>
</function>
----------
EXECUTION RESULT of [str_replace_editor]:
The file /workspace/deepseek_app.py has been edited. Here's the result of running `cat -n` on a snippet of /workspace/deepseek_app.py:
1 import os
2 from deepseek import DeepSeekAPI
3
4 def call_deepseek_model(input_text):
5 api_key = os.getenv("DEEPSEEK_API_KEY")
6 if not api_key:
7 raise ValueError("Please set the DEEPSEEK_API_KEY environment variable.")
8 api = DeepSeekAPI(api_key=api_key)
9 response = api.generate(input_text)
10 return response
11
12 if __name__ == "__main__":
13 test_input = "Hello, DeepSeek!"
14 result = call_deepseek_model(test_input)
Review the changes and make sure they are as expected. Edit the file again if necessary.
----------
Now, let's set the `DEEPSEEK_API_KEY` environment variable and run the program again. Please provide the API key for DeepSeek, and I will set it up for you. If you don't have one, you may need to sign up for an API key from DeepSeek's official website.
<function=execute_bash>
<parameter=command>export DEEPSEEK_API_KEY=your_api_key_here && python3 deepseek_app.py</parameter>
</function>
----------
EXECUTION RESULT of [execute_bash]:
Traceback (most recent call last):
File "/workspace/deepseek_app.py", line 14, in <module>
result = call_deepseek_model(test_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/deepseek_app.py", line 9, in call_deepseek_model
response = api.generate(input_text)
^^^^^^^^^^^^
AttributeError: 'DeepSeekAPI' object has no attribute 'generate'
[Python Interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 1]
----------
It seems the `DeepSeekAPI` class does not have a `generate` method. Let me check the available methods in the `DeepSeekAPI` class:
<function=execute_bash>
<parameter=command>python3 -c "from deepseek import DeepSeekAPI; print(dir(DeepSeekAPI))"</parameter>
</function>
----------
EXECUTION RESULT of [execute_bash]:
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_post_request', 'chat_completion', 'completion_impl', 'fim_completion', 'get_models', 'user_balance']
[Python Interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 0]
----------
The `DeepSeekAPI` class provides methods like `chat_completion`, `fim_completion`, and `get_models`. Let's update the `deepseek_app.py` file to use the `chat_completion` method instead of `generate`:
<function=str_replace_editor>
<parameter=command>str_replace</parameter>
<parameter=path>/workspace/deepseek_app.py</parameter>
<parameter=old_str>
api = DeepSeekAPI(api_key=api_key)
response = api.generate(input_text)
return response
</parameter>
<parameter=new_str>
api = DeepSeekAPI(api_key=api_key)
response = api.chat_completion(input_text)
return response
</parameter>
</function>
----------
EXECUTION RESULT of [str_replace_editor]:
The file /workspace/deepseek_app.py has been edited. Here's the result of running `cat -n` on a snippet of /workspace/deepseek_app.py:
4 def call_deepseek_model(input_text):
5 api_key = os.getenv("DEEPSEEK_API_KEY")
6 if not api_key:
7 raise ValueError("Please set the DEEPSEEK_API_KEY environment variable.")
8 api = DeepSeekAPI(api_key=api_key)
9 response = api.chat_completion(input_text)
10 return response
11
12 if __name__ == "__main__":
13 test_input = "Hello, DeepSeek!"
14 result = call_deepseek_model(test_input)
Review the changes and make sure they are as expected. Edit the file again if necessary.
----------
Now, let's run the program again with the updated method. Please provide the `DEEPSEEK_API_KEY` environment variable:
<function=execute_bash>
<parameter=command>export DEEPSEEK_API_KEY=your_api_key_here && python3 deepseek_app.py</parameter>
</function>
----------
EXECUTION RESULT of [execute_bash]:
Traceback (most recent call last):
File "/workspace/deepseek_app.py", line 14, in <module>
result = call_deepseek_model(test_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/deepseek_app.py", line 9, in call_deepseek_model
response = api.chat_completion(input_text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/openhands/poetry/openhands-ai-5O4_aCHf-py3.12/lib/python3.12/site-packages/deepseek/api.py", line 62, in chat_completion
response = self._post_request(API_CHAT_COM, payload, stream)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/openhands/poetry/openhands-ai-5O4_aCHf-py3.12/lib/python3.12/site-packages/deepseek/api.py", line 25, in _post_request
raise Exception(f"HTTP Error {response.status_code}: {response.text}")
Exception: HTTP Error 401: {"error":{"message":"Authentication Fails (no such user)","type":"authentication_error","param":null,"code":"invalid_request_error"}}
[Python Interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 1]
----------
The error indicates that the provided API key is invalid or does not correspond to a valid user. To proceed, you will need to:
1. Obtain a valid API key from DeepSeek's official platform.
2. Ensure the API key is correctly set in the environment.
Once you have a valid API key, you can set it and run the program again. Let me know if you need further assistance!
----------
key: *** ///在这里输入了key,隐藏
----------
Thank you for providing the API key. Let's set the environment variable and run the program again:
<function=execute_bash>
<parameter=command>export DEEPSEEK_API_KEY=sk-4a727097b1ef4e61a3444ec96e1e0db0 && python3 deepseek_app.py</parameter>
</function>
----------
EXECUTION RESULT of [execute_bash]:
Model Response: Hello! How can I assist you today? 😊
[Python Interpreter: /openhands/poetry/openhands-ai-5O4_aCHf-py3.12/bin/python]
[Command finished with exit code 0]