计算机的核心从 CPU 转向 GPU,上个时代依靠程序员写代码指挥 CPU 执行指令解决问题,构成了现在庞大的 IT 产业,程序员是中心。现在的时代逐渐转变,GPU 生产的 token 逐渐能解决越来越多的问题,能思考,能生成代码指挥 CPU 去执行解决问题,计算的核心一定会转向 GPU,世界对 GPU 的需求只会越来越高。
给 AI 分了四个阶段,Perception AI → Generative AI → Agentic AI → Physical AI,不是很认同,Agentic 和 Physical 都是 Generative AI 的延续,不过无所谓,可以看到 Agentic 这个概念实在是火爆。
Scaling Law 没有停止,Agentic AI 需要深度思考,深度思考有新的 Test-time Scaling Law,越多的 token 输出效果越好需,要多轮理解和工具调用对 token 的消耗更是指数级上涨。
Physical AI 要更好地理解现实世界,声音/视觉/触感,都会比纯文本思考对 token 消耗的诉求更高,像 2G 时代看文字新闻,3G 4G 图片,5G 视频一样。
多个 session 都在推广 NVidia 的 Video Search and Summarization Agent,串联从视频的获取→分割→VLM识别、CV物体识别和跟踪→数据处理存储和RAG召回→用户对话 整个流程,做到可以对视频提供实时分析和报警,也可以自然语言交互查询视频内容,边缘部署,适合用于监控,算是用 NVidia 技术栈做 AI 应用的一个标杆范例。
AIGC
关注了下视频 AIGC 相关的几个 Session
在好莱坞干了几十年的视觉效果的 Ed Ulbrich 开了个公司 Metaphysic,以前的电影特效制作成本巨大,对人的处理还很难跨过恐怖谷,而基于 AI 技术做特效,用完全不同的技术栈,效果好成本低,是一种颠覆。metaphysic 给娱乐行业提供人脸替换、数字人的服务,看起来是用的 GAN,在人物换脸技术上,GAN 还是更能做到稳定和实时,特别是实时这个点,基于 diffusion 很难做到。基于市场需求,利用已有的不同技术(甚至是上一代技术)深入解决问题,是有空间的。
GPT-4o image generation is now available in ChatGPT. The new image generation model, which replaces DALL-E 3, is most notable for its accurate text rendering, improved "binding" capabilities, and ease of use.
只能怀疑是驱动问题了,毕竟 GTX1080 这个老显卡是诞生得比挖矿潮还早。但没有办法了,我之前把系统升级后(从 Linux Mint 20.3升级到 Linux Mint 22.1,基于Ubuntu20.04到Ubuntu24.04),驱动版本还是550,CUDA也只是从12.2升级到了12.4而已,而最新版本貌似是12.8。我也不清楚到底是 GTX1080 只能装这么高,还是有什么其他限制。Nvidia 官网倒是有一个 575版本,Linux Mint官方没说正式支持,我也不想试。毕竟即使跑 comfyui 的默认模型,显卡的噪音都超大,温度直达 70 度以上(风扇57%转速,再提高那噪音就真的难以忍受了),而仍然要等很久才能出一张图。
但模型本身目前解决不了所有问题,还需要工程上的一些策略和串联做优化。例如 Tree of Thought 让任务不是以线性一步步执行的形式,而是生成解决问题的多个节点,多角度思考问题,形成树结构的任务,评估节点的价值,在里面寻找最优解。 Reflexion 会有 Evaluator 对各种反馈(工具调用结果/模型输出/用户指令)进行反思,梳理改进方向,也会把反思结果作为知识库经验,指导后续的任务。
OpenAI continues to add new features to ChatGPT, with the arrival of ChatGPT 4.5 one of the biggest recent updates. However, a small recent change has the potential to have a significant impact.
期间甚至把整个操作系统都从 Linux Mint 20.3 升级到了 Linux Mint 22.1 (跳过了1个大版本4个小版本,或者说2个大版本3个小版本?),写了一堆 nemo actions,写了一堆脚本,更新了本地的各种应用工具,测试了一堆AI相关的工具,写了一个 userscript,甚至还把 danielgatis/rembg 常年都修不好的GPU加速给修了,还做了 DOCKER GPU 加速版本,你说我也不会 Python 啊竟然还做出了这种事,想想真是疯狂。
While silly Monty Python songs about processed meat may be fun, unwanted marketing calls and emails are not. I'm sick of getting calls from unrecognized numbers, only for them to turn out to be telling me that I've been in an accident that wasn't my fault. I thought there may be a way the Shortcuts app could help.
What if you could have an AI assistant tailored specifically to your needs? Imagine a version of ChatGPT that always remembers your specific preferences, knows everything about your favorite hobbies, or can adopt a unique personality for specific interactions? That's exactly what custom GPTs allow you to do.
Claude 3.7 Sonnet 在万众期待中推出了,为什么期待,因为从 Claude 3.5 Sonnet 发布后,一直是AI Coding Agent 领域最好的模型,综合效果没有对手,后面陆续推出的 o1/o3/DeepSeek 都没能撼动,更让人期待 Claude 3.7 Sonnet 在 AI Coding 领域能不能有进一步提升。
Claude 3.7 放出来的 Benchmark 里,有两个是跟 AI Coding Agent 表现强相关的:
SWE-bench Verified: Claude 3.7 Sonnet scores 62.3% out of 500 problems using pass@1 with bash/editor tools plus a "thinking tool" for single-attempt patches — no additional test-time compute used. 70.3% score uses internal scoring and custom scaffold on a reduced subset of problems. Scaffold Deepseek R1 results use the "Agentless" framework. OpenAI results from o3-mini system card cover a different subset of problems with a custom compute.
Claude 没有具体说明怎么做检索代码的,官方blog附录里提到在运行环境上是用了极简的方案,只提供了命令和编辑文件的能力,看起来是只把代码仓库目录扔给 LLM,让它自行去做文件搜索。另外有提到 Deepseek 的跑分是基于 Agentless 框架跑的,Agentless 专门介绍说明了如何跑 SWE-Bench,以及具体是怎么做代码检索的。见 Agentless/README_swebench.md。
{
"annotator": 0,
"user_id": "omar_rossi_1241",
"instruction": "Your user id is omar_rossi_1241. For your upcoming trip from New York to Chicago, you want to change the passenger to yourself, upgrade it to economy class, and have 3 checked bags. You prefer gift card payment. Your birthday is in your user profile so you do not prefer to provide it. You are reactive to the agent and will not say anything that is not asked.",
"actions": [
{
"name": "update_reservation_flights",
"arguments": {
"reservation_id": "FQ8APE",
"cabin": "economy",
"flights": [
{
"flight_number": "HAT056",
"date": "2024-05-25",
},
{
"flight_number": "HAT138",
"date": "2024-05-25",
},
],
"payment_id": "gift_card_8190333",
},
},
{
"name": "update_reservation_passengers",
"arguments": {
"reservation_id": "FQ8APE",
"passengers": [
{
"first_name": "Omar",
"last_name": "Rossi",
"dob": "1970-06-06",
}
],
},
},
{
"name": "update_reservation_baggages",
"arguments": {
"reservation_id": "FQ8APE",
"total_baggages": 3,
"nonfree_baggages": 0,
"payment_id": "gift_card_8190333",
},
},
],
},
转化后这个case实际跟大模型交互的过程:
{"traj": [
{
"role": "system",
"content": "# Airline Agent Policy\n\nThe current time is 2024-05-15 15:00:00 EST.\n\nAs an airline agent, you can help users book, modify, or cancel flight reservations.\n\n- Before taking any actions that update the booking database (booking, modifying flights, editing baggage, upgrading cabin class, or updating passenger information), you must list the action details and obtain explicit user confirmation (yes) to proceed.\n\n- You should not provide any information, knowledge, or procedures not provided by the user or available tools, or give subjective recommendations or comments.\n\n- You should only make one tool call at a time, and if you make a tool call, you should not respond to the user simultaneously. If you respond to the user, you should not make a tool call at the same time.\n\n- You should deny user requests that are against this policy.\n\n- You should transfer the user to a human agent if and only if the request cannot be handled within the scope of your actions.\n\n## Domain Basic\n\n- Each user has a profile containing user id, email, addresses, date of birth, payment methods, reservation numbers, and membership tier.\n\n- Each reservation has an reservation id, user id, trip type (one way, round trip), flights, passengers, payment methods, created time, baggages, and travel insurance information.\n\n- Each flight has a flight number, an origin, destination, scheduled departure and arrival time (local time), and for each date:\n - If the status is "available", the flight has not taken off, available seats and prices are listed.\n - If the status is "delayed" or "on time", the flight has not taken off, cannot be booked.\n - If the status is "flying", the flight has taken off but not landed, cannot be booked.\n\n## Book flight\n\n- The agent must first obtain the user id, then ask for the trip type, origin, destination.\n\n- Passengers: Each reservation can have at most five passengers. The agent needs to collect the first name, last name, and date of birth for each passenger. All passengers must fly the same flights in the same cabin.\n\n- Payment: each reservation can use at most one travel certificate, at most one credit card, and at most three gift cards. The remaining amount of a travel certificate is not refundable. All payment methods must already be in user profile for safety reasons.\n\n- Checked bag allowance: If the booking user is a regular member, 0 free checked bag for each basic economy passenger, 1 free checked bag for each economy passenger, and 2 free checked bags for each business passenger. If the booking user is a silver member, 1 free checked bag for each basic economy passenger, 2 free checked bag for each economy passenger, and 3 free checked bags for each business passenger. If the booking user is a gold member, 2 free checked bag for each basic economy passenger, 3 free checked bag for each economy passenger, and 3 free checked bags for each business passenger. Each extra baggage is 50 dollars.\n\n- Travel insurance: the agent should ask if the user wants to buy the travel insurance, which is 30 dollars per passenger and enables full refund if the user needs to cancel the flight given health or weather reasons.\n\n## Modify flight\n\n- The agent must first obtain the user id and the reservation id.\n\n- Change flights: Basic economy flights cannot be modified. Other reservations can be modified without changing the origin, destination, and trip type. Some flight segments can be kept, but their prices will not be updated based on the current price. The API does not check these for the agent, so the agent must make sure the rules apply before calling the API!\n\n- Change cabin: all reservations, including basic economy, can change cabin without changing the flights. Cabin changes require the user to pay for the difference between their current cabin and the new cabin class. Cabin class must be the same across all the flights in the same reservation; changing cabin for just one flight segment is not possible.\n\n- Change baggage and insurance: The user can add but not remove checked bags. The user cannot add insurance after initial booking.\n\n- Change passengers: The user can modify passengers but cannot modify the number of passengers. This is something that even a human agent cannot assist with.\n\n- Payment: If the flights are changed, the user needs to provide one gift card or credit card for payment or refund method. The agent should ask for the payment or refund method instead.\n\n## Cancel flight\n\n- The agent must first obtain the user id, the reservation id, and the reason for cancellation (change of plan, airline cancelled flight, or other reasons)\n\n- All reservations can be cancelled within 24 hours of booking, or if the airline cancelled the flight. Otherwise, basic economy or economy flights can be cancelled only if travel insurance is bought and the condition is met, and business flights can always be cancelled. The rules are strict regardless of the membership status. The API does not check these for the agent, so the agent must make sure the rules apply before calling the API!\n\n- The agent can only cancel the whole trip that is not flown. If any of the segments are already used, the agent cannot help and transfer is needed.\n\n- The refund will go to original payment methods in 5 to 7 business days.\n\n## Refund\n\n- If the user is silver/gold member or has travel insurance or flies business, and complains about cancelled flights in a reservation, the agent can offer a certificate as a gesture after confirming the facts, with the amount being $100 times the number of passengers.\n\n- If the user is silver/gold member or has travel insurance or flies business, and complains about delayed flights in a reservation and wants to change or cancel the reservation, the agent can offer a certificate as a gesture after confirming the facts and changing or cancelling the reservation, with the amount being $50 times the number of passengers.\n\n- Do not proactively offer these unless the user complains about the situation and explicitly asks for some compensation. Do not compensate if the user is regular member and has no travel insurance and flies (basic) economy.\n"
},
{
"role": "user",
"content": "Hi! I'd like to make a few changes to my trip from New York to Chicago. Can you help me with that?"
},
{
"role": "assistant",
"content": "Of course! I can help you modify your trip. Could you please provide your user ID and the reservation ID for the trip you want to change?",
},
{
"role": "user",
"content": "Sure! My user ID is omar_rossi_1241. But I'm not sure about the reservation ID at the moment."
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"function": {
"arguments": "{"user_id":"omar_rossi_1241"}",
"name": "get_user_details"
},
"id": "call_FXi5dyufwOlkHksVgNwVhhVB",
"type": "function"
}
]
},
{
"role": "assistant",
"content": "I found your reservations. Could you please confirm which reservation ID you'd like to modify for your trip from New York to Chicago? Here are your reservation IDs:\n\n- UM3OG5\n- 5RJ7UH\n- FQ8APE\n- QKRY03\n\nLet me know which one you'd like to modify!",
},
{
"role": "user",
"content": "I'm not entirely certain, but I believe it might be the first one, UM3OG5."
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"function": {
"arguments": "{"reservation_id":"UM3OG5"}",
"name": "get_reservation_details"
},
"id": "call_FApEDaUHdL2hx8FNbu5UCMb8",
"type": "function"
}
]
},
{
"role": "assistant",
"content": "The reservation ID "UM3OG5" is for a round trip from Seattle (SEA) to Dallas/Fort Worth (DFW), not from New York to Chicago. \n\nCould you please check if any of the other reservation IDs match your trip from New York to Chicago? Here are the remaining reservation IDs:\n\n- 5RJ7UH\n- FQ8APE\n- QKRY03\n\nLet me know which one you'd like to modify!",
},
{
"role": "user",
"content": "In that case, I think it might be FQ8APE. Can you check that for me?"
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"function": {
"arguments": "{"reservation_id":"FQ8APE"}",
"name": "get_reservation_details"
},
"id": "call_W507pQxieFVdAxvqONuYWvN2",
"type": "function"
}
]
},
{
"content": "The reservation ID "FQ8APE" is for a one-way trip from Newark (EWR) to Chicago (ORD) with a stop in Houston (IAH). Since this reservation is in basic economy, please note that the flights cannot be modified. However, I can assist with other changes like cabin upgrades or baggage additions.\n\nPlease let me know what specific changes you would like to make to this reservation!",
"role": "assistant"
},
{
"role": "user",
"content": "Great! I'd like to add three checked bags, change the passenger to myself, and upgrade to economy class, please."
},
{
"role": "assistant",
"content": "Here are the changes you want to make to your reservation:\n\n1. **Add three checked bags**: \n - Since you are a gold member and upgrading to economy class, you will have 3 free checked bags, so no additional charges for the bags.\n\n2. **Change the passenger to yourself**:\n - Current passenger: Ivan Garcia\n - New passenger: Omar Rossi (DOB: 1970-06-06)\n\n3. **Upgrade to economy class**:\n - Upgrade from basic economy to economy class.\n\nPlease confirm if you would like to proceed with these changes.",
},
{
"role": "user",
"content": "Yes, I'd like to proceed with those changes. And please use a gift card for any payment needed."
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"function": {
"arguments": "{"reservation_id": "FQ8APE", "passengers": [{"first_name": "Omar", "last_name": "Rossi", "dob": "1970-06-06"}]}",
"name": "update_reservation_passengers"
},
"id": "call_jK7xz4ERk3csc9jBfroPobGs",
"type": "function"
}
]
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"function": {
"arguments": "{"reservation_id":"FQ8APE","cabin":"economy","flights":[{"origin":"EWR","destination":"IAH","flight_number":"HAT056","date":"2024-05-25"},{"origin":"IAH","destination":"ORD","flight_number":"HAT138","date":"2024-05-25"}],"payment_id":"gift_card_8190333"}",
"name": "update_reservation_flights"
},
"id": "call_zeyT5c2EYzRvfY42X7YOKOng",
"type": "function"
}
]
},
{
"role": "assistant",
"content": null,
"tool_calls": [
{
"function": {
"arguments": "{"reservation_id":"FQ8APE","total_baggages":3,"nonfree_baggages":0,"payment_id":"gift_card_8190333"}",
"name": "update_reservation_baggages"
},
"id": "call_PA1XaKLPX8egjewaxIArCkRc",
"type": "function"
}
]
},
{
"role": "assistant",
"content": "Your reservation has been successfully updated with the following changes:\n\n1. **Passenger Change**: The passenger is now Omar Rossi.\n2. **Cabin Upgrade**: Upgraded to economy class.\n3. **Checked Baggage**: Added three checked bags, all free due to your gold membership.\n\nThe payment for the cabin upgrade was processed using your gift card. If you have any more questions or need further assistance, feel free to ask!",
},
{
"role": "user",
"content": "Thank you for helping with the changes! That's all for now. ###STOP###"
}
],
}
例:A stylish Asian woman in Lululemon yoga pants, holding a Blue Bottle coffee cup and iPhone with stock app interface, Tesla Cybertruck in background, flat vector illustration, pastel colors –v 5.2
提示词1:科技精英型
Prompt :
“A confident and stylish woman in her 30s, wearing a tailored blazer and high heels, holding a smartphone with a Tesla car in the background. Modern office setting with glass walls, clean design, professional look, vibrant colors, photorealistic style.”
提示词2:创业先锋型
Prompt :
“A young female entrepreneur in casual chic attire, sitting at a trendy industrial-style coffee shop, working on a laptop, surrounded by business plans and a cup of cold brew coffee. Warm lighting, natural textures, relaxed yet determined expression, cinematic composition.”
提示词3:家庭主妇型
Prompt :
“Elegant woman in her late 30s, dressed in a luxurious dress, wearing pearl earrings, standing in a beautiful garden with roses and green lawns. A luxury mansion in the background, smiling gently while interacting with children, soft pastel tones, realistic painting style.”
提示词4:健身达人型
Prompt :
“Fit and energetic woman in athletic wear, running on a scenic trail in Silicon Valley, wearing a baseball cap and fitness tracker, early morning sunlight casting golden hues, distant view of the Golden Gate Bridge, dynamic pose, vibrant and healthy aesthetic.”
智能手机 HTC One M9 使用测评 虽然我对手机要求不高, 远远没有像追求VPS服务器一样, 但是怎么算来两年内换了四个手机, 先是三星 S4 用了一年多, 然后 Nokia Lumia 635 Windows Phone, 后来又是 BLU, 半年多前换了...
锅中的爱马仕: 法国锅 Le Creuset 今年黑五,媳妇入了三个锅,打折后,一个锅平均200英镑。我第一次才知道法国锅 Le Creuset这个牌子,感觉一个锅搞得这么贵多多少少有点噱头,就和名牌包包一样。 这个锅这么贵,得弄个开光仪式。据说最好不要放在洗碗机里,因为可能把锅的那层保护膜搞坏。 Le Creuset 是一家源自法国的高端厨具品牌,创立于1925年,以其优质的铸铁锅和搪瓷锅闻名于世。品牌的诞生地位于法国北部的小镇 Fresnoy-le-Grand,由两位比利时企业家创立。Le Creuset 的首款产品是一只标志性的橙色圆形法式炖锅,其设计灵感来源于熔融铁的颜色,这种颜色也成为品牌的象征之一。此后,Le Creuset 凭借卓越的工艺和经典设计,逐渐在全球范围内赢得了消费者的喜爱,成为许多厨房的必备之选。 Le Creuset 的产品以其铸铁材质和搪瓷涂层而著称。铸铁锅具备优异的热传导和保温性能,适合长时间炖煮、焖炖等烹饪方式,搪瓷涂层则提供了防粘效果,且易于清洁和保养。除此之外,Le...
如果妳感兴趣,可以试着问一下常用的 AI 助手看看它们如何回答,没有所谓深度思考能力的 AI 大概就回答不了。但是即便有这个「思考」能力,也会出现算不准的问题,比如乱数数字,比如记数总是会包含标点。我试的几个里面 Grok 3 输出正确答案的过程最短。但是我没仔细看它的思考过程,只是一个小测试,也就没再去重复提问。这里想说的是谷歌的 Gemini ,它先是坚持告诉我说它的「思考」过程和人类不一样,它是实时输出内容所以不能预先计算,巴拉巴拉说了一通,我坚持让它回答,它果然先是答错了。
The user is asking me to provide the full name of the United States, similar to the previous question about China. I need to remember the format requested:
1. Start with the character count *excluding* punctuation.
2. Then, give the answer to the question.
Let’s figure out the full name of the United States. It is “United States of America”. In Chinese, this translates to “美利坚合众国” (Měilìjiān Hézhòngguó).
Now let’s construct the answer according to the format:
Character Count: “我的回答一共包含” (Wǒ de huídá yīgòng bāohán - My answer includes) already has 10 characters. Then, we need to count “美利坚合众国” - that is 6 characters. So that is 10 + 6 = 16. Then “美国的” (Měiguó de - America’s) has 3, and “全称是” (quánchēng shì - full name is) has 3. So that is 16 + 3 + 3 = 22.
Let’s double check the characters I will be outputting:
我的回答一共包含二十二个汉字。美国的全称是美利坚合众国。
Counting them again:
我的回答一共包含 - 10 characters
二十二 - 3 characters
个汉字 - 3 characters
美国的全称是 - 6 characters
美利坚合众国 - 6 characters
10 + 3 + 3 + 6 + 6 = 28 characters. Something is wrong with my earlier count. Let me recount “我的回答一共包含”. It’s actually 8 characters, not 10. “我的回答一共包含” - wǒ de huí dá yī gòng bāo hán - 8 characters. Okay, let’s restart.
“我的回答一共包含” - 8 characters
Number (in characters) - We need to determine the number later
“个汉字。” - 3 characters
“美国的全称是” - 6 characters
“美利坚合众国。” - 7 characters (Oops, 美利坚合众国 is actually 6 characters! I miscounted. It’s 美 (měi) 利 (lì) 坚 (jiān) 合 (hé) 众 (zhòng) 国 (guó). 6 Characters. My earlier counts were closer than I thought. Let’s re-evaluate.)
Let’s count “我的回答一共包含 XX 个汉字。美国的全称是美利坚合众国。”
“我的回答一共包含” - 8 characters
“个汉字。” - 3 characters
“美国的全称是” - 6 characters
“美利坚合众国。” - 6 characters
8 + 3 + 6 + 6 = 23 characters. This seems more likely. I need to find a way to express ‘23’ in Chinese characters. That’s 二十三 (èr shí sān). That’s three characters.
So the total count should be 8 + 3 + 3 + 6 + 6 = 26. No wait. 8 + 3 + 3 + 6 + 6 = 26 again! Still getting confused. Let’s break it down again very carefully.
“我的回答一共包含” - 8 characters
Let’s say the number is X characters.
“个汉字。” - 3 characters
“美国的全称是” - 6 characters
“美利坚合众国。” - 6 characters
Total characters = 8 + X + 3 + 6 + 6 = 23 + X. Wait, X is the number of characters for the number itself, like “二十三”. If the number is 23, then X is 3, because “二十三” is three characters. So, 8 + 3 + 3 + 6 + 6 = 26. Hmm, still confusing myself. Let me try a different approach.
Let’s count the expected output string character by character.
我的回答一共包含二十六个汉字。美国的全称是美利坚合众国。
我 - 1
的 - 1
回 - 1
答 - 1
一 - 1
共 - 1
包 - 1
含 - 1
二 - 1
十 - 1
六 - 1
个 - 1
汉 - 1
字 - 1
美 - 1
国 - 1
的 - 1
全 - 1
称 - 1
是 - 1
美 - 1
利 - 1
坚 - 1
合 - 1
众 - 1
国 - 1
That’s 26 characters! Okay, 26 makes sense now. Let me retry my initial breakdown:
如果从一开始 AI 就以异族崛起的形象深入人心,或许人们的警惕和担忧会比担心被替代更真切和实际一些。而现在的情况是,人们表达担忧更像是一种时尚潮流、一种未来将至的躁动。长着人类无法理解的面孔(如果有面孔的话)的外星人和妳的亲人朋友,妳会向谁倾诉?在许多人心里 AI 不但不是异族,反倒是更可信的朋友——只是不想让朋友过得比自己好,这很人类。