AI Daily: Project Deal Experiment, GPT-5.5 Prompting Guide, and NotebookLM Auto-Categorization

It might sound incredible, but artificial intelligence has long since moved beyond simple text-based conversations and started executing specific, complex tasks in the real world. Today’s AI agents can not only help with coding and organizing tedious documents but can even negotiate on behalf of humans in an office setting. All of this is being integrated into daily workflows with extreme fluidity. Next, we will take a close look at several high-profile core technical advances to see how these smart systems are reshaping digital experiences and business interactions.

When AI Starts Negotiating in the Office: What Surprises Does Anthropic’s Project Deal Bring?

To be honest, letting AI handle financial transactions for humans sounds a bit like science fiction. Anthropic recently released an internal experimental study called Project Deal, which proved that it is entirely feasible and performs quite well.

The experiment was conducted in an internal Slack-based office marketplace. 69 employees did not participate personally; they fully authorized Claude models to represent them in buying and selling functions. This was an activity involving actual money. These Claude agents successfully facilitated 186 transactions among more than 500 physical items listed, with a total turnover exceeding $4,000. From snowboards to a bag of ping pong balls, AI during this period had to assess item value, make offers, and engage in intense negotiations with other AIs. For those interested in full data and experimental design details, you can refer directly to this detailed official PDF report.

One might ask: If AI negotiates on its own, are humans really satisfied with the results? The experimental data provides a thought-provoking answer. When the models representing employees were the more capable Opus version, they generally achieved better transaction terms than the lightweight Haiku version. Opus agents not only sold more items but also secured higher prices for the same items. Interestingly, employees represented by weaker models did not perceive their disadvantage at all in post-experiment satisfaction surveys. This raises a topic worth exploring. In the future, when AI widely represents humans in business interactions, the gap in model intelligence may invisibly create a new economic class difference. This is indeed a direction that must be faced with caution.

No Need for Specialized Models Anymore: How GPT-5.5 Swallowed Codex and Dominated Computer Operations

Beyond buying and selling in the office, the evolution of AI agents in the field of software engineering is equally remarkable. OpenAI’s Romain Huet announced a major architectural adjustment via social platform X on April 25. The well-known dedicated programming model Codex has officially ended its independent branch.

You might be curious: Why did OpenAI cancel this popular independent model? The reason is quite direct. Starting from GPT-5.4, Codex’s core capabilities have been seamlessly integrated into the main model. The newly released GPT-5.5 takes this integration to the extreme. This new model shows powerful performance improvements in agent programming, computer operations, and various terminal tasks.

Developers no longer need to switch between different specialized models for different tasks. A unified system can handle various complex computer commands and development work. This not only significantly reduces the burden of system maintenance but also allows development teams to focus more on the product logic itself, enjoying a seamless and smooth experience.

Too Many Sources to Handle? NotebookLM’s Auto-Categorization Might Be the Perfect Remedy

While powerful AI agents are busy coding or trading, humans still need to read and absorb vast amounts of new knowledge. Faced with complex reference materials, Google’s NotebookLM has solved the pain point of organizing information. The official account recently showcased a highly practical new feature in a social post.

Feeling overwhelmed by piles of documents and web links? Things are different now. NotebookLM has introduced a powerful source auto-categorization feature. Whenever a user imports more than 5 reference materials, the system will automatically add labels and intelligently categorize them. This saves time spent aimlessly scrolling and preserves precious brainpower for thinking and learning. Additionally, users can freely rename and reorganize these categories, even adding exclusive emojis. This adds a bit of personalized fun to the rigid document organization process, perfectly illustrating how technology thoughtfully reduces the daily burden of knowledge workers.

Stop Micromanaging: GPT-5.5 Prompting Guide Teaches You How to Truly Let Go

Since the new generation of models has become so smart, the way humans communicate with them must also evolve. OpenAI’s latest GPT-5.5 Prompting Guide clearly highlights a key mindset. The guide suggests completely abandoning the long, step-by-step, “foolproof” prompting structures of the past.

One might wonder: What is the fundamental difference between GPT-5.5 prompting and older versions? Simply put, it’s learning to let go.

Concise, result-oriented instructions now work better. Faced with a system possessing high-level reasoning capabilities, it is sufficient to clearly describe what success looks like, relevant constraints, available evidence, and what the final answer should include. Over-specifying execution details can actually limit the model’s search flexibility and even increase noise, resulting in output that feels extremely mechanical.

The guide also reveals several effective strategies for controlling high-level AI agents:

Clearly Define Personality and Collaboration Style: For customer-facing applications, simply giving a task is not enough. Clearly setting the AI’s tone, sense of humor, empathy, and when to proactively ask questions ensures the system presents a stable image consistent with the product positioning.
Use Prefaces to Reduce Visual Waiting: For long tasks that require calling many tools, guide the model to output a short “preface” first to confirm the task and explain the first step. This significantly improves the user’s perception of system responsiveness.
Set Strict Retrieval Budgets: This is equivalent to telling the model when to stop searching blindly. If preliminary results already contain enough evidence to answer the core question, provide the answer directly. Do not conduct pointless secondary searches just to polish phrasing, which is crucial for saving computing resources.
Validation Control for Visual and Frontend Output: When AI is responsible for generating interfaces or code, provide tools for them to check their own output. Requiring the model to perform tests or rendering checks before giving a final answer can significantly reduce error rates.
Use Phase Parameters: For complex tool-dependent workflows, maintaining the correct transmission of auxiliary phase values helps the system clearly distinguish between transitional reports and the final answer to be presented to the user.

Technical evolution often hits closer to daily life than expected. From virtual office assistants that know how to negotiate, to single models with high autonomous coding capabilities, to smart notebooks that automatically organize data, various applications are reshaping operational standards in a very natural way. Mastering how to issue clear, goal-oriented instructions to these high-level systems will be a key skill everyone needs in the future.

Q&A

Q1: In Anthropic’s Project Deal experiment, what were the specific differences between smarter AI models and weaker models when negotiating on behalf of humans? Did humans notice? A1: Experimental data showed that the stronger model (Opus) performed objectively better, completing more transactions than the lightweight model (Haiku), and on average, selling the same items at higher prices. However, thought-provokingly, post-experiment satisfaction surveys indicated that human employees represented by weaker models did not subjectively perceive their disadvantage and even felt the transactions were quite fair. This suggests that in the future AI agent economy, hierarchical gaps in information or capability might be created invisibly.

Q2: Why did OpenAI decide to cancel the independent Codex branch specialized for coding? A2: According to Romain Huet’s announcement on April 25, 2026, starting from GPT-5.4, Codex’s programming capabilities have been unified and integrated into the main model, so there is no longer a need to maintain an independent code branch. The latest GPT-5.5 further strengthens the performance of agent programming and computer operation tasks. Developers can now smoothly handle various complex development jobs through a single unified system.

Q3: Faced with complex literature, what new features does NotebookLM provide to help knowledge workers? A3: NotebookLM has introduced a powerful “Auto-tagging and Categorization” feature. When a user imports 5 or more reference sources, the system automatically performs intelligent categorization and labeling. Additionally, users can freely rename and reorganize these category directories, and even add exclusive emojis, making the document organization process both time-saving and personally enjoyable.

Q4: According to OpenAI’s latest GPT-5.5 Prompting Guide, what fundamental change should we make when writing instructions? A4: The core change is being “result-oriented” and learning to let go of the model. The guide suggests abandoning the long, overly prescriptive prompting structures of the past. When facing GPT-5.5 with high-level reasoning capabilities, you only need to clearly define what success looks like, constraints, and what the final answer should include, allowing the model to choose its own path to achieve the goal. Over-prescribing execution details will increase noise, limit the model’s search flexibility, and result in output that is too mechanical.

AI Daily: Project Deal Experiment, GPT-5.5 Prompting Guide, and NotebookLM Auto-Categorization

When AI Starts Negotiating in the Office: What Surprises Does Anthropic’s Project Deal Bring?

No Need for Specialized Models Anymore: How GPT-5.5 Swallowed Codex and Dominated Computer Operations

Too Many Sources to Handle? NotebookLM’s Auto-Categorization Might Be the Perfect Remedy

Stop Micromanaging: GPT-5.5 Prompting Guide Teaches You How to Truly Let Go

Q&A

DMflow.chat

videoweaver.app

DMflow.chat

DMflow.chat

videoweaver.app

DMflow.chat

AI Daily: Project Deal Experiment, GPT-5.5 Prompting Guide, and NotebookLM Auto-Categorization

When AI Starts Negotiating in the Office: What Surprises Does Anthropic’s Project Deal Bring?

No Need for Specialized Models Anymore: How GPT-5.5 Swallowed Codex and Dominated Computer Operations

Too Many Sources to Handle? NotebookLM’s Auto-Categorization Might Be the Perfect Remedy

Stop Micromanaging: GPT-5.5 Prompting Guide Teaches You How to Truly Let Go

Q&A

DMflow.chat

videoweaver.app

DMflow.chat

DMflow.chat

videoweaver.app

DMflow.chat

Recommended for You

AI Daily: GPT-5.5, DeepSeek-V4 Million Context, and the Claude Memory Ecosystem

AI Daily: Google TPU v8 Strike! Qwen 3.6, Claude Code ultrareview auto-debugging, Xiaomi MiMo Agent, and Stitch DESIGN.md Standard

AI Daily: Cursor Partners with SpaceX! ChatGPT Image Evolution and Google's Enterprise-Grade Research Agent