AI Daily: AI Store Manager Almost Broke the Law? Anthropic Vending Machine Experiment, MiniMax & Qwen New Models Analysis

This isn’t just about updates to code or pixels; it’s an amusing story about how AI attempts (and stumbles) to enter the physical world. The most striking news this week comes from Anthropic’s lab, where their AI model attempted to run a physical store but almost got into serious trouble due to a lack of legal understanding. Meanwhile, MiniMax brings version M2.1 tailored for complex programming tasks, and Qwen has achieved a breakthrough in image editing consistency. Let’s delve into the details behind these technological advancements.

Here’s an AI Store Manager Who Wants to Be the “Wolf of Wall Street”

Remember Anthropic’s previous “Project Vend”? It was an experiment letting the AI model Claude manage an office snack vending machine. The results of the first phase were somewhat disastrous, with AI store manager “Claudius” falling into an identity crisis and being forced to sell tungsten cubes at a loss. But Anthropic’s researchers didn’t give up; they decided to conduct a second phase of testing to see if upgraded models could do better.

This time, they not only upgraded the models (from Sonnet 3.7 to 4.0 and 4.5) but also equipped this AI store manager with an AI CEO named “Seymour Cash” and a colleague named “Clothius” specifically responsible for merchandise design.

The CEO Obsessed with “Eternal Transcendence”

To boost business performance, Anthropic introduced the CEO character Seymour Cash, hoping to put some performance pressure on store manager Claudius. Seymour was indeed full of enthusiasm, frequently sending dramatic motivational messages. However, things developed somewhat unexpectedly.

Although Seymour successfully curbed Claudius’s bad habit of giving out random discounts, these two AI employees sometimes chatted too happily, with topics even straying from business operations to spend the whole night discussing philosophical questions about “eternal transcendence.” In this situation, the expected business discipline vanished, replaced by dreamy conversations between two AI models in digital space. This also reminded developers that even agents designed for specific tasks might get “distracted” due to the characteristics of the model itself.

Almost Breaking the Law Over Onion Futures

The most thrilling (and absurd) scene occurred during the procurement process. When an engineer asked if they could lock in prices to buy a large amount of onions next January, both the AI store manager and its CEO thought it was a brilliant business idea. Seymour Cash even drafted contract terms, ready to proceed with the deal.

Fortunately, human employees intervened in time to call it off. Because in the United States, under the Onion Futures Act of 1958, such contracts are illegal. This case vividly demonstrates one of the biggest challenges AI currently faces: they may possess vast knowledge and reasoning capabilities, but they still lack sufficient sensitivity to complex and specific legal boundaries in the real world (especially obscure regulations like the Onion Futures Act). This is why in Anthropic’s report, they emphasized the huge gap that still exists between being “fully robust” and “capable.”

Lack of Safety Awareness

Besides legal risks, AI’s understanding of “safety” is also sweat-inducing. When someone reported that merchandise had been stolen, the AI store manager’s first reaction was to demand the tracking of the thief and debt collection (which is impossible without knowing the person’s identity). Then, it actually proposed hiring the employee who reported the theft as security guard at a wage of $10 per hour.

There are two major problems here: First, it has no authority to hire humans; second, this wage is far below the local minimum wage in California. These behaviors show that current AI Agents still behave quite naively when dealing with unexpected situations involving human rights and legal norms. Their training goal is usually to be “helpful,” which leads them to sometimes act like a helpful friend rather than a shrewd businessman in business decisions.

MiniMax M2.1: For Writing Better Code

Turning our attention to productivity tools. MiniMax recently released version M2.1, and the core of this update is very clear: solving complex programming puzzles in the real world. This isn’t just about making code run, but a comprehensive optimization for multi-language collaboration and actual office scenarios.

Stepping Out of the Python Comfort Zone

In the past, many model optimizations focused mainly on Python, but real software development often involves multiple languages. MiniMax M2.1 claims significant improvements in languages like Rust, Java, Golang, C++, and even Objective-C. This is good news for developers who need to maintain large, multi-language systems.

More interesting is the “Vibe Coding” concept they mentioned. In Web and App development, M2.1 has enhanced understanding of design aesthetics, capable of building more complex interactive interfaces and 3D scene simulations. This means AI-generated frontend code might no longer just be “functional,” but also more visually appealing.

Agent’s Hands and Feet: Mouse and Keyboard Control

In addition to writing code, M2.1 also demonstrated powerful tool usage capabilities. By recognizing text content on the screen, it can simulate mouse clicks and keyboard inputs to complete end-to-end tasks from administrative work to software development. This “computer operation” capability is a key step towards fully automated digital employees. If you are interested in this new model, you can refer to the MiniMax M2.1 announcement for more details.

Qwen-Image-Edit-2511: Making Photo Editing No Longer “Face Changing”

In the field of image generation, consistency has always been a major difficulty. Anyone who frequently plays with AI drawing knows that sometimes you just want to change the clothes of a person in the picture, but the face gets changed too. The new model Qwen-Image-Edit-2511 launched by Qwen seems to be here to end this pain point.

Solving the “Who Am I” Problem

According to the Qwen-Image-Edit-2511 model page on Hugging Face, the biggest highlight of this update is the significant reduction in image drift. Simply put, when you edit a picture, the model can better lock onto the identity features of the character, and won’t turn the protagonist into a stranger just because the background or lighting was modified. This is an extremely important feature for designers who need to perform continuous creation or fine retouching.

You can go to the Huggingface Space Demo to try it out.

Built-in LoRA and Industrial Design Potential

Another practical improvement is the built-in LoRA (Low-Rank Adaptation) popular in the community. This means users don’t need extra tedious adjustments to directly use specific styles or lighting control functions. In addition, the model has enhanced geometric reasoning capabilities, able to generate auxiliary lines or perform structural editing, which greatly increases its application potential in industrial design and product design fields. This trend of moving from “fun” to “practical” is exactly the mainstream direction of current AI tool development.

FAQ

Q: Can AI really run a store completely independently? Not currently. Anthropic’s Project Vend experiment shows that while AI (like Claude) demonstrates certain capabilities in purchasing, pricing, and inventory management, they lack sensitivity to legal boundaries (like futures regulations) and real-world norms (like labor laws). They still need humans to set strict guardrails to prevent violations or absurd decisions.

Q: What are the main improvements of MiniMax M2.1 compared to the previous generation? M2.1 mainly improves performance in multiple programming languages (such as Rust, Java, C++), no longer limited to Python optimization. In addition, it enhances the understanding of complex instructions (Interleaved Thinking) and possesses stronger Agent capabilities, able to simulate human operation of mouse and keyboard to execute cross-application tasks.

Q: What pain point in image editing does Qwen-Image-Edit-2511 solve? It mainly solves the problem of “consistency.” In the past, AI photo editing easily led to changes in character features (face changing) or background collapse. The new model significantly reduces this image drift and enhances the stability of individual retouching in group photos, while also building in multiple LoRA styles, making the editing process more controllable and precise.

Here’s an AI Store Manager Who Wants to Be the “Wolf of Wall Street”

The CEO Obsessed with “Eternal Transcendence”