AI Daily: OpenAI Jalapeño Inference Chip | GPT-5.5 Instant Upgrade | Gemini 3.5 Computer Use | Qwen-AgentWorld Language World Model | GitHub Copilot Pay-as-you-go

AI Tech Focus: OpenAI Launches Inference Chip and Model Upgrade, Google Assistant Officially Learns to Operate Computers

Every morning, there is always something new in the tech circle. The software and hardware developments of the past few days seem to be fitted with rocket boosters. Major companies have released blockbuster updates one after another. The OpenAI team not only upgraded its most commonly used language model but also quietly joined forces with hardware manufacturers to launch a dedicated chip. Google has enabled its own AI to directly operate computers. Let’s take a look at the important focus summarized for readers today.

OpenAI Joins Forces with Broadcom: Jalapeño Inference Chip Tailored for Language Models

Speaking of computing chips, the first thing many people think of might be Nvidia. However, OpenAI has decided to participate in the hardware battle itself this time. The company has just announced a partnership with Broadcom to launch an AI inference chip named Jalapeño. This project is no joke; the time from initial design to delivery (tape-out) took only nine short months. This speed in hardware development is truly amazing.

Readers might be curious, what exactly is Jalapeño’s advantage? The reason is simple: OpenAI wants to completely master the hardware bottom architecture. This chip is specifically tailored for the inference needs of large language models. According to the details published officially, engineering samples currently running in the lab show that Jalapeño can significantly outperform existing flagship products in terms of power consumption. It will partner with data center partners such as Microsoft in the future and is expected to begin initial deployment by the end of 2026. One can imagine that the response speed of future chatbots will be so fast that people will not feel any delay at all. This chip has become an important part of promoting common-use computing.

GPT-5.5 Instant Gets a New Upgrade: Understands Your Heart Better, Can Also Handle Complex Instructions

Since OpenAI was mentioned, news about the software side cannot be omitted. The language model that users interact with most is now smarter and more caring. GPT-5.5 Instant has received a brand-new version upgrade. Everyone must want to ask, what substantive improvements are there this time?

The focus of this update is on “intent understanding.” Sometimes people ask a sentence offhandedly, and the literal meaning may be different from what they really think. The new version of the model can now more accurately grasp the real idea behind the question and give an answer that is more in line with expectations. In addition, it has become more reliable in handling multiple conditional restrictions. If a user wants to find a good restaurant nearby, or needs a series of shopping recommendations, the suggestions given by the new version will be more specific and practical. Paying users can experience these new features immediately today, and free users will receive the update one after another tomorrow.

Let AI Do It Yourself! Gemini 3.5 Flash Features Built-in “Computer Use”

It is no longer a surprise to have AI help write drafts or draw pictures. What if it could directly help click the mouse and operate software? Google has just announced that Gemini 3.5 Flash officially features built-in “Computer Use”. This sounds a bit like a scene from a science fiction movie, but it has now become a reality.

In the past, this technology only existed in a few independent, specific models. Now, Google has directly integrated it into its main model. Developers can use this to build virtual assistants that can operate across browsers, across phones, or even desktop environments. For example, in the past, engineers had to test software step by step by themselves; now, they just need to give an instruction, and Gemini can automatically complete those tedious mouse clicks and keyboard inputs.

Perhaps everyone will worry, is it safe to let AI operate the computer at will? In fact, Google thought of this long ago. To ensure system security, the development team added multi-layer defense mechanisms, such as requiring personal confirmation from the user before executing sensitive actions. After all, nobody wants the system assistant to delete important files on its own.

Major Leap in General Intelligent Agents: Alibaba Open-Sources Qwen-AgentWorld Language World Model

The next piece of news is slightly skewed towards the professional field, but it is absolutely worthy of attention from tech enthusiasts. The Alibaba Qwen team has released Qwen-AgentWorld Language World Model. What exactly is a language world model? Simply put, it is allowing the system to train in a virtual “sandbox.” However, the official emphasized that this is not to replace the real environment, nor to reduce costs, but to expand the frontier of capabilities.

This model covers seven major interaction fields, including search engines, terminal command lines, and even Android and operating system graphical interfaces. In the past, training virtual assistants by relying solely on real environment interactions made it difficult to cover all extreme scenarios. Qwen-AgentWorld provides scalability and precise controllability that goes beyond real environments (such as injecting directional perturbations to expose the agent’s weaknesses), allowing intelligent agents to try and fail to their heart’s content inside. Stunningly, this model actually outperformed many popular frontier models in relevant evaluation standards. For developers, this is definitely good news. Readers interested in technical details can go directly to its GitHub page or Hugging Face collection to explore more resources.

Programmers Are Also Crazy! The Change in Billing Method Brings GitHub its Best Month Ever

Speaking of coding, most developers know the GitHub platform. Recently, this Microsoft-owned developer community had its best month ever. The reason behind it is quite interesting: they changed their billing method. According to foreign media reports, GitHub Copilot changed the billing method of charging a fixed fee to a single user for a fixed number of requests to a pay-as-you-go model.

On the surface, this is just a fine-tuning of the business model, but it brought about amazing traffic growth. The motivation behind this is actually because market competition is too fierce. Facing strong competitors like Cursor and Anthropic’s Claude Code, GitHub must make changes to retain users. Pay-as-you-go makes many light users more willing to try, which in turn drives the surge in overall usage. Of course, the explosion in traffic was also accompanied by small episodes of server crashes. It is said that the development team is now seeking assistance from other cloud platforms to solve the capacity problem. This also indirectly proves how massive the market demand for coding assistance is.

Google Flow Integrates with Street View: Making Virtual Creation Closer to Reality

Finally, let’s look at a fun visual application. Imagine putting your favorite anime character directly on the street downstairs at your home; what would the picture look like? Google Flow integrates with Google Maps Street View, and now creators can realize this wish.

Through this new feature, the images or videos generated by creators can directly correspond to street views in the real world. As long as you enter a specific location in the prompt, the system will refer to the real images of that location to create. Currently, this service is limited to US street views, but it is enough to make people look forward to its future development potential. Whether you want to redesign city landmarks or just mischievously have glowing jellyfish float on the busy street, this tool makes the creative process more fun.

Q&A

Q1: What major breakthroughs has OpenAI made in “software” and “hardware” recently? A1:

Software: Released a brand-new upgrade of GPT-5.5 Instant, which significantly improved the ability to understand users’ real intentions, and became more accurate and practical in handling complex conditions and providing shopping and local recommendations.
Hardware: OpenAI joined forces with Broadcom to launch the Jalapeño chip specifically tailored for large language model inference. The chip took only 9 months from initial design to delivery (tape-out). Engineering samples are currently running in the lab, and initial deployment is expected by the end of 2026, which will significantly improve computing performance and reduce latency.

Q2: What new skills has Google’s Gemini model learned? How does it help with image creation? A2:

Google announced that Gemini 3.5 Flash officially features built-in “Computer Use” capabilities. This allows developers to build virtual assistants that can operate across browsers, mobile devices, and desktop environments, and even handle complex tasks such as continuous software testing.
In terms of image creation, Google Flow integrates with Google Maps Street View in the US, allowing images and videos generated by creators to directly correspond to and merge with real-world street details.

Q3: What is the “Qwen-AgentWorld” released by Alibaba Qwen team? Is it to replace real environment testing? A3: Qwen-AgentWorld is the first native “Language World Model” that can simulate agent interaction environments covering seven major fields such as terminals, search engines, operating systems, and Android within a single model. The official emphasized that this is not to replace the real environment or reduce costs, but to “expand the frontier of agent capabilities.” By providing scalability and precise controllability that goes beyond real environments (such as injecting directional perturbations to expose the agent’s weaknesses), it helps agents cope with extreme scenarios that are difficult to cover in the real world.

Q4: Why did GitHub have its best month ever? A4: To respond to the competition from strong rivals like Cursor, GitHub changed the billing model of its AI programming assistant tool Copilot. They changed the model of “charging a fixed fee to a single user for a fixed number of requests” to “pay-as-you-go.” This change significantly lowered the threshold for light users and drove a surge in overall usage, but also caused capacity challenges that led to several server crashes in 2026.

AI Daily: OpenAI Jalapeño Inference Chip | GPT-5.5 Instant Upgrade | Gemini 3.5 Computer Use | Qwen-AgentWorld Language World Model | GitHub Copilot Pay-as-you-go