AI Daily | ChatGPT PPT Generation in One Click! CapCut Partners with Gemini for Effortless Editing, Tencent Open-Sources Powerful Translation Model

AI Tech Trends: ChatGPT Tests PowerPoint Generation, CapCut Partners with Gemini for Video Editing Upgrade

Innovation in the tech world never stops. Every day, many new technologies are launched, not only changing work patterns but also reshaping people’s lifestyles. Did you know? Over the past few days, major companies have released a series of practical new tools. Let’s explain how these new developments will impact your work and daily life.

ChatGPT Officially Supports PowerPoint Presentation Creation

Making presentations often takes a lot of effort. Now ChatGPT has launched a beta feature for PowerPoint, bringing generative language models directly into Microsoft’s presentation software. Users only need to enter daily conversational commands to automatically generate slides, update existing presentations, and even convert messy notes into well-structured text and graphics.

This feature is currently available for testing worldwide, covering Enterprise, Education, and regular free users. This not only saves time but also makes layout design effortless.

Regarding this new feature, many people ask: Will corporate or personal data be used to train the model? You can rest assured. According to the official description, by default, data from Enterprise and Education users will absolutely not be used to improve future language models. This setting ensures that enterprise users can adopt it with peace of mind. With just a few clicks, document layout work that used to take hours can now be completed with ease.

CapCut and Gemini Collaboration: A New Experience of Editing via Conversation

Good news for video creators. The well-known editing software CapCut has announced a partnership with Gemini. In the future, users will be able to call CapCut’s advanced editing features directly within the Gemini app. What does this mean? The editing process, which previously required complex timeline operations, is about to transform into an intuitive “conversational” experience.

Users can precisely adjust image and video details through text dialogues. This interaction mode makes the creative workflow more coherent. The development team believes that future content creation will certainly move towards high levels of conversation and intelligent integration. This is just the beginning; more amazing application scenarios are sure to follow, making editing as simple as chatting.

Tencent Open-Sources Hy-MT2 Translation Model: Breakthrough in Lightweight and Multilingual Capabilities

Language barriers have always been a major challenge in international communication. The Tencent team’s latest release, the Hy-MT2 multilingual translation model, brings remarkable progress. This series includes various sizes such as 1.8B, 7B, and the 30B-A3B based on the Mixture-of-Experts (MoE) architecture, supporting mutual translation for up to 33 languages.

Notably, for edge device deployment, the team utilized the AngelSlim 1.25-bit extreme quantization technology. This technology significantly reduces the storage space of the 1.8B lightweight model to just 440 MB, while increasing inference speed by 1.5 times. Despite its small size, its overall performance still surpasses many mainstream commercial APIs like those from Microsoft or Doubao.

To promote community development, the team also open-sourced the IFMTBench evaluation standard to verify the model’s ability to follow translation instructions. Developers can access these resources through HuggingFace or ModelScope platforms, as well as the dedicated 7B model page. To integrate the model for translation tasks, the official “Hy-MT2-Translator Skill” is available for download on ClawHub and SkillHub. Currently, Tencent is also officially collaborating with WMT26 to host the “Video Subtitle Translation Task” and the “General Machine Translation Task,” inviting technical experts worldwide to participate.

Meituan Releases LongCat-Video-Avatar 1.5: Creating High-Stability Digital Humans

Digital human technology is gradually entering daily commercial applications. Meituan has open-sourced the latest LongCat-Video-Avatar 1.5 framework, focusing on creating high-stability audio-driven digital humans. This upgraded version replaces the old Wav2Vec2 with Whisper-Large, which has extremely high speech recognition accuracy, as the audio encoder.

This change has brought very noticeable results. The generated digital humans not only have more natural lip-syncing but also show significantly improved movement stability for the whole body. Additionally, the model has excellent stylistic adaptability, working perfectly for realistic humans, anime characters, and even cute animals.

In terms of inference efficiency, through advanced step distillation technology, high-quality images can now be produced in just 8 steps, balancing visual fidelity with server computing costs. Readers interested in technical details can directly check the official technical report and the model files on Hugging Face. This is a powerful tool for creating virtual anchors and video content; the source code is also available on the GitHub project page.

Claude Becomes a Capable Assistant for Enterprise Security and Compliance

As enterprises face increasingly diverse network threats, defense methods must keep pace with the times. Claude is assisting many partners in applying the Opus model to the cybersecurity field. This is not just talk; many actual cases have shown impressive results.

For example, cybersecurity firm Wiz uses the Opus model to continuously perform attack simulation tests on over 150,000 production assets weekly, successfully identifying thousands of high-risk vulnerabilities. Palo Alto Networks used the technology to shorten penetration testing work that would normally take a year to just three weeks. Accenture also integrated Opus, compressing scan analysis time from 3 to 5 days to under an hour.

Meanwhile, Anthropic officially announced that Claude now supports more security and compliance tools, ensuring that enterprises can meet strict audit standards while introducing AI technology. This approach of transforming a top-tier language model directly into a 24/7 security expert has indeed brought a new atmosphere to corporate protection networks.

Q&A

Q: Will ChatGPT for PowerPoint use my presentation data to train future AI models? A: By default, no. For Business, Enterprise, Education, and Teacher plan users, data shared with ChatGPT will absolutely not be used to improve future language models.

Q: How will the integration of CapCut and Gemini change the video editing workflow? A: In the future, users will be able to use CapCut’s advanced editing features directly within the Gemini app, which will turn the previously complex editing process into an intuitive “conversational” and intelligently integrated experience.

Q: What breakthroughs does Tencent’s Hy-MT2 translation model have in edge device deployment? A: The Hy-MT2 series supports mutual translation for up to 33 languages. To address edge device deployment constraints, the team adopted AngelSlim 1.25-bit extreme quantization technology, successfully reducing the storage space of the 1.8B lightweight model to only 440 MB, while also increasing inference speed by 1.5 times.

Q: What key technical upgrades did Meituan’s LongCat-Video-Avatar 1.5 make to improve digital human stability and generation efficiency? A: In terms of stability, the model upgraded the audio encoder to Whisper-Large (replacing the old Wav2Vec2), significantly improving lip-sync naturalness and whole-body movement stability. In terms of inference efficiency, advanced step distillation technology was used, and now only 8 inference steps (8 NFE) are needed to generate high-quality images that balance visual fidelity and server costs.

Q: What specific achievements has Claude Opus made in helping enterprises improve cybersecurity defenses? A: Claude Opus has brought significant efficiency improvements in automated security defense. For example, Wiz can continuously perform attack simulation tests on over 150,000 production assets weekly; Palo Alto Networks successfully shortened a year’s worth of penetration testing workload to within three weeks; and Accenture compressed scan analysis time from 3 to 5 days to under one hour.

AI Daily | ChatGPT PPT Generation in One Click! CapCut Partners with Gemini for Effortless Editing, Tencent Open-Sources Powerful Translation Model

AI Tech Trends: ChatGPT Tests PowerPoint Generation, CapCut Partners with Gemini for Video Editing Upgrade

ChatGPT Officially Supports PowerPoint Presentation Creation

CapCut and Gemini Collaboration: A New Experience of Editing via Conversation

Tencent Open-Sources Hy-MT2 Translation Model: Breakthrough in Lightweight and Multilingual Capabilities

Meituan Releases LongCat-Video-Avatar 1.5: Creating High-Stability Digital Humans

Claude Becomes a Capable Assistant for Enterprise Security and Compliance

Q&A

videoweaver.app

scribis.app

DMflow.chat

DMflow.chat

videoweaver.app

scribis.app

DMflow.chat

DMflow.chat

AI Daily | ChatGPT PPT Generation in One Click! CapCut Partners with Gemini for Effortless Editing, Tencent Open-Sources Powerful Translation Model

AI Tech Trends: ChatGPT Tests PowerPoint Generation, CapCut Partners with Gemini for Video Editing Upgrade

ChatGPT Officially Supports PowerPoint Presentation Creation

CapCut and Gemini Collaboration: A New Experience of Editing via Conversation

Tencent Open-Sources Hy-MT2 Translation Model: Breakthrough in Lightweight and Multilingual Capabilities

Meituan Releases LongCat-Video-Avatar 1.5: Creating High-Stability Digital Humans

Claude Becomes a Capable Assistant for Enterprise Security and Compliance

Q&A

videoweaver.app

scribis.app

DMflow.chat

DMflow.chat

videoweaver.app

scribis.app

DMflow.chat

DMflow.chat

Recommended for You

AI Daily: GPT-5.6 Preview Released | Claude Subscription Surge | AI Agents Reshaping the Workplace | Google's Copyright Battle

AI Daily: OpenAI Jalapeño Inference Chip | GPT-5.5 Instant Upgrade | Gemini 3.5 Computer Use | Qwen-AgentWorld Language World Model | GitHub Copilot Pay-as-you-go

AI Daily | AI Agents, Physical Robot Dogs, GPT-5.5 Medical Alignment, Open Source Boogu-Image, and Silicon Valley Talent Mobility