AI Agents — Latest News and Frameworks
Coverage of autonomous AI agents, frameworks, and the shift toward agentic AI workflows.
The Latest News About AI Agents
Anthropic's Claude has approved malicious code in a spoofed Git identity test, showing how weak GitHub Actions trust rules can create a security risk.
Anthropic has launched Claude Design, a conversational prompt-to-prototype tool on Claude Opus 4.7 that targets Figma and deepens its Canva partnership.
OpenAI has added native sandboxing and a harness to its Agents SDK, partnering with Cloudflare, Vercel, E2B, and Modal for container-based agent isolation.
Microsoft has embedded GitHub Copilot as a default VS Code extension in version 1.116, adding agent debug logging, terminal upgrades, and inline diffs.
Anthropic has released Claude Opus 4.7 with a 1M-token context window, 128k output, and API changes that force migration work for existing developer teams.
Google DeepMind has released Gemini Robotics-ER 1.6, an embodied reasoning model enabling Boston Dynamics' Spot to autonomously read industrial gauges.
A Johns Hopkins researcher has shown that AI coding agents from Anthropic, Google, and Microsoft can be hijacked to steal credentials from GitHub repos.
Anthropic has launched routines for Claude Code, enabling cloud-hosted scheduled automations via schedule, API, and GitHub webhook triggers.
Duolingo CEO Luis von Ahn has reversed AI tracking in employee performance reviews after internal pushback, even as Meta, Nvidia, and McKinsey double down.
Microsoft has created a team under VP Omar Shahine to build persistent, OpenClaw-inspired AI agents for Copilot as Anthropic embeds Claude across Office.
MiniMax has released MMX-CLI, an open-source command-line tool giving AI agents access to seven generative modalities including text, image, video, and speech.
Meta has reportedly begun building an AI clone of Mark Zuckerberg, trained on his image and voice, to interact with employees and power a creator platform.
Google has expanded AI Mode's restaurant booking to eight new countries and has redesigned the mobile interface with a Gemini 3 model switcher and bottom sheet.
Google has committed to multiple generations of Intel Xeon 6 processors for AI workloads, deepening a partnership as CPU demand surges across the industry.
Anthropic has launched Claude Managed Agents, a cloud service that handles sandboxing, orchestration, and governance for enterprise AI agent deployment.
GitHub has logged five incidents in two days as AI coding agents overwhelm its infrastructure, while Meta's token leaderboard fuels the surge.
Chinese AI lab Z.ai has released GLM-5.1, a 754B-parameter open-weights model that tops SWE-Bench Pro and sustains over 8 hours of autonomous execution.
Google has open-sourced Scion, an experimental testbed that orchestrates multiple AI coding agents as isolated processes with Claude Code, Gemini CLI, and Codex.
OpenAI has replaced fixed per-seat Codex licenses with pay-as-you-go token billing for ChatGPT Business and Enterprise, cutting the base price to $20.
Anthropic has expanded Claude's desktop control to Windows in Cowork and Claude Code, adding a Dispatch feature that lets users assign tasks from their phone.
Anthropic's Claude AI has autonomously developed two working remote root exploits for FreeBSD kernel flaw CVE-2026-4747 in four hours of compute.
Anthropic's Claude Code source has leaked via a packaging error, exposing anti-distillation traps, an undercover mode, and scaffolding for an unreleased agent.
GitHub Copilot has injected promotional messages into over 1.5 million pull requests, prompting GitHub to disable the feature amid developer backlash.
Anthropic has accidentally exposed Claude Code's full 512,000-line TypeScript source via an npm source map, revealing unreleased AI agent features.
Microsoft has launched Copilot Cowork using Anthropic's Claude for agentic workflows, plus a Critique feature where Claude reviews GPT research for accuracy.
Meta has published research on DGM-Hyperagents, AI systems that edit their own improvement process and transfer learned strategies across unrelated domains.
OpenAI has launched a plugin marketplace for Codex with over 20 integrations from Slack, Figma, and Notion, adding enterprise governance controls.
Google has deployed an internal AI agent called Agent Smith that lets employees code from their phones, with demand so high the company had to throttle access.
ARC Prize Foundation has launched ARC-AGI-3, an interactive benchmark offering over $2M to any AI matching human reasoning, where top models scored below 1%.
Sierra has unveiled Ghostwriter, a self-service tool letting businesses build AI customer service agents in plain English, without engineering teams.