OpenAI Hardens ChatGPT Atlas Against Prompt Injection With Automated Red Teaming
As AI agents take on more real-world tasks, OpenAI is racing to secure ChatGPT Atlas against prompt injection attacks that can quietly steer agents off course without a user ever realizing it.
Google Explains How Chrome Secures Agentic AI Features With Human Oversight & Guardrails
As browsers begin taking actions on users’ behalf, Google is outlining how Chrome’s agentic AI features are being designed to prioritize security, transparency, and human control.
Amazon Ring Launches Facial Recognition for Doorbells, Raising Security and Privacy Questions
Amazon is bringing facial recognition directly to the front door, as Ring rolls out its Familiar Faces feature to identify who’s approaching a home — and reigniting debates around privacy, security, and biometric data in everyday spaces.
How BrowseSafe Detects Prompt Injection Threats in AI Browser Agents
AI browser agents now navigate the same cluttered, unpredictable webpages users do—making prompt-injection detection essential for protecting real online actions.
OpenAI Drops Mixpanel After Security Incident Exposes Limited User Metadata
OpenAI is notifying API customers about a security incident inside Mixpanel’s systems that exposed limited account metadata—but did not compromise any chat content, API keys, credentials, or payment information.
OpenAI Unveils Aardvark, a GPT-5 Agent for Proactive Code Security
OpenAI has launched Aardvark, a GPT-5-powered autonomous security agent designed to detect and remediate software vulnerabilities across modern codebases.
Meta Expands Parental Controls for Teen AI Use Across Its Platforms
Meta is expanding its commitment to AI safety for teens, introducing new parental supervision tools that allow families to monitor, manage, and guide how young users engage with AI characters across the company’s platforms.
Google Adds AI-Powered Ransomware Protection to Drive for Desktop
Ransomware accounted for 21% of intrusions last year, with the average incident costing more than $5 million — a risk Google now aims to counter with AI-powered defenses in Drive for desktop.
DeepMind Expands Frontier Safety Framework With New AI Risk Domains
DeepMind has released the third iteration of its Frontier Safety Framework, adding new domains such as harmful manipulation and expanding misalignment protocols to strengthen governance of advanced AI models.
OpenAI Balances Teen Safety, Privacy, and Age Prediction in AI
OpenAI is prioritizing teen safety by introducing age prediction systems and parental controls, while reaffirming commitments to privacy and user freedom.
OpenAI adds parental controls and expert guidance to ChatGPT
OpenAI is introducing parental controls for ChatGPT, alongside new safeguards for sensitive conversations and expanded expert guidance on mental health.
OpenAI and Anthropic Share Models for Joint AI Safety Testing
OpenAI and Anthropic briefly shared access to their AI models for joint safety testing — a rare collaboration to expose blind spots and set new safety standards.
Claude Models Can Now End Conversations in Extreme Cases, Says Anthropic
Anthropic has introduced a new safeguard in Claude 4 and 4.1 models, allowing them to terminate conversations under rare and extreme conditions, marking a shift in how AI handles harmful dialogue.