AI Security2026-03-0812 min

Partnering with Mozilla to improve Firefox's security

Claude Opus 4.6 discovered 22 vulnerabilities in Firefox over two weeks, with 14 classified as high-severity — almost a fifth of all high-severity Firefox vulnerabilities remediated in 2025.

D

DNA Research Team

Research Team, DNA Cyber Security

AI models can now independently identify high-severity vulnerabilities in complex software. As Anthropic recently documented, Claude found more than 500 zero-day vulnerabilities in well-tested open-source software.

In this post, Anthropic shares details of a collaboration with researchers at Mozilla in which Claude Opus 4.6 discovered 22 vulnerabilities over the course of two weeks. Of these, Mozilla assigned 14 as high-severity — almost a fifth of all high-severity Firefox vulnerabilities that were remediated in 2025. As part of this collaboration, Mozilla shipped fixes to hundreds of millions of users in Firefox 148.0.

From model evaluations to a security partnership

In late 2025, Anthropic noticed that Opus 4.5 was close to solving all tasks in CyberGym, a benchmark that tests whether LLMs can reproduce known security vulnerabilities. They needed harder, more realistic evaluations.

Firefox was chosen because it's both a complex codebase and one of the most well-tested and secure open-source projects in the world. Browser vulnerabilities are particularly dangerous because users routinely encounter untrusted content.

The initial approach involved using Claude to locate previously-identified CVEs in older Firefox versions. Opus 4.6 was able to reproduce a substantial percentage of historical CVEs — bugs that originally required significant human effort to discover. However, concerns about training data contamination led the team to the real test: finding novel, unreported vulnerabilities in the current Firefox release.

A Use After Free in 20 minutes

The team focused initial efforts on Firefox's JavaScript engine — a critical component that processes untrusted external code during web browsing.

After just twenty minutes of exploration, Claude Opus 4.6 reported that it had identified a Use After Free — a type of memory vulnerability that could allow attackers to overwrite data with arbitrary malicious content — in the JavaScript engine. The researchers validated the finding independently against the latest Firefox release, then filed a Bugzilla report with a vulnerability description and Claude-generated patches.

A graph showing Firefox security vulnerabilities reported from all sources by month, with Claude Opus 4.6 finding 22 vulnerabilities in February 2026

In the time it took to validate and submit this first vulnerability, Claude had already discovered fifty more unique crashing inputs. By the end of this effort, the team had scanned nearly 6,000 C++ files and submitted a total of 112 unique reports. Most issues received fixes in Firefox 148, with remaining fixes scheduled for upcoming releases.

From identifying vulnerabilities to writing primitive exploits

To measure the upper limits of Claude's cybersecurity abilities, Anthropic developed a new evaluation to determine whether Claude was able to exploit any of the bugs it discovered — specifically, whether it could create functional exploits for malicious code execution.

Claude was given access to the submitted vulnerabilities and attempted to create targeted exploits. Success required demonstrating actual attacks: reading and writing local target system files.

The team ran this test several hundred times with different starting points, spending approximately $4,000 in API credits. Despite this, Opus 4.6 was only able to actually turn the vulnerability into an exploit in two cases. Two key conclusions:

Claude is much better at finding these bugs than it is at exploiting them
The cost of identifying vulnerabilities is an order of magnitude cheaper than creating an exploit for them

However, the fact that Claude can automatically develop crude browser exploits — even rarely — is noteworthy.

The exploits Claude wrote only worked on a testing environment which intentionally removed some of the security features found in modern browsers — most importantly the sandbox, the purpose of which is to reduce the impact of these types of vulnerabilities.

What's next for AI-enabled cybersecurity

The Firefox team highlighted three critical components for trustworthy AI-generated vulnerability reports:

Minimal test cases — Reproducible proof that the bug exists
Detailed proofs-of-concept — Clear explanation of how the vulnerability can be triggered
Candidate patches — Suggested fixes that have been validated against the test suite

Anthropic also introduced the concept of "task verifiers" — a trusted method of confirming whether an AI agent's output actually achieves its goal. For vulnerability patching, this means verifying two properties: the vulnerability is actually removed, and program functionality is preserved.

The urgency of the moment

Frontier language models are now world-class vulnerability researchers. On top of the 22 CVEs identified in Firefox, Claude Opus 4.6 has been used to discover vulnerabilities in other important software projects like the Linux kernel.

Opus 4.6 is currently far better at identifying and fixing vulnerabilities than at exploiting them. This gives defenders the advantage. But looking at the rate of progress, it is unlikely that the gap between frontier models' vulnerability discovery and exploitation abilities will last very long.

Anthropic urges developers to take advantage of this window to redouble their efforts to make their software more secure.

_Source: _Anthropic Blog

#AI#vulnerability-discovery#Firefox#Claude#Anthropic#browser-security#CVE

Ready for Agentic Pentest?

Experience next-gen Penetration Testing. Agentic Pentest Workflows led by domain-specialist senior experts. Your data masked via the DNA LLM Gateway.

Contact us now

Partnering with Mozilla to improve Firefox's security

From model evaluations to a security partnership

A Use After Free in 20 minutes

From identifying vulnerabilities to writing primitive exploits

What's next for AI-enabled cybersecurity

The urgency of the moment

Related Posts

How We Hacked McKinsey's AI Platform

Emergent Cyber Behavior: When AI Agents Become Offensive Threat Actors

OpenClaw Security Risks: What Enterprises Need to Know

Ready for Agentic Pentest?