Key Takeaways
Assume-breach in 2026 is more important than ever. It's the math: Anthropic's Claude Mythos Preview produced a 90x jump in successful exploit generation, while internal breach detection barely improved according to the latest M-Trends report. Assuming attackers are already inside your network means you need to emphasize detection in two ways: AI SOC Analyst clears the alert queue, and AI Threat Hunter runs hypothesis-driven hunts on the alerts that never fire.
- Expertise is now scalable. Anthropic's Claude Mythos Preview produced 181 working Firefox JS shell exploits on the same benchmark where Opus 4.6 produced just 2, a roughly 90x jump in a single model generation. Initial access is now a matter of how much compute budget attackers have.
- Detection has stalled. Mandiant's 2026 M-Trends report shows internal detection ticked up from 43% in 2024 to 52% in 2025, while the global median dwell time climbed from 11 days to 14, and dwell on cyber-espionage cases ran 122 days.
- Reactive and proactive detection. AI SOC Analyst clears the alert queue you already have, while AI Threat Hunter and AI TI Analyst run hypothesis-driven, federated hunts to find the threats that never fired an alert.
Introduction
Anthropic's Frontier Red Team published a chart earlier this year that should be every CISO's lock screen. On the same Firefox JavaScript-engine benchmark, Opus 4.6 produced just 2 working exploits across several hundred attempts. Mythos Preview produced 181, plus 29 more reaching register control: a roughly 90x jump in success. The expertise bottleneck on initial access is gone as attackers will be increasingly limited by their compute budget, not by how many skilled operators they have. The post-compromise math still favors defenders if they can detect malicious actors before they achieve their goals.
Attackers Got Faster. Defenders Didn't.

Why Exploiting Zero Days Will Become More Common
For two decades, the attacker economics in your environment ran on social engineering. Someone had to click on a phishing email or hand over a credential, and everything downstream waited on a human making a mistake. The 2020 Verizon DBIR put social engineering at 22% of breaches, almost all of it phishing-led and gated by a human click.
Phishing will still be a threat going forward, but in the Mythos era, attackers will be able to more easily find and exploit zero-day vulnerabilities. That means they have more means of gaining initial access.
Anthropic's Frontier Red Team put it bluntly in their February 2026 zero-days post: "language models are already capable of identifying novel vulnerabilities, and may soon exceed the speed and scale of even expert human researchers." Two months later, the Mythos Preview benchmark backed the warning.
On the same Firefox 147 JavaScript-engine target, Opus 4.6 produced just 2 working exploits across several hundred attempts; Mythos Preview produced 181, plus 29 more reaching register control. Anthropic's engineers describe asking Mythos Preview to find remote code execution vulnerabilities overnight and waking up to working exploits the following morning, work that expert penetration testers said would have taken weeks.
If you've been telling your CISO this wave is still hypothetical, Anthropic's November 2025 disclosure is the one that ends that conversation. Anthropic disrupted a Chinese state-sponsored campaign that used Claude Code for reconnaissance, exploitation, lateral movement, and exfiltration. The campaign profile was unmistakably machine-scale:
- 80–90% of tactical operations executed by Claude
- ~30 target entities across the campaign
- 4–6 critical decision points where human operators stepped in
Starting in late December 2025, attackers running a Claude instance whose guardrails they had bypassed compromised ten Mexican government bodies and a financial institution and exfiltrated roughly 150 GB of data covering an estimated 195 million identities. Initial access has flipped from a probability problem into a tooling problem, and the tooling now operates at machine speed against named targets.
Why Has Internal Detection Stalled?
Half of the time, someone else has to tell you that you’ve been breached. Per Mandiant's M-Trends 2026 report:
- Internal detection ticked up from 43% to 52% (2024 to 2025), the single best year-over-year improvement Mandiant has logged in years, but …
- External parties still account for nearly half of all 2025 breach notifications (48%). This means that half of the time, victims are notified by someone else.
- Global median dwell climbed from 11 to 14 days
Let’s see how the calculus has changed: Attacker capability moves visibly with every frontier model release, meaning that they have more opportunities to breach your network. But internal detection rates move by a few points per year, sometimes in the wrong direction.
The threats that never trip an alert are part of the job that has been quietly underfunded for a decade. The concept of “assume breach” has been around for more than a decade, but it’s more important than ever.
To understand why, let’s look at the intruder’s dilemma.
One Initial Access. Eleven Chances to Detect.

Why Is Initial Access Now the Easy Part?
The familiar "defenders have to be right 100% of the time, attackers have to be right once" framing is true, but that doesn’t mean everything is easy for attackers once they’re in. The intruder’s dilemma is that in order for them to achieve their goals, they have a lot of chances to set off alarms.
Throughout an entire post-compromise sequence, defenders need only detect one of the attacker's actions to respond and prevent real damage. The problem is, as we covered with the M-Trends report above, land-and-pivot works for attackers because perimeters are generally hard while interiors are soft. Many organizations need to drastically improve their internal detection capabilities to prevent damage once attackers are able to enter the network more easily. That asymmetry only widens once initial access gets cheap with more powerful AI models.
To improve your detection and response, the MITRE ATT&CK helps and turns into a checklist you can actually defend against. Past initial access, the framework lays out eleven tactic categories where defenders have a detection chance:
- Execution
- Persistence
- Privilege Escalation
- Defense Evasion
- Credential Access
- Discovery
- Lateral Movement
- Collection
- Command and Control
- Exfiltration
- Impact
That's eleven chances for the attacker to fall over a tripwire on the way to whatever they came for.
What Do You Do When a Threat Doesn’t Trigger an Alert?
The threats most likely to hurt you in 2026 aren't necessarily the ones screaming in your queue. Lateral movement off legitimate credentials and identity-provider drift survives by not looking like activity in the first place. If they can, attackers will use common tools and protocols to hide their activity. They never trigger an alert because they never break a rule. That’s what threat hunting is for: finding the threats that slipped past your detections.
An operational assume-breach posture has to cover both surfaces: the queue you have, and the queue you don't.
What Does Operational Assume-Breach Actually Look Like?
Hunt the Queue You Have
Tier 1 triage and investigation is where reactive surface-clearing earns its keep, and where Dropzone's AI SOC Analyst spends its day: every alert in your queue receives a consistent investigation that takes about 7 minutes on average.
ECS, a top-five MSSP in North America (ranked #4 on MSSP Alert's Top 250 for 2025), handles roughly 30,000 alerts per month in its SOC thanks to Dropzone AI. No human team clears that queue without something going uninvestigated, which is exactly where land-and-pivot activity usually hides. Clearing it at machine speed and with consistent investigation depth frees up the time budget for the proactive work that comes next. Read the ECS case study.
That extra SOC capacity unlocks a closer look: alerts your team would normally triage out under pressure get the same full investigation as Urgent ones. When the Axios npm package was compromised in March 2026, the resulting activity appeared in customer environments only as a medium-priority Microsoft Defender alert.
AI SOC Analyst pulled that medium-priority alert through a full investigation, traced endpoint telemetry to a stealthy PowerShell execution beaconing to infrastructure already flagged as malicious by multiple vendors, and escalated it to a Malicious Urgent verdict before human analysts could triage.
The medium-severity label was hiding a real initial-access foothold from Sapphire Sleet, a North Korean state actor that ordinary triage probably wouldn’t have chased, and the same pattern lives in every low- or medium-priority ticket your team triaged out under headcount pressure.
Hunt the Queue You Don't Have
The threats that never tripped a rule are the ones you actually need to find. AI Threat Hunter runs hypothesis-driven, federated hunts across SIEM, EDR, identity, and network data, compressing up to 40 hours of manual hunting into roughly an hour.
Behind the hunts sits the Hunt Pack Catalog, organized into five categories:
- Emerging threats
- Threat actors
- Vulnerabilities
- ATT&CK techniques
- Operational anomalies

To respond quickly to emerging threats, the AI TI Analyst feeds the catalog by monitoring 2,500+ OSINT sources, from CVE disclosures and vendor advisories to security blogs and dark web forums, extracting TTPs and IOCs, and packaging the results into hunt packs that the Threat Hunter can actually execute.
Let’s look at an example: A February 2026 Five Eyes Cisco SD-WAN advisory (CISA Emergency Directive ED-26-03) shows the cadence with Dropzone AI:
- At 00:00, a third-party advisory drops
- By 04:00, a behavioral hunt pack is built from the extracted TTPs
- By 05:30, a federated hunt report lands in your team's inbox
Even hunts that come back clean return value:
- Policy violations surfaced
- Visibility gaps identified
- Misconfigurations flagged
- Detection-engineering opportunities you can action the next morning
Conclusion
Frontier model releases land every few months now, and each one is going to get a bit better at finding and exploiting vulnerabilities. You still need to focus on speeding up your patching cycle, but the reality is that there is too much operational friction to keep up. By 2027, you will absolutely need your detection and response function to pick up the slack. Just assume that attackers have already breached your network. You have chances to catch the intruder, but only if something is hunting on every one of them. See for yourself how Dropzone does this in our self-guided demo.




