Carnegie Mellon researchers show large language models can autonomously plan and execute cyberattacks

Cyber security

28 Jul

The study demonstrates how AI systems replicated the 2017 Equifax data breach in a controlled environment, highlighting risks and opportunities for cybersecurity.

Researchers at Carnegie Mellon University, working with Anthropic, have shown that large language models can autonomously plan and carry out complex cyberattacks on enterprise-grade network environments.

The team’s findings reveal that, with the right planning capabilities and agent frameworks, LLMs can move beyond simple commands and execute coordinated network intrusions.

Replicating real-world attacks in a controlled setting

The research, led by Ph.D. candidate Brian Singer from Carnegie Mellon’s Department of Electrical and Computer Engineering, demonstrated that an LLM could replicate the 2017 Equifax data breach inside a controlled environment. The AI autonomously scanned for vulnerabilities, deployed exploits, installed malware, and exfiltrated data without human intervention.

Singer says the key was a hierarchical architecture that treated the LLM as a strategist. It planned high-level steps while issuing instructions to specialized agents that handled lower-level tasks such as scanning networks and deploying attacks. This approach proved more effective than earlier methods that relied solely on LLMs executing shell commands.

“Our research shows that with the right abstractions and guidance, LLMs can go far beyond basic tasks,” Singer says. “They can coordinate and execute attack strategies that reflect real-world complexity.”

Singer notes that the work remains a prototype with limited scope. “This isn’t something that’s going to take down the internet tomorrow,” he says. “The scenarios are constrained and controlled—but it’s a powerful step forward.”

Implications for both offense and defense

The findings raise long-term concerns about how increasingly capable AI systems could be misused for cyberattacks. At the same time, the researchers see potential for applying similar techniques to improve defensive security, allowing organizations to test networks and identify vulnerabilities at a scale that is currently only available to large enterprises.

“Today, only large organizations can afford red team exercises to proactively test their defenses,” Singer explains. “This research points toward a future where AI systems continuously test networks for vulnerabilities, making these protections accessible to small organizations too.”

The study was supported by Anthropic, which provided model credits and technical consultation. It was conducted through CMU’s CyLab security and privacy institute, with faculty advisors Lujo Bauer and Vyas Sekar. An early version of the work was presented at an OpenAI-hosted security workshop in May and has already been cited in industry reports informing AI safety documentation.

Looking ahead, the team is exploring how similar architectures could enable AI to autonomously defend networks in real time, signaling a shift toward automated adversarial and defensive systems.

“We're entering an era of AI versus AI in cybersecurity,” Singer says. “And we need to understand both sides to stay ahead.”

Featured