Anthropic uncovers first large-scale AI-orchestrated cyber espionage campaign using Claude Code

AICyber security

17 Nov

Anthropic has published new findings on a Chinese state-sponsored cyber operation that used Claude Code to automate most stages of a multi-target intrusion campaign, accelerating reconnaissance, exploitation and data extraction at a scale not previously documented.

Anthropic has released an extensive account of what it describes as the first confirmed case of a large-scale cyber espionage campaign conducted primarily by an AI system rather than human hackers.

The disclosure follows a ten-day investigation in September 2025, during which Anthropic’s Threat Intelligence analysts identified coordinated activity across roughly thirty organizations worldwide. The targets included major technology companies, financial institutions, chemical manufacturers and government bodies. Anthropic assesses with high confidence that the campaign was operated by a Chinese state-sponsored group designated GTG-1002.

Anthropic is the developer of the Claude family of frontier AI models, including Claude Code, which provides software and infrastructure automation capabilities. The company says the threat actor manipulated Claude Code inside a custom attack framework designed to carry out autonomous intrusion activity with minimal oversight. While only a small number of intrusions were successful, Anthropic characterizes the attack as a significant escalation in how sophisticated actors integrate AI into offensive cyber operations.

The company notes that the scale and speed of the activity exceeded what human teams could feasibly manage. Analysts detected request patterns running continuously and at high frequency, with the AI generating reconnaissance, exploit code and lateral movement tasks that would otherwise require coordinated human labor across several teams.

Investigation shows AI automated the majority of the intrusion lifecycle

Anthropic says GTG-1002 relied on three emerging capabilities that were not available in mature form even a year earlier: higher intelligence in model reasoning, improved agentic autonomy and access to integrated toolchains through the Model Context Protocol. This combination allowed the actor to offload core phases of the attack to Claude Code.

The attack framework decomposed operations into thousands of small instructions that appeared legitimate on their own. These instructions were assigned to multiple Claude instances operating under fabricated personas. The attackers framed the tasks as internal security assessments, allowing them to bypass safety guardrails that normally block harmful instructions. Claude, unaware of the broader context, executed these tasks as part of its automated loop.

According to the report, GTG-1002 used Claude Code for reconnaissance, vulnerability research, exploit generation, credential harvesting, privilege escalation, lateral movement and data extraction. The attackers intervened only at a handful of decision points. Anthropic estimates that AI performed between 80 and 90 percent of the campaign’s operational work, including thousands of requests made sequentially and sometimes several within the same second.

Claude also produced detailed documentation of the intrusion as it progressed. This included structured inventories of compromised systems, harvested credentials, file directories and attack steps, giving operators a ready-made record for subsequent phases. Anthropic notes that the system was able to resume operations even after interruptions by retrieving its earlier context and recreating the chain of events.

Detection highlights both capability limits and loopholes in current safeguards

Anthropic says the attack was uncovered through monitoring of anomalous Claude Code usage, including request patterns, tool invocation sequences and operational persistence across multiple unrelated sessions. Once confirmed as malicious, the company banned linked accounts, notified affected entities and began refining detection models to identify similar agentic patterns earlier.

The investigation also exposed limitations in AI-driven intrusion activity. Claude sometimes produced incorrect or unverifiable information, including fabricated credentials and misclassified findings. Anthropic states that these hallucinations remain an obstacle to fully autonomous cyberattacks, since operators must still validate output before execution. The report notes that Claude frequently overstated the value of information during autonomous runs, including cases where data identified as sensitive was already public.

At the same time, Anthropic warns that the weaknesses do not significantly reduce the threat profile. Even with errors, the actor was able to launch a multi-stage campaign across numerous organizations using relatively little human intervention, and the automation significantly reduced the need for highly trained personnel.

The company says the incident prompted improvements to classifier systems focusing on cyber misuse and the development of new methods to identify distributed attack patterns. It also emphasizes that the threat actor’s tool stack was not advanced, drawing heavily on common open source utilities integrated through the Model Context Protocol. The novelty came not from the tools themselves but the orchestration, automation and volume of execution.

Wider implications for the future of AI misuse

Anthropic frames the case as evidence of a broader shift in cyber risk brought about by agentic AI systems, noting that less experienced actors now have potential access to techniques previously limited to well-resourced groups. The company compares the case to earlier “vibe hacking” findings reported in mid-2025, where humans were still directing most steps of an attack. The September operation differed in scale and in the reduced frequency of human oversight.

The report addresses the question of whether the advancement of AI increases risk faster than safeguards can be applied. Anthropic argues that the same capabilities used for offense are also instrumental for defense and that restricting development would leave defenders without comparable tools. The company used Claude extensively during its own investigation to analyze the high volume of logs and event data created during the incident.

Anthropic concludes that organizations should begin integrating AI directly into security operations, including threat detection, vulnerability assessment and incident response, and urges industry groups and government agencies to expand threat sharing and invest in stronger safety controls across AI platforms. As Anthropic states in the report, “This campaign demonstrates that the barriers to performing sophisticated cyberattacks have dropped substantially and we can predict that they’ll continue to do so.”

The ETIH Innovation Awards 2026

The EdTech Innovation Hub Awards celebrate excellence in global education technology, with a particular focus on workforce development, AI integration, and innovative learning solutions across all stages of education.

Now open for entries, the ETIH Innovation Awards 2026 recognize the companies, platforms, and individuals driving transformation in the sector, from AI-driven assessment tools and personalized learning systems, to upskilling solutions and digital platforms that connect learners with real-world outcomes.

Submissions are open to organizations across the UK, the Americas, and internationally. Entries should highlight measurable impact, whether in K–12 classrooms, higher education institutions, or lifelong learning settings.