Microsoft uses agentic AI to find Windows security flaws before attackers do

The company says its MDASH system helped researchers identify 16 Windows vulnerabilities, showing how AI could change cybersecurity, software development, and technical skills training.

Digital cybersecurity shield representing Microsoft MDASH and AI-powered vulnerability discovery

Microsoft says its MDASH agentic AI security system helped researchers identify 16 Windows vulnerabilities across networking and authentication systems.

Microsoft says a new agentic AI security system has helped its researchers find 16 vulnerabilities across Windows networking and authentication systems, including four Critical flaws that could allow remote code execution.

The system, codenamed MDASH, was built by Microsoft’s Autonomous Code Security team and is now being used by Microsoft security engineering teams. It is also being tested by a small group of customers through a limited private preview.

The findings were included in Microsoft’s 12 May 2026 Patch Tuesday update, the company’s regular monthly release of security fixes. For developers, cybersecurity teams, and computer science educators, the announcement gives a clearer view of how AI is being used not only to write code, but to inspect it, challenge it, and find weaknesses before attackers can exploit them.

How Microsoft’s AI security system works

MDASH is what Microsoft calls a multi-model agentic scanning harness. In simpler terms, it is a system that uses many AI agents to examine code from different angles, rather than relying on one model to find problems on its own.

Microsoft says MDASH uses more than 100 specialized AI agents across several types of AI models. Some agents look for possible bugs. Others test whether those findings are real. Others compare similar code patterns, remove duplicate findings, or try to prove that a vulnerability can actually be triggered.

That matters because cybersecurity teams do not only need a list of possible problems. They need to know which issues are real, how serious they are, and whether they can be reproduced. A tool that produces too many weak findings can create more work for engineers rather than reducing risk.

Microsoft describes MDASH as a pipeline that takes a codebase, identifies areas that could be attacked, scans those areas, checks whether the findings hold up, removes duplicate reports, and then attempts to prove the bug where possible.

The company’s central argument is that the surrounding system is as important as the AI model itself. In other words, MDASH is not just asking one model to “find bugs.” It is using a structured process that mirrors parts of how human security researchers investigate complex software.

Windows flaws added to Patch Tuesday

Microsoft says MDASH helped researchers identify 16 vulnerabilities across Windows networking and authentication components. These included issues in tcpip.sys, which is part of the Windows TCP/IP networking stack, and IKEEXT, a Windows service used for internet key exchange and IPsec connections.

Four of the flaws were rated Critical because they involved remote code execution. This means that, in some circumstances, an attacker could potentially run code on a target system without having physical access to the machine.

Most of the vulnerabilities were reachable from a network position without credentials, according to Microsoft. That makes them more serious than issues that require an attacker to already have access to a machine or account.

Microsoft highlighted two examples. One flaw, tracked as CVE-2026-33827, involved tcpip.sys and could be triggered through specially crafted IPv4 packets. The issue was a use-after-free bug, which means software continued to use memory after it should no longer have been available. Bugs like this can sometimes be exploited to crash systems, expose information, or run code.

The second, CVE-2026-33824, affected IKEEXT. Microsoft says the bug could be triggered through two UDP packets in certain IKEv2 responder configurations. The issue involved a double-free, another memory management error where the same piece of memory is released twice, potentially creating a route to code execution.

The important point is not only that the system found bugs. Microsoft says these flaws required reasoning across multiple files, code paths, and ownership patterns. That is the sort of work that traditional scanners and single-model AI systems can struggle with.

Benchmarks show where AI security is heading

Microsoft says MDASH found all 21 planted vulnerabilities in a private test driver with zero false positives. It also reports 96 percent recall against five years of confirmed Microsoft Security Response Center cases in clfs.sys and 100 percent recall in tcpip.sys.

The recall figures mean the system was tested against older, already confirmed vulnerabilities to see whether it would have found them. Microsoft says the results show MDASH would have identified a high proportion of those previous flaws.

On CyberGym, a public benchmark made up of 1,507 real-world vulnerability reproduction tasks, Microsoft says MDASH reached an 88.45 percent success rate. The company says this was the highest published leaderboard score at the time of writing and around five points above the next entry.

Microsoft is not claiming that the same performance will automatically apply to every future codebase. The company says the historical tests show how the system would have performed on known past vulnerabilities, while the Patch Tuesday group shows how it is being used in current Microsoft security work.

The Autonomous Code Security team includes members who came from Team Atlanta, which won the DARPA AI Cyber Challenge by building an autonomous cyber-reasoning system that found and patched bugs in open-source projects. Microsoft says lessons from that work helped shape MDASH.

MDASH is now in use inside Microsoft security engineering teams and is being tested by customers in a limited private preview. The next question for software teams, universities, and cybersecurity training providers is how quickly AI-assisted vulnerability discovery becomes part of standard developer tooling, security operations, and technical education.

Previous
Previous

US agencies put career pathways and teacher training behind new K-12 grants

Next
Next

Googlebook brings Gemini into the laptop as Google rethinks Chromebook era