XBOW: autonomous offensive security, with the proof attached

The difficult part of automated security testing has never been generating candidate findings. Scanners have returned long lists of possible issues for years. The difficult part is confirming which of those issues can actually be exploited, and which are noise that will cost an analyst an afternoon. AI tooling made the first half faster. It did not, on its own, solve the second.

XBOW's contribution is best understood there, at the point of confirmation. The company builds an autonomous offensive security platform that finds vulnerabilities in web applications and then validates them, rather than handing a security team a queue of maybes to triage.

The validator layer

XBOW's engineers found that large language models were effective at discovering candidate vulnerabilities but unreliable at deciding whether those candidates were real. Their answer was to separate the two jobs. Discovery uses AI reasoning; verification uses a deterministic validation step that confirms a finding is genuinely exploitable before it is reported. According to the company, this approach keeps the false-positive rate low.

That design choice is the part worth attention. It is the difference between a tool that suggests work and a tool that has already done the part of the work most teams find expensive: proving the issue matters. For a security architecture lead weighing where autonomous testing fits, the validator layer is the question to press on, not the AI label.

The result that made it visible

The validation approach is also what produced XBOW's most public signal. In 2025 it became the first autonomous system to reach the top of HackerOne's US leaderboard, a ranking otherwise held by human researchers. Over roughly 90 days it submitted close to 1,060 vulnerability reports, of which 54 were classified as critical, 242 as high and 524 as medium in severity.

The autonomy here is worth stating precisely. Discovery was automated, and XBOW's own security team reviewed findings before submission to comply with HackerOne's policy on automated tooling. So the result reflects automated discovery with a human gate at the point of reporting, not a fully closed loop. That is still a notable outcome. It is also a more honest description than the headlines that ran at the time, and the distinction matters to anyone planning to deploy this in their own environment.

How XBOW got here

XBOW was founded in January 2024 by Oege de Moor, whose career runs through the disciplines the platform now draws on. He was an Oxford professor and founded Semmle, a code-analysis company acquired by GitHub, where he went on to work on developer tooling including GitHub Copilot. The line from program analysis to autonomous offensive security is a short one, and it shows in how the product is built.

The leaderboard result did not arrive cold either. XBOW's team started by measuring the system against Capture The Flag challenges, then built its own benchmark of real-world scenarios that had not been used to train language models, before applying it to zero-day discovery in open-source projects. The HackerOne submissions came after that groundwork. The ranking reads less as a one-off and more as the visible end of a methodical progression.

What the platform offers now

On the Capability Exchange listing, XBOW is featured with two products. The XBOW Autonomous Offensive Security Platform deploys AI agents that map an application's attack surface, execute parallel and adaptive attacks, and validate exploitability before reporting, with documentation aimed at compliance evidence. XBOW Lightspeed Ipackages the same capability as an on-demand pentest, positioned for teams that need frequent, fast assessment inside a development cycle rather than a point-in-time engagement.

Both are built around the same premise: continuous testing that mirrors how an attacker actually operates, with the validation step carried inside the product rather than left to the buyer.

XBOW Page on Capability Exchange

The momentum behind it

The funding trajectory is steep. A $75 million Series B led by Altimeter in 2025 was followed in early 2026 by a Series C that valued the company at over $1 billion, then a $35 million extension. Total funding now stands above $270 million.

The extension is the more telling signal. It came from Accenture Ventures, DNX Ventures, Liberty Global Tech Ventures, NVIDIA's NVentures, Samsung Ventures and SentinelOne S Ventures, and several of those investors are also customers. Companies that deploy the product they are funding is a stronger demand signal than the headline figure.

What to take from it

XBOW is a useful case for any team thinking about where autonomous testing belongs in a security stack. The claim that earns a second look is not that an AI can find vulnerabilities. It is that an AI can validate them independently enough to be measured on a public leaderboard, and that the validation, rather than the discovery, is where the engineering effort went.

For security leaders mapping coverage across an expanding tool portfolio, that framing is the right one to carry into any evaluation: ask what a tool proves, not only what it flags. Understanding where a capability like this overlaps with existing controls is exactly the kind of question ESProfiler is built to answer.

XBOW is featured in the Market Momentum Spotlight series, which profiles fast-growing vendors tracked on the Capability Exchange leaderboard.