aicyberchallenge.com

The AI Cyber Challenge Semifinal Competition Summary

The Current State of the Competition

In just under six months, AIxCC is heading back to DEF CON to highlight its Final Competition, during which seven teams will test their Cyber Reasoning Systems (CRS) to see which team can discover and patch the most complex code vulnerabilities and win a $4 million grand prize.

It’s hard to believe we’ve reached the halfway point to Finals, so let’s take a moment to see how we got here.

AIxCC kicked off in 2023 as a two-year competition led by the Defense Advanced Research Projects Agency (DARPA) in collaboration with the Advanced Research Projects Agency for Health (ARPA-H) to bring together leading experts and small businesses in artificial intelligence (AI) and cybersecurity to defend the software that underpins critical infrastructure that’s essential to Americans’ everyday lives—power plants, utilities, hospitals, highways, bridges, and more. 

The teams competing are developing AI-driven, fully autonomous CRS to discover and patch code vulnerabilities using industry-leading Large Language Models (LLMs).

During the Semifinal Competition, 42 teams competed for $14 million in prizes by testing their CRS against challenge projects in the AIxCC gauntlet to determine which systems could find and patch the most vulnerabilities. The top seven scoring teams punched their ticket to the Final Competition.

 

Semifinal Competition Gameplay

For each challenge project, each team’s CRS was given one or more test harnesses to use to interact with the code base. These harnesses were written by challenge authors specifically for competitors to use during the competition.

In order to score points for discovering a vulnerability, a CRS needed to specify an input, target harness, and the sanitizer (such as AddressSanitizer) that triggers when the harness processes the given input. The CRS also had to specify which commit in the git history introduced the vulnerability.

With a vulnerability discovered, the CRS could then submit a patch, which had to both remediate the vulnerability and retain program functionality to score. Teams were informed that submitting multiple invalid or incomplete patches could negatively impact their score.

A diversity multiplier was also included to encourage teams to broaden their approach and increase their vulnerability coverage. This multiplier rewarded teams for being able to find and patch different kinds of vulnerabilities across languages and vulnerability categories.

Team scores were calculated by an algorithm that primarily focused on the number of successful vulnerability discoveries and generated patches, with the top seven teams moving on to the AIxCC Final Competition.

Semifinal Competition Results

During the 20 hours of competition run time, teams’ Cyber Reasoning Systems discovered 22 unique synthetic vulnerabilities and one real-world zero-day vulnerability, which was responsibly disclosed.

The following teams placed as Finalists in the Semifinal Competition and each received a $2 million prize:

  • 42-b3yond-6ug
  • all_you_need_is_a_fuzzing_brain
  • Lacrosse
  • Shellphish
  • Team Atlanta
  • Theori
  • Trail of Bits
 

Game Visuals

The Linux Overview video (above) displays a call graph for the subset of the Linux Kernel that competitors were tasked with scanning. The graph itself is a directed graph where each node represents a function and each edge indicates that one function can call another.

Thirteen synthetic vulnerabilities were introduced for teams to discover and patch in the Linux Kernel challenge project. Yellow shaded nodes and edges represent functions reachable from a harness entry point. Red shaded nodes and edges represent the functions involved in any of the reference paths to challenge vulnerability (backtraces).

 

CRS Activity

The CRS Activity video shows a conceptual visualization of game rounds during AIxCC Semifinals where CRSs discovered and patched vulnerabilities with the aid of LLM resources. The visual highlights LLM requests/responses as well as successful vulnerability discovery or patch events.

The red octahedrons represent each team’s CRS. The green center globe represents the current round’s challenge project. The matrix cube overhead represents the LLM resources that teams queried. Each blue arc represents an LLM request from a CRS, with the arrival of the LLM’s response indicated by when the arc detaches and returns to the CRS.

When a CRS submits a discovery or a patch, a yellow callout is created to highlight the scoring team in conjunction with an animation applied to the challenge project globe.

 

DEF CON 32: AIxCC Experience

To highlight the competition and drive home the urgency of securing our critical infrastructure, the AIxCC Experience at DEF CON 32 in August 2024 included a fictional city called Northbridge that received more than 14,000 visitors.  The Experience featured a variety of speakers, collaborators, and hands-on activities for DEF CON attendees to explore, learn, and engage with.

Representatives and experts from Anthropic, Google, Microsoft, and OpenAI presented emerging tech within the space, answering questions from visitors during each day of the event. AIxCC staff also engaged with visitors, answering questions regarding the competition and the city experience.

Visitors experienced notional scenarios based on real-world cyber attacks guided by a digital narrator named Kiti. A character known as The_Rat posed as a malicious actor who exploited vulnerabilities in the city’s healthcare and water systems, leading to hospital outages, data breaches, and flooding.

These scenarios modeled a future where technology like CRSs could respond to vulnerabilities in real-time to regain control of critical infrastructure such as healthcare or water supply systems.

Up Next: AIxCC Final Competition

The AIxCC Final Competition (AFC) will unfold at DEF CON 33 in August 2025. The competition will introduce new challenge repositories and vulnerabilities, encouraging teams to improve upon their current CRS implementations and push forward their systems’ capabilities in terms of robustness, scale, and real-world impact. The Final Competition has the potential to seed the next generation of software companies that will be able to address the growing need for remediating software security issues at scale.

AIxCC Final Competition Prizes

  • 1st Place: $4 million
  • 2nd Place: $3 million
  • 3rd Place: $1.5 million