Blog
HTTP/2 BombCVE-2026-49975AI PentestGenerative Exploit

HTTP/2 Bomb (CVE-2026-49975): when an AI agent chained two decade-old primitives nobody had composed

Codex composed HPACK amplification and Slowloris stalling into a 5,700:1 DoS chain hitting nginx, Apache, IIS, Envoy and Pingora — 880,000 servers exposed. The defensive lesson is symmetric.

Zero Hunt Research··7 min read

On 2026-06-02 the offensive-security firm Calif published a disclosure that is worth reading twice. They credit OpenAI's Codex agent with finding a remote denial-of-service chain that hits nginx, Apache httpd, Microsoft IIS, Envoy and Cloudflare Pingora — five of the most-deployed HTTP/2 implementations on the public internet. Shodan counts roughly 880,000 servers exposed. The chain combines two primitives that have been publicly documented for nearly a decade. No human composed them in that decade. Codex did.

The Apache piece carries CVE-2026-49975; the other implementations got vendor advisories of their own. At disclosure, IIS and Pingora were still unpatched.

What the HTTP/2 Bomb actually does

The attack chains two known weaknesses in HTTP/2's HPACK header-compression layer.

The first primitive is HPACK table amplification. HPACK lets a client populate the server's dynamic header table with a single large entry — say, a 4 KB cookie value. After that, every subsequent request can reference that entry by index using a single byte on the wire. The server, however, has to reconstruct the full header each time it processes the stream. One byte in, several kilobytes of allocated memory out. The amplification ratio depends on how much the server caps per-header size and whether cookies are counted against that cap.

The second primitive is Slowloris-style flow-control stalling. HTTP/2 supports per-stream flow control with WINDOW_UPDATE frames. A client that opens streams, fires the amplification payload, and then refuses to advance the receive window forces the server to hold the allocated buffers in memory indefinitely. Each stream becomes a tax on the server's heap.

Composed, the chain is brutally efficient. The researchers at Calif report the following amplification ratios:

Server Bandwidth amplification Time-to-32-GB exhaustion
Envoy 1.37.2 ~5,700:1 ~10 seconds
Apache 2.4.67 ~4,000:1 seconds
nginx 1.29.7 ~70:1 sub-minute
IIS (WS 2025) ~68:1 sub-minute

A residential 100 Mbit/s uplink is sufficient to take an Envoy instance offline in roughly the time it takes to refresh a browser tab.

Both halves were public for a decade

This is the part that should make defenders uncomfortable. HPACK was standardised in RFC 7541 in 2015. The amplification risk was discussed in academic and practitioner literature within months of the spec landing. Slowloris was Robert Hansen's 2009 disclosure — over fifteen years old. The two primitives have lived next to each other in every conference talk on HTTP/2 hardening since 2016.

Nobody composed them in production-impacting form. The dispatcher of attention in offensive research is human: a researcher picks one bug, drives it to a working PoC, writes it up, moves on. Composability across teams and across years is hard. You have to read two codebases at once, recognise that the failure modes interact, and write the glue. That's exactly the kind of cross-file, cross-paper synthesis that an LLM, given an offensive prompt, will attempt without the cognitive cost a human pays.

Read the Calif post carefully and you see the honest framing: the AI did not invent the cryptography or the protocol exploit. It read the codebases, recognised that the two primitives compose, built the glue, and let the humans verify. Quang Luong and the rest of the Calif team did the validation and the disclosure work. The chain is novel; neither primitive is.

"The attack was discovered by Codex, which chained two techniques known to humans for a decade." — Calif disclosure, 2026-06-02

That single sentence is the whole story.

Where the patch chain stands

The patch state at disclosure is uneven, which is the second uncomfortable part.

  • nginx released 1.29.8 with a new max_headers directive that caps the total compressed-header size per request. Effective. Requires explicit configuration; the default is generous.
  • Apache httpd shipped mod_http2 2.0.41 (standalone module and trunk; not yet in a stable 2.4.x point release). The fix makes Cookie headers count against LimitRequestFields — the bypass Calif used. CVE-2026-49975 covers this fix, committed same-day by Stefan Eissing on 2026-05-27 after coordinated disclosure.
  • Envoy shipped patches the day after Calif's public post.
  • Microsoft IIS: unpatched at the time of disclosure. No advisory window communicated.
  • Cloudflare Pingora: unpatched at the time of disclosure. The closed-source nature of Pingora's deployment at Cloudflare's edge means impact on Cloudflare-fronted properties is bounded by whatever mitigation Cloudflare deploys internally and is not auditable from outside.

The temporary mitigation for unpatched stacks is to disable HTTP/2 (Protocol http/1.1 on Apache, equivalent on the others) or to drop the per-stream header-size limit drastically. Both have collateral cost — HTTP/2 is the throughput floor for modern web traffic, and the limit drop will break legitimate clients that send long compound Cookie headers.

The validation gap nobody talks about

The interesting question is not "how did Codex find this." It is "why didn't your last pentest find this."

A traditional pentest engagement runs against a target. The tester picks an attack surface, runs scanners, tries credentialled and uncredentialled approaches, writes a report. The depth of the engagement is bounded by the tester's mental model and by the time on the contract. Composition-across-primitives is rarely tested because the search space explodes: any two of N known bugs gives you N² candidate chains, any three gives you N³, and most are duds. A human cannot rank-order that space.

An offensive AI agent doing composition search at scale can. That is not a marketing claim — it is the structural reason this CVE was found by Codex and not by a human, in a piece of software that is on the perimeter of nearly every major web property on earth. The attack surface is the same. The primitives are the same. What changed is who is doing the search.

For defenders, the implication is symmetric. If the offensive side has access to LLM-driven composition search, the defensive side has to run an equivalent search continuously against its own perimeter — not once a year, not after change tickets, but every day. The cost of doing that with humans is prohibitive. The cost of doing that with an AI swarm built for the purpose is bounded by the GPU it runs on.

This is the part of the CISA KEV catalogue and ENISA Threat Landscape that the dashboards do not surface: the gap between "CVE published" and "exploitation observed" is collapsing, and the gap between "primitive published" and "novel composition exploited" is collapsing harder. HTTP/2 Bomb is the canonical example. It will not be the only one published this quarter.

What CISOs should ask their teams this week

Three concrete questions, all answerable in an afternoon:

  • Which of our internet-facing properties terminate HTTP/2 on IIS or Pingora? Inventory the unpatched stacks first. Disable HTTP/2 on those endpoints until vendor patches land.
  • What is the current value of max_headers (nginx) and LimitRequestFieldSize (Apache) on every public-facing reverse proxy? Defaults are not safe. The vendor patch is necessary but not sufficient — the protective directive has to be configured.
  • When was the last time we ran an exploit chain validation, not a CVE scan? A scanner sees individual CVEs. The HTTP/2 Bomb chain is not a CVE on a CVE database for four of the five affected stacks; it is a composition. A scanner that only reports KEV-listed IDs will miss it.

How Zero Hunt addresses this scenario

The HTTP/2 Bomb story is the cleanest possible illustration of why Zero Hunt is built the way it is. The platform's 10-agent generative-pentest swarm — Recon, Exploit, Web, Credential, Post-Exploit, Pivot, Tactic, Report, plus the AI Controller — does exactly the composition search the Calif team's Codex agent did, but continuously, against the customer's own perimeter, with the appliance GPU as the cost ceiling. New exploits are written per-target by the local LLM; they are not pulled from ExploitDB. Composition of known primitives into a novel chain is precisely the class of behaviour the swarm exists to surface before an attacker does.

The AI Gym, with its 142+ self-evolving security skills backtested against Vulhub (316 of 317 exercises across 16 classes), NYU CTF Bench, Cybench and Vulhub-Bench's 314 CVE-based black-box tasks, is where new chains earn the right to run in a real environment. A skill that composes HPACK amplification with Slowloris stalling is validated against the corpus before it ever touches production traffic. The AI Gym Knowledge RAG — pgvector semantic search over every past execution, finding and remediation — is how the engine remembers that the two primitives compose, so it doesn't have to rediscover the chain on every customer.

On the wire side, the 4-head deep-learning traffic-analysis model is the second line: HTTP/2 Bomb traffic has a distinctive signature (sustained one-byte HPACK references against a populated dynamic table, with stalled WINDOW_UPDATE behaviour) that does not require a CVE-specific signature to be flagged as anomalous. The model runs on the appliance GPU at 2.7+ Gbit/s baseline, locally, with no cloud callback. It catches the chain executing in real time, not in tomorrow's SIEM digest.

For regulated operators — NIS2 essential and important entities, DORA financial entities, the public-administration scope ACN is currently inventorying — the compliance layer maps the finding against the 32 frameworks at write time and ECDSA-signs the evidence chain. When the auditor asks "did you have continuous validation against composition attacks on your perimeter," there is a defensible answer with a verifiable signature on it.

The HTTP/2 Bomb is the kind of finding that, before this week, only one offensive team in the world had. After this week, every offensive team has the playbook. The defensive side has to respond at the same tempo. Talk to us if you want to see what continuous AI-on-AI validation looks like against your own perimeter.