_
[WHO AM I]
SYSTEM: ONLINE
Back to Dashboard

The Day ChatGPT Went Dark: Anatomy of a Layer 7 DDoS

SEV: MEDIUM
Nov 2023
STATUS: RESOLVED
ID: LOG-0104

Incident Report

"Internal Server Error" - The New Normal?

If you were trying to code with Copilot or chat with GPT-4 in November 2023, you likely stared at a spinning wheel or a generic "Capacity Reached" error. OpenAI confirmed they were dealing with "abnormal traffic patterns reflective of a DDoS attack."

[Image of DDoS attack types diagram]

This wasn't your average script kiddie attack.
Traditional volumetric attacks (Layer 3/4) flood the pipes with garbage UDP packets. This was a Layer 7 (Application Layer) attack. The difference is crucial:

  • Volumetric: Tries to clog the network cable.
  • Layer 7: Tries to exhaust the server's CPU/RAM.

The "Asymmetric" Cost of AI

The attackers (a group claiming to be Anonymous Sudan) exploited a specific weakness in LLM architecture: Asymmetry.

  1. Cheap to Request: Sending a request like POST /completions { prompt: "Write a novel..." } costs the attacker fractions of a cent and minimal bandwidth.
  2. Expensive to Process: The server must spin up massive H100 GPU clusters, load gigabytes of weights into VRAM, and spend seconds generating tokens.

This creates a resource imbalance. You don't need a massive botnet to take down an AI service; you just need enough requests to fill the inference queue.

Technical Mitigation: The "Proof of Work" Defense

How do you stop this? Rate limiting based on IP isn't enough because attackers rotate IPs. The industry standard solution involves implementing a Client-Side Proof of Work (PoW) challenge.

Before the API accepts your request, it forces your browser/client to solve a math puzzle.

// Conceptual Example of a Challenge-Response async function solveChallenge(challengeString) { let nonce = 0; while (true) { const hash = await sha256(challengeString + nonce); if (hash.startsWith("0000")) { // Finding the "Golden Nonce" return nonce; } nonce++; } }

Why this works: It shifts the cost back to the attacker. If their botnet has to spend 100% CPU to solve puzzles for every request, the attack becomes too expensive to sustain.

Lessons for API Developers

This outage taught us that 429 Too Many Requests is not a failure; it's a survival mechanism. If you are building GenAI wrappers, you need aggressive caching and circuit breakers. Never assume the upstream provider is infinite.

SYSTEM NOTES

This log entry has been verified and archived. Access restricted to authorized personnel only.