Skip to content

The Security Audit

February 13, 2026. Finding four critical vulnerabilities in a system we thought was production-ready, and the fix-and-verify cycle that followed.

Context: Two days after the major feature expansion (Feb 11), with a dashboard deployed, CI/CD running, and the endpoint live for GUVI evaluation. We had been shipping features for eight straight days. The security audit was triggered by reviewing an audit_codex.md file that contained a checklist of security concerns flagged during earlier development.


The Discovery

The audit started not with a tool or a scan, but with a file:

User: "Let's go through the audit_codex and check every item."

The audit codex was a document we had been maintaining --- a running list of "we should check this later" notes accumulated during rapid development. Items like "verify API key validation," "check callback authentication," and "confirm secret wiring in all functions." Each note was a future debt logged during a moment when we were too busy shipping to stop and verify.

Going through it item by item, Claude Code pulled up the relevant source files and analyzed them against the checklist. Within thirty minutes, four findings had crystallized.


The Four Findings

P0: API Key Bypass via Timing Attack

# BEFORE (vulnerable)
def validate_api_key(request):
    expected_key = os.environ.get("SCAMSHIELD_API_KEY")
    provided_key = request.headers.get("x-api-key", "")
    return provided_key == expected_key

Python's == operator compares strings byte-by-byte and returns False as soon as it finds a mismatch. An attacker can measure response time to determine how many leading bytes of the API key they have guessed correctly. With enough requests, the entire key can be reconstructed character by character.

Claude Code: "This comparison is vulnerable to timing attacks. The == operator short-circuits on first mismatch. Replace with hmac.compare_digest() which takes constant time regardless of where strings differ."

The fix was one line:

# AFTER (constant-time comparison)
import hmac
return hmac.compare_digest(provided_key, expected_key)

Why This Matters

API key validation is the front door of the entire system. Every request from the GUVI evaluator --- and from any attacker who discovers the endpoint --- passes through this function. A timing attack against a cloud-hosted function is noisy (requires many requests) but entirely feasible. The constant-time fix costs nothing in performance and eliminates the attack vector completely.

P0: Missing OIDC Verification on Callback Endpoint

The send_delayed_callback Cloud Function was accessible via HTTP POST. Any request with a valid body would be processed --- there was no verification that the request came from Cloud Tasks (the only legitimate caller).

User: "Wait --- anyone can call this endpoint?"

Claude Code: "Correct. Cloud Tasks sends an OIDC Bearer token with each request that can be verified against Google's public keys. Without verification, the endpoint accepts requests from any source."

The fix required a new module:

# functions/utils/oidc.py
def verify_cloud_tasks_token(request):
    if not os.environ.get("K_SERVICE"):
        return True, ""  # Skip in local dev

    auth_header = request.headers.get("Authorization", "")
    if not auth_header.startswith("Bearer "):
        return False, "Unauthorized"

    token = auth_header[len("Bearer "):]
    claims = id_token.verify_oauth2_token(token, google_requests.Request())

    expected_sa = f"{project_id}@appspot.gserviceaccount.com"
    if claims.get("email") != expected_sa:
        return False, "Unauthorized"

    return True, ""

The Extraction Decision

We deliberately extracted OIDC verification into functions/utils/oidc.py rather than putting it inline in main.py. The reason was pragmatic: main.py imports trigger Firebase Admin SDK initialization, which requires Google Cloud credentials. Test files that import from main.py would fail in CI environments without credentials. Extracting the utility to a separate module broke the import chain and made it testable in isolation.

This is the kind of architectural decision that comes from debugging experience, not from AI suggestion. Claude Code generated the OIDC verification code. The human decided where to put it based on lessons learned from previous test failures.

P1: Secret Not Wired to Callback Function

SCAMSHIELD_API_KEY was listed in the secrets=[] parameter of the main guvi_honeypot Cloud Function. It was not listed in the send_delayed_callback function's secrets parameter. In production, the callback function could not read the API key --- os.environ.get("SCAMSHIELD_API_KEY") returned None, and callbacks silently failed authentication against the GUVI API.

User: "So callbacks have been failing in production this whole time?"

The fix was adding one line to the function decorator:

@https_fn.on_request(
    timeout_sec=60,
    memory=256,
    region="asia-south1",
    secrets=["GEMINI_API_KEY", "SCAMSHIELD_API_KEY"],  # Added SCAMSHIELD_API_KEY
)
def send_delayed_callback(request):
    ...

The deployment workflow also needed updating to pass the secret to Cloud Run for the dashboard.

P1: Callback Payload Field Name Mismatch

The callback payload used snake_case field names (session_id, scam_detected, confidence_level). The GUVI API expected camelCase (sessionId, scamDetected, confidenceLevel). The POST request returned 200 (GUVI's API was lenient) but the unrecognized fields were silently dropped. Our intelligence reports were arriving empty.

This one was diagnosed by comparing our Pydantic model with the GUVI API specification line by line. Claude Code generated the corrected model with proper alias configurations.


The Fix-and-Verify Cycle

All four fixes were committed in a single commit:

ba3bc09  Feb 13 21:15  Fix priority 0 audit findings: auth, OIDC, secret wiring,
                       payload shape

The verification process:

  1. Local tests: Run the full test suite to confirm fixes did not break existing functionality
  2. Deploy: Push to Firebase and Cloud Run
  3. Live verification: Send test requests and check that API key validation worked, callbacks succeeded, and payload fields matched
  4. Log review: Confirm no silent failures in Cloud Logging

The entire audit --- discovery, analysis, fix, deploy, verify --- took one evening. The bugs had been live for days. The audit_codex.md file that prompted the review had been written five days earlier. If we had done the audit immediately instead of logging it for later, we would have caught these before the evaluator started scoring.

The Cost of 'We'll Check It Later'

Every item in the audit codex was a known concern that we deferred under deadline pressure. The timing attack was noted on Feb 8. The callback authentication was flagged on Feb 6. We shipped five days of features on top of unverified security assumptions. In a hackathon, this might be acceptable. In production, it would be a breach waiting to happen.


These excerpts are representative of the actual development conversations. See the README for how to interpret them.