Skip to content

How We Built ScamShield AI with Claude Code

80 sessions. 87 commits. 276 tests. 17 days. One AI coding partner.


What This Is

This is not a polished tech blog. It is the real, messy, pressured story of building a production AI system under hackathon deadlines --- told through the git log, the session transcripts, and the debugging war stories that did not make it into the README.

ScamShield AI was built for the GUVI India AI Impact Buildathon using Claude Code as an AI coding assistant. Every architecture decision, every late-night bug fix, every moment of "why is this returning a 404" happened in a conversation between a developer and an AI pair programmer.

We had roughly 80 Claude Code sessions across the build. The raw conversation transcripts weigh in at around 119 MB of text --- back-and-forth about architecture choices, debugging Gemini API responses, arguing over regex patterns for Aadhaar numbers, and occasionally panicking about deadlines. What follows is the story those transcripts tell.

What makes this story worth reading

Most AI-assisted development stories show the happy path: prompt in, working code out. This one shows the full picture --- the model name that took three commits to get right, the security audit that found four critical vulnerabilities in a system we thought was production-ready, the Cloudflare proxy setting that silently broke WebSocket connections, and the all-night scoring blitz on the final day before evaluation.

The Timeline

Date Phase Sessions Commits What Happened
Feb 1 Blueprinting 2 3 Meta-architecture and AI agent team design
Feb 5 The Blitz 30+ 30 Zero to working system in one day
Feb 6 Callbacks 3 3 Cloud Tasks for delayed intelligence reporting
Feb 11 Feature Expansion 17 17 Multilingual, dashboard, CI/CD, optimization branches
Feb 12 Dashboard + Tests 10 4 UX polish, security hardening, 1,045 new lines of tests
Feb 13 Security Audit 16 2 4 critical findings found and fixed at 9 PM
Feb 14 Auth + Gemini 3 8 10 Cookie auth, model upgrade, CookieManager bug
Feb 16 Extractors 3 3 IFSC, Aadhaar, PAN regex; test suite hits 276
Feb 18--19 Bug Fixes 3 2 Remote debugging, phone number edge cases
Feb 21 Score Maximization 12 4 Per-turn callbacks, scam detection from turn 1

The Chapters

Chapter 1: The Blitz --- Zero to Working System

Feb 5 and Feb 11. Thirty commits in a single day. The Gemini model name that took three tries. Three personas built in one sitting. Seventeen sessions of feature expansion. The plug-and-play strategy that let us merge optimization branches on demand.

Chapter 2: Hardening Under Pressure --- Dashboard, Security, and Deployment

Feb 12--14. A nine-page Streamlit dashboard deployed to Cloud Run. A security audit at 9 PM that found API key bypass, missing OIDC verification, and payload mismatches --- all fixed by midnight. The Cloudflare proxy setting that broke everything. The CookieManager bug that took two render cycles to understand.

Chapter 3: The Presentation

Feb 13--14. Building the buildathon presentation while still shipping features. Two slide deck versions. The constraint of presenting a technical system to a mixed audience.

Chapter 4: Scouting the Competition

Feb 16. Analyzing other submissions. What they did differently. What we learned from their approaches.

Chapter 5: The Rework --- Five Milestones in One Night

Feb 18--21. Post-submission panic, the UPI regex that matched emails, the phone number format that lost points. Then at 4 AM: the resubmission opportunity. A systematic five-milestone plan, the scamDetected duplication bug, 633 lines of dead Cloud Tasks code removed, and the uncomfortable question of optimizing for an evaluator vs. optimizing the system.

Chapter 6: Reflections --- What We Learned About Human-AI Collaboration

The capstone. What humans bring (vision, domain knowledge, judgment). What AI brings (speed, consistency, tireless test generation). Where it worked brilliantly (persona engineering, regex patterns, refactoring). Where it broke down (silent failures, security gaps, stale model names). The multiplier effect and advice for others.

Conversation Highlights

Sanitized excerpts from actual Claude Code sessions showing the real back-and-forth: Building the Personas, The Security Audit, and Score Optimization.


By the Numbers

Total Claude Code sessions:  ~80
Conversation transcript size: ~119 MB
Total git commits:           87
Total test cases:            276
Lines of Python:             ~4,500
Personas:                    3
Evidence extractor types:    11
Scam categories:             12
Dashboard pages:             9
Days from first commit
  to final submission:       17

Reading Order

The chapters are designed to be read in sequence, but each stands alone. If you are here for the security story, skip to Chapter 2. If you want the technical sprint, start with Chapter 1.

How This Relates to the Rest of the Documentation

The Building ScamShield series is the technical walkthrough --- it explains how each component works, with code samples and architecture diagrams. It is organized by topic: foundations, personas, evidence extraction, security, deployment.

This dev diary is the narrative --- it explains what happened, in what order, and why. It is organized by time. The same security audit appears in both: in Building ScamShield, it is a chapter on hardening techniques with code samples. Here, it is the story of a 9 PM session that found four critical vulnerabilities in a system we were about to present.

If you want to understand the system, read Building ScamShield. If you want to understand the process of building it with an AI coding partner under hackathon pressure, read this.


This dev diary is part of the ScamShield AI documentation.