How We Built ScamShield AI with Claude Code¶
80 sessions. 87 commits. 276 tests. 17 days. One AI coding partner.
What This Is¶
This is not a polished tech blog. It is the real, messy, pressured story of building a production AI system under hackathon deadlines --- told through the git log, the session transcripts, and the debugging war stories that did not make it into the README.
ScamShield AI was built for the GUVI India AI Impact Buildathon using Claude Code as an AI coding assistant. Every architecture decision, every late-night bug fix, every moment of "why is this returning a 404" happened in a conversation between a developer and an AI pair programmer.
We had roughly 80 Claude Code sessions across the build. The raw conversation transcripts weigh in at around 119 MB of text --- back-and-forth about architecture choices, debugging Gemini API responses, arguing over regex patterns for Aadhaar numbers, and occasionally panicking about deadlines. What follows is the story those transcripts tell.
What makes this story worth reading
Most AI-assisted development stories show the happy path: prompt in, working code out. This one shows the full picture --- the model name that took three commits to get right, the security audit that found four critical vulnerabilities in a system we thought was production-ready, the Cloudflare proxy setting that silently broke WebSocket connections, and the all-night scoring blitz on the final day before evaluation.
The Timeline¶
| Date | Phase | Sessions | Commits | What Happened |
|---|---|---|---|---|
| Feb 1 | Blueprinting | 2 | 3 | Meta-architecture and AI agent team design |
| Feb 5 | The Blitz | 30+ | 30 | Zero to working system in one day |
| Feb 6 | Callbacks | 3 | 3 | Cloud Tasks for delayed intelligence reporting |
| Feb 11 | Feature Expansion | 17 | 17 | Multilingual, dashboard, CI/CD, optimization branches |
| Feb 12 | Dashboard + Tests | 10 | 4 | UX polish, security hardening, 1,045 new lines of tests |
| Feb 13 | Security Audit | 16 | 2 | 4 critical findings found and fixed at 9 PM |
| Feb 14 | Auth + Gemini 3 | 8 | 10 | Cookie auth, model upgrade, CookieManager bug |
| Feb 16 | Extractors | 3 | 3 | IFSC, Aadhaar, PAN regex; test suite hits 276 |
| Feb 18--19 | Bug Fixes | 3 | 2 | Remote debugging, phone number edge cases |
| Feb 21 | Score Maximization | 12 | 4 | Per-turn callbacks, scam detection from turn 1 |
The Chapters¶
Chapter 1: The Blitz --- Zero to Working System¶
Feb 5 and Feb 11. Thirty commits in a single day. The Gemini model name that took three tries. Three personas built in one sitting. Seventeen sessions of feature expansion. The plug-and-play strategy that let us merge optimization branches on demand.
Chapter 2: Hardening Under Pressure --- Dashboard, Security, and Deployment¶
Feb 12--14. A nine-page Streamlit dashboard deployed to Cloud Run. A security audit at 9 PM that found API key bypass, missing OIDC verification, and payload mismatches --- all fixed by midnight. The Cloudflare proxy setting that broke everything. The CookieManager bug that took two render cycles to understand.
Chapter 3: The Presentation¶
Feb 13--14. Building the buildathon presentation while still shipping features. Two slide deck versions. The constraint of presenting a technical system to a mixed audience.
Chapter 4: Scouting the Competition¶
Feb 16. Analyzing other submissions. What they did differently. What we learned from their approaches.
Chapter 5: The Rework --- Five Milestones in One Night¶
Feb 18--21. Post-submission panic, the UPI regex that matched emails, the phone number format that lost points. Then at 4 AM: the resubmission opportunity. A systematic five-milestone plan, the scamDetected duplication bug, 633 lines of dead Cloud Tasks code removed, and the uncomfortable question of optimizing for an evaluator vs. optimizing the system.
Chapter 6: Reflections --- What We Learned About Human-AI Collaboration¶
The capstone. What humans bring (vision, domain knowledge, judgment). What AI brings (speed, consistency, tireless test generation). Where it worked brilliantly (persona engineering, regex patterns, refactoring). Where it broke down (silent failures, security gaps, stale model names). The multiplier effect and advice for others.
Conversation Highlights¶
Sanitized excerpts from actual Claude Code sessions showing the real back-and-forth: Building the Personas, The Security Audit, and Score Optimization.
By the Numbers¶
Total Claude Code sessions: ~80
Conversation transcript size: ~119 MB
Total git commits: 87
Total test cases: 276
Lines of Python: ~4,500
Personas: 3
Evidence extractor types: 11
Scam categories: 12
Dashboard pages: 9
Days from first commit
to final submission: 17
Reading Order
The chapters are designed to be read in sequence, but each stands alone. If you are here for the security story, skip to Chapter 2. If you want the technical sprint, start with Chapter 1.
How This Relates to the Rest of the Documentation¶
The Building ScamShield series is the technical walkthrough --- it explains how each component works, with code samples and architecture diagrams. It is organized by topic: foundations, personas, evidence extraction, security, deployment.
This dev diary is the narrative --- it explains what happened, in what order, and why. It is organized by time. The same security audit appears in both: in Building ScamShield, it is a chapter on hardening techniques with code samples. Here, it is the story of a 9 PM session that found four critical vulnerabilities in a system we were about to present.
If you want to understand the system, read Building ScamShield. If you want to understand the process of building it with an AI coding partner under hackathon pressure, read this.
This dev diary is part of the ScamShield AI documentation.