Skip to content

Changelog

Development history of ScamShield AI, organized by phase.

Phase 1: Core System (Feb 11, 2026)

The initial build — from zero to working honeypot in a single day.

  • Architecture design — Firebase Cloud Functions + Gemini Flash + Firestore
  • GUVI webhook handler — POST endpoint with API key authentication
  • Pydantic models — Request/response models matching GUVI spec (camelCase aliases)
  • Scam classification — Gemini-powered classifier with 6 initial scam types
  • 3 personas — Sharma Uncle, Lakshmi Aunty, Vikram Professional
  • Evidence extraction — Regex patterns for UPI IDs, bank accounts, phone numbers
  • Keyword detection — 11 categories with weighted scoring
  • Session management — In-memory store (later migrated to Firestore)
  • CI/CD pipeline — GitHub Actions with Firebase CLI deploy

Phase 2: Dashboard & Security (Feb 12-13, 2026)

Production hardening and operational visibility.

  • Streamlit dashboard — 9 pages on Cloud Run
  • PIN authentication — HMAC-signed cookies with Firestore lockout
  • Security audit — 4 critical/high findings identified and fixed:
  • API key bypass → constant-time comparison
  • Missing OIDC verification → Cloud Tasks token validation
  • Secret wiring gap → proper Secret Manager integration
  • Callback field mismatch → aligned with GUVI spec
  • Cloudflare proxy — Worker with redirect: "manual" (critical gotcha)
  • Workload Identity Federation — Keyless CI/CD deploys
  • Input sanitizer — Prompt injection pattern detection and neutralization

Phase 3: Firestore Migration (Feb 14-18, 2026)

Moving from in-memory to persistent storage.

  • Firestore sessionshoneypot_sessions collection with batch updates
  • Evidence index — Cross-session evidence linking (evidence_index collection)
  • Rate limiter — Per-session rate limiting with Firestore counters
  • Cloud Tasks — Delayed callback scheduling (10s inactivity)

Phase 4: Classification & Extraction Expansion (Feb 19-20, 2026)

Broader scam coverage and better evidence capture.

  • 6 new scam types — Investment, Insurance, Romance, Loan, Custom Duty, Crypto
  • IFSC code extraction — Bank branch identification
  • Aadhaar detection — 12-digit with Verhoeff checksum validation
  • PAN detection — ABCDE1234F format
  • Amount extraction — ₹/Rs patterns with Indian numbering
  • Phone number improvements — +91 prefix preservation, helpline filtering

Phase 5: GUVI Evaluation Optimization (Feb 21, 2026)

Maximizing evaluation scores through systematic improvements.

  • Per-turn callbacks — Send intelligence on every response, not just at conversation end
  • Scam detection from turn 1 — Always report scamDetected: true with initial classification
  • Response format enrichmentextractedIntelligence, engagementMetrics, agentNotes on every response
  • Strategy state machine — BUILDING_TRUST → EXTRACTING → DIRECT_PROBE → PIVOTING
  • Pipeline context — Dynamic prompt assembly with language detection and edge-case analysis
  • Bug fix — scamDetected duplication in callback payloads
  • Regex improvements — Higher extraction accuracy for UPI and phone patterns

Open Source Release (Mar 2026)

Transforming from private hackathon project to public GitHub showcase.

  • PII sanitization — All personal info and infrastructure references replaced with placeholders
  • MkDocs documentation site — Architecture reference, educational chapters, deployment guides
  • "Building ScamShield" series — 10-chapter educational walkthrough
  • Contributing guides — Tutorials for adding personas and extractors
  • MIT License