Skip to content

Chapter 6: Self-Correction --- The Strategy State Machine

What We Built

ScamShield AI does not generate the same kind of response on turn 1 as it does on turn 8. Early in a conversation, the system builds trust and plays the naive victim. Mid-conversation, it pivots to extracting payment details. Late in the conversation --- if direct extraction has stalled --- it tries different angles or backs off to rebuild rapport.

This adaptive behavior is powered by a strategy state machine with four states: BUILDING_TRUST, EXTRACTING, DIRECT_PROBE, and PIVOTING. The transitions are driven by concrete signals: confidence thresholds, evidence gathered, messages since last evidence, and scammer engagement level. The state machine provides structure; the LLM provides creativity within that structure.

This chapter explains why static responses fail, how the state machine works, and how it integrates with the orchestrator's dynamic prompt assembly.

Why This Approach

The Problem with Static Responses

Early versions of ScamShield AI used a simple pattern: classify the scam, select a persona, and let the LLM improvise. This worked surprisingly well for 3-4 turns. Then it broke down.

The failure modes were predictable:

  • The LLM got stuck in loops. Without explicit guidance to shift tactics, Gemini would keep asking the same kinds of questions. "Please tell me your name sir" on turn 2, 3, 4, 5...
  • Extraction timing was wrong. The LLM would sometimes ask for UPI details on turn 1 (too aggressive, scammer gets suspicious) or never ask at all (too passive, no intelligence extracted).
  • No adaptation to scammer behavior. When a scammer started giving short, suspicious responses, the LLM did not know to back off and rebuild trust. When a scammer was highly engaged, the LLM did not know to push harder.

We needed a mechanism that could observe conversation dynamics and adjust the LLM's approach without replacing the LLM's natural language capabilities.

Why Not Pure LLM Control?

We considered letting the LLM decide its own strategy entirely. The problem: LLMs are not reliable state trackers. They can summarize what happened, but they struggle to consistently implement multi-turn strategies that depend on counting messages, tracking what evidence has been gathered, and making threshold-based decisions.

We also considered a pure rule-based system with templated responses. That was too rigid --- scam conversations are inherently unpredictable, and templates sound robotic.

The hybrid approach gives us the best of both worlds: deterministic strategy transitions (rules) with creative response generation (LLM).

The Code

The State Machine

The strategy state machine lives in functions/engine/orchestrator.py in the determine_strategy_adjustment() function. Here is the full state diagram:

stateDiagram-v2
    [*] --> BUILDING_TRUST

    BUILDING_TRUST --> EXTRACTING : msg_count >= 3 AND confidence > 0.6
    BUILDING_TRUST --> BUILDING_TRUST : NOT_SCAM classification

    EXTRACTING --> DIRECT_PROBE : msgs_since_evidence >= 4 AND no UPI/bank
    EXTRACTING --> PIVOTING : UPI or bank account extracted

    DIRECT_PROBE --> BUILDING_TRUST : Scammer disengaging (short msgs) + 2 stalled turns
    DIRECT_PROBE --> DIRECT_PROBE : msgs_since_evidence >= 3 (vary tactics)

    PIVOTING --> PIVOTING : Continue extracting scammer identity

    note right of BUILDING_TRUST
        Play naive victim.
        Ask verification questions.
        Build rapport.
    end note

    note right of EXTRACTING
        Ask for payment details.
        Express willingness to comply.
        Request UPI/bank info.
    end note

    note right of DIRECT_PROBE
        Be more direct.
        "Please send your UPI ID."
        Try alternate angles.
    end note

    note right of PIVOTING
        Payment captured.
        Now extract scammer identity:
        name, employee ID, address.
    end note

Transition Logic

Each state has explicit transition conditions. The function returns a dict with the new strategy, the reason for the adjustment, and specific tactics the LLM should use:

def determine_strategy_adjustment(
    session: SessionState,
    evidence: ExtractedIntelligence,
    scammer_message: str,
    scam_type: str = "UNKNOWN",
) -> dict:
    # NOT_SCAM: stay in trust-building mode
    if scam_type == "NOT_SCAM":
        return {
            "new_strategy": "BUILDING_TRUST",
            "adjustment_reason": None,
            "tactics_suggestion": "This conversation does not appear to be a scam. "
                "Stay in character and have a natural conversation.",
        }

    current_strategy = session.strategy_state
    msg_count = session.message_count
    msgs_since_evidence = session.messages_since_evidence
    has_high_value = bool(evidence.upiIds) or bool(evidence.bankAccounts)
    scammer_engaged = len(scammer_message) > 50  # Short responses = suspicious

    result = {"new_strategy": current_strategy, "adjustment_reason": None, "tactics_suggestion": None}

    if current_strategy == "BUILDING_TRUST":
        if msg_count >= 3 and session.confidence > 0.6:
            result["new_strategy"] = "EXTRACTING"
            result["adjustment_reason"] = "Trust established, begin extraction"
            result["tactics_suggestion"] = (
                "Start asking about payment methods. Express willingness to "
                "pay/comply but ask for their payment details first for 'verification'."
            )

    elif current_strategy == "EXTRACTING":
        if msgs_since_evidence >= 4 and not has_high_value:
            result["new_strategy"] = "DIRECT_PROBE"
            result["adjustment_reason"] = f"No evidence after {msgs_since_evidence} messages"
            result["tactics_suggestion"] = (
                "Be more direct. Ask specifically: 'Please send your UPI ID "
                "so I can pay' or 'What bank account should I transfer to?'"
            )
        elif has_high_value:
            result["new_strategy"] = "PIVOTING"
            result["adjustment_reason"] = "Payment details obtained, extract scammer identity"
            result["tactics_suggestion"] = (
                "Payment details captured. Now extract THEIR details: "
                "employee ID, supervisor name, office address, official email."
            )

    elif current_strategy == "DIRECT_PROBE":
        if not scammer_engaged and msgs_since_evidence >= 2:
            result["new_strategy"] = "BUILDING_TRUST"
            result["adjustment_reason"] = "Scammer seems suspicious, rebuilding trust"
            result["tactics_suggestion"] = (
                "Back off slightly. Express confusion and fear. Show you're "
                "a naive victim who wants to comply but doesn't understand."
            )
        elif msgs_since_evidence >= 3:
            result["adjustment_reason"] = "Direct approach not working, vary tactics"
            result["tactics_suggestion"] = (
                "Try a different angle: mention you don't have a smartphone/UPI, "
                "ask if they can call you, or ask for their WhatsApp number."
            )

    elif current_strategy == "PIVOTING":
        result["tactics_suggestion"] = (
            "Continue extracting scammer identity. Ask for: their name, "
            "employee ID, department, office address, supervisor's name."
        )

    return result

The scammer_engaged Heuristic

We use message length as a proxy for scammer engagement. Messages over 50 characters suggest the scammer is still invested. Short, terse responses suggest they are getting suspicious or losing interest. This is a rough heuristic, but it works surprisingly well in practice --- engaged scammers tend to write longer messages with instructions and threats, while suspicious scammers give one-word answers.

Tracking State Across Turns

The state machine needs persistent counters to function. These are stored on the SessionState model and saved to Firestore on every turn:

class SessionState(BaseModel):
    # ... standard fields ...

    # Self-correction tracking
    strategy_state: str = "BUILDING_TRUST"
    messages_since_evidence: int = 0   # Counter for stalled conversations
    high_value_extracted: bool = False  # True if UPI/bank already obtained
    extraction_attempts: int = 0       # How many times we've tried to extract

The handler updates messages_since_evidence on every turn:

# If new high-value evidence found this turn, reset the counter
previous_high_value = session.high_value_extracted
new_evidence_this_turn = has_high_value and not previous_high_value

if new_evidence_this_turn:
    messages_since_evidence = 0
    logger.info("Self-correction: High-value evidence found, resetting counter")
else:
    messages_since_evidence = session.messages_since_evidence + 1

This counter is what drives the EXTRACTING -> DIRECT_PROBE transition: if four messages pass without new high-value evidence, the strategy escalates.

Dynamic Prompt Assembly with PipelineContext

The strategy adjustment does not directly modify the LLM prompt. Instead, it flows through the PipelineContext, which collects dynamic prompt sections from multiple pipeline stages and assembles them in priority order:

@dataclass
class PipelineContext:
    detected_language: str = "English"
    edge_case_context: Optional[str] = None
    conversation_stage: Optional[str] = None
    _prompt_sections: List[Tuple[int, str, str]] = field(default_factory=list)

    def add_prompt_section(self, priority: int, label: str, content: str) -> None:
        self._prompt_sections.append((priority, label, content))

    def get_prompt_sections(self) -> str:
        sorted_sections = sorted(self._prompt_sections, key=lambda x: x[0])
        return "\n".join(content for _, _, content in sorted_sections)

Priority conventions:

Priority Source Example
10 Edge-case context "Scammer appears to be a bot --- use shorter sentences"
20 Language instruction "Respond in Hindi-English mix (Hinglish)"
25 Quality directives "TURN 4 OF 10 --- ask 2 investigative questions"
30 Engagement tactics "CURRENT STRATEGY: EXTRACTING --- ask for payment details"

The orchestrator injects turn-aware quality directives that evolve across the conversation:

def _inject_quality_directives(self, context, turn_number):
    if turn_number <= 3:
        directive = (
            "TURN {turn_number} OF 10 -- EARLY ENGAGEMENT\n"
            "  (1) Ask 2 identity verification questions\n"
            "  (2) Express confusion about at least 1 red flag\n"
            "  (3) Show willingness to cooperate\n"
            "  (4) Ask for their phone number 'so I can call back'"
        )
    elif turn_number <= 6:
        directive = (
            "TURN {turn_number} OF 10 -- INVESTIGATION PHASE\n"
            "  (1) Ask 2 investigative questions\n"
            "  (2) Call out 2 red flags explicitly\n"
            "  (3) Attempt to elicit phone, email, or WhatsApp\n"
            "  (4) Ask for written proof"
        )
    else:
        directive = (
            "TURN {turn_number} OF 10 -- EXTRACTION PHASE\n"
            "  (1) Demand proof of identity\n"
            "  (2) Call out at least 2 red flags\n"
            "  (3) Final attempt to get all contact details\n"
            "  (4) Ask for full name and employee ID"
        )
    context.add_prompt_section(priority=25, label="quality", content=directive)

The strategy context and quality directives are concatenated into the final prompt alongside the persona instructions, cross-session intelligence, and conversation history. The LLM sees a structured set of instructions but has full creative freedom in how it follows them.

The Strategy in the Prompt

When the orchestrator generates a response, it builds a strategy context block from the adjustment result:

strategy_context = f"""
CURRENT STRATEGY: {strategy_adjustment.get('new_strategy', 'EXTRACTING')}
{f"Reason: {strategy_adjustment['adjustment_reason']}" if ... else ""}

TACTICS TO USE NOW:
{strategy_adjustment['tactics_suggestion']}
"""

This context is passed to Gemini alongside the persona prompt. The LLM knows what to do (extract payment details) and why (trust established after 3 messages) but decides how to do it (the exact words, the tone, the cultural nuances).

Key Architectural Decision

State machine vs. LLM-only control: we chose a hybrid.

The state machine handles what LLMs are bad at: counting, tracking, and threshold-based decisions. The LLM handles what rule systems are bad at: generating natural, contextually appropriate language.

The state machine answers: "What phase of the conversation are we in, and what kind of action should we take?"

The LLM answers: "Given this phase and these tactics, what exactly should I say to this particular scammer in this particular context?"

The Key Insight

State machines and LLMs are complementary, not competing. The state machine is the conductor; the LLM is the orchestra. Neither works well alone. Together, they produce conversations that are both strategically coherent and linguistically natural.

We considered three alternatives:

  1. Pure LLM control. Add instructions like "After 3 messages, start asking for payment details." The problem: the LLM loses count and does not reliably implement multi-turn plans.

  2. Pure rule-based with templates. Each state maps to a set of response templates. The problem: conversations sound robotic and fail when scammers deviate from expected patterns.

  3. Reinforcement learning. Train a policy network to decide transitions. The problem: insufficient data (we had hundreds of conversations, not millions), and the reward signal (evidence extracted) is sparse and delayed.

The hybrid approach required the least training data (zero --- it is all rule-based transitions), was the most debuggable (log the state transition and reason), and produced the most natural conversations.

What We Learned

  1. Message length is a useful engagement signal. We were skeptical that len(scammer_message) > 50 would work as an engagement heuristic, but in practice it discriminates well between engaged scammers (who write paragraphs of threats and instructions) and suspicious ones (who give one-word answers).

  2. The "back off" transition is critical. The DIRECT_PROBE -> BUILDING_TRUST transition --- triggered when the scammer is disengaging --- prevents a common failure mode where aggressive extraction pushes the scammer away. Sometimes you have to retreat to advance.

  3. State machines need persistent counters. The messages_since_evidence counter is the most important signal for triggering escalation. Without it, the system has no way to detect that extraction is stalled. Storing it in Firestore means it survives cold starts.

  4. Turn-aware quality directives pay off. Injecting specific instructions for each conversation phase (early, mid, late) produces measurably better responses than generic "be engaging" instructions. The LLM responds well to numbered checklists.

  5. Log the adjustment reason. Every strategy transition is logged with a human-readable reason ("No evidence after 4 messages, switching to direct approach"). This made debugging and tuning the transition thresholds dramatically easier than trying to infer what happened from conversation transcripts alone.