Testing Guide¶
This guide covers how to write and run tests for ScamShield AI. All contributions must include tests for new functionality.
Running Tests¶
# Run the full test suite
pytest tests/ -v
# Run a specific test file
pytest tests/guvi/test_extractors.py -v
# Run a specific test class or function
pytest tests/guvi/test_extractors.py::TestUPIExtraction -v
pytest tests/guvi/test_extractors.py::TestUPIExtraction::test_upi_extraction -v
# Run with coverage (if pytest-cov is installed)
pytest tests/ -v --cov=functions --cov-report=term-missing
Test Structure¶
Tests mirror the source tree:
functions/
guvi/
handler.py → tests/guvi/test_handler.py
models.py → tests/guvi/test_models.py
callback.py → tests/guvi/test_callback.py
engine/
orchestrator.py → tests/engine/test_orchestrator.py
extractors/
regex_patterns.py → tests/guvi/test_extractors.py
keywords.py → tests/guvi/test_extractors.py
gemini/
client.py → tests/gemini/test_client.py
utils/
rate_limiter.py → tests/utils/test_rate_limiter.py
sanitizer.py → tests/utils/test_sanitizer.py
Fixtures¶
Shared fixtures live in tests/conftest.py. Use them instead of creating inline test data.
Available Fixtures¶
@pytest.fixture
def sample_guvi_request():
"""Single-turn GUVI request."""
return {
"sessionId": "test-session-001",
"message": {
"sender": "scammer",
"text": "Your KYC has expired. Share OTP to update.",
"timestamp": 1700000000,
},
"conversationHistory": [],
"metadata": {
"channel": "SMS",
"language": "English",
"locale": "IN",
},
}
@pytest.fixture
def sample_multi_turn_request():
"""Multi-turn request with conversation history."""
...
@pytest.fixture
def mock_session():
"""Pre-populated SessionState for testing."""
...
Adding Fixtures¶
- Used across files: Add to
tests/conftest.py - Used in one file: Define at the top of the test file
# tests/conftest.py
@pytest.fixture
def sample_upi_text():
"""Text containing multiple UPI IDs for extractor testing."""
return "Please pay to fraud@oksbi or backup.scammer@paytm for KYC verification"
Parametrized Tests¶
Use @pytest.mark.parametrize for extractor and pattern-matching tests. This is the established project pattern.
Example: UPI Extractor¶
import pytest
from extractors.regex_patterns import extract_upi_ids
class TestUPIExtraction:
@pytest.mark.parametrize(
"text,expected_upis",
[
# Standard UPI handles
("Send money to fraud@oksbi", ["fraud@oksbi"]),
("Pay to scammer123@paytm", ["scammer123@paytm"]),
("UPI: user.name@ybl", ["user.name@ybl"]),
# Multiple UPI IDs
(
"Pay to fraud@oksbi or scammer@paytm",
["fraud@oksbi", "scammer@paytm"],
),
# No UPI IDs
("No UPI here", []),
("Just a normal message about banking", []),
# Edge cases: emails should not be extracted as UPI
("Email me at user@gmail.com", []),
("Contact support@yahoo.com for help", []),
],
)
def test_upi_extraction(self, text, expected_upis):
result = extract_upi_ids(text)
assert sorted(result) == sorted(expected_upis)
Example: Phone Number Extractor¶
@pytest.mark.parametrize(
"text,expected_phones",
[
# Standard mobile
("Call me at 9876543210", ["9876543210"]),
# With +91 prefix
("+91-9876543210", ["+91-9876543210"]),
("+91 8765432109", ["+91-8765432109"]),
# Landline
("Office: 011-23456789", ["011-23456789"]),
# Not a phone (too many digits)
("Account 12345678901234", []),
# Embedded in text
("My number is 7890123456, call anytime", ["7890123456"]),
],
)
def test_phone_extraction(self, text, expected_phones):
result = extract_phone_numbers(text)
assert sorted(result) == sorted(expected_phones)
Mocking External Services¶
Never call real APIs in tests. Mock these services:
Gemini API¶
from unittest.mock import patch, MagicMock
@patch("gemini.client.GeminiClient.classify_scam")
def test_classification(mock_classify):
mock_classify.return_value = {
"classification": "KYC_BANKING",
"confidence": 0.85,
}
# Your test code here
result = orchestrator.process(session, message, history, metadata)
assert result.scam_type == "KYC_BANKING"
mock_classify.assert_called_once()
Firestore¶
@patch("firestore.sessions.get_session")
def test_new_session_created(mock_get_session):
mock_get_session.return_value = None
# Test that a new session is created when none exists
@patch("firestore.sessions.save_session")
def test_session_persisted(mock_save):
# Test that session state is saved correctly
process_honeypot_request(request_data)
mock_save.assert_called()
GUVI Callback¶
@patch("guvi.callback.GuviCallbackService.send_final_result")
def test_callback_sent(mock_callback):
mock_callback.return_value = True
# Test that callback is triggered under the right conditions
Cloud Tasks¶
@patch("tasks.callback_scheduler.schedule_callback_task")
def test_task_scheduled(mock_schedule):
mock_schedule.return_value = True
# Test that a delayed callback task is scheduled
Testing Pydantic Models¶
Test validation, default values, and serialization:
Validation¶
from guvi.models import GuviRequest, ExtractedIntelligence
def test_guvi_request_validation():
data = {
"sessionId": "test-001",
"message": {"sender": "scammer", "text": "Hello", "timestamp": 1700000000},
"metadata": {"channel": "SMS", "language": "English", "locale": "IN"},
}
req = GuviRequest.model_validate(data)
assert req.sessionId == "test-001"
assert req.message.sender == "scammer"
assert req.conversationHistory == [] # Default
Default Values¶
def test_extracted_intelligence_defaults():
evidence = ExtractedIntelligence()
assert evidence.upiIds == []
assert evidence.bankAccounts == []
assert evidence.phishingLinks == []
assert evidence.suspiciousKeywords == []
assert evidence.amounts == []
Timestamp Validation¶
@pytest.mark.parametrize(
"timestamp,expected",
[
(1700000000, 1700000000), # int
(1700000000.5, 1700000000.5), # float
("1700000000", 1700000000), # string → int
("2024-01-15T10:30:00Z", 1705312200), # ISO 8601
],
)
def test_timestamp_coercion(timestamp, expected):
from guvi.models import GuviMessage
msg = GuviMessage(sender="scammer", text="test", timestamp=timestamp)
assert msg.timestamp == expected
What Must Be Tested¶
Use this checklist when adding new functionality:
New Extractor Patterns¶
- [ ] Valid pattern matches (standard formats)
- [ ] Multiple matches in the same text
- [ ] No matches when pattern is absent
- [ ] Edge cases: partial matches, similar-but-invalid patterns
- [ ] False positive prevention (e.g., emails not extracted as UPI IDs)
- [ ] Text with embedded whitespace, punctuation, or mixed case
New Pydantic Model Fields¶
- [ ] Default value is correct
- [ ] Required field raises
ValidationErrorwhen missing - [ ] Field validators work correctly
- [ ] Serialization via
model_dump()produces expected output
Handler Logic Changes¶
- [ ] Request parsing with valid input
- [ ] Request parsing with missing/invalid fields (graceful degradation)
- [ ] Rate limiting behavior
- [ ] API key validation (valid, invalid, missing, dev mode)
Callback Trigger Conditions¶
- [ ] Callback sent when conditions are met
- [ ] Callback not sent when conditions are not met
- [ ] Callback not sent when already sent (
callback_sent=True) - [ ] Retry and circuit breaker behavior
Sanitizer Patterns¶
- [ ] Known injection patterns are filtered
- [ ] Normal text is not modified
- [ ] Truncation at character limit
Test Configuration¶
Tests are configured to run from the repository root. The conftest.py file at tests/ sets up the Python path to include functions/:
This allows tests to import from functions/ using the same paths as the production code:
CI Integration¶
The test suite runs automatically on:
- Push to non-main branches -- via
test.yml - Pull requests to main -- via
test.yml - Push to main -- as the first job in
deploy.yml(deploy is blocked until tests pass)
All tests must pass before a PR can be merged.