struckdown

Temporal Extraction Test Suite

Comprehensive test suite for validating temporal type extractions (dates, datetimes, times, durations) in Struckdown.

Running the Tests

Basic Usage

uv run python examples/temporal_test_cases.py

With Verbose Output

See detailed results for each test case:

uv run python examples/temporal_test_cases.py --verbose
# or
uv run python examples/temporal_test_cases.py -v

Stop on First Error

Useful for debugging:

uv run python examples/temporal_test_cases.py --stop-on-error
# or
uv run python examples/temporal_test_cases.py -x

Combined Options

uv run python examples/temporal_test_cases.py --verbose --stop-on-error

Test Coverage

The test suite includes 38 comprehensive test cases covering:

1. Single Date Extractions (4 tests)

2. Recurring Patterns - Single Value (2 tests)

3. Recurring Patterns - Lists (4 tests)

4. Multiple Explicit Dates (2 tests)

5. Quantifiers (4 tests)

6. DateTime Extractions (4 tests)

7. Time Extractions (4 tests)

8. Duration Extractions (5 tests)

9. Mixed Temporal Types (2 tests)

10. Edge Cases (3 tests)

11. Year Inference (2 tests)

12. Complex Patterns (2 tests)

Test Results Format

================================================================================
TEMPORAL EXTRACTION TEST SUITE
================================================================================

Total test cases: 38
Categories: 12

================================================================================
CATEGORY: Single Dates - Explicit
================================================================================

[1/2] Extract explicit date (Jan 15, 2024)... ✓ PASSED
[2/2] Extract ISO format date... ✓ PASSED

...

================================================================================
TEST SUMMARY
================================================================================

✓ Passed: 38/38
✗ Failed: 0/38
⚠ Errors: 0/38

================================================================================
SUCCESS RATE: 100.0%
================================================================================

Key Features Tested

1. Pattern Detection & RRULE Expansion

Tests verify that natural language patterns like “first 2 Tuesdays in September” are:

2. Year Context

Tests verify that:

3. Single vs List Extraction

Tests verify:

4. Temporal Context Injection

Tests verify that date_rule extractions receive temporal context hints, ensuring proper year inference.

Adding New Tests

To add a new test case, append to the TEST_CASES list in temporal_test_cases.py:

{
    "category": "Your Category",
    "description": "Brief description of what's being tested",
    "prompt": "Your prompt with [[date:var]] or other temporal slots",
    "expected_type": date,  # or datetime, time, timedelta, list, etc.
    "validate": lambda r: (
        # Your validation logic
        isinstance(r["var"], date)
        and r["var"].month == 9
    ),
},

Validation Tips

Cache Management

The test suite clears the Struckdown cache before each run. If you encounter issues:

rm -rf ~/.struckdown/cache

Exit Codes

Perfect for CI/CD integration!

Performance

Running all 38 tests typically takes 2-5 minutes depending on:

Use --stop-on-error for faster debugging iterations.