verifies
verifies
Declare embedded test cases for a machine. The verifies section contains test blocks that validate the machine’s behavior with specific inputs, expected outputs, and optional step mocks. Tests run during mashin test and are the primary mechanism for ensuring a machine works correctly without calling real APIs, LLMs, or external services.
Every machine should have at least two test cases: a happy path and an edge case.
When to use
Use verifies for:
- Validating that the machine produces correct output for known inputs
- Testing edge cases and error conditions
- Mocking LLM and effect step responses for deterministic, fast, free tests
- Regression testing after changes
Use achieves > for example for specification-by-example (happy path documentation). Use verifies for thorough test coverage including edge cases, error paths, and mocked dependencies.
Syntax
verifies test "<description>" given { <input_field>: <value>, ... } expect { <output_field>: <value>, ... } assuming <step_name> { <field>: <mock_value>, ... }Components
| Component | Required | Description |
|---|---|---|
test "<description>" | Yes | Human-readable test name. Should describe what is being tested. |
given { ... } | Yes | Input values for the test. Must match the accepts contract. |
expect { ... } | Yes | Expected output values. Matched against the machine’s actual output. |
assuming <step> { ... } | No | Mock return values for specific steps. Replaces real execution. |
assuming blocks
The assuming keyword provides mock data for governed steps (ask, action, remember, recall). When a test runs, any step with a matching assuming block returns the mock values instead of executing:
assuming classify {category: "billing", confidence: 0.95}This makes tests:
- Deterministic: no LLM variability
- Fast: no API calls, no network latency
- Free: no token costs
- Offline: no API keys required
You can have multiple assuming blocks per test to mock different steps.
Examples
Basic test with mocked LLM
machine sentiment_classifier accepts text as text, is required
responds with sentiment as text confidence as number
implements ask classify, using: "anthropic:claude-haiku-4" with task "Classify sentiment.\n\nText: ${input.text}" returns sentiment as text confidence as number
verifies test "positive review" assuming classify {sentiment: "positive", confidence: 0.95} given {text: "This product is amazing!"} expect {sentiment: "positive", confidence: 0.95}
test "negative review" assuming classify {sentiment: "negative", confidence: 0.88} given {text: "Terrible experience, waste of money."} expect {sentiment: "negative"}
test "neutral text" assuming classify {sentiment: "neutral", confidence: 0.72} given {text: "The meeting is at 3pm."} expect {sentiment: "neutral"}Testing with multiple step mocks
machine research_assistant accepts question as text, is required
responds with answer as text sources as list cached as boolean
implements recall check_cache query: input.question collection: "research" limit: 3
decide use_cache when steps.check_cache.count > 0 compute cached {answer: steps.check_cache.results[0].content, sources: [], cached: true} otherwise ask research, using: "anthropic:claude-sonnet-4-6" with task "Research: ${input.question}" returns answer as text sources as list assuming answer: "Research answer" sources: ["source1.com"]
verifies test "returns cached result when available" assuming check_cache {count: 1, results: [{content: "Cached answer", metadata: {}, score: 0.9}]} given {question: "What is Rice's theorem?"} expect {cached: true, answer: "Cached answer"}
test "researches when cache is empty" assuming check_cache {count: 0, results: []} assuming research {answer: "Fresh research", sources: ["paper.pdf"]} given {question: "Obscure question"} expect {cached: false, answer: "Fresh research"}Testing error handling
machine resilient_fetcher accepts url as text, is required
responds with data as map source as text
implements action fetch http url: input.url timeout: 5000 compute result {data: fetch.body, source: "live"} on failure compute fallback {data: {}, source: "fallback"}
verifies test "returns live data on success" assuming fetch {body: {temperature: 72}, status: 200} given {url: "https://api.example.com/data"} expect {source: "live"}
test "returns fallback on failure" assuming fetch {error: "timeout"} given {url: "https://unreachable.example.com"} expect {source: "fallback", data: {}}Testing branching logic
verifies test "routes high priority to urgent queue" assuming classify {priority: "high", risk: "elevated"} given {amount: 15000, department: "engineering"} expect {queue: "urgent", needs_approval: true}
test "auto-approves small amounts" assuming classify {priority: "low", risk: "minimal"} given {amount: 50, department: "office"} expect {queue: "standard", needs_approval: false}Relationship to achieves
verifies > test | achieves > for example | |
|---|---|---|
| Mocking | Full assuming support | No assuming support |
| Scope | Edge cases, errors, integration | Happy path specification |
| Detail | Can test intermediate steps | Input/output only |
| Required | Yes (minimum 2 tests) | Recommended |
Both are executed by mashin test. Use for example for specification; use test for thorough coverage.
Running tests
# Run all tests for a machinemashin test my_machine.mashin
# Run all tests in a projectmashin test
# Run a specific test by namemashin test my_machine.mashin --filter "positive review"Governance
Tests run in test mode, which means:
assumingmocks replace governed step execution (no real LLM calls, no real effects)- No governance decisions are recorded in the behavioral ledger during test runs
- No API keys or external services are required
- No costs are incurred
The verifies section itself is a compile-time declaration. It adds no runtime overhead to production execution. Tests are only executed when explicitly requested.
Translations
| Language | Keyword |
|---|---|
| English | verifies |
| Spanish | verifica |
| French | verifie |
| German | überprüft |
| Japanese | 検証 |
| Chinese | 验证 |
| Korean | 검증 |
Sub-keywords
| English | Spanish | French | German | Japanese | Chinese | Korean |
|---|---|---|---|---|---|---|
| test | prueba | test | Test | テスト | 测试 | 테스트 |
| given | dado | donne | gegeben | 入力 | 给定 | 주어진 |
| expect | espera | attend | erwartet | 期待 | 期望 | 기대 |
| assuming | asumiendo | supposant | angenommen | 仮定 | 假设 | 가정 |
See also
- achieves - Goal specification with example pairs
- accepts - Input fields used in
givenblocks - responds with - Output fields used in
expectblocks - ask … using - Step-level
assumingfor inline mocking