verifies

Declare embedded test cases for a machine. The verifies section contains test blocks that validate the machine’s behavior with specific inputs, expected outputs, and optional step mocks. Tests run during mashin test and are the primary mechanism for ensuring a machine works correctly without calling real APIs, LLMs, or external services.

Every machine should have at least two test cases: a happy path and an edge case.

When to use

Use verifies for:

Validating that the machine produces correct output for known inputs
Testing edge cases and error conditions
Mocking LLM and effect step responses for deterministic, fast, free tests
Regression testing after changes

Use achieves > for example for specification-by-example (happy path documentation). Use verifies for thorough test coverage including edge cases, error paths, and mocked dependencies.

Syntax

verifies
  test "<description>"
    given { <input_field>: <value>, ... }
    expect { <output_field>: <value>, ... }
    assuming <step_name> { <field>: <mock_value>, ... }

Components

Component	Required	Description
`test "<description>"`	Yes	Human-readable test name. Should describe what is being tested.
`given { ... }`	Yes	Input values for the test. Must match the `accepts` contract.
`expect { ... }`	Yes	Expected output values. Matched against the machine’s actual output.
`assuming <step> { ... }`	No	Mock return values for specific steps. Replaces real execution.

`assuming` blocks

The assuming keyword provides mock data for governed steps (ask, action, remember, recall). When a test runs, any step with a matching assuming block returns the mock values instead of executing:

assuming classify {category: "billing", confidence: 0.95}

This makes tests:

Deterministic: no LLM variability
Fast: no API calls, no network latency
Free: no token costs
Offline: no API keys required

You can have multiple assuming blocks per test to mock different steps.

Examples

Basic test with mocked LLM

machine sentiment_classifier
  accepts
    text as text, is required

  responds with
    sentiment as text
    confidence as number

  implements
    ask classify, using: "anthropic:claude-haiku-4"
      with task "Classify sentiment.\n\nText: ${input.text}"
      returns
        sentiment as text
        confidence as number

  verifies
    test "positive review"
      assuming classify {sentiment: "positive", confidence: 0.95}
      given {text: "This product is amazing!"}
      expect {sentiment: "positive", confidence: 0.95}

    test "negative review"
      assuming classify {sentiment: "negative", confidence: 0.88}
      given {text: "Terrible experience, waste of money."}
      expect {sentiment: "negative"}

    test "neutral text"
      assuming classify {sentiment: "neutral", confidence: 0.72}
      given {text: "The meeting is at 3pm."}
      expect {sentiment: "neutral"}

Testing with multiple step mocks

machine research_assistant
  accepts
    question as text, is required

  responds with
    answer as text
    sources as list
    cached as boolean

  implements
    recall check_cache
      query: input.question
      collection: "research"
      limit: 3

    decide use_cache
      when steps.check_cache.count > 0
        compute cached
          {answer: steps.check_cache.results[0].content, sources: [], cached: true}
      otherwise
        ask research, using: "anthropic:claude-sonnet-4-6"
          with task "Research: ${input.question}"
          returns
            answer as text
            sources as list
          assuming
            answer: "Research answer"
            sources: ["source1.com"]

  verifies
    test "returns cached result when available"
      assuming check_cache {count: 1, results: [{content: "Cached answer", metadata: {}, score: 0.9}]}
      given {question: "What is Rice's theorem?"}
      expect {cached: true, answer: "Cached answer"}

    test "researches when cache is empty"
      assuming check_cache {count: 0, results: []}
      assuming research {answer: "Fresh research", sources: ["paper.pdf"]}
      given {question: "Obscure question"}
      expect {cached: false, answer: "Fresh research"}

Testing error handling

machine resilient_fetcher
  accepts
    url as text, is required

  responds with
    data as map
    source as text

  implements
    action fetch http
      url: input.url
      timeout: 5000
    compute result
      {data: fetch.body, source: "live"}
    on failure
      compute fallback
        {data: {}, source: "fallback"}

  verifies
    test "returns live data on success"
      assuming fetch {body: {temperature: 72}, status: 200}
      given {url: "https://api.example.com/data"}
      expect {source: "live"}

    test "returns fallback on failure"
      assuming fetch {error: "timeout"}
      given {url: "https://unreachable.example.com"}
      expect {source: "fallback", data: {}}

Testing branching logic

verifies
  test "routes high priority to urgent queue"
    assuming classify {priority: "high", risk: "elevated"}
    given {amount: 15000, department: "engineering"}
    expect {queue: "urgent", needs_approval: true}

  test "auto-approves small amounts"
    assuming classify {priority: "low", risk: "minimal"}
    given {amount: 50, department: "office"}
    expect {queue: "standard", needs_approval: false}

Relationship to achieves

	`verifies > test`	`achieves > for example`
Mocking	Full `assuming` support	No `assuming` support
Scope	Edge cases, errors, integration	Happy path specification
Detail	Can test intermediate steps	Input/output only
Required	Yes (minimum 2 tests)	Recommended

Both are executed by mashin test. Use for example for specification; use test for thorough coverage.

Running tests

# Run all tests for a machine
mashin test my_machine.mashin

# Run all tests in a project
mashin test

# Run a specific test by name
mashin test my_machine.mashin --filter "positive review"

Governance

Tests run in test mode, which means:

assuming mocks replace governed step execution (no real LLM calls, no real effects)
No governance decisions are recorded in the behavioral ledger during test runs
No API keys or external services are required
No costs are incurred

The verifies section itself is a compile-time declaration. It adds no runtime overhead to production execution. Tests are only executed when explicitly requested.

Translations

Language	Keyword
English	verifies
Spanish	verifica
French	verifie
German	überprüft
Japanese	検証
Chinese	验证
Korean	검증

Sub-keywords

English	Spanish	French	German	Japanese	Chinese	Korean
test	prueba	test	Test	テスト	测试	테스트
given	dado	donne	gegeben	入力	给定	주어진
expect	espera	attend	erwartet	期待	期望	기대
assuming	asumiendo	supposant	angenommen	仮定	假设	가정

verifies

verifies

When to use

Syntax

Components

assuming blocks

Examples

Basic test with mocked LLM

Testing with multiple step mocks

Testing error handling

Testing branching logic

Relationship to achieves

Running tests

Governance

Translations

Sub-keywords

See also

`assuming` blocks