Skip to main content

Auto-verification on save

Auto-verification runs an API scenario as soon as it is saved. This gives you fast feedback while the scenario is still being authored, before it becomes part of a scheduled suite.

How it works

When scenario_save runs, Qodex does not just persist the scenario. For API scenarios, it also runs the scenario’s steps against the default staging environment, captures pass/fail per step, and writes last_run_status back to the scenario row. The verdict shows up in the scenarios list as a pass/fail badge before you even close the create dialog. This is deliberate. A generated scenario that fails its first run is often broken in a fixable way: a stale schema, a missing capture, or a misnamed environment variable. Surfacing the failure during authoring lets you or the agent correct it before the scenario becomes active.

What auto-verify checks

For each step:
  • HTTP status codes: the actual response status against expectedStatus.
  • Response shape: each declared expectations entry evaluates against the live response. JSONPath checks, body-shape assertions, and header checks all run.
  • Captures: any captures on a step are evaluated against the live response. If a capture fails (the JSONPath doesn’t resolve), the step fails.
For the whole scenario:
  • Pass: every step’s expectations evaluate true and no step errors out.
  • Fail: at least one step’s assertions evaluate false.
  • Error: a step couldn’t run at all (network failure, malformed request, env not found).

What auto-verify does not check

  • Semantic correctness. Auto-verify only enforces the assertions you or the agent wrote into the scenario. If the assertion is weak (status == 200, no body check), auto-verify will pass on bad data that returns 200.
  • Business invariants. “The new user actually shows up in the admin list” is a multi-step check you have to explicitly write. Auto-verify won’t infer it.
  • Side effects on other endpoints. Auto-verify isolates per scenario. If creating a user breaks the auth flow for everyone else, you’ll catch that in the next full-suite run, not in this save.
Auto-verify is a tight feedback loop, not a coverage guarantee.

What you see in the UI

The scenario list row shows a verdict pill the moment the save completes:
scn_a1b2  POST /users returns 201 with id   [PASS]   Edited 3s ago
scn_c3d4  GET /orders/:id IDOR returns 403  [FAIL]   Edited 12s ago
Click into a FAIL row to see which step failed, the actual HTTP status, the response body, and the assertion that did not match. From there you usually:
  • Edit the assertion if your expectation was wrong.
  • Edit the request if you sent the wrong payload.
  • Demote to draft and ask the agent to investigate.

When it doesn’t run

  • No staging environment. Auto-verify needs a default staging environment to fire against. If you haven’t created one, the scenario saves and gets last_run_status: null.
  • UI scenarios. UI scenarios go through scenario_record, not scenario_save. They have their own post-author verify path (see the UI testing docs).
  • Scenarios in a group with assumed captures. Group members that depend on an earlier member’s captures get auto-verified at the group level once you wire the group up.

On the roadmap

Self-critique on save runs a second LLM pass in addition to auto-verify: it reviews the generated scenario against the stated goal and returns a structured verdict (approve / revise / reject) with an issue list. Auto-verify checks “does it run”; self-critique checks “does it test what it claims to test.” The two run independently and both attach to the row.

Scenarios

The shape and lifecycle.

Test rules in plain English

How assertions get into a scenario.

Failure classification

What happens when an active scenario fails later.

Run tests

Triggering scenarios on demand, on a schedule, or via webhook.