All docs

Test Generation

Test Baseline Review

Review AI-generated tests before they run unattended. Approve, edit, or reject; strict mode for compliance.

Test Baseline Review

"Baseline" is overloaded in AegisRunner. There are two distinct concepts:

ConceptWhat it is
Test review (this page)The workflow for sanity-checking AI-generated tests before you trust them in CI. Approve / request edits / reject each test.
Baseline replay (different)Saving a scan as your project's deterministic replay reference. See Baseline Replays.

This guide is about the first one — reviewing the tests AI wrote before letting them run unattended in CI.

Why review at all

AegisRunner's test generator is conservative — it only writes assertions on selectors it actually saw in your DOM, and it tries to follow your custom prompts. But it can still:

  • Pick the wrong assertion target on an ambiguous page.
  • Generate happy-path tests for flows where you wanted edge-case coverage.
  • Misinterpret a custom prompt.
  • Write a test that's correct but not high-priority for you.

Reviewing every newly-generated test takes ~30 seconds per test and saves you from chasing low-value failures later.

Where to review

Two entry points:

  • From a scan — the Test Suites tab on the scan result page. Each suite has a Review button.
  • From the Test Suites page — filter by Pending Review. Lists every suite with new tests waiting for approval.

Review states

Each test case has a review state:

StateWhat it means
PendingAI generated it; nobody has reviewed yet. Default for new tests.
ApprovedYou've sanity-checked it.
RejectedSkipped. Won't run unless re-approved.

Reviewing a test

Open a test in review mode. You'll see:

  1. The test name and description at the top.
  2. Each step, with a screenshot of the page state right before that step ran during generation. Lets you visualize the flow without running it.
  3. Locator for each step — what element it'll click or assert on.
  4. Assertion — what's expected to be true after the step.
  5. Action buttons — Approve or Reject.

Editing a test instead of rejecting

If a test is mostly right but has the wrong locator, expected text, or step order, edit it inline. Open the test, click into a step, change the locator/value/assertion, save. The next run uses the updated test.

Re-generating a test

If a test is fundamentally off (wrong scenario, wrong page), it's often faster to regenerate than to edit. From the page detail or the suite, click Generate AI tests again — this produces fresh test cases for the page.

Audit trail Pro+

Every review action is logged: who reviewed, when, what state they set, any comments. Useful for SOC 2 audits where you need to demonstrate that someone actually looked at every test before it shipped.

View under Settings → Audit Log, filter by test.review.

Common patterns

Daily review queue

QA reviews new tests once a day. Pull up Test Suites → Pending Review. Approve, edit, or reject. Done in 15 minutes for most projects.

Self-review for solo devs

You're shipping alone. Just glance at new tests after each scan, reject obvious junk, edit the rest if needed.

For compliance

Pro+ audit log proves every test that ran in CI was approved by a named reviewer.

Common questions

What happens to old approvals if a test is regenerated?

The replacement test starts as Pending. AI changed the test; review again.

Does review state affect runs?

Both approved and pending tests run in suites and schedules. Rejected tests are skipped. Use rejection to keep noisy or low-priority tests out of your runs without deleting them.

Related