Test Generation
Test Baseline Review
Review AI-generated tests before they run unattended. Approve, edit, or reject; strict mode for compliance.
Test Baseline Review
"Baseline" is overloaded in AegisRunner. There are two distinct concepts:
| Concept | What it is |
|---|---|
| Test review (this page) | The workflow for sanity-checking AI-generated tests before you trust them in CI. Approve / request edits / reject each test. |
| Baseline replay (different) | Saving a scan as your project's deterministic replay reference. See Baseline Replays. |
This guide is about the first one — reviewing the tests AI wrote before letting them run unattended in CI.
Why review at all
AegisRunner's test generator is conservative — it only writes assertions on selectors it actually saw in your DOM, and it tries to follow your custom prompts. But it can still:
- Pick the wrong assertion target on an ambiguous page.
- Generate happy-path tests for flows where you wanted edge-case coverage.
- Misinterpret a custom prompt.
- Write a test that's correct but not high-priority for you.
Reviewing every newly-generated test takes ~30 seconds per test and saves you from chasing low-value failures later.
Where to review
Two entry points:
- From a scan — the Test Suites tab on the scan result page. Each suite has a Review button.
- From the Test Suites page — filter by Pending Review. Lists every suite with new tests waiting for approval.
Review states
Each test case has a review state:
| State | What it means |
|---|---|
| Pending | AI generated it; nobody has reviewed yet. Default for new tests. |
| Approved | You've sanity-checked it. |
| Rejected | Skipped. Won't run unless re-approved. |
Reviewing a test
Open a test in review mode. You'll see:
- The test name and description at the top.
- Each step, with a screenshot of the page state right before that step ran during generation. Lets you visualize the flow without running it.
- Locator for each step — what element it'll click or assert on.
- Assertion — what's expected to be true after the step.
- Action buttons — Approve or Reject.
Editing a test instead of rejecting
If a test is mostly right but has the wrong locator, expected text, or step order, edit it inline. Open the test, click into a step, change the locator/value/assertion, save. The next run uses the updated test.
Re-generating a test
If a test is fundamentally off (wrong scenario, wrong page), it's often faster to regenerate than to edit. From the page detail or the suite, click Generate AI tests again — this produces fresh test cases for the page.
Audit trail Pro+
Every review action is logged: who reviewed, when, what state they set, any comments. Useful for SOC 2 audits where you need to demonstrate that someone actually looked at every test before it shipped.
View under Settings → Audit Log, filter by test.review.
Common patterns
Daily review queue
QA reviews new tests once a day. Pull up Test Suites → Pending Review. Approve, edit, or reject. Done in 15 minutes for most projects.
Self-review for solo devs
You're shipping alone. Just glance at new tests after each scan, reject obvious junk, edit the rest if needed.
For compliance
Pro+ audit log proves every test that ran in CI was approved by a named reviewer.
Common questions
What happens to old approvals if a test is regenerated?
The replacement test starts as Pending. AI changed the test; review again.
Does review state affect runs?
Both approved and pending tests run in suites and schedules. Rejected tests are skipped. Use rejection to keep noisy or low-priority tests out of your runs without deleting them.
Related
- AI Test Generation — what we generate.
- Baseline Replays — the other "baseline" concept.
- Running Tests