All docs

Test Execution

Smart Test Selection

Run only the tests that matter on this PR. Strategies: changed files, flake, recency, tags, last-run failures.

Smart Test Selection

Once your test suite is large, running everything on every commit is slow and expensive. Smart Test Selection lets you choose which tests to run based on what changed — file-changes, flake history, recency, or a manual selection.

Availability: Pro plan and above. Available in CI triggers and from the Run modal.

Strategies

StrategyPicks tests by…
changedTest files mapped to source files in your PR's diff. Skips tests that don't touch what changed.
flakyTests with a recent history of intermittent failure. Useful for stability sweeps.
recentTests created or edited in the last N days. Useful for sanity-checking new tests.
tagsTests with specific tags (e.g. smoke, critical, checkout).
failed-last-runOnly tests that failed in the most recent run for this project + browser. Quick re-verification.
manualExplicit list of test or suite IDs.

Using a strategy from CI

POST /api/v1/ci/trigger
Authorization: Bearer aegis_<token>

{
  "selectionStrategy": "changed",
  "selectionParams": {
    "diffFiles": ["src/cart/CartPage.tsx", "src/api/orders.ts"]
  },
  "wait": true
}

Pass the changed files via selectionParams.diffFiles. AegisRunner maps each file to tests that:

  • Visit a page that uses that file.
  • Test a flow that transitions through that file's component.
  • Were generated from a scan that captured that file's URL.

The mapping is approximate — file → URL → page → suite. Better than nothing for catching the most relevant tests; not as precise as pure unit-test coverage mapping.

GitHub Actions example

- name: Get changed files
  id: changed-files
  uses: tj-actions/changed-files@v44

- name: Run AegisRunner on changed files
  run: |
    curl -fsSL -X POST https://aegisrunner.com/api/v1/ci/trigger \
      -H "Authorization: Bearer aegis_${{ secrets.AEGIS_CI_TOKEN }}" \
      -H "Content-Type: application/json" \
      -d "{
        \"selectionStrategy\": \"changed\",
        \"selectionParams\": { \"diffFiles\": ${{ toJSON(steps.changed-files.outputs.all_changed_files) }} },
        \"wait\": true
      }"

flaky strategy

Run only the tests with a flake-pattern label from triage. Useful patterns:

  • Once a week, run flakies 5 times each to gather more data on whether they're really flaky or there's a real intermittent bug.
  • After a fix to suspected flakiness, run flakies repeatedly to confirm the fix held.
{
  "selectionStrategy": "flaky",
  "selectionParams": { "minFlakeRate": 0.05 }
}

minFlakeRate: 0.05 means tests with at least 5% flake rate over the last 30 days.

recent strategy

Run only tests created or edited in the last N days:

{
  "selectionStrategy": "recent",
  "selectionParams": { "withinDays": 7 }
}

Useful when:

  • You want to verify newly-generated tests work before approving them.
  • You just edited a batch of tests and want to confirm they all pass.

tags strategy

Tests can be tagged manually or by AI. Run only tests with specific tags:

{
  "selectionStrategy": "tags",
  "selectionParams": { "tags": ["smoke", "critical"] }
}

Common tag conventions:

  • smoke — minimal coverage of every page (page-loads).
  • critical — checkout, login, signup. Block-on-fail.
  • flaky — quarantined tests; see Debugging Failed Tests.
  • negative — tests of unauthenticated/error states. Disables auto-login fallback.
  • slow — tests that take long; skip in fast PR cycles.

failed-last-run strategy

Quickly re-verify whatever broke in the last run:

{
  "selectionStrategy": "failed-last-run",
  "selectionParams": { "browserProfile": "chromium" }
}

Most useful as a CI-level retry pass when you're not sure if a failure was a flake or real.

manual strategy

Explicit selection — you list IDs:

{
  "selectionStrategy": "manual",
  "selectionParams": {
    "suiteIds": ["019d..."],
    "caseIds": ["019e...", "019f..."]
  }
}

Or just include suiteIds/caseIds at the top of the trigger payload without a strategy — same effect.

Combining strategies

Currently you pick one strategy per trigger. Common workaround: run two pipelines in parallel, one per strategy. Future versions may support combinations.

Reading the run page

When you use smart selection, the run page header shows:

  • Strategy used — e.g. "changed (12 files)".
  • Tests selected — count and percentage of total project tests.
  • Tests skipped — count, with a "show skipped" toggle.

This makes it obvious what was and wasn't run, so you don't accidentally trust a green run that only ran 5 tests when you thought it ran 200.

Common questions

How accurate is "changed"?

Approximate. We map files → URLs → tests; if your repo structure doesn't map cleanly to your URL structure (e.g. SSR + dynamic routes), the mapping might miss. Run the full suite once a day to backstop.

Is there a "minimum coverage" floor?

Yes — even with strict selection, we always run the suite tagged critical in addition to whatever was selected. Configurable in Project Config.

Smart selection picked 0 tests. What now?

Either no tests map to your changes, or your tag query matched nothing. The trigger response will say "0 tests selected" — handle in your CI by running a fallback suite.

Related