Chaos Runs
Autonomous AI-powered exploratory testing.
Chaos Runs
Chaos runs launch autonomous AI agents that explore your application, discover features, and test them from multiple angles. Unlike scripted test runs, chaos runs require no pre-defined workflows - the agents figure out what to test on their own.
Starting a Chaos Run
- Go to Runs in the sidebar
- Click New Run → Chaos Run
- Configure the run:
- Select a Property (required)
- Optionally select Credentials for authenticated testing
- Optionally add Guidance to focus the agents
- Configure advanced options if needed
- Click Start Chaos Run
Testing Approaches
Chaos agents test your application using six different approaches:
| Approach | What It Tests |
|---|---|
| Positive | Happy path flows - expected user journeys that should succeed |
| Negative | Error handling - invalid inputs, missing data, permission errors |
| Edge Case | Boundary conditions - empty states, maximum values, special characters |
| Accessibility | Screen reader compatibility, keyboard navigation, ARIA labels |
| Security | XSS vectors, injection attempts, authentication bypasses |
| Creative | Unusual user behaviors - rapid clicking, navigation interrupts, multi-tab scenarios |
Each area of your application is tested from multiple approaches to ensure comprehensive coverage.
How Agents Work
Hierarchical Exploration
Chaos runs use a parent-child agent structure:
- Root Agent - Starts at your application's entry point, discovers main feature areas
- Child Agents - Spawn to explore specific features discovered by their parent
- Grandchild Agents - Go deeper into sub-features as needed
This tree structure allows thorough exploration while staying organized.
Test Checklists
Each agent creates a checklist of specific things to test in its assigned area. For example, an agent testing a login form might check:
- Valid credentials log in successfully
- Invalid password shows error message
- Empty fields show validation errors
- "Forgot password" link works
- Login rate limiting is enforced
Results are recorded as healthy (passed) or unhealthy (found issue).
Configuration Options
| Option | Default | Range | Description |
|---|---|---|---|
| Credentials | None | - | Login credentials for authenticated testing |
| Guidance | None | - | Natural language instructions to focus agents |
| Max Agents | 20 | 1-100 | Maximum total agents to spawn |
| Max Depth | 5 | 1-10 | Levels of parent-child exploration |
| Max Duration | 60 min | 5-180 min | Run stops after this time |
Guidance Examples
Help agents focus on what matters:
- "Focus on checkout flows and payment processing"
- "Test edge cases with invalid inputs"
- "Explore admin features and user management"
- "Look for security issues in the API"
Chaos Run Statuses
| Status | Meaning |
|---|---|
| Pending | Run is queued but hasn't started |
| Running | Agents are actively exploring and testing |
| Completed | All agents finished successfully |
| Partial | Run finished but some agents failed or errored |
| Canceled | Manually stopped by user |
| Timeout | Hit max duration limit |
Cutoff Reasons
When a run ends, it has a cutoff reason:
| Reason | Meaning |
|---|---|
| Max Agents | Spawned the maximum number of agents |
| Max Depth | Reached maximum exploration depth |
| Timeout | Hit the time limit |
| User Canceled | Manually stopped |
| No Work | No more areas to explore |
| All Areas Covered | Successfully tested all discovered areas |
Understanding Results
The Agent Tree
The chaos run detail page shows all agents in a tree structure. Each agent shows:
- Status - Running, Completed, Failed, or Skipped
- Assigned Area - What feature/approach it was testing
- Test Results - How many checks passed vs failed
- Issues Found - Bugs discovered during testing
Tested Areas
The Areas tab shows a flattened view of all tested feature areas:
- Feature name and testing approach
- Status (Pending, Testing, Tested, Skipped)
- Number of issues found in that area
Issues Found
Issues discovered during chaos runs are automatically filed with:
- Screenshots and evidence
- Steps to reproduce
- Severity classification
- Tags (if you configured them)
Navigate to Issues to triage and track these findings.
When to Use Chaos Runs vs Test Runs
| Use Case | Recommended Run Type |
|---|---|
| Regression testing before deploy | Test Run |
| CI/CD quality gate | Test Run |
| Exploring a new feature | Chaos Run |
| Finding edge case bugs | Chaos Run |
| Security assessment | Chaos Run |
| Periodic comprehensive testing | Chaos Run |
Test Runs are fast and deterministic - use them for known workflows. Chaos Runs are exploratory - use them to find unknown problems.
Best Practices
- Start with credentials - Authenticated testing finds more issues
- Use guidance sparingly - Let agents explore freely at first
- Review the agent tree - Understand what was tested and what wasn't
- Run regularly - Chaos runs find different issues each time
- Triage issues quickly - False positives teach you what to ignore