In the previous post, we talked about why QA is becoming the bottleneck in the AI agent era. Now let’s get concrete. In this post, we’ll walk through a complete QA workflow using aqua — step by step, with real commands and plan structures — using a simple Todo app as our example.
The Scenario
You’re building a Todo app. Your AI coding agent just implemented a new feature: due dates for tasks. The PR adds:
- A
due_datefield to the task creation API - A date picker in the browser UI
- An “Overdue” badge that appears when a task is past due
Before merging, you need to verify all of this actually works. Let’s use aqua to do that.
Step 1: Set Up Your Environment
First, create an environment file that tells aqua where your app is running and how to authenticate. Environment files live in .aqua/environments/.
// .aqua/environments/local.json
{
"notes": "Local dev server on port 3000 (API) and 5173 (UI)",
"variables": {
"api_base_url": "http://localhost:3000",
"web_base_url": "http://localhost:5173"
},
"secrets": {
"test_user_email": {
"type": "literal",
"value": "qa@example.com"
},
"test_user_password": {
"type": "op",
"value": "op://Dev/todo-app-qa/password"
}
}
}
Notice that the password comes from 1Password via the op type — it’s resolved locally on your machine and never sent to the AI agent. aqua also supports AWS Secrets Manager, GCP Secret Manager, and HashiCorp Vault.
Step 2: Create a QA Plan
With your environment ready, ask your AI coding agent to create a QA plan. In Claude Code, you might say:
Create a QA plan to test the new due date feature. Check the API, the browser UI, and the overdue badge behavior.
The agent reads your codebase and PR diff, then calls aqua’s create_qa_plan and update_qa_plan tools to build a structured plan. Here’s what the resulting plan looks like:
{
"name": "Due Date Feature - Task Management",
"description": "Verify due date field in API, date picker in UI, and overdue badge display",
"variables": {
"api_base_url": "http://localhost:3000",
"web_base_url": "http://localhost:5173"
},
"scenarios": [
{
"name": "API: Create Task with Due Date",
"steps": [...]
},
{
"name": "Browser: Date Picker and Task Creation",
"steps": [...]
},
{
"name": "Browser: Overdue Badge Display",
"steps": [...]
}
]
}
Let’s look at each scenario in detail.
Scenario 1: API — Create Task with Due Date
The first scenario tests the API directly. It creates a task with a due date and verifies the response.
{
"name": "API: Create Task with Due Date",
"steps": [
{
"step_key": "login",
"action": "http_request",
"config": {
"method": "POST",
"url": "{{api_base_url}}/api/auth/login",
"headers": { "Content-Type": "application/json" },
"body": {
"email": "{{test_user_email}}",
"password": "{{test_user_password}}"
}
},
"assertions": [
{ "type": "status_code", "expected": 200 }
],
"extract": {
"auth_token": "$.token"
}
},
{
"step_key": "create_task_with_due_date",
"action": "http_request",
"depends_on": ["login"],
"config": {
"method": "POST",
"url": "{{api_base_url}}/api/tasks",
"headers": {
"Content-Type": "application/json",
"Authorization": "Bearer {{auth_token}}"
},
"body": {
"title": "Review PR #42",
"due_date": "2026-04-01T00:00:00Z"
}
},
"assertions": [
{
"type": "status_code",
"expected": 201,
"description": "Task created successfully"
},
{
"type": "json_path",
"path": "$.task.due_date",
"expected": "2026-04-01T00:00:00Z",
"description": "Due date is stored correctly"
},
{
"type": "json_path",
"path": "$.task.id",
"condition": "exists"
}
],
"extract": {
"task_id": "$.task.id"
}
},
{
"step_key": "get_task_verify_due_date",
"action": "http_request",
"depends_on": ["create_task_with_due_date"],
"config": {
"method": "GET",
"url": "{{api_base_url}}/api/tasks/{{task_id}}",
"headers": {
"Authorization": "Bearer {{auth_token}}"
}
},
"assertions": [
{ "type": "status_code", "expected": 200 },
{
"type": "json_path",
"path": "$.task.due_date",
"expected": "2026-04-01T00:00:00Z",
"description": "Due date persisted correctly on re-fetch"
}
]
}
]
}
A few things to note:
extractpulls values from responses for use in later steps. Theauth_tokenfrom login is used in subsequent API calls via{{auth_token}}.depends_onensures steps run in the right order and that a failure in one step skips its dependents.- Secrets like
{{test_user_password}}are resolved from the environment file and automatically masked in logs and results — the actual password never appears in aqua’s server.
Scenario 2: Browser — Date Picker and Task Creation
Next, we verify the UI works. This scenario opens a browser, logs in, and creates a task using the date picker.
{
"name": "Browser: Date Picker and Task Creation",
"requires": ["web_base_url", "test_user_email", "test_user_password"],
"steps": [
{
"step_key": "create_task_via_ui",
"action": "browser",
"config": {
"viewport": "pc",
"steps": [
{ "goto": "{{web_base_url}}/login" },
{ "type": { "selector": "input[name='email']", "text": "{{test_user_email}}" } },
{ "type": { "selector": "input[name='password']", "text": "{{test_user_password}}" } },
{ "click": "button[type='submit']" },
{ "wait_for_url": "/tasks" },
{ "click": "[data-testid='new-task-button']" },
{ "type": { "selector": "input[name='title']", "text": "Ship v2.0" } },
{ "click": "input[name='due_date']" },
{ "wait_for_selector": ".date-picker-calendar" },
{ "click": "[data-date='2026-04-15']" },
{ "click": "[data-testid='save-task-button']" },
{ "wait_for_selector": "[data-testid='task-list'] >> text=Ship v2.0" },
{ "screenshot": "task_created_with_due_date" }
]
},
"assertions": [
{
"type": "element_visible",
"selector": "[data-testid='task-list'] >> text=Ship v2.0",
"description": "New task appears in the list"
},
{
"type": "element_text",
"selector": "[data-testid='task-due-date']",
"contains": "Apr 15",
"description": "Due date is displayed on the task card"
},
{
"type": "screenshot",
"name": "task_with_due_date",
"description": "Task list showing the new task with due date"
}
]
}
]
}
The requires field ensures this scenario is skipped (not failed) if the required variables aren’t available — useful when running the same plan against an API-only environment.
Scenario 3: Browser — Overdue Badge
The final scenario checks the overdue badge. It creates a task with a past due date via the API, then verifies the badge appears in the browser.
{
"name": "Browser: Overdue Badge Display",
"steps": [
{
"step_key": "create_overdue_task",
"action": "http_request",
"depends_on": ["login"],
"config": {
"method": "POST",
"url": "{{api_base_url}}/api/tasks",
"headers": {
"Content-Type": "application/json",
"Authorization": "Bearer {{auth_token}}"
},
"body": {
"title": "Overdue task for QA",
"due_date": "2025-01-01T00:00:00Z"
}
},
"assertions": [
{ "type": "status_code", "expected": 201 }
]
},
{
"step_key": "verify_overdue_badge",
"action": "browser",
"depends_on": ["create_overdue_task"],
"config": {
"steps": [
{ "goto": "{{web_base_url}}/tasks" },
{ "wait_for_selector": "text=Overdue task for QA" },
{ "screenshot": "overdue_badge" }
]
},
"assertions": [
{
"type": "element_visible",
"selector": "[data-testid='overdue-badge']",
"description": "Overdue badge is visible on the task"
},
{
"type": "screenshot",
"name": "overdue_badge_visible",
"description": "Task list showing the overdue badge"
}
]
}
]
}
Notice how this scenario mixes HTTP and browser steps — it uses the API to set up the data (creating a task with a past due date), then switches to the browser to verify the visual result. Dependencies work across action types seamlessly.
Step 3: Execute the Plan
With the plan created, run it against your local environment:
npx @aquaqa/cli execute plan-abc123 --env local
Or let your AI agent handle it — it calls the execute_qa_plan MCP tool and gets a structured result:
# Execution Result: Due Date Feature - Task Management
Execution ID: exec-789xyz
Status: completed
Steps: 6 passed, 0 failed
## [PASS] API: Create Task with Due Date / login
HTTP 200 (52ms)
## [PASS] API: Create Task with Due Date / create_task_with_due_date
HTTP 201 (134ms)
✓ Task created successfully
✓ Due date is stored correctly
## [PASS] API: Create Task with Due Date / get_task_verify_due_date
HTTP 200 (41ms)
✓ Due date persisted correctly on re-fetch
## [PASS] Browser: Date Picker and Task Creation / create_task_via_ui
✓ New task appears in the list
✓ Due date is displayed on the task card
📸 task_with_due_date
## [PASS] Browser: Overdue Badge Display / create_overdue_task
HTTP 201 (98ms)
## [PASS] Browser: Overdue Badge Display / verify_overdue_badge
✓ Overdue badge is visible on the task
📸 overdue_badge_visible
URL: https://app.aquaqa.com/executions/exec-789xyz
All steps passed. Screenshots, HTTP request/response pairs, and DOM snapshots are captured automatically and viewable in the web dashboard.
Step 4: Run the Same Plan Against Staging
Your local tests pass. Now run the exact same plan against staging before merging the PR.
Create a staging environment:
// .aqua/environments/staging.json
{
"notes": "Staging environment - deployed on every push to main",
"variables": {
"api_base_url": "https://staging-api.example.com",
"web_base_url": "https://staging.example.com"
},
"secrets": {
"test_user_email": {
"type": "op",
"value": "op://Staging/todo-app-qa/email"
},
"test_user_password": {
"type": "op",
"value": "op://Staging/todo-app-qa/password"
}
}
}
Then execute:
npx @aquaqa/cli execute plan-abc123 --env staging
Same plan, different environment. No regeneration needed. The results are recorded separately, so you can compare local vs. staging runs side by side in the dashboard.
Step 5: Post-Deploy Verification
After the PR is merged and deployed, run the plan one more time against production:
npx @aquaqa/cli execute plan-abc123 --env production
This is the same verification that passed in local and staging, now confirming the feature works in production with real infrastructure. Each execution is recorded with its environment, so you have a clear audit trail: this plan passed in local, staging, and production.
What Makes This Different
If you’ve worked with QA before, much of this might feel familiar on the surface. But consider what aqua gives you that traditional approaches don’t:
- AI-generated plans — Your coding agent analyzed the PR diff and built the test plan. You didn’t have to write it from scratch or figure out what to test.
- Structured and reusable — The plan is versioned and immutable. Run it today against local, tomorrow against staging, next week against production. It’s the same plan every time.
- Secrets stay local — Passwords and API keys are resolved from your secret provider and masked in results. The AI agent never sees them.
- Automatic recording — Every execution captures HTTP details, screenshots, and DOM snapshots in a consistent format. Your team can review results in the dashboard without asking “what did you test?”
- Mixed API and browser testing — A single plan can combine HTTP requests for fast data setup with browser steps for UI verification. Dependencies flow across both.
What’s Next
This walkthrough covered the core workflow, but there’s much more to explore — writing effective QA plans, managing secrets at scale, integrating aqua into CI/CD pipelines, and leveraging project memory to make your AI agent smarter over time. We’ll cover these topics in upcoming posts.
Ready to try it yourself? Check out the quickstart guide to get started.