WebTest AI Config and API Reference

This document is the single reference for WebTest AI configuration, CLI commands, Markdown spec fields, step language, reports, artifacts, and programmatic APIs.

The short rule:

Specs describe user intent and expected behavior.
Config describes environment, runtime policy, driver selection, reporting, healing, memory, models, and auth defaults.
CLI flags select a run and override operational choices for one invocation.

Configuration Loading

WebTest AI loads JSON config from webtest-ai.config.json by default. Pass --config <path> to use another file. Missing config files are allowed; built-in defaults are used.

Config is deep-merged over defaults. Arrays replace default arrays instead of merging element-by-element.

Implemented loader:

src/config/loadConfig.js
public helpers: loadConfig(configPath), getDefaultConfig()

Default shape:

{
  "execution": {
    "artifacts": {
      "trace": "retain-on-failure",
      "screenshot": "only-on-failure"
    },
    "journey": {
      "enabled": false,
      "capture": "navigation",
      "maxSnapshots": 20
    },
    "quality": {
      "a11y": {
        "mode": "warn",
        "failOnSerious": true,
        "maxSerious": 0
      },
      "vitals": {
        "mode": "warn",
        "lcpMaxMs": 2500,
        "clsMax": 0.1
      }
    }
  },
  "reporting": {
    "redact": {
      "enabled": true,
      "headers": ["authorization", "cookie", "set-cookie", "x-api-key"],
      "queryParams": ["token", "session", "email", "password"],
      "patterns": [
        {
          "name": "email",
          "regex": "[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,}",
          "flags": "gi",
          "replacement": "[REDACTED_EMAIL]"
        },
        {
          "name": "bearer",
          "regex": "Bearer\\s+[A-Za-z0-9._-]+",
          "flags": "g",
          "replacement": "Bearer [REDACTED_TOKEN]"
        }
      ]
    },
    "network": {
      "excludeUrls": []
    }
  },
  "driver": {
    "name": "playwright",
    "package": null,
    "path": null,
    "require": [],
    "options": {}
  },
  "healing": {
    "enabled": false,
    "mode": "bounded",
    "confidenceThreshold": 0.8,
    "maxAttemptsPerStep": 1,
    "snapshot": {
      "maxCandidates": 20,
      "maxTextLength": 120,
      "maxNearbyTextLength": 280
    }
  },
  "memory": {
    "enabled": true,
    "path": ".webtest-ai/ui-inventory.json",
    "mode": "propose",
    "maxCandidatesPerIntent": 5,
    "staleAfterDays": 60
  },
  "models": {
    "activeProfile": null,
    "profiles": {
      "local-reasoner": {
        "provider": "ollama",
        "model": "qwen3:14b",
        "endpoint": "http://127.0.0.1:11434",
        "apiKeyEnv": null,
        "capabilities": {
          "structuredJson": true,
          "reasoning": true,
          "toolCalling": false,
          "streaming": false,
          "vision": false
        },
        "limits": {
          "timeoutMs": 15000,
          "retries": 1,
          "maxInputBytes": 120000,
          "maxOutputTokens": 4096,
          "maxSessionTurns": 8
        }
      }
    },
    "writePolicy": {
      "roots": ["specs", "artifacts", ".webtest-ai"],
      "extensions": [".md", ".json", ".js"]
    }
  }
}

Reporting Config

reporting.redact.enabled: Enables redaction in WebTest AI-owned JSON reports, HTML reports, network summaries, console errors, step output, healing traces, healing snapshots, and generated memory proposals. Defaults to true.

reporting.redact.headers: Header names whose values are replaced with [REDACTED] when header-like blocks are rendered.

reporting.redact.queryParams: Query parameter names whose values are replaced with [REDACTED] in URLs.

reporting.redact.patterns: Regex replacements applied to text fields. Each entry supports name, regex, flags, and replacement.

reporting.network.excludeUrls: URL substrings omitted from WebTest AI-owned network summaries and HTML trace preview tables.

Privacy boundary:

Redaction applies to WebTest AI-owned outputs.
Raw Playwright trace archives may still contain original captured data.
Use trace retention policy later if raw artifact privacy is required.

Execution Quality Config

execution.quality.a11y.mode: off, warn, or fail. Defaults to warn.

execution.quality.a11y.failOnSerious: When true, serious or critical axe violations create a quality finding. Defaults to true.

execution.quality.a11y.maxSerious: Allowed serious/critical axe violations before a finding is emitted. Defaults to 0.

execution.quality.vitals.mode: off, warn, or fail. Defaults to warn.

execution.quality.vitals.lcpMaxMs: Largest Contentful Paint threshold in milliseconds. Defaults to 2500.

execution.quality.vitals.clsMax: Cumulative Layout Shift threshold. Defaults to 0.1.

Quality findings are collected through driver session capabilities. warn records findings without changing functional status; fail turns an otherwise-passing run into a failed result with qualityFailures.

Execution Artifact And Profile Config

execution.artifacts.trace: off, always, or retain-on-failure. Defaults to retain-on-failure.

execution.artifacts.screenshot: off, always, or only-on-failure. Defaults to only-on-failure.

execution.journey.enabled: Enables journey snapshots. Defaults to false.

execution.journey.capture: navigation captures after URL-changing passed actions. step captures after every passed action. off disables capture.

execution.journey.maxSnapshots: Maximum journey screenshots per test. Defaults to 20.

Runtime profiles are shortcut overlays:

fast: disables traces, journey, a11y, and vitals; keeps screenshots only on failure.
ci: keeps traces and screenshots only on failure; leaves quality policy from config.
debug: keeps trace and screenshot always, enables step-level journey snapshots, and keeps quality in warn mode.

Driver Config

driver.name: Execution driver name. Defaults to playwright.

driver.path: Optional local module path for a custom driver.

driver.package: Optional package name for a custom driver.

driver.require: Capability names that must be supported by the active driver. A run fails early if the driver does not advertise a required capability.

driver.options: Driver-specific options passed through to custom driver factories.

The built-in Playwright driver supports actions, assertions, screenshots, traces, network summaries, console errors, auth state, basic frames/popups, healing snapshots, axe scans, and vitals collection. Custom drivers should export a function, createDriver, or launchDriver, and return a driver object with capability flags and a createSession() method.

Example custom-driver config:

{
  "driver": {
    "name": "chrome-cdp",
    "path": "examples/drivers/chrome-cdp-driver.js",
    "require": ["actions", "assertions", "screenshots"],
    "options": {
      "executablePath": "/path/to/chrome"
    }
  }
}

The Chrome CDP example talks directly to the Chrome DevTools Protocol. It may use the Playwright package only to locate the bundled Chromium executable when no system Chrome path is configured; execution still goes through CDP, not Playwright APIs.

Existing browser endpoint route:

If an IDE, agent environment, or manually launched Chrome exposes a Chrome DevTools Protocol WebSocket endpoint, WebTest AI can connect through the CDP example driver instead of launching a new browser:

{
  "driver": {
    "name": "chrome-cdp",
    "path": "examples/drivers/chrome-cdp-driver.js",
    "options": {
      "wsEndpoint": "ws://127.0.0.1:9222/devtools/browser/<browser-id>"
    }
  }
}

Endpoint resolution order is driver.options.wsEndpoint, driver.options.endpoint, WEBTEST_AI_CDP_WS_ENDPOINT, then CURSOR_BROWSER_WS_ENDPOINT. Cursor's public browser-tool docs do not currently document a CDP WebSocket endpoint; WebTest AI can drive the Cursor browser only if such an endpoint is exposed.

Capability-aware execution:

actions: open, click, fill, and submit steps.
assertions: text and URL assertions.
screenshots: explicit Capture screenshot steps.
network: Wait for network steps.
frames: frame-targeted steps.
popups: page switching.
authState: tests with auth metadata.
healingSnapshots: opt-in bounded healing for Click action, Click element, Click text, and Click role.
a11ySurface: accessibility-tree action surface used before DOM locator fallback when the driver provides it.
a11y: axe-based accessibility scanning.
vitals: LCP/CLS collection.

When a test needs a capability that the active driver does not advertise, WebTest AI records the test as skipped with a clear reason.

Healing Config

healing.enabled: Enables bounded runtime healing for supported actions. Defaults to false.

healing.mode: Current implemented mode is bounded. The value is reserved for future expansion.

healing.confidenceThreshold: Minimum candidate confidence required for a healed candidate. Defaults to 0.8.

healing.maxAttemptsPerStep: Reserved policy field. Defaults to 1.

healing.snapshot.maxCandidates: Maximum visible interactive candidates captured in one healing snapshot.

healing.snapshot.maxTextLength: Maximum length for candidate text-like fields.

healing.snapshot.maxNearbyTextLength: Maximum nearby context text length per candidate.

Implemented healing applies to Click action "<intent>" and bounded fallback recovery for Click element, Click text, and Click role after deterministic click failure. When a11ySurface is available, click resolution and healing prefer bounded accessibility-tree candidates before DOM fallback. Click action requires healingSnapshots when healing is enabled; explicit click steps run deterministically on smaller drivers and use healing fallback only when snapshots are available. Healing ranks only candidates collected by WebTest AI or the active driver, refuses unsafe or ambiguous candidates, and emits reviewed memory/spec patch artifacts after verified recovery.

Memory Config

memory.enabled: Enables generated UI inventory lookup. Defaults to true.

memory.path: Filesystem JSON inventory path. Defaults to .webtest-ai/ui-inventory.json.

memory.mode: Inventory write policy.

Supported policy values:

propose: write proposal artifacts after verified healing.
auto: write verified healing directly to memory.path.
read: intended read-only mode; current behavior reads memory and still writes proposal artifacts for verified healing.
review-required: intended review workflow using webtest-ai heal queue, pending, approve-pending, and reject-pending.
off: intended off mode; today use memory.enabled: false for equivalent behavior.

memory.maxCandidatesPerIntent: Maximum locator candidates retained per app/page/intent.

memory.staleAfterDays: Ignores memory candidates whose lastSeenAt is older than this many days.

Models Config

models.activeProfile: Name of the single active model profile for optional model workflows. null disables model calls.

models.profiles.<name>.provider: Adapter key. Supported local/open-compatible keys include ollama, openai-compatible, vllm, lmstudio, llama.cpp, and openrouter-compatible. Legacy flat provider values are accepted temporarily for compatibility, but new config should use profiles.

models.profiles.<name>.model: Model name sent to the provider.

models.profiles.<name>.endpoint: Provider endpoint. ollama uses <endpoint>/api/chat; OpenAI-compatible adapters use <endpoint>/v1/chat/completions unless the endpoint already ends with /chat/completions.

models.profiles.<name>.apiKeyEnv: Optional environment variable that contains a bearer token.

models.profiles.<name>.capabilities: Explicit booleans for router/session requirements, including structuredJson, reasoning, toolCalling, streaming, and vision.

models.profiles.<name>.limits: Request and session bounds such as timeoutMs, retries, maxInputBytes, maxOutputTokens, and maxSessionTurns.

models.writePolicy: Roots and extensions enforced before guarded autonomous maintenance applies generated writes. The policy resolves paths against the workspace and blocks traversal or writes outside approved test-related surfaces.

Model-enabled paths record call metadata in run or recording results: purpose, mode, provider, model, profile, timing, message count, prompt byte size, status, and error message when applicable. Prompts, responses, API keys, and auth values are not stored in this metadata.

The sidecar only ranks bounded candidates already produced by WebTest AI. Discovery and maintenance workflows return structured proposals and apply writes only after centralized write-policy checks.

For live-provider smoke testing with Qwen, DeepSeek, Kimi, GLM4, and Yi, see Live Model Testing.

Current Runtime Defaults Not Yet Configurable

These values are implemented as hardcoded runtime defaults today and are good candidates for future config or CLI flags:

artifacts root: artifacts/
reports root: artifacts/reports/
auth storage directory: playwright/.auth/
popup detection timeout: 1500 ms
browser launch executable override for Chromium
report environment: always local

CLI API

Entrypoint:

node ./src/cli/index.js <command> [flags]

Installed package entrypoint:

webtest-ai <command> [flags]

`run`

Runs one Markdown suite.

node ./src/cli/index.js run --suite specs/webtest-ai-demo.md --config webtest-ai.config.json

Flags:

--suite <path>: Markdown suite path. Defaults to specs/webtest-ai-demo.md.
--config <path>: JSON config path. Defaults to webtest-ai.config.json.
--tags <a,b>: include tests that contain all listed tags.
--exclude-tags <a,b>: exclude tests containing any listed tag.
--workers <n>: parallel worker capacity. Defaults to 1.
--headed: launch browser headed.
--debug: enables Playwright Inspector behavior and pauses before test execution.
--refresh-auth: ignore reusable storage state and create fresh UI login state where applicable.
--auth-profile <name>: override the auth profile name used for reusable sessions.
--profile fast|ci|debug: apply a runtime profile overlay.
--trace off|always|retain-on-failure: override trace policy for this run.
--screenshot off|always|only-on-failure: override final screenshot policy for this run.
--quality off: disable a11y and vitals collection for this run.
--journey off|navigation|step: override journey snapshot capture for this run.

Run output prints suite name, counts, worker count, run id, config path, JSON report path, and HTML report path.

`debug`

Shorthand for run --debug.

node ./src/cli/index.js debug --suite specs/webtest-ai-demo.md --tags public

This sets PWDEBUG=1, forces headed behavior through debug mode, and pauses with Playwright Inspector.

`heal`

Reviews, queues, approves, rejects, or publishes generated UI inventory proposals.

node ./src/cli/index.js heal list --proposal artifacts/<runId>/<testId>/ui-inventory.proposed.json

Subcommands:

list / show: print proposed inventory updates.
queue: store proposal updates in the pending review file.
pending: list queued pending updates.
approve: merge proposal updates into configured inventory.
approve-pending: promote queued pending updates into inventory.
reject-pending: remove queued pending updates.
publish-pr: write local PR metadata stub only.

Flags:

--proposal <path>: proposal JSON path. Required except for pending, approve-pending, and reject-pending.
--config <path>: config used to resolve memory.path.
--memory <path>: override memory.path for this command.
--app <name>: fallback app key for older proposal files.
--index <n>: select one pending update.
--all: select all pending updates.
--patch <path>: optional patch markdown path for publish-pr.
--title <title>: optional PR stub title.
--dry-run: preview approve or approve-pending without writing inventory or consuming pending review items.

`discover`

Runs a bounded model discovery workflow and emits candidate test-flow proposals. It uses the active model profile and does not affect deterministic run pass/fail truth.

node ./src/cli/index.js discover --url https://app.example --dry-run
node ./src/cli/index.js discover --url https://app.example --output artifacts/discovery/app.json

Flags:

--url <seed-url> / --seed <seed-url>: seed URL. May be repeated.
--config <path>: JSON config path. Defaults to webtest-ai.config.json.
--max-turns <n>: override the active profile session turn limit.
--output <path>: write the normalized discovery proposal JSON.
--dry-run: print the proposal summary without writing an artifact.

`record`

Records a manual browser flow and generates a Markdown spec plus recording artifacts. Markdown is the source-of-truth contract.

node ./src/cli/index.js record --url https://app.example --output specs/recorded-flow.md

Flags:

--url <url>: URL to open and record. Required.
--output <path>: Markdown spec output path. Defaults to specs/recording-<timestamp>.md.
--emit-starter: also write an optional Playwright starter test for migration/debugging.
--starter <path>: starter-test output path. Passing this flag implies --emit-starter.
--event-log <path>: redacted event-log output path.
--api-manifest <path>: redacted API/network manifest output path.
--inventory-proposal <path>: UI inventory proposal output path.
--name <name>: generated test name.
--tags <a,b>: generated suite/test tags.
--headless: run browser headless for scripted capture. Recorder defaults to headed.
--force: overwrite generated Markdown and optional starter files.
--save-auth: save storageState after recording.
--auth-profile <name>: profile name for saved auth state. Defaults to recorded-user.
--record-auth-flow: include login/MFA-like steps as env placeholders. Without this, recorder trims obvious username/password/MFA segments and prefers saved-session reuse when --save-auth is used.
--capture-network manifest|full: writes a redacted API manifest. full currently adds a privacy warning; request/response bodies are not stored by the MVP.
--smart: optionally clean up goal, tags, steps, and expected assertions with one model call when model config is enabled. Deterministic output remains the fallback.

Recorder outputs:

Markdown spec
optional Playwright starter test when --emit-starter or --starter is used
redacted event log
redacted API manifest with Wait for network suggestions
reviewable ui-inventory.proposed.json selector-memory proposal
optional saved auth state when --save-auth is used

Interactive auth/MFA policy:

The user enters live MFA codes in the headed browser during recording.
Passwords, OTPs, tokens, and similar values are redacted from event logs.
Generated login-flow specs use env placeholders such as WEBTEST_AI_PASSWORD and WEBTEST_AI_MFA_CODE.
Post-login business-flow specs should use saved storageState reuse instead of replaying MFA.

`report`

Exports WebTest AI report data for external integrations. The command is vendor-neutral and does not call Jira, Slack, MCP servers, or other remote services.

node ./src/cli/index.js report export --report artifacts/reports/<runId>.json
node ./src/cli/index.js report summary --report artifacts/reports/<runId>.json --format markdown

Flags:

--report <path>: source WebTest AI JSON report. Required.
--output <path>: export path. Defaults beside the source report.
--format json|ndjson: export output format. Defaults to json.
--format markdown|text|json: summary output format. Defaults to markdown.
--stdout: write export content to stdout instead of a file.
--force: overwrite an existing export file.

Exported JSON shape:

version
kind: "webtest-ai.integration-report"
source: source report metadata
run: run metadata, filters, scheduling, browser, and driver
summary: status counts and run success boolean
tests: compact per-test records with status, tags, timing, failure reason, quality findings, model-call metadata, counts, and artifact paths

Use this output to build project-specific integrations, CI annotations, dashboards, MCP resources/tools, Jira tickets, Slack messages, or any other workflow outside WebTest AI core.

CI summary output:

kind: "webtest-ai.ci-summary" for JSON summaries
status counts and success boolean
failed, blocked, and healed tests that need attention
first failed step and artifact paths when available

`mcp`

Starts the WebTest AI MCP server over stdio.

node ./src/cli/index.js mcp

Flags:

--config <path>: config used by MCP tools that need WebTest AI runtime policy.

The MCP server exposes WebTest AI data and operations. It does not call vendor services or publish to external systems.

The browser-session MCP tools use Playwright Chromium directly because they are explicit debug and recording helpers. They are separate from deterministic webtest-ai.run.suite execution, which still goes through the configured driver boundary and owns pass/fail truth.

Resources:

webtest-ai://reports/latest
webtest-ai://reports/latest/integration
webtest-ai://reports/{name}
webtest-ai://reports/{name}/integration
webtest-ai://specs
webtest-ai://specs/{path}
webtest-ai://artifacts/{path}
webtest-ai://schemas/integration-report-v1

Tools:

webtest-ai.run.suite
webtest-ai.report.list
webtest-ai.report.summary
webtest-ai.report.export
webtest-ai.report.ci_summary
webtest-ai.spec.parse
webtest-ai.browser.start_session
webtest-ai.browser.click
webtest-ai.browser.fill
webtest-ai.browser.assert_text
webtest-ai.browser.snapshot
webtest-ai.browser.screenshot
webtest-ai.browser.to_spec
webtest-ai.browser.stop_session

Browser MCP sessions are for explicit debug, recording, and agent-assisted exploration. webtest-ai.browser.snapshot returns accessibility-tree candidates with session-local ax* refs when the browser exposes an a11y snapshot. webtest-ai.browser.click can target those refs with ref, or continue to use selector, role/name, or text inputs. Normal CI pass/fail truth still belongs to deterministic WebTest AI runs.

webtest-ai.run.suite runs a Markdown suite through the same deterministic runner used by the CLI and writes normal JSON/HTML reports. Runner progress logs are sent to stderr so MCP stdout remains valid protocol output.

Reserved Commands

No SaaS-specific publisher commands are built into core.

Programmatic recorder APIs exist for recordFlow(), generateSpec(), and normalized recorder output. Report export APIs exist for buildIntegrationExport(), writeIntegrationExport(), and publishResults(). publishResults() returns the same vendor-neutral export payload and does not publish to an external service.

Recommended Future CLI Overrides

These flags are not implemented yet, but they are the natural split between specs and runtime operation:

--base-url <url>
--env <name>
--browser <chromium|firefox|webkit|all>
--timeout <ms>
--navigation-timeout <ms>
--report-dir <path>
--artifacts-dir <path>
--healing on|off
--memory-mode propose|auto|read|off
--run-id <id>

Markdown Suite API

One Markdown file is one suite. YAML front matter defines suite metadata. Each top-level # heading defines one test.

Suite Front Matter

Current supported fields:

suite: Suite name used for run ids and reports. Defaults to unnamed-suite.

app: App key used by generated UI inventory. If omitted, WebTest AI derives an app key from baseUrl or suite name.

baseUrl: Base URL used by relative Open steps and UI login. Currently required for relative paths. This is a strong candidate to move to config/CLI.

tags: Tags inherited by every test in the suite.

defaults: Parsed and copied into each normalized test, but not heavily used by runtime yet.

auth: Suite-level auth policy. Test metadata can override it. This is a strong candidate to move to config auth profiles.

mcpProfile: Reserved integration profile. Defaults to default.

browsers: Browser list. Current runtime uses only the first suite browser.

Test Metadata

Metadata appears between a test heading and the first ## section.

Supported fields:

tags: Test tags. Suite tags and test tags are combined and deduplicated.

priority: Reporting metadata. Defaults to normal.

owner: Reporting metadata. Defaults to null.

auth: Per-test auth override merged over suite auth.

browsers: Parsed into normalized test objects, but current runtime does not launch per-test browsers.

modelMode: Parsed and reported, but actual model behavior is controlled by config.

retryPolicy: Parsed and enforced by runtime. A test-level value overrides defaults.retryPolicy, and both override execution.retries from config.

Sections

Supported section headings:

## Goal
## Before
## BeforeEach
## Steps
## Expected
## AfterEach
## After
## Data
## Notes

Steps, Expected, hooks, and Notes become lists. Goal becomes one text string. Data becomes key/value entries split on :.

Hook sections are supported for setup and cleanup:

## Before: runs before the test's main steps.
## BeforeEach: suite-level hook in frontmatter as hooks.beforeEach, or a test section when needed.
## AfterEach: suite-level hook in frontmatter as hooks.afterEach, or a test section when needed.
## After: runs after expected assertions.

Hooks use the same step language as ## Steps. Keep hooks short and visible; business behavior should stay in ## Steps and ## Expected.

## Steps and ## Expected may use numbered/bulleted lists or a one-column Markdown table with a Step header:

| Step |
| --- |
| Open "/" |
| Click role "button" named "Run" |

## Data may use key/value lines or a Markdown table. CSV files are not part of this pass; add them later only as an explicit data-file expansion feature, not as a replacement for the Markdown contract.

Step Language API

The interpreter is rule-based. Unsupported text fails with Unsupported step.

Supported step forms:

Open "<url-or-path>"
Click action "<intent>"
Click element "<name>"
Click text "<text>"
Click role "<role>" named "<name>"
Click frame role "<role>" named "<name>"
Click frame "<frame-name-or-url-part>" role "<role>" named "<name>"
Fill field "<label>" with "<value>"
Fill field "<label>" with env "<ENV_VAR>"
Fill frame field "<label>" with "<value>"
Fill frame field "<label>" with env "<ENV_VAR>"
Submit form
Submit form with button "<name>"
Assert text "<text>"
Assert frame text "<text>"
Assert frame "<frame-name-or-url-part>" text "<text>"
Assert url "<url>"
Wait for url "<url>"
Wait for network "<url-part>"
Wait for network "<url-part>" status <status>
Capture screenshot
Capture screenshot as "<fileName>"
Capture visual checkpoint "<name>"

Current locator behavior:

Click role resolves through the driver accessibility-tree surface first when available, then falls back to Playwright getByRole(role, { name, exact: false }).
Click text resolves through the driver accessibility-tree surface first when available, then falls back to getByText(text, { exact: false }).first().
Fill field uses getByLabel(label, { exact: false }).
Click element resolves through the driver accessibility-tree surface first when available, then tries button role, link role, label, placeholder, then text.
Submit form calls requestSubmit() on the first form unless a button name is supplied.
Frame steps choose the first non-main frame unless a frame name or URL part is supplied.
Popup-producing clicks are followed by setting the active page when a popup is detected within 1500 ms.
Visual checkpoints capture a full-page screenshot, create a missing baseline, and compare later runs with exact image hash matching.

Visual Checkpoint Config

visual.baselineDir: Directory for visual checkpoint baselines. Defaults to visual-baselines.

visual.updateBaselines: When true, visual checkpoint actions overwrite existing baselines with the current screenshot.

visual.mode: fail or warn. Defaults to fail. In warn mode, changed visual checkpoints write diff metadata without failing the step.

Visual checkpoint artifacts:

actual screenshot: artifacts/<runId>/<testId>/visual-<name>.png
diff metadata: artifacts/<runId>/<testId>/visual-<name>.diff.json
baseline: <visual.baselineDir>/<suite>/<testId>/<name>.png

Auth API

Current auth metadata fields:

mode: commonly ui, reuse, or api.
profile: storage-state profile name.
saveStorageState: when true, passing UI auth tests persist storage state.
usernameEnv: username environment variable.
passwordEnv: password environment variable.
mfaCodeEnv: MFA code environment variable.
otp: optional OTP provider config. Supported providers are totp-secret and static-env.
validationPath / validationUrl: reusable-session validation target.
validate: set false to skip reusable-session validation.
apiLoginPath / apiLoginUrl: API endpoint used by mode: api.
method: API login method, defaulting to POST.
bodyFormat / contentType: json or form, defaulting to json.
usernameField: API login username field, defaulting to username.
passwordField: API login password field, defaulting to password.
mfaCodeField: API login MFA field, defaulting to mfaCode.
extraFields / body: extra API login request fields.
headers: extra API login request headers.
successStatus / successStatuses: accepted API login response status codes.
requireCookie: set false to allow API login responses without Set-Cookie.
loginMode: same-host or idp.
provider: idp enables IDP login behavior.
loginPath: login page path.
successPath: URL waited for after UI login.
successText: text waited for after UI login.
usernameLabel: username field label.
passwordLabel: password field label.
submitName: login submit button name.
mfaLabel: MFA input label.
mfaSubmitName: MFA submit button name.
mfaTimeout: how long to look for the MFA field.
idpStartName: IDP start link name, or false to skip that click.

OTP examples:

auth:
  mode: reuse
  profile: ci-user
  usernameEnv: APP_USERNAME
  passwordEnv: APP_PASSWORD
  otp:
    provider: totp-secret
    secretEnv: APP_TOTP_SECRET
    digits: 6
    period: 30

auth:
  otp:
    provider: static-env
    codeEnv: APP_MFA_CODE

mfaCodeEnv remains supported for compatibility and behaves like static-env.

Current storage state path:

playwright/.auth/<profile>.json

Recommended direction:

Keep auth flow definitions in config.
Let specs reference a logical profile or auth requirement.
Use CLI for one-off overrides such as --auth-profile and --refresh-auth.

API login creates the same browser-ready storage state shape as UI login. The default endpoint is /api/login, and cookies returned by Set-Cookie are written to playwright/.auth/<profile>.json. Drivers must advertise authState to consume that state during execution.

Reusable session validation opens the saved storage state in a browser context and treats an OK response from validationPath or validationUrl as a valid session. The endpoint must return a non-OK response for anonymous users; otherwise WebTest AI cannot distinguish a real session from public access.

Report And Artifact API

Per run:

artifacts/reports/<runId>.json
artifacts/reports/<suite>-latest.json
artifacts/reports/<runId>.html
artifacts/reports/<suite>-latest.html

Per test:

artifacts/<runId>/<testId>/final.png
artifacts/<runId>/<testId>/trace.zip
artifacts/<runId>/<testId>/trace/
optional checkpoint screenshots
optional journey screenshots
optional healing snapshots
optional UI inventory proposal and patch files
optional reviewed spec patch file

Result status values:

passed
failed
healed-passed
skipped
blocked

Important report fields:

runId
suite
suitePath
generatedAt
browser
driver
driverCapabilities
workerCount
scheduling
configPath
filters
results

Important per-test result fields:

testId
name
status
browser
browserEngine
browserChannel
browserVersion
driverName
driverCapabilities
environment
tags
priority
owner
workerId
serial
authMode
authProfile
authState
modelMode
healing
journey
durationMs
assertions
stepResults
firstFailedStep
networkSummary
consoleErrors
artifacts
summary

The HTML report renders suite-level driver metadata, capability tags, per-test owner/priority, journey snapshots, artifact availability, healing traces, and skip/block reasons. This makes adapter support and degraded execution visible without opening raw JSON.

Manual Feature Validation Matrix

Use these checks when validating a local platform pass:

Recorder capture: run node ./src/cli/index.js record --url http://127.0.0.1:4010 --name "Manual recorded smoke" --tags recorded,smoke --output specs/manual-recorded-smoke.md --force, interact in the opened headed browser, then confirm the Markdown includes clicks/inputs/page opens and no .playwright.js is created.
Optional starter: rerun recorder with --emit-starter or --starter specs/manual-recorded-smoke.playwright.js and confirm Starter Test: prints only then.
Dashboard journey: run a suite with --journey navigation or --profile debug, open artifacts/reports/<suite>-latest.html, and confirm User Journey, owner, priority, and artifact policy are visible.
Fast profile: run node ./src/cli/index.js run --suite <suite.md> --profile fast; confirm traces are off, quality is skipped, and screenshots are retained only on failure.
Auth reuse validation: use auth.mode: reuse with validationPath: /api/session; confirm a saved valid state is reused and a stale state falls back to login or fails clearly.
Hooks and tables: parse/run a spec with ## Before, ## Expected, and table-form ## Steps; confirm hook phases appear in stepResults.
Healing review: run node ./src/cli/index.js heal approve --proposal examples/healing-review-proposal.json --dry-run --memory /tmp/webtest-ai-ui-inventory.json; confirm it prints Would Apply: 1 and does not write inventory.
Existing CDP endpoint: set WEBTEST_AI_CDP_WS_ENDPOINT=ws://... and use examples/drivers/existing-cdp-driver.config.json; if no endpoint exists, this feature cannot drive that browser.

Programmatic API

Package entrypoint:

const {
  parseSuite,
  runSuite,
  runTest,
  createBrowserDriver,
  createFakeDriver,
  recordFlow,
  generateSpec,
  buildIntegrationExport,
  exportResults,
  writeIntegrationExport,
  createMcpServer,
  publishResults
} = require("@asserthive/webtest-ai");

`parseSuite(filePath)`

Reads a Markdown suite and returns a normalized suite object.

const suite = await parseSuite("specs/webtest-ai-demo.md");

Returned shape:

{
  filePath,
  suite,
  app,
  baseUrl,
  tags,
  defaults,
  auth,
  mcpProfile,
  browsers,
  tests
}

`runSuite(parsedSuite, options)`

Runs a parsed suite through the configured browser driver and writes JSON/HTML reports. Playwright is the default driver.

Common options:

{
  headless: true,
  debug: false,
  workers: 1,
  includeTags: [],
  excludeTags: [],
  refreshAuth: false,
  authProfile: null,
  suitePath: "/absolute/path/to/spec.md",
  config: {},
  configPath: "/absolute/path/to/webtest-ai.config.json"
}

Returns:

{
  runId,
  suite,
  workerCount,
  configPath,
  reportPath,
  htmlReportPath,
  results
}

`runTest(test, options)`

Runs one normalized test. This is lower-level than runSuite() and expects either a driver object or a launched Playwright browser plus runtime options.

Important options:

driver
browser
browserName
browserEngine
browserChannel
browserVersion
runId
workerId
sessionCache
suite
includeTags
excludeTags
refreshAuth
authProfile
debug
workerCount
suitePath
config

Driver helpers:

createBrowserDriver(options): resolves and launches the configured driver.
createFakeDriver(options): creates an in-memory driver for tests and adapter-contract checks.
tests/support/driverContract.js: reusable local contract helper for checking that a driver can execute the core open/fill/click/assert/screenshot flow.

Integration Export APIs

buildIntegrationExport(report, options): Converts a WebTest AI JSON report object into a compact vendor-neutral integration payload.

writeIntegrationExport(exportPayload, outputPath, options): Writes the integration payload as JSON or NDJSON.

exportResults(reportOrPath, options): Returns the vendor-neutral integration payload and optionally writes it.

publishResults(reportOrPath, options): Compatibility alias for exportResults(). It does not call external services.

createMcpServer(options): Creates the WebTest AI MCP server object for stdio hosts or direct tests.

Boundary Guidance

Keep in Markdown specs:

user-visible behavior
business flows
assertions
scenario-specific data
owner/priority/tags used for reporting and selection

Move or keep in config:

base URLs and environment names
auth profile details and secret env var names
default browser/project
execution driver and required capabilities
timeouts
artifact/report directories
trace and screenshot retention
reporting privacy
healing, memory, and model policy
scheduling policy defaults

Use CLI for:

choosing a suite
selecting tags
changing config path
changing worker count
headed/debug runs
refreshing auth
one-off profile/environment/browser overrides as those flags are added