5-Minute Quickstart
Choose a Blackbox quickstart track: add effect coverage to an existing system test, or learn the first proof loop from the showcase.
Choose Your Starting Point
Blackbox has two fast starts. Use ?track=existing when you already have a system or E2E test. Use ?track=new when you want to learn from the showcase before wiring your own service.
Both tracks start with runtime behavior. An effect is observable boundary behavior: a database write, queue message, HTTP call, cache change, emitted event, email intent, or forbidden dependency call.
By the end, you should have one flow run with Blackbox, one effect catalog, one reviewed required or forbidden effect, and one effect coverage signal in the terminal.
Install Blackbox first. Then choose the path that matches your repo: an existing system or E2E test, or the Blackbox showcase.
One run records effects, one catalog entry is reviewed, and the next run reports a satisfied effect coverage entry.
Track 1: Add Effect Coverage To An Existing System Test
Use this path when you already have a system or E2E test that exercises a valuable flow. The first win is not a new suite. It is evidence for behavior your existing test already drives.
1Add the Blackbox wrapper and matcher
Keep the request and response assertions you already trust. Add the Blackbox system boundary and `toMatchCatalog()` at the end of one valuable flow.
import { expect, test } from './testbed.js';
test.system('checkout-flow', 'customer completes checkout', () => { test('existing checkout test', async ({ capture, request, system }) => { const response = await request.post(`${system.app.hostBaseUrl}/checkout`, { data: { cartId: 'cart-123', paymentMethodId: 'pm_card_visa' }, });
expect(response.status()).toBe(201);
await expect(capture).toMatchCatalog(); });});2Run your normal system-test command
Blackbox does not need to replace your runner. Use the command your project already uses; the Playwright command below is only an example.
<your-system-test-command>npx playwright test --config ./e2e/playwright.config.tsYou should see a baseline-pending message the first time the catalog entry does not exist.
toMatchCatalog: no catalog entry for "checkout-flow"; baseline pending.Will be written at fixture teardown.3Review the effect catalog
The first useful artifact is the generated catalog. It starts as observed behavior, not a trusted contract. Review it before committing it.
specVersion: "0.1"
checkout-flow: requires: - { boundary: postgres, op: INSERT, key: orders } - { boundary: http, op: POST, key: /payments } - { boundary: sqs, op: SendMessage }4Tighten the catalog to one meaningful contract
`toMatchCatalog()` is already the matcher. Your job here is to edit the catalog: keep one required effect that proves useful behavior and add one forbid for behavior that must not happen.
specVersion: "0.1"
checkout-flow: requires: - { boundary: postgres, op: INSERT, key: orders } forbids: - { boundary: http, op: POST, key: /refunds }5Rerun and read effect coverage
On the next run, the terminal should show whether the reviewed catalog entry was satisfied, failed, or uncovered.
effect coveragemetric valuecatalog entries 1satisfied 1 (100%)failed 0uncovered 0
entry state runs why✓ checkout-flow satisfied 1/1written: <coverage-dir>/coverage.jsonThat is the first quickstart win: the same flow now proves something about the behavior inside the system, not only its response.
Track 2: Showcase First
Use this path when you do not yet have a useful system test. First learn the proof loop in the Blackbox showcase, then write one narrow flow in your own system.
1Run the showcase
From the Blackbox showcase repo root, run the system-test script. The subscription flow is intentionally small but effect-rich: Redis, Postgres, HTTP, and SQS all participate.
pnpm test:system2Inspect the proof trail
A successful showcase run should produce a catalog and coverage artifacts. Start with these three; spans and V8 payloads are diagnostics for later.
e2e/tests/__effects__/00-baseline-subscribe.system.effects.yamle2e/.blackbox-coverage/coverage.jsone2e/.blackbox-coverage/omcdc-propagation.md3Copy the smallest test shape
The important shape is `test.system(...)` plus `toMatchCatalog()`. The matcher seeds a baseline when no catalog entry exists, then enforces the reviewed catalog on later runs.
import { expect, test } from './testbed.js';
test.system('subscribe-flow', 'subscribing a user to the pro tier', () => { test('alice subscribes', async ({ capture, request, system }) => { const response = await request.post(`${system.bff.hostBaseUrl}/subscriptions`, { data: { userId: 'alice', paymentMethodId: 'pm_card_visa' }, });
expect(response.status()).toBe(201);
await expect(capture).toMatchCatalog(); });});4Write one narrow flow in your system
Pick a flow where input/output success is not enough: checkout, signup, webhook handling, account deletion, refund prevention, or another side-effect-heavy path.
The goal is not a broad journey. The goal is one controlled system flow whose effects matter.
5Review the catalog and read coverage
The first custom catalog begins as observed behavior. Keep the effects that define correctness, add forbids for dangerous behavior, then rerun to see effect coverage.
specVersion: "0.1"
subscribe-flow: requires: - { boundary: redis, op: GET, key: "user:*:tier" } - { boundary: postgres, op: INSERT, key: subscriptions } - { boundary: sqs, op: SendMessage } forbids: - { boundary: postgres, op: DELETE } - { boundary: http, op: POST, key: /v1/refunds }effect coveragemetric valuecatalog entries 1satisfied 1 (100%)failed 0uncovered 0
entry state runs why✓ subscribe-flow satisfied 1/1written: <coverage-dir>/coverage.jsonAfter Your First Successful Run
After the first effect loop works, choose the next layer deliberately:
- Continue to Feature Files From Tests if you want feature files, Gherkin output, or feature-file drift checks.
- Read System Effects to understand catalogs, matchers, effect coverage, and OMC/DC.
- Use Troubleshooting First Run if the catalog, runner, or coverage output is missing.