5-Minute Quickstart

Choose a Blackbox quickstart track: add effect coverage to an existing system test, or learn the first proof loop from the showcase.

Choose Your Starting Point

Blackbox has two fast starts. Use ?track=existing when you already have a system or E2E test. Use ?track=new when you want to learn from the showcase before wiring your own service.

Both tracks start with runtime behavior. An effect is observable boundary behavior: a database write, queue message, HTTP call, cache change, emitted event, email intent, or forbidden dependency call.

By the end, you should have one flow run with Blackbox, one effect catalog, one reviewed required or forbidden effect, and one effect coverage signal in the terminal.

Before you start

Install Blackbox first. Then choose the path that matches your repo: an existing system or E2E test, or the Blackbox showcase.

First success

One run records effects, one catalog entry is reviewed, and the next run reports a satisfied effect coverage entry.

Track 1: Add Effect Coverage To An Existing System Test

Use this path when you already have a system or E2E test that exercises a valuable flow. The first win is not a new suite. It is evidence for behavior your existing test already drives.

Existing tests trackExisting tests are run again with Blackbox, creating observed effects, a reviewed effect catalog, and matcher-backed effect coverage.Track 1: add Blackbox to tests you already runThe first value is one reviewed catalog entry and one coverage signal.1existing testinput/output2add wrapperobserve effects3review catalogrequires/forbids4reruncoverage signal5Advanced after the first loop works: inspect OMC/DC or generate behavior specs.
Existing tests already create useful traffic. Blackbox turns that traffic into observed effects, a reviewed effect catalog, and effect coverage on the next run.

1Add the Blackbox wrapper and matcher

Keep the request and response assertions you already trust. Add the Blackbox system boundary and `toMatchCatalog()` at the end of one valuable flow.

import { expect, test } from './testbed.js';
test.system('checkout-flow', 'customer completes checkout', () => {
test('existing checkout test', async ({ capture, request, system }) => {
const response = await request.post(`${system.app.hostBaseUrl}/checkout`, {
data: { cartId: 'cart-123', paymentMethodId: 'pm_card_visa' },
});
expect(response.status()).toBe(201);
await expect(capture).toMatchCatalog();
});
});

2Run your normal system-test command

Blackbox does not need to replace your runner. Use the command your project already uses; the Playwright command below is only an example.

Terminal window
<your-system-test-command>
Terminal window
npx playwright test --config ./e2e/playwright.config.ts

You should see a baseline-pending message the first time the catalog entry does not exist.

toMatchCatalog: no catalog entry for "checkout-flow"; baseline pending.
Will be written at fixture teardown.

3Review the effect catalog

The first useful artifact is the generated catalog. It starts as observed behavior, not a trusted contract. Review it before committing it.

specVersion: "0.1"
checkout-flow:
requires:
- { boundary: postgres, op: INSERT, key: orders }
- { boundary: http, op: POST, key: /payments }
- { boundary: sqs, op: SendMessage }

4Tighten the catalog to one meaningful contract

`toMatchCatalog()` is already the matcher. Your job here is to edit the catalog: keep one required effect that proves useful behavior and add one forbid for behavior that must not happen.

specVersion: "0.1"
checkout-flow:
requires:
- { boundary: postgres, op: INSERT, key: orders }
forbids:
- { boundary: http, op: POST, key: /refunds }

5Rerun and read effect coverage

On the next run, the terminal should show whether the reviewed catalog entry was satisfied, failed, or uncovered.

effect coverage
metric value
catalog entries 1
satisfied 1 (100%)
failed 0
uncovered 0
entry state runs why
✓ checkout-flow satisfied 1/1
written: <coverage-dir>/coverage.json

That is the first quickstart win: the same flow now proves something about the behavior inside the system, not only its response.

After Your First Successful Run

After the first effect loop works, choose the next layer deliberately: