Run with the Testbed

Run a controlled Playwright and OpenTelemetry testbed so Blackbox can collect system-test evidence.

Abstract

The Blackbox testbed is the controlled path for running a real system test and collecting runtime evidence. It stands up the system topology, runs the workflow through Playwright or a compatible runner, and routes OpenTelemetry evidence into Blackbox artifacts.

Use this page when you want a repeatable first topology before adapting Blackbox to your own Docker Compose or Testcontainers setup.

Audience

Developers wiring Blackbox into a local system-test environment, CI job, or sample project.

What The Testbed Proves

The testbed is not only a demo harness. It teaches the shape Blackbox expects from a useful system test:

A running service or service topology.
Managed dependencies such as databases, queues, caches, local services, or containers.
Runtime instrumentation that emits OpenTelemetry spans during the run.
A scenario that exercises a real boundary.
Generated artifacts that can be reviewed after the run.

That is why the testbed is the best first step before trying to wire Blackbox into a larger system.

Playwright And OpenTelemetry

Playwright is the runner that drives the workflow. OpenTelemetry is the evidence channel that shows what happened across boundaries. Blackbox uses the run and the spans together:

Playwright performs the action.
The application emits spans and other observable facts.
Blackbox maps those facts into effects.
Reports and feature artifacts make the result reviewable.

This is the practical meaning of a Playwright OpenTelemetry workflow in Blackbox: the test does not only pass or fail. It leaves behind evidence about the system behavior it caused.

What Happens During A Run

The important user-facing point is that Blackbox does not ask you to rewrite your service images for observability. It runs your production-shaped containers with a generated test overlay.

You run the Blackbox system-test command, usually through the project script that invokes Playwright.
Your system-test setup builds or selects the OpenTelemetry bootstrap image.
Playwright global setup reads the testbed config and the user’s Docker Compose topology.
Blackbox generates temporary compose overlays for instrumentation, debug ports, and worker isolation.
An init container copies /blackbox-otel from the bootstrap image into a Docker volume.
Each SUT container mounts that volume and shadows its Node binary with bin/node-wrap.
The SUT still starts with its normal command, such as node src/server.js.
node-wrap prepends --require=/blackbox-otel/bootstrap.cjs and then execs the real Node binary from the bootstrap volume.
The bootstrap loads before app code and registers OpenTelemetry instrumentations.
The test drives a workflow, spans are captured, and Blackbox writes reports after the run.

The user-owned compose file is not modified on disk. The overlays are generated for the test run and then treated as testbed plumbing.

Why The Bootstrap Image Exists

The bootstrap travels as a Docker image instead of a host node_modules bind mount. That matters because package managers such as pnpm can represent dependencies as symlinks that do not survive cleanly when mounted into another container filesystem.

The image contains real files:

bootstrap.cjs, the OpenTelemetry preload entrypoint.
node_modules/, with the OpenTelemetry SDK and instrumentations.
bin/node-wrap, the shell wrapper mounted over the SUT node binary.
bin/real-node, the Node binary the wrapper delegates to.

This keeps the user’s package manager out of the instrumentation path.

Why Blackbox Shadows `node`

Docker Compose environment merging is key-based. If Blackbox simply set NODE_OPTIONS=--require=/blackbox-otel/bootstrap.cjs, it could overwrite a value the user’s image, compose file, or env file already set.

Instead, the wrapper composes NODE_OPTIONS inside the container after the final environment has already been resolved. It preserves existing flags, prepends the Blackbox bootstrap, and avoids double-prepending when child Node processes start.

Blackbox also avoids overriding the service entrypoint. Overriding entrypoint can drop the image command, which would force the testbed to reconstruct every service command. Shadowing node keeps the service startup path closer to what the image already declares.

Testcontainers And Docker Compose

Blackbox does not require one topology tool. The important distinction is whether the dependencies are managed by the test run.

Setup	Best for	Blackbox concern
Testbed	Fastest known-good path	Learn the expected topology and artifacts
Testcontainers	Per-test managed dependencies	Keep services isolated and observable
Docker Compose	Existing multi-service local stack	Make sure services, networks, and telemetry are wired consistently
Remote E2E environment	Production-like journey	Expect more unmanaged dependency noise

System tests usually make better behavioral gates when dependencies are controlled. E2E environments are still useful, but failures may come from auth, email, payment, vendor APIs, or shared infrastructure rather than the behavior under test.

Minimal Workflow

A good first run should answer four questions:

Did the topology start?
Did the scenario exercise the boundary?
Did spans arrive?
Did Blackbox write artifacts?

Do not start by optimizing the whole pipeline. Start with one behavior and one output directory. Once the first proof run works, expand the catalog, reports, and CI gate.

Expected Artifacts

A successful run should produce some combination of:

Runtime evidence from the scenario.
Feature files or observation output.
Effect catalog or catalog coverage output.
OMC/DC coverage artifacts such as omcdc-propagation.json, omcdc-propagation.md, and omcdc.html.
Logs or diagnostics for the run.

The exact filenames depend on the command and configuration. The reference page Files and Artifacts owns the complete inventory.

Common Failure Points

If the first run fails, check these in order:

The service did not start or was not reachable from the runner.
The scenario passed through the wrong host, port, or network.
The bootstrap image was not built, pulled, or mounted into the SUT container.
The service did not start through the Node binary path Blackbox shadowed.
The wrapper loaded, but existing tracing configuration conflicted with the Blackbox bootstrap.
Artifacts were written somewhere different from the path you inspected.

Figure Placeholder

Caption: A system-test run from Playwright to Docker Compose overlays, node-wrap, OpenTelemetry spans, and .blackbox-coverage/ artifacts.

Slot: 

Use No Spans Captured when the test runs but no runtime evidence appears. Use Docker and Testbed when the topology itself fails.