Writing Scenarios

Edit on GitHub

Write scenarios and effect assertions that connect system tests to observability testing, span assertions, and Blackbox behavior proof.

Abstract

A Blackbox scenario is the bridge between a system test and behavioral proof. It names the workflow, exercises the boundary, and gives Blackbox enough structure to compare observed effects against what the team expects.

This page also owns the material that used to sit in a separate asserting-effects guide: how scenarios, span assertions, trace assertions, and required or forbidden effects work together.

Audience

Developers writing Blackbox scenarios against system tests, Playwright flows, Docker Compose stacks, Testcontainers setups, or other controlled runtime environments.

Scenario Shape

A useful scenario has three jobs:

Establish the starting state.
Exercise a real workflow.
Assert the effects that must or must not appear.

The exact DSL can vary by runner and API surface, but the mental model stays close to Given, When, Then:

Given the system starts from a known state.
When the workflow runs.
Then required effects appear and forbidden effects stay absent.

Do not start with every workflow. Start with one behavior whose boundary effects matter.

Observability Testing Vs Effect Assertions

Observability testing is useful when traces, spans, logs, and runtime facts help explain what happened during a test. By itself, observability does not decide whether the behavior is acceptable.

Blackbox adds the assertion layer:

Evidence	Useful question	Blackbox layer
Span	Did this operation happen?	Span assertion or effect builder
Trace	Did the workflow cross expected boundaries?	Trace assertion or runtime evidence review
Effect	Did the externally visible behavior happen?	Required effect
Missing effect	Did expected behavior disappear?	Catalog failure
Extra effect	Did forbidden behavior appear?	Forbidden effect

The goal is not to assert every span. The goal is to assert the effects that represent product or system behavior.

Requires And Forbids In A Scenario

Required effects are positive assertions. They describe what must happen if the scenario is correct.

Forbidden effects are negative assertions. They describe what must not happen, even if the main path succeeds.

Good examples:

Require an event publication after a successful order.
Require a database write after a user update.
Forbid a deprecated downstream API call.
Forbid duplicate queue messages during retry.

Weak examples:

Assert every internal helper span.
Assert private implementation details that do not matter to the boundary.
Treat every observed span as a required behavior without review.

Working With OpenTelemetry

OpenTelemetry gives Blackbox the runtime facts. The scenario gives those facts meaning.

A practical OTEL testing workflow is:

Instrument the system under test.
Run one system or E2E scenario.
Capture spans and related runtime evidence.
Map the evidence into effects.
Promote important effects into required or forbidden behavior.
Let future runs fail when the behavior drifts.

This is where span assertions and trace assertions become behavior assertions. They are not only debugging tools; they become part of the verification gate.