Scenario
The user brief
The test used a URL-shortener microservice brief because it forced engineering, QA, technical documentation, and conversion copy into one task without becoming a toy example.
Build a URL-shortener microservice: HTTP API to shorten and resolve URLs,
SQLite storage for the MVP with a migration path to Postgres, safety checks
against malicious redirects, a compact admin endpoint that lists recent links,
Jest + property-based tests for the hash function, Markdown API docs suitable
for a README, and a short landing-page hero block with the product name and
value prop.
Tested version: Guild v1.0.0-beta4, against a sibling workspace named guild-test-urlshortener. This page is the public narrative distilled from the internal run notes.
Specialist DAG
Five specialists, scoped by dependency
| Specialist | Planned ownership | Dependency |
|---|
| architect | System design, hash strategy, schema, blocklist placement, admin separation, ADR. | None |
| backend | Express app, SQLite data layer, blocklist enforcement, admin endpoint. | architect |
| qa | Jest suite and property-based tests for the short-code hash. | backend |
| technical-writer | Markdown API docs and README walkthrough. | backend |
| copywriter | Landing-page hero name, value proposition, and CTA. | Spec only |
architect -> backend -> qa
-> technical-writer
copywriter -> spec-only, parallel with engineering
Guild did not add security as a separate specialist because the MVP had no external integration, no secrets design, and only a simple bearer-token admin path.
Execution trace
Plan vs what the agents did
| Lane | Plan | What the agent did | Evidence |
|---|
| architect | Design boundaries, schema, hash strategy, blocklist placement, admin separation, ADR. | Produced the design and ADR, including a short-code strategy and module boundary recommendation. | Design doc, ADR, architect handoff receipt. |
| backend | Implement Express API, better-sqlite3 storage, blocklist behavior, admin endpoint. | Built runtime routes, SQLite store, migration, blocklist checks, admin auth, and server entrypoint within the LOC cap. | Runtime files, migration, backend handoff receipt. |
| qa | Write Jest and property-based tests for hash behavior and route regressions. | Added property tests and integration tests covering shorten, resolve, blocklist, and admin behavior. | 2 test files, 8/8 tests passing. |
| technical-writer | Write API docs suitable for README use. | Documented endpoints, curl examples, schemas, errors, setup, and admin token behavior. | README and docs/api.md. |
| copywriter | Write a compact landing-page hero block with product name and value prop. | Created the Shortlane hero, 17-word value prop, and CTA. | docs/landing-hero.md. |
The important part is not that every lane improvised nothing; it is that every deviation was captured in receipts and reviewed later. The architect/backend service-module mismatch became the reflection signal described below.
End result
What existed after the run
Project files
Runtime, data layer, migrations, route handlers, tests, API docs, README, and landing hero copy.
src/app.js
src/data/sqlite.js
src/routes/shorten.js
src/routes/resolve.js
src/routes/admin.js
test/shortcode.property.test.js
test/routes.integration.test.js
docs/api.md
docs/landing-hero.md
Guild artifacts
Spec, team, plan, context bundles, specialist handoffs, assumptions, review, verify, reflection, and telemetry.
.guild/context/<run-id>/*.md
.guild/runs/<run-id>/handoffs/*.md
.guild/runs/<run-id>/review.md
.guild/runs/<run-id>/verify.md
.guild/reflections/run-<id>.md
.guild/runs/<run-id>/events.ndjson
Verification
Verification output
| Metric | Result |
|---|
| Specialists dispatched | 5 / 5 |
| Handoff receipts produced | 5 / 5 |
| Review lanes passing both stages | 5 / 5 |
| Verify checks green | 5 / 5 |
npm test | 8 / 8 tests pass in 0.625s |
| Runtime LOC | 193 / 500 cap |
| Guild artifacts on disk | 20 under .guild/ |
| Telemetry events captured | 93 in events.ndjson |
| Context bundle total size | 13.6 KB across 5 specialists |
Test and curl evidence
PASS test/routes.integration.test.js
PASS test/shortcode.property.test.js
Test Suites: 2 passed, 2 total
Tests: 8 passed, 8 total
Time: 0.625 s
POST /shorten -> 201 {"code":"iV3wO0R", ...}
GET /:code -> 302
GET /admin/links -> 401
GET /admin/links auth -> 200
Verify also checked blocklist behavior, SQLite migration presence, standard HTTP status codes, documentation coverage, and changed-file scope traceability.
Self-evolution signal
The run produced a real improvement candidate
Reflection found silent contract drift between the architect's design and the plan deliverables. The architect proposed a separate service module; the deliverables list did not reserve it. Backend resolved the ambiguity conservatively, but the mismatch was real.
Proposed improvement: have context assembly detect plan/design deliverable mismatches and flag them in the specialist bundle before execution.
Guild classified the finding as a guild:plan or context-assembly improvement candidate rather than blaming a specialist. It stayed in proposal form, below the promotion threshold, as the self-evolution gate requires.