Skip to content
← Back to Library

Data Flow Specification

Recommended engineering data_flow_spec
Agent Prompt Snippet
Ensure the project has a data flow specification mapping sensor data from collection through edge processing to cloud ingestion.

Purpose

A data flow specification maps sensor data from collection through edge processing to cloud ingestion, defining protocols, serialization formats, and buffering strategies for reliable telemetry.

This is a Recommended document — most projects benefit significantly from having one. While not strictly essential for every situation, its absence often leads to gaps in team understanding or quality.

What Makes It Good vs Bad

A strong version of this document:

  • Provides enough detail that a new team member can understand the system
  • Includes diagrams or structured descriptions of components and data flows
  • Documents decisions and trade-offs, not just final choices
  • Stays current — updated when the system changes meaningfully
  • Clearly separates what exists today from what is planned

Warning signs of a weak version:

  • So high-level it could describe any project
  • Outdated diagrams that no longer match the running system
  • Missing rationale — documents what but never why
  • No clear ownership or update cadence
  • Mixes aspirational design with actual implementation

Common Mistakes

  • Writing documentation after the fact that doesn’t reflect actual decisions made
  • Over-documenting trivial details while under-documenting critical design choices
  • Not versioning the document alongside the code it describes
  • Assuming readers have the same context the author had when writing

How to Use This Document

Write for the audience of a new team member joining six months from now. Lead with the why behind decisions — the what is usually visible in code, but context and trade-offs are not. Use diagrams for system topology and data flows. Keep the document close to the code it describes (same repo, linked from README).

For AI agents: Use this document as primary context when modifying the system. Before proposing changes, verify your understanding against the architecture and design documents. Cite specific sections when explaining why a change is compatible with existing design decisions.

Starter Template

SpecBase includes a ready-to-use template for this document: kb/templates/engineering/data_flow_spec.md.tmpl. Use the SpecBase CLI or MCP integration to generate it pre-filled for your project.

# Generate stubs via CLI
specbase init <archetype> --features <features> --dir ./docs
  • Documenting Software Architectures: Views and Beyond by Paul Clements et al. — The standard reference for how to document software architecture effectively.
  • Designing Data-Intensive Applications by Martin Kleppmann — Essential reading for understanding distributed systems, data models, and trade-offs.
  • A Philosophy of Software Design by John Ousterhout — Concise guide to managing complexity in software through better module design.

Appears In