Evidence-as-Code: Why Machine-Readable Compliance is the Future
Somewhere in your organization there's a compliance analyst with a folder of screenshots. Each one is timestamped (maybe), labeled (hopefully), and maps to a control (in theory). When audit time comes, they'll spend weeks matching screenshots to a checklist, hoping nothing expired, hoping the infrastructure hasn't changed since the screenshot was taken, hoping the assessor doesn't ask to see it run live.
This is the compliance evidence model most organizations use today. It's labor-intensive, error-prone, and fundamentally non-reproducible. Evidence-as-code replaces it with something better: machine-readable, cryptographically signed, continuously generated compliance artifacts that assessors can verify without relying on someone's screenshot folder.
The problem with screenshot evidence
Screenshot-based evidence has five fundamental weaknesses:
- No provenance. A screenshot doesn't prove when it was taken, by whom, or whether the system depicted is the actual production system. It's a picture of pixels. An assessor has to trust that it's real and current.
- Point-in-time only. A screenshot shows what the system looked like at one moment. It says nothing about what happened before or after. An MFA configuration screenshot from January doesn't prove MFA is still enabled in April.
- Not machine-verifiable. No tool can read a screenshot and confirm it satisfies a control requirement. A human has to look at it, interpret it, and decide if it's sufficient. This doesn't scale.
- Stale by default. Infrastructure changes continuously. Screenshots don't. By the time your annual assessment arrives, your evidence may be months out of date. The only way to keep it current is to re-screenshot everything — constantly.
- Not defensible. In a dispute or investigation, a screenshot has no chain of custody. It can be fabricated, modified, or taken from the wrong system. There's no cryptographic proof of authenticity.
General-purpose GRC platforms improve on this by automating evidence collection via API integrations. Vanta, Drata, and similar tools pull configuration data from cloud providers and present it in dashboards. This is better than manual screenshots, but the evidence is still stored in the vendor's cloud, still point-in-time (updated periodically, not continuously), and still not in a machine-readable standard format like OSCAL.
What is evidence-as-code?
Evidence-as-code treats compliance evidence like software: it's versioned, tested, reproducible, and automatically generated from infrastructure. Instead of a screenshot proving that MFA is enabled, you have:
- A Prowler check (
iam_root_hardware_mfa_enabled) that runs in CI on every commit - The check produces structured output mapped to a specific NIST 800-171 practice
- An OSCAL emitter converts the result into an OSCAL assessment-result document
- The document is SHA256 hashed and committed to git with the commit that triggered it
- The hash is linked to the corresponding control in the OSCAL component-definition
The result: every compliance claim traces to a specific check, run at a specific time, against a specific version of your infrastructure, with a cryptographic hash proving it hasn't been tampered with. An assessor can verify the entire chain from claim to evidence to infrastructure state.
OSCAL: the standard that makes it work
The key enabling technology is OSCAL (Open Security Controls Assessment Language), developed by NIST. OSCAL provides machine-readable formats for:
- Catalogs — the control requirements themselves (e.g., NIST 800-171 Rev 2)
- Profiles — which controls apply to your context (e.g., CMMC Level 2 selection)
- Component definitions — how your systems implement specific controls
- System Security Plans — the complete description of your security posture
- Assessment plans and results — how controls were tested and what the findings were
- POA&Ms — what's not yet implemented and when it will be
OSCAL is the compliance equivalent of Terraform for infrastructure. It makes the SSP a build artifact, not a document. It makes the assessment a diff, not a read. And it makes continuous compliance a pipeline property, not a dashboard feature.
How evidence-as-code works in practice
Here's the pipeline that replaces the screenshot folder:
Infrastructure change (Terraform apply) | v CI pipeline triggers | +--> Prowler: 200+ security checks against AWS +--> Steampipe: SQL queries against cloud APIs +--> OPA: Policy-as-code evaluation against Terraform state | v OSCAL emitter maps results to control requirements | v SHA256 hash + git commit | v Component definition updated in SSP assembly | v Failed checks --> auto-generated POA&M entries
Every infrastructure change runs through this pipeline. The evidence is always current because it's generated from the same commit that changed the infrastructure. There's no lag between "we changed something" and "the evidence reflects the change."
What the assessor sees
Instead of a folder of screenshots, the assessor gets:
- An OSCAL component-definition for each in-scope system, with implemented-requirements pointing at verification methods
- Assessment-result documents with timestamped findings and git commit hashes
- A provenance chain from each claim to the CI run that produced the evidence
- An auto-generated POA&M for any failed checks, with the check ID and last-passing commit
Year one, the assessor reviews the full tree. Year two, they review the diff — what changed since last year, which new evidence was produced, which POA&M items were resolved. A re-assessment becomes hours, not weeks.
The court-defensible advantage
Evidence-as-code produces artifacts with properties that screenshot evidence simply cannot have:
- Immutability: Git commits are content-addressed. Changing the evidence changes the hash, which breaks the chain. You can't retroactively modify evidence without detection.
- Reproducibility: Given the same infrastructure state and the same commit, the pipeline produces the same evidence. An assessor (or a court) can verify this.
- Completeness: The pipeline runs against all in-scope systems on every commit. Missing evidence is a pipeline failure, not a human oversight.
- Currency: Evidence is never more than one commit old. There's no "when was this screenshot taken?" question.
In a world where the DoJ is holding CISOs personally liable for compliance misrepresentations and the False Claims Act is being applied to cybersecurity assertions, the provenance of your compliance evidence matters. "We have a screenshot" is a different legal position than "we have a SHA256-signed artifact from a reproducible pipeline, committed at timestamp T, against infrastructure version V."
Getting started with evidence-as-code
You don't need to build this pipeline from scratch. Here's the practical path:
- Start with what you have. If you're already running Prowler, Steampipe, or cloud-native security tools, you have raw evidence. The gap is mapping it to controls and producing OSCAL output.
- Pick three controls. Don't try to automate all 110 CMMC practices at once. Start with your highest-value controls (AC.L2-3.1.1, IA.L2-3.5.3, SC.L2-3.13.8) and build the pipeline for those.
- Map to OSCAL. Use NIST's oscal-cli or compliance-trestle to produce machine-readable component definitions.
- Commit and sign. Hash every evidence artifact and commit to a git repository. This is your provenance chain.
- Extend incrementally. Add controls to the pipeline one domain at a time until all 110 are covered.
Or, let us do it. We deploy this pipeline for clients in weeks, not months, mapping all 110 CMMC L2 practices to automated checks with OSCAL output.
See evidence-as-code in action
Download a sample evidence package with OSCAL component definitions, assessment results, and provenance chains.
View sample evidence package →