Documentation

Architecture overview

What runs where: scanner, generators, hosted trust center.

Last updated May 6, 2026

Attestly is a single Next.js application backed by Postgres, a managed background-job runner, and a small object store. There is no on-premises agent — your code is read through the GitHub API.

High level

GitHub OAuth ──▶ Attestly app (Next.js on Vercel)
                      │
                      ├── Postgres (Neon) — tenants, findings, doc versions
                      ├── Inngest        — background workers (scan, generate, drift)
                      ├── OpenAI API     — structured-output document generation
                      └── /trust/<slug> on the app host (+ optional *.trust.<root> — public trust centers, ISR)

The scan pipeline

Tarball download. We call GET /repos/:owner/:repo/tarball/:ref with the user's OAuth token and stream the gzipped archive into a temp directory.
AST + lockfile sweep. Two passes happen in parallel: a TypeScript AST traversal (via ts-morph) and a lockfile scan (npm, pnpm, yarn, requirements, go.sum, Cargo.lock).
Detector library. Around 200 detectors map package names and import paths to canonical entities — openai → "OpenAI, L.L.C.", @stripe/stripe-js → "Stripe, Inc.", and so on.
Findings persistence. Each match becomes a row in findings with sourcePath and sourceLine — that's how we cite back to your code.
Rolled-up subprocessor list. A unique-by-key view becomes the canonical list shown on your trust center.

Each document type has its own Zod schema (see the generators reference). The OpenAI API is called with that schema as a structured-output constraint, so the model is forced to produce a JSON object that matches the shape we expect. We then deterministically render it to Markdown — no second LLM pass, no hallucinations that change between versions.

Drift detection

A GitHub webhook subscribes to pull_request.opened and pull_request.synchronize. For each PR, we run the scanner on the merge ref, diff the findings against main, and surface the result as a check run. If a finding changed and the underlying document is currently published, the PR is blocked until a reviewer approves the new version.

Data we keep vs. don't

We keep	We don't keep
Detector findings (file:line, vendor key, metadata)	Source files (deleted after scan)
Generated document JSON + rendered Markdown	Code snippets beyond a 3-line context window
Subprocessor list	Customer-end-user data
Audit log of every approval and publish	Anything from repos you didn't explicitly connect

High level

The scan pipeline

Document generation

Drift detection

Data we keep vs. don't