Code Intelligence / AI Security · Product · 2026

CPG Scanner

Static analysis that maps the attack surface of any LangChain or LangGraph app — MITRE ATLAS taint chains, architecture, and control gaps, from source.

Year: 2026
Status: Product
Category: Code Intelligence / AI Security
Role: Architect & Lead

Key metrics

ATLAS Attacks

Analysis Skills

Node Types

237

Tests

~1.2s

Scan Time

Architecture

A fetcher pulls source from a GitHub URL, git repo, or local path and converts notebooks to Python. A Python AST pass builds a Code Property Graph, then six YAML-driven analysis skills run in strict order over it — filtering to LLM-relevant nodes, classifying trust planes, extracting architecture, mapping MITRE ATLAS taint chains, detecting control gaps, and resolving model governance. The merged JSON report drives an interactive React SVG diagram. Nothing in the target code is ever executed.

Case study

The problem: an LLM app's attack surface is invisible before it ships

A LangChain or LangGraph application is a graph of untrusted inputs, model calls, retrievers, tools, and data stores — but none of that structure is visible in the code itself. It's scattered across decorators, pipe chains, and conditional edges. By the time a security engineer can reason about where a prompt injection lands or where a poisoned document reaches the model, the app is already running. Pen-testing a live deployment is slow, partial, and late.

CPG Scanner answers the question before anything runs: given any LangChain / LangGraph Python codebase — a GitHub URL, a git repo, or a local path — what is its attack surface, and where are the controls missing?

[[toc]]

Building a Code Property Graph from the AST

The scanner never executes the target code. A fetcher pulls the source (cloning repos and converting .ipynb notebooks to .py as needed), then a Python AST pass walks every file and emits a Code Property Graph: nodes for CALL, IMPORT, ASSIGN, FUNC_DEF, and LITERAL, joined by DATA_FLOW and CALL edges. The default LangGraph adaptive-RAG examples produce 629 nodes and 103 edges from four files in about 1.2 seconds.

The CPG is the substrate. Everything downstream is derived from it, which is what makes the evidence real — every finding traces back to a specific file and line in the source.

Six YAML-driven analysis skills

Six skills run in strict order over the graph. Each is configured entirely in YAML — new frameworks, attack chains, and control patterns are added without touching Python.

flowchart TD
  SRC["GitHub URL / git repo / local path"] --> FETCH["fetcher.py"]
  FETCH --> AST["ast_runner.py — Python AST to CPG (629 nodes, 103 edges)"]
  AST --> S1["llm_filter — prune to 7 LLM-relevant categories"]
  S1 --> S2["boundary_classifier — assign 8 trust planes"]
  S2 --> S3["arch_extractor — 14 architecture node types"]
  S3 --> S4["threat_mapper — MITRE ATLAS taint chains"]
  S4 --> S5["control_detector — 10 control types + gaps"]
  S5 --> S6["eci_resolver — model governance card"]
  S6 --> ASM["assembler.py — merge to scan_result.json"]
  ASM --> UI["React SVG attack surface map"]

llm_filter prunes the raw CPG down to LLM-relevant nodes across seven categories, mutating the graph in place so every later skill sees the same filtered view.
boundary_classifier assigns eight trust planes from import and call patterns — the ingress, model, and privileged-action zones you see drawn as dashed boundaries in the UI.
arch_extractor maps the CPG onto 14 architecture node types, including LCEL pipe-operator chains (prompt | llm | parser) that other tools miss.
threat_mapper does the taint-chain analysis described below.
control_detector detects 10 guardrail types and emits a gap warning, with the specific library to add, whenever an expected control is absent.
eci_resolver matches model strings to a governance registry and produces a per-model card with an Epoch Capabilities Index score and blast radius.

MITRE ATLAS taint chains, with evidence

The threat mapper traces data-flow paths through the CPG and matches them against six MITRE ATLAS techniques. Each finding is a real taint chain, not a heuristic guess. Direct prompt injection (AML.T0051.000), for example, fires when a HumanMessage flows through add_messages into a ChatOpenAI call — and the report carries the source, line, and code snippet for every hop:

Attack	MITRE ID	Severity	Trigger
Prompt Smuggling / Direct Injection	AML.T0051.000	Critical	`HumanMessage` → `add_messages` → model
Indirect RAG Poisoning	AML.T0054.003	High	loader → vectorstore → retriever
Prompt Exfiltration via Tool	AML.T0057	High	`ToolNode` with web/shell after the model
Privilege Escalation via Agent	AML.T0043	High	multi-agent supervisor, no HITL gate
Model Inversion via Embeddings	AML.T0040	Medium	embeddings + store, no access control
Adversarial Example Injection	AML.T0020	Medium	unvalidated docs into the retrieval pipeline

Findings also line up with the OWASP LLM Top 10 (2025): prompt injection (LLM01), sensitive data exposure (LLM02), and supply chain (LLM05).

Step through how a single tainted source propagates across the graph until it reaches a dangerous sink — the same walk the threat mapper performs to build a taint chain:

The interactive attack surface map

The merged JSON report drives a React SVG diagram. Components are laid out across three trust planes — public ingress, the privileged action plane, and the data/tool plane — with each node carrying its CPG evidence, code snippets, and governance card. Nodes the scanner detected directly are badged CPG; the rest are inferred synthetic nodes that complete the picture.

@image[hero.png]

Clicking through surfaces the detail: attack paths with MITRE technique and mitigations, end-to-end data-flow paths with OWASP risk indicators, and the governance card per model. The diagram below is the scan of the bundled LangGraph adaptive-RAG example, with code snippets pulled straight from source.

@image[01-attack-surface-map.png]

Built to run anywhere

The tool ships as a Typer CLI (scan, serve, list-skills, reindex) and a FastAPI server, packaged for Docker and deployable via Terraform. Adding new LLM frameworks, attack taint chains, security-control patterns, or model registry entries is a YAML edit, not a code change — the analysis engine stays fixed while the rule set grows. The suite runs 237 tests with zero failures.

Tech stack

PythonAST / Code Property GraphLangGraphReactSVGMITRE ATLASFastAPIDockerTerraform