← All work
Generative AI Agents · Prototype · 2025

GCoder

A graph-native coding agent that models code, runtime, infra, and security as one queryable SystemGraph — replacing flat-context copilots with localized subgraph retrieval.

GCoder
Year
2025
Status
Prototype
Category
Generative AI Agents
Role
Architect & Lead

Key metrics

9
MCP Tools
~85%
Foundation
6/6
E2E Tests
8 slots
Brain Pool

Architecture

Models an entire system — code structure, runtime traces, infra, CVEs, and visual UI state — as a single queryable graph (the SystemGraph) in FalkorDB. A pluggable core exposes nine MCP tools while hiding dozens of internal functions. Encoder models handle 80-90% of routing and similarity work; LLMs only generate. A LangGraph planner runs an 8-phase workflow with a hot-swap brain pool and a parallel executor with rollback. Targets 10-100x cost and reliability gains over flat-context copilots via localized subgraph retrieval instead of whole-repo dumps.

Case study

GCoder — Graph-Native AI Code Intelligence

GCoder is an experiment in fixing the core failure mode of large-context copilots: they treat a codebase as a pile of text, dump as much of it as possible into an LLM window, and hope relevance falls out. That approach is expensive, unreliable, and has no memory.

GCoder takes the opposite bet. It models the entire system — code structure, runtime behavior, infrastructure, security advisories, and even visual UI state — as a single queryable graph called the SystemGraph, and works on small, precisely-targeted subgraphs instead of whole repos.

The core insight

Traditional copilots fail at scale because they make expensive generative calls for work that is really search and matching, treat every file as equally relevant, and keep no persistent understanding between sessions.

GCoder front-loads understanding into the graph and into cheap encoder models, and reserves frontier LLMs for the small slice of work that genuinely requires generation.

The SystemGraph

Three interconnected layers live in one graph database:

  • Structural layer — what exists: repos, modules, files, symbols, bounded contexts, capabilities, endpoints, DB tables, queues, contracts, and third-party dependencies.
  • Runtime layer — what happens: deployments, requests, trace spans, log events, errors, and metrics, all linked back to the structural layer (an error points at the capability that produced it).
  • Planning layer — what we want and did: requirements, plan steps, decision nodes, editable design rules, CVE impacts, and test coverage.

A visual extension models PageView → DomNode → StyleRule → DesignToken so that UI changes can be planned and impact-analyzed like code refactors.

Encoder-first, LLM-last

A hierarchy of models keeps cost down. Encoder (BERT-style) models do semantic routing, intent classification, similarity search, and reranking at near-zero marginal cost. Small and mid-sized models handle templated edits. A frontier model (Claude Sonnet 4.5) is reserved for ambiguous, global problems — roughly 1% of operations. The design target is a 10-100x cost reduction versus naively dumping a repo into a 1M-token model.

Clean tool surface, deep internals

GCoder exposes just nine MCP toolsgraph, code, runtime, tests, capability, bug, cve, visual, and workflow — while hiding dozens of internal functions behind them. Each tool is backed by a "skill kit," and the whole thing is built to be pluggable: the core graph schema stays stable while ingestion, rules, diagnostics, planning, and verification behaviors are added as plugins.

How a task flows

A feature request is encoded, classified by intent, matched against similar existing capabilities, and localized to a 100-1000 node subgraph. Editable design rules constrain the solution space before any LLM sees the problem. A LangGraph planner emits plan steps, coder agents execute them with minimal context each, and graph-scoped targeted tests prove the capability is healthy. Bug fixing, CVE mitigation, and visual changes follow the same localize → constrain → generate → verify shape.

Current state

GCoder is an active prototype. The foundation layer is roughly 85% complete: the FalkorDB-backed storage adapter, configuration system, hot-swap brain pool (8 slots), dual-mode coding agent, an 8-phase LangGraph planner with human-feedback re-entrancy, a FastMCP server scaffold, and a parallel executor with pause/resume and rollback are all in place. The end-to-end delegation test suite passes 6/6, covering plugin discovery, brain-pool management, orchestrator execution with live LLM code generation, storage integration, and a Docker build-and-run check.

The graph layers themselves (structural, domain, planning) are the next milestone, along with the full MCP server implementation and broader test coverage.

Why it matters

GCoder is not "copilot versus static analysis versus observability." It unifies all three on a graph foundation, enabling workflows that siloed tools cannot: fully automated CVE mitigation from detection through fix and validation, design-token refactors with real impact analysis, and multi-repo contract-change coordination. The bet is that persistent, structured understanding beats ever-larger context windows.

Tech stack

PythonLangGraphFalkorDB (OpenCypher)FastMCPClaude Sonnet 4.5Docker

Other 2025 work