I'm the fractional AI CTO who still writes the code.
I build the harnesses, code-intelligence graphs, voice pipelines, and hybrid retrieval that turn AI from a demo into a system you can run in production.
The Manager as Compiler
Running a mixed human–AI team without drowning in slop
Six months into serious AI adoption, output is up and the manager is more tired than ever. The problem is control systems, not tooling: cheap, unverified output is flooding a review pipeline that was never designed to filter it.
Read the articleCode harnesses & agentic coding
Steve builds the harnesses that turn AI coding agents from impressive demos into systems you can trust in a real codebase — quality gates, sandboxes, and orchestration he writes himself.
# compact context pack, not a raw dump
ctx = pack(repo) # ~3k chars, not 50k
ctx += adrs + conventions + dep_graph
ctx += symbol_index(treesitter_parse(repo))
prompt = user_intent + ctx
agent.run(prompt, budget=Budget(steps, tokens))
while not done and budget.ok():
plan = llm(ctx) # reason over state
call = select_tool(plan) # next action
call = pre_tool(call) # mediated ↓
obs = run(call) # sandboxed
ctx += obs + post_tool(obs) # gate report
done = plan.is_complete or budget.spent()
# intercept BEFORE execution — deny or rewrite
def pre_tool(call):
assert call.path in scope # path/scope guard
assert call.tool in allowlist # policy allowlist
assert not secrets(call.args) # no creds leak
call = policy.rewrite(call) # Cursor .cursor/rules
return call # or raise Deny
# edit / exec inside the isolated worktree
before = read(call.path)
apply(call, cwd=wt) # write file / run cmd
after = read(call.path)
diff = unified(before, after) # captured change
return Observation(diff, stdout, exit_code)
# Claude Code PostToolUse hook → 30 compiled rules
def post_tool(diff):
checks = [lint, types, tests, ast_diff,
secret_scan, no_stub, no_phantom_ref,
import_exists] # 6 categories
gate = all(rule(diff) for rule in RULES) # 30, parallel
if not gate: rollback(diff) # revert in ms
return Report(gate, failures) # Codex AGENTS.md
wt = worktree(repo) # isolated checkout
apply(diff, wt)
graph = code_graph(treesitter, wt)
ok = graph.resolve_refs(diff) # no phantom symbols
ok &= symbol_search(diff.new_imports).exist()
build(wt) # self-contained binary
return ok # OpenCode .opencode/
if report.pass:
ctx += "✓ gate green"; continue # keep going
else:
ctx += report.failures # actionable
retries += 1
if retries > MAX: escalate(human) # bounded
# agent re-plans against the failures →
Select a stage above to read its pseudo-code.
Forge (forgeOS) v1.6.0 — Rust binary · Tree-sitter code graph · 30 deterministic rules · 12 MCP tools · 6+ languages · 638 tests. Hooks into Claude Code, Cursor, Codex & OpenCode.
Forge
McClawd
HyperCoder
MCP Stack
AI Cybersecurity: Code Property Graphs
Steve builds code property graphs that let AI reason about software the way a senior engineer does — following control, data, and call flow across a whole codebase instead of one file at a time.
CPG Scanner
Vulnflow
GCoder
Voice AI
Steve builds low-latency voice AI end to end — streaming audio in, transcription and understanding out — fast enough to feel like conversation rather than a form you talk at.
Hybrid search at scale
Steve builds hybrid retrieval that fuses semantic vector search with structured filtering, so AI answers stay both relevant and precise on large, messy document sets — not one or the other.
What else
I've shipped
Thirty-five years of building under real consequences — banks, AWS, an automated hedge fund, pioneering DevOps in 2006. The proof, not the pitch.
Still shipping the code.
I write code, and I have for thirty-five years — usually in places where being wrong was expensive. I started doing AI for money in the 1990s, building a fully automated hedge fund on neural networks and genetic algorithms, and I have been close to the hard edge of trading, banking, and infrastructure ever since. Along the way I was an early pioneer in founding DevOps — writing a DevOps vision paper at Morgan Stanley in 2006 — invented one of the earliest container platforms and service meshes, served as a Principal Cloud Architect at AWS, and held Global Chief Architect roles in international banking. I have founded more than thirty startups, including Azara, an agentic AI workflow platform I built and launched as Founder and CTO. Today I build AI directly: the code harnesses that make agentic coding trustworthy, the code-intelligence graphs that let machines reason about software, voice pipelines, and hybrid retrieval at scale. I am the person you hire when AI has to actually work — as your fractional AI CTO or Chief AI Architect, and as someone who still ships the code.
Hire an AI leader who still ships the code.
Whether you need a fractional AI CTO to own the roadmap or a Chief AI Architect who writes the hard parts himself, the question is the same: can this person build AI you can trust in production? The harnesses, graphs, voice pipelines, and retrieval systems above are the answer. Let's talk about what you're building and where it has to hold up.