I make agentic AI shippable.
MacLeodLabs rescues AI products stuck between demo and market.
I audit messy codebases, expose the real launch blockers, fix the critical path, and get agentic AI systems ready for customers, investors, and production use.
For teams whose AI product works in demo but is not yet reliable, secure, scalable, or trusted enough to ship.
Product rescue · AI security · RAG and document intelligence · Voice AI · Cloud and platform architecture · Hands-on CTO execution
The Manager as Compiler
WorkSpec: the missing control layer for AI work
AI does not just change how work is produced. It changes how work must be specified, reviewed, constrained, and accepted. The problem is no longer output. The problem is control.
Read the articleExamples of pain points I've fixed
These are not abstract demos. They are recurring failure patterns I have fixed inside real AI products: codebases moving faster than control, agentic systems with invisible risk, voice AI that breaks outside the demo, and RAG/search systems users cannot trust. They are examples, not the full list — the same approach applies wherever an AI product has to get reliable, secure, and trusted enough to ship.
AI codebases moving faster than control
AI coding tools and agents create speed, but they also create fragile code, weak tests, phantom references, unclear ownership, and build instability.
I audit the codebase, identify the launch blockers, repair the critical path, and put quality gates around the parts that matter.
# compact context pack, not a raw dump
ctx = pack(repo) # ~3k chars, not 50k
ctx += adrs + conventions + dep_graph
ctx += symbol_index(treesitter_parse(repo))
prompt = user_intent + ctx
agent.run(prompt, budget=Budget(steps, tokens))
while not done and budget.ok():
plan = llm(ctx) # reason over state
call = select_tool(plan) # next action
call = pre_tool(call) # mediated ↓
obs = run(call) # sandboxed
ctx += obs + post_tool(obs) # gate report
done = plan.is_complete or budget.spent()
# intercept BEFORE execution — deny or rewrite
def pre_tool(call):
assert call.path in scope # path/scope guard
assert call.tool in allowlist # policy allowlist
assert not secrets(call.args) # no creds leak
call = policy.rewrite(call) # Cursor .cursor/rules
return call # or raise Deny
# edit / exec inside the isolated worktree
before = read(call.path)
apply(call, cwd=wt) # write file / run cmd
after = read(call.path)
diff = unified(before, after) # captured change
return Observation(diff, stdout, exit_code)
# Claude Code PostToolUse hook → 30 compiled rules
def post_tool(diff):
checks = [lint, types, tests, ast_diff,
secret_scan, no_stub, no_phantom_ref,
import_exists] # 6 categories
gate = all(rule(diff) for rule in RULES) # 30, parallel
if not gate: rollback(diff) # revert in ms
return Report(gate, failures) # Codex AGENTS.md
wt = worktree(repo) # isolated checkout
apply(diff, wt)
graph = code_graph(treesitter, wt)
ok = graph.resolve_refs(diff) # no phantom symbols
ok &= symbol_search(diff.new_imports).exist()
build(wt) # self-contained binary
return ok # OpenCode .opencode/
if report.pass:
ctx += "✓ gate green"; continue # keep going
else:
ctx += report.failures # actionable
retries += 1
if retries > MAX: escalate(human) # bounded
# agent re-plans against the failures →
Select a stage above to read its pseudo-code.
Forge (forgeOS) v1.6.0 — Rust binary · Tree-sitter code graph · 30 deterministic rules · 12 MCP tools · 6+ languages · 638 tests. Hooks into Claude Code, Cursor, Codex & OpenCode.
Forge
McClawd
HyperCoder
MCP Stack
Code harnesses, agentic coding workflows, AI-generated code control, codebase rescue, and release-readiness checks.
See how the control loop worksAgentic systems with invisible risk
Agentic AI risk hides across tools, memory, retrievers, APIs, files, permissions, prompts, and human approval boundaries.
I map what the system can read, write, call, trigger, or leak — then fix the control gaps before launch.
CPG Scanner
Vulnflow
GCoder
Code property graphs, trust-boundary mapping, control-surface analysis, agent permissions, and AI security review.
See how the risk map worksVoice AI that works in demo but not production
Voice AI breaks on latency, silence detection, streaming, transcription quality, turn-taking, inference cost, telephony, deployment, and real users.
I turn fragile voice prototypes into production-ready pipelines.
Real-time speech, VAD, streaming, inference acceleration, call handling, deployment, and production-readiness.
See how voice becomes production-readyRAG and search systems users cannot trust
Naive RAG gives plausible answers. Shippable RAG gives grounded, complete, cited, testable answers — and knows when to abstain.
I rescue document-intelligence systems that miss evidence, hallucinate specifics, fail exhaustive queries, or cannot prove where answers came from.
Hybrid retrieval, structured filtering, citation grounding, exhaustive search, evaluation, and trustworthy document intelligence.
See how trustworthy retrieval worksStart with a Product Rescue Diagnostic
The diagnostic is for teams with an AI product, prototype, or codebase that needs technical truth before it can ship, sell, fundraise, or scale. I review the codebase, architecture, launch path, control surface, retrieval quality, deployment risks, and operational readiness.
You get a clear answer to three questions
- What is really blocking launch?
- What must be fixed now?
- What can safely wait?
Deliverables
- Codebase and architecture audit
- Launch-blocker map
- Risk register
- Critical-path fix list
- Cut / fix / defer recommendations
- 30-day rescue plan
- Go / no-go release judgement
- Optional hands-on rescue sprint
Senior enough to make the hard calls. Hands-on enough to fix the code.
I have spent decades building systems where being wrong was expensive: trading, banking, cloud platforms, DevOps operating models, container-era infrastructure, and production AI systems. Before AI became a product category, I was already building automated systems that had to reason, decide, and act under pressure. Today that experience is focused on one problem: making agentic AI shippable.
- Took over and rebuilt several stalled AI platforms, getting them to market in weeks to a few months.
- Helped move a generative-AI security product from research toward go-to-market readiness.
- Built and led an agentic AI workflow platform through product buildout and acquisition.
- Designed cloud operating models and automation practices for high-stakes enterprise environments.
- Authored and implemented a pre-DevOps operating model in 2006, focused on automation, shared ownership, operational feedback loops, and reducing handoff failure.
- Built early LXC-era container platform and service-mesh systems before Docker became mainstream.
Ready to make your agentic AI product shippable?
If the demo works but the product is not yet credible enough for customers, investors, or production, start with a Product Rescue Diagnostic.