Skip to main content

12 posts tagged with "ai-agents"

View All Tags

Durable Execution in LangGraph: Agents That Survive Failure and Resume Where They Left Off

· 12 min read
Vadim Nicolai
Senior Software Engineer

Most AI agents are built as a single process holding state in memory: a while loop, local variables, maybe a sleep(). That holds up until the workflow has to outlive the process that started it — and in production it always does. The math is unforgiving: chain ten steps that each succeed 85% of the time and the whole run finishes only about 20% of the time (0.85¹⁰ ≈ 0.20). Without durability, every one of those failures restarts from scratch. The model might be reliable; the tool calls aren't. Better LLMs don't fix network failures — only durable execution does.

The research consensus is that the infrastructure around the model, not the model itself, is where production agents live. The 2026 design-space analysis Dive into Claude Code found that only 1.6% of Claude Code's codebase is AI decision logic; the other 98.4% is operational infrastructure for context management, tool routing, and recovery. LangGraph's answer to that reality is durable execution through its persistence layer — making the agent a row in a checkpoint store, not a stack frame in a living process. This article dissects how that works, the sharp edges it creates, and how to observe a workflow that — by design — no longer runs as a single process.

CrewAI's Genuinely Unique Features: An Honest Technical Deep-Dive

· 14 min read
Vadim Nicolai
Senior Software Engineer

TL;DR — CrewAI's real uniqueness is that it models problems as "build a team of people" rather than "build a graph of nodes" (LangGraph) or "build a conversation" (AutoGen). The Crews + Flows dual-layer architecture is the core differentiator. The role-playing persona system and autonomous delegation are ergonomic wins, not technical breakthroughs. The hierarchical manager is conceptually appealing but broken in practice. This post separates what's genuinely novel from what's marketing.

5 Meta-Tools, 0 Ad-Hoc Edits: Structured Code Repair with AI Agents

· 9 min read
Vadim Nicolai
Senior Software Engineer

There's a difference between an AI that can edit code and an AI that can repair code. Editing is mechanical — find a string, replace it. Repair requires understanding what's broken, why it's broken, and what the minimal fix looks like within the constraints of an existing codebase.

The Code Improver is the fourth agent in our six-agent autonomous self-improvement pipeline for nomadically.work. It's the only agent that writes application code. The Trajectory Miner finds patterns, the Codebase Auditor diagnoses issues, and the Skill Evolver improves instructions — but the Code Improver is the one that actually opens files and changes them.

Five research papers informed its design, curated from the VoltAgent/awesome-ai-agent-papers collection. The central insight across all of them: structured repair workflows outperform ad-hoc fixing.

Your Linter Can't Trace Execution Paths. This Agent Can.

· 9 min read
Vadim Nicolai
Senior Software Engineer

Static analysis tools find pattern violations. Linters catch style issues. But neither traces an N+1 query from a GraphQL resolver through a DataLoader absence to a frontend performance degradation. That requires understanding execution paths — and that's what the Codebase Auditor does.

The Codebase Auditor is the second agent in our six-agent autonomous self-improvement pipeline for nomadically.work. It receives pattern IDs from the Trajectory Miner, investigates the actual code exhaustively, and produces findings with exact file:line references. It never modifies code — it only reads and reports.

Four research papers shaped its design, curated from the VoltAgent/awesome-ai-agent-papers collection. Here is how each one translated into practice.

How We Built an Agent That Edits Its Own Instructions

· 9 min read
Vadim Nicolai
Senior Software Engineer

Most AI systems have a hard boundary between the instructions they follow and the work they do. Developers write prompts; the AI executes them. If the prompts are wrong, a human fixes them. We built an agent that fixes its own prompts.

The Skill Evolver is the third agent in our six-agent autonomous self-improvement pipeline for nomadically.work. Its scope is precisely defined: it can edit skill files, commands, hooks, CLAUDE.md, and memory files. It cannot touch application source code — that's the Code Improver's job. This agent improves the instructions that all other agents follow.

Five research papers informed its design, curated from the VoltAgent/awesome-ai-agent-papers collection. Each one solved a different aspect of the self-modification problem.

Why Do AI Agents Keep Making the Same Mistakes?

· 8 min read
Vadim Nicolai
Senior Software Engineer

Every Claude Code session leaves a trace — tool calls made, files read, edits applied, errors encountered, and ultimately a score reflecting how well the task was completed. Most systems discard this history. We built an agent that mines it.

The Trajectory Miner is the first agent in our six-agent autonomous self-improvement pipeline for nomadically.work, a remote EU job board aggregator. Its job: analyze past sessions, extract recurring patterns and reusable skills, and feed structured intelligence to the rest of the team. It writes no code. It produces raw material that other agents — the Codebase Auditor, Skill Evolver, and Code Improver — consume.

The design draws from four research papers, curated from the VoltAgent/awesome-ai-agent-papers collection. Here is what each paper contributes and how we translated academic ideas into a working system.

We Built a Strategic Brain for Our AI Pipeline. Here's What It Learned.

· 10 min read
Vadim Nicolai
Senior Software Engineer

Five agents in our pipeline know how to mine patterns, audit code, evolve skills, fix bugs, and verify changes. None of them knows when to do any of those things. That is the Meta-Optimizer's job.

The Meta-Optimizer is the sixth and final agent in our autonomous self-improvement pipeline for nomadically.work. It is the strategic brain: it reads all reports from other agents, determines the current phase of the system, creates prioritized action plans, and enforces safety constraints. It never edits code or skills directly. It only decides what should happen next.

Six research papers shaped its design. Together, they address the hardest problem in autonomous improvement: knowing when to improve, when to stop, and when to call for help.

The Agent That Says No: Why Verification Beats Generation

· 8 min read
Vadim Nicolai
Senior Software Engineer

An autonomous improvement system without verification is just autonomous damage. The Code Improver can write fixes. The Skill Evolver can edit prompts. But neither should be trusted to judge its own work. That's why the Verification Gate exists.

The Verification Gate is the fifth agent in our six-agent autonomous self-improvement pipeline for nomadically.work. It validates every change made by the Skill Evolver and Code Improver before those changes are accepted. It never modifies code or skills — it only reads, checks, and reports a verdict.

Five research papers shaped its design, curated from the VoltAgent/awesome-ai-agent-papers collection. The common thread: autonomous systems need calibrated self-awareness about the quality of their own outputs.

How I Built a UX Team with Claude Code Agent Teams

· 16 min read
Vadim Nicolai
Senior Software Engineer
TL;DR

Set CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 in .claude/settings.json. Write a command file in .claude/commands/ and spawn prompts in .claude/team-roles/. Type /ux-team and three agents — UX Lead, UX Researcher, UI Designer — run in parallel: researcher defines personas and journeys, designer builds the component system, lead synthesizes into a spec. File ownership is enforced by persona, not by filesystem. BMAD Method v6 provides the Sally persona and a quality-gate checklist that runs before the spec is marked complete.

BMAD Method + Langfuse + Claude Code Agent Teams in Production

· 16 min read
Vadim Nicolai
Senior Software Engineer

Running AI agents in a real codebase means solving three intertwined problems at once: planning and quality gates (so agents don't drift), observability (so you know what's working), and orchestration (so multiple agents divide work without clobbering each other). In nomadically.work — a remote EU job board with an AI classification and skill-extraction pipeline — these problems are solved by three complementary systems: BMAD v6, Langfuse, and Claude Code Agent Teams. This article explains how each works and how they compose.