Grove — production-grade AI agents

96% of organizations run AI agents. Only 1 in 9 has them in production at scale.

The problem isn't the models. It's that today's agent frameworks were built as scripts: brittle, opaque, expensive, and impossible to debug once they're live. Most failures originate at the boundaries between processes — exactly where current tooling has nothing to say.

Grove takes the lessons from 30 years of Erlang/OTP fault tolerance, adds a workflow compiler, and wraps both in a developer experience that feels like Next.js. It is the runtime layer agents have been missing.

Seven things you didn't have before

Each one is real, tested, and verified live against Claude.

01 · supervision

Processes, not scripts

OTP-style supervisor with one_for_one, one_for_all, rest_for_one. Restart-intensity guard. Let it crash; let the system heal.

02 · compile

Compile-to-determinism

Static analysis identifies deterministic paths. Compile prewarms the runtime cache with declared examples. Verified live: 10× projected cost reduction on multi-agent topologies.

03 · cache

Cross-process tool cache

Deterministic tools content-hashed by canonical-JSON of their input. Persistent SQLite. LRU eviction beyond 10K entries. Restart your worker, the warm cache stays warm.

04 · bench

Live time-travel inspector

Every event recorded by default. Scrub through any past session, click any step, see the full data, fork from any moment. Keyboard scrubbing (←/→/Home/End/F).

05 · hot reload

Per-child hot reload

Edit agent.ts, save — only the children whose definitions actually changed restart. Siblings keep running.

06 · eval

Behaviour diff in CI

Declare cases, save profiles, diff across runs. Cases classified same / drift / regressed. Non-zero exit on regression — drop into CI.

07 · mcp + prompt cache

MCP-native + Anthropic prompt caching

Mount tools from any MCP stdio server with one line. Anthropic prompt cache auto-applied to long system prompts; 90% input-token discount on cache reads. Verified live: 4399 tokens cached, 3959 tokens saved on second call.

Install

Targets Bun 1.3+ for native SQLite, native TypeScript, native test, native bundler.

# Install Bun (if you don't have it)
curl -fsSL https://bun.sh/install | bash

# Add Grove
bun add @vyntral/grove-core @vyntral/grove-runtime @vyntral/grove-cli

# Or scaffold a new project
bunx grove init agent.ts && bun run agent.ts

Hello, supervised agent

import { agent, supervise, tool } from '@vyntral/grove-core'
import { start } from '@vyntral/grove-runtime'
import { z } from 'zod'

const search = tool({
  name: 'search',
  description: 'Search the web for a query.',
  schema: z.object({ query: z.string() }),
  deterministic: true,                    // ← cache hits cross-process
  examples: [{ query: 'sparse attention' }],  // ← prewarmed by `grove compile`
  run: async ({ query }) => fetchResults(query),
})

const research = agent({
  name: 'research',
  model: 'anthropic/claude-opus-4-7',
  system: 'You research topics rigorously.',
  tools: [search],
})

const tree = supervise({
  strategy: 'one_for_one',
  children: [research],
  restart: { intensity: 5, period: 60_000 },
})

const { handle, sessionId } = await start(tree)
const out = await handle.run('survey on prompt cache thresholds')
await handle.stop()

How it differs

Capabilities adjacent frameworks ship today, vs Grove.

	LangChain / CrewAI	DSPy / TextGrad	Replay / AgentOps	Grove
Supervised processes	–	–	–	✓
Restart strategies (OTP)	–	–	–	✓
Compile-time optimisation	–	prompt-only	–	workflow-level
Persistent runtime cache	–	–	–	✓
Hot reload (per-child)	–	–	–	✓
Time-travel inspector	–	–	✓	✓ + fork
Behaviour diff (CI-friendly)	–	–	–	✓
MCP-native (stdio)	partial	–	–	✓
One install	partial	–	–	✓

Production-grade AI agents.