· 10 min read
Archives

Cantrip README ₂

Cantrip

A spellbook for summoning entities from language. Disguised as an Elixir agent runtime.

Putting language in a loop can make it come alive. You say words, the words change the room, the room changes you, you say different words. We call it chanting, and it is one of the oldest tools of magic.

An agent is the same shape. The model predicts a token; put it in a loop with an environment, and something emerges that wasn’t in the instructions. Cantrip names the parts:

  • Circle — the environment the entity is given to act within
  • Medium — the substrate the entity thinks in (conversation, Elixir, a shell)
  • Gates — boundary crossings where the circle opens outward (file reads, child entities, hot-loaded modules)
  • Wards — enforced runtime constraints (turn limits, recursion depth, medium options, hot-load policy)
  • Loom — every turn recorded as a tree of threads, forkable and replayable
  • Entity — what arises from the loop. You don’t build it. You design the circle, and it emerges.

A cantrip is the reusable value that binds an LLM, an identity, and a circle. When you cast or summon it, an entity appears in the loop. The action space is the formula:

A = M ∪ G − W

Quick Start

mix deps.get
cp .env.example .env

mix cantrip.cast "explain what a cantrip is"

That’s a bare conversation cantrip with a done gate. For the full code-medium coordinator that lives in your codebase:

mix cantrip.familiar
mix cantrip.familiar "summarize the loom storage modules"
mix cantrip.familiar --acp

Workflows

The same package primitives cover several distinct shapes:

  • Workspace cantrip — give an entity a medium, gates, wards, and a loom so it can work in a real environment with explicit controls.
  • Persistent entity — summon the cantrip into an OTP process when related prompts should share process-owned state.
  • Child cantrip composition — fan out work to specialized children and graft their results and looms back into the parent run.
  • Familiar coordinator — use the packaged codebase-facing entity when you want workspace gates, code-medium reasoning, durable memory, and delegation assembled for you.
  • Distributed Familiar — place child cantrips on named BEAM nodes and replicate Mnesia loom tables across the cluster.
  • Familiar evals — run curated prompt scenarios across multiple seeds, score them with rubric criteria, and persist transcripts for review.
  • Protocol surface — expose the same runtime through library calls, Mix tasks, streaming events, or stdio ACP.

Build a Workspace Cantrip

A code-medium cantrip that inspects a workspace through scoped filesystem gates and leaves a JSONL loom behind. The entity thinks in Elixir, uses list_dir, search, and read_file as host functions, and records every turn:

{:ok, llm} = Cantrip.LLM.from_env()
root = File.cwd!()

{:ok, cantrip} =
  Cantrip.new(
    llm: llm,
    identity: %{
      system_prompt: """
      You are a careful codebase analyst. Inspect the workspace through the
      available gates and call done with a concise findings list.
      """
    },
    circle: %{
      type: :code,
      gates: [
        :done,
        %{name: "list_dir", dependencies: %{root: root}},
        %{name: "search", dependencies: %{root: root}},
        %{name: "read_file", dependencies: %{root: root}}
      ],
      wards: [%{max_turns: 8}, %{sandbox: :port}, %{code_eval_timeout_ms: 5_000}]
    },
    loom_storage: {:jsonl, "tmp/cantrip-analysis.jsonl"}
  )

{:ok, result, _next, loom, meta} =
  Cantrip.cast(cantrip, """
  Find the modules responsible for loom storage and summarize their
  persistence choices, including any operational risks a deployer should know.
  """)

Provider configuration is routed through ReqLLM:

CANTRIP_LLM_PROVIDER=openai_compatible
CANTRIP_MODEL=gpt-5-mini
CANTRIP_API_KEY=sk-...
CANTRIP_BASE_URL=https://api.openai.com/v1

Cantrip.FakeLLM scripts deterministic responses for tests.

Keep an Entity Alive

Use summon when an entity should keep process-owned state across multiple intents:

{:ok, pid} = Cantrip.summon(cantrip)
{:ok, _first, _next, _loom, _meta} = Cantrip.send(pid, "Map the storage modules.")
{:ok, second, _next, loom, _meta} =
  Cantrip.send(pid, "Continue from there: compare JSONL and Mnesia.")

Fan Out to Child Cantrips

Use ordinary cantrips as children. Results return in request order; each child also produces a loom.

{:ok, jsonl_reader} =
  Cantrip.new(
    llm: llm,
    identity: %{system_prompt: "Summarize the JSONL storage implementation."},
    circle: %{type: :conversation, gates: [:done], wards: [%{max_turns: 5}]}
  )

{:ok, mnesia_reader} =
  Cantrip.new(
    llm: llm,
    identity: %{system_prompt: "Summarize the Mnesia storage implementation."},
    circle: %{type: :conversation, gates: [:done], wards: [%{max_turns: 5}]}
  )

{:ok, summaries, _children, _looms, _meta} =
  Cantrip.cast_batch([
    %{cantrip: jsonl_reader, intent: "Focus on lib/cantrip/loom/storage/jsonl.ex"},
    %{cantrip: mnesia_reader, intent: "Focus on lib/cantrip/loom/storage/mnesia.ex"}
  ])

Launch the Familiar

The Familiar is the batteries-included coordinator for codebase work. It observes the workspace, reasons in Elixir, delegates to child cantrips, and persists its loom.

{:ok, familiar} = Cantrip.Familiar.new(llm: llm, root: File.cwd!())

{:ok, report, _next, _loom, _meta} =
  Cantrip.cast(familiar, "Inspect this repo and report the package shape.")

Hot-loading is opt-in. Pass evolve: true to include compile_and_load and an exact allowlist for Elixir.Cantrip.Hot.Tally. Be careful what you wish for; the Familiar is minimally warded.

Core API

Cantrip.new/1 builds a reusable cantrip value from an LLM tuple, identity, circle, loom storage, retry policy, and folding options.

Cantrip.cast/3 summons a one-shot entity for one intent:

{:ok, result, cantrip, loom, meta} =
  Cantrip.cast(cantrip, "Analyze this data", stream_to: self())

Cantrip.cast_batch/2 runs child cantrips concurrently and returns results in request order:

{:ok, results, children, looms, meta} =
  Cantrip.cast_batch([
    %{cantrip: analyst, intent: "Read chapter one."},
    %{cantrip: analyst, intent: "Read chapter two."}
  ])

Cantrip.cast_stream/2 returns {stream, task} for event consumers.

Cantrip.summon/1 and Cantrip.send/3 keep a supervised entity process alive across multiple intents.

Cantrip.Loom.fork/4 replays a loom prefix and branches from a prior turn.

See docs/public-api.md for a task-oriented API guide.

Mediums

The medium is the inside of the circle — what the entity thinks in.

Conversation. The LLM receives gates as tool definitions and responds with structured calls. Right when the work IS speech: interpretation, judgment, naming.

Code. The entity writes Elixir. Bindings persist across turns. Gates are injected as functions; loom is available as data. Right when the work is composition: gathering pieces, transforming them, aggregating, fanning out. Children are constructed through the public package API:

data = read_file.(path: "metrics.txt")
done.("Read #{byte_size(data)} bytes")

Plain code-medium cantrips use the safe port boundary by default: LLM-written Elixir is evaluated by Dune inside a child BEAM process, while gates, child cantrip API calls, stdio, and hot-loading are resolved through explicit parent/child protocol messages. Use %{sandbox: :port} when you want that default boundary to be explicit in a circle. The Familiar defaults to sandbox: :unrestricted for trusted operator-local coding work so native Elixir affordances such as binding/0 and Code.fetch_docs/1 match what its prompt teaches. Use sandbox: :port_unrestricted only when you explicitly want raw Elixir in the child process, sandbox: :dune when you want in-process language restriction with a deliberately smaller binding surface (see docs/port-isolated-runtime.md for the divergence — entity prompts need to match the variant in use), or sandbox: :unrestricted for trusted local development in the host BEAM. Child-origin atoms outside Cantrip’s wire vocabulary cross the port boundary as strings, which keeps hot-loaded child code from forcing new atoms into the parent BEAM.

Bash. The entity writes shell commands. Each command runs in a fresh OS-sandboxed subprocess from the configured cwd. Shell state does not persist. Filesystem writes are denied except under %{bash_writable_paths: [...]}, and network is off unless %{bash_network: :on} is declared. Declared gates are projected as commands at the front of PATH: read_file README.md, list_dir ., search pattern lib, mix test, and cantrip_done "answer" for the done gate. SUBMIT: output still works for shell-only answers. The Bash sandbox is release-tested against representative local shell workloads (git, make, jq, redirects through /dev/null, and common find/sed/grep pipelines); that workload suite is the support contract for expanding the adapter configuration over time. The workload tests opt into %{bash_network: :on} so GitHub-hosted runners can execute bubblewrap even when they cannot create a network namespace; separate tests pin the default network-deny command shape.

Gates

Built-in gates close over construction-time dependencies and produce observations the entity reads as data:

  • done(answer) — terminate with the final answer
  • echo(text) — visible observation
  • read_file(%{path}) — read a file under :root
  • list_dir(%{path}) — list a directory under :root
  • search(%{pattern, path}) — regex search returning %{path, line, text} matches
  • mix(%{task, args}) — run an allowlisted Mix task under :root
  • compile_and_load(%{module, source}) — compile and hot-load a module (opt-in via evolve: true on the Familiar)

Errors are observations. A failed gate call returns to the entity as data so the next turn can adapt. Error as steering.

Storage

The loom is the durable record of every turn the entity and its children have taken. Three backends:

base = [
  llm: llm,
  identity: %{system_prompt: "..."},
  circle: %{type: :conversation, gates: [:done], wards: [%{max_turns: 5}]}
]

Cantrip.new(Keyword.put(base, :loom_storage, :memory))
Cantrip.new(Keyword.put(base, :loom_storage, {:jsonl, "loom.jsonl"}))
Cantrip.new(Keyword.put(base, :loom_storage, {:mnesia, table: :cantrip_turns}))

Mnesia persistence across BEAM restarts requires a named node and a writable Mnesia directory. See DEPLOYMENT.md.

Safety

Plain code-medium circles default to the two-layer port boundary. Dune denies ambient File.*, System.*, Process.*, spawn, and similar capabilities inside the child; the port boundary keeps LLM-written code, hot-loaded modules, and spawned child work out of the host BEAM. Gate calls, hot-load validation, child cantrip construction, casting, loom grafting, telemetry, and provider access stay in the parent runtime. Timeouts close and kill the child process.

The Familiar default is the trusted host-BEAM evaluator because its audience is operator-local. For stricter operating-system policy — filesystem mounts, network egress, CPU/memory quotas, and user isolation — use sandbox: :port with :port_runner or run the host in a constrained container. The raw child-BEAM evaluator is sandbox: :port_unrestricted; the host-BEAM evaluator is sandbox: :unrestricted. See DEPLOYMENT.md for the full posture.

Paths by audience

Cantrip’s primitives are polymorphic on purpose. The Familiar is the one preassembly we ship today; other audiences assemble cantrips from the same Cantrip.new / cast / summon / cast_batch surface. Pick the entry that matches your use case.

Operator-local coding companion. You want an Elixir-native coding agent in your own workspace, with a durable loom keyed to that workspace. Run mix cantrip.familiar (REPL) or mix cantrip.familiar "your intent" (single-shot). The Familiar is the preassembly: code medium, scoped workspace gates, delegation, and Mnesia loom out of the box. See docs/public-api.md for the underlying surface.

Editor companion via ACP. You want the Familiar mounted inside Zed, JetBrains, Toad, or another ACP-aware editor. Run mix cantrip.familiar --acp and point your editor’s ACP client at it. See docs/acp-editor.md for a worked editor mount with configuration, smoke-test, and troubleshooting.

Research / evaluation substrate. You want to run prompt scenarios across seeds, score with rubric judges, and diff transcripts for regression work. Use Cantrip.Familiar.Eval and the eval harness. See docs/eval-harness.md for the harness, and evals/familiar/v1.3.3.exs for a curated 5-scenario starter suite covering gate-use, composition, synthesis quality (judge-graded), forbidden-pattern, and cross-summoning memory.

Reference docs

Package status

This package is 1.3.3. ACP support depends on agent_client_protocol ~> 0.1.0 from Hex. The package surface is checked with mix docs and mix hex.build.