OPEN SOURCE · FULL HARNESS IMPLEMENTED · APACHE 2.0

It's all about the harness.
It's AI's harness.

Not just a workflow. A complete reasoning and control architecture for AI agents.

A workflow tells your AI what to do. A harness makes sure it actually does it — with a world model that tracks beliefs and contradictions, a control layer that knows when to slow down or stop, nine-layer verification before every action, and recovery strategies when things go wrong. Its Harness has delivered the full architecture: draw it on a canvas, run it on any framework, trace every decision.

LangGraph CrewAI Mastra MS Agent Framework

Star on GitHub★ 0 View roadmap → Free & open source · Apache 2.0

Simple Agent Loop

Input / Caller

LLM Call

Tool Call↺ loop

Output

prompt in → answer out
no world model · no control state · no verification

Full Harness — Implemented

Caller Stateconstraints · clarification · propagation

World Modelbeliefs · contradictions · generation_id

Reasoningevidence · hypotheses (4 sources) · VOI

Control5-tier resolver · deadlock detectkey

Planningtask graph (6-state) · parallel concurrency

ExecutionVOI · review gate

Verification9 layers

Recovery6 strategies

Memorycompression · journal

Learningexperience store · warm start (optional)

Output & Reviewer Passcontract · 3-lens review

22 nodes · 11 layers · world model + 5-tier control · 9-layer verification

WHY A HARNESS, NOT JUST A WORKFLOW

A harness does what a workflow can't.

A workflow routes prompts from node to node. A harness governs what the AI believes, what it's allowed to do, how it catches its own mistakes, and what it learns for next time. Its Harness delivers the full architecture — canvas and framework adapters as the foundation, and the complete reasoning and control layer on top.

Reasoning, not just prompting

The full harness maintains a world model — typed beliefs, contradictions, hypotheses from four generation sources, and a value-of-information gate before every action. Not just "ask the LLM."

Control that holds

A five-tier control state resolver (NORMAL → CAUTIOUS → BLOCKED) governs every action. Diagnostic health vectors drive it. Deadlock detection stops it escalating forever. Your AI knows when to slow down.

Verification with teeth

Nine verification layers, a pre-execution review gate across five dimensions, an adversarial reviewer pass, and contract validation before every return. Trust, but verify — every time.

Recovery and learning

Six named recovery strategies, typed failure detection, local vs global replanning, and an optional experience store that reuses successful decompositions — not just success-rate priors — across runs.

Canvas + 4 framework adapters + Langfuse observability + full harness: world model, control state resolver, 9-layer verification, recovery, experience store, reviewer pass — 22 harness nodes · 379 tests passing.

FULL HARNESS ARCHITECTURE

Canvas foundation to full reasoning and control.

Its Harness is a complete reasoning and control architecture for AI agents. The canvas and framework adapters are the visible surface — below them sit world model management, diagnostic health vectors, a five-tier control state resolver, nine-layer verification, named recovery strategies, an experience store, and a three-lens reviewer pass. 379 acceptance tests across all four frameworks.

Canvas & execution layer

✓ Canvas with 24 node types (14 execution + 10 harness)

✓ 4 framework adapters (LangGraph, CrewAI, Mastra, MAF)

✓ Langfuse observability — harness traces, all 4 runtimes

✓ HITL pause/resume · REST/MCP/A2A deploy

✓ FlowSpec v0.2.0 — open, portable JSON format

✓ Process concepts — pre-seeded task graph scaffolds

Reasoning & control layer

✓ World model · typed beliefs · contradiction detection

✓ 5-tier control state resolver · deadlock detection

✓ Pre-execution review gate · 9-layer verification

✓ 6 named recovery strategies · typed failure library

✓ Experience store — cross-run structural reuse

✓ Adversarial reviewer pass · output contract validation

Read the full implementation plan → 22 nodes · 11 layers · 379 tests · all harness invariants

Foundation State Architecture

Evidence & Reasoning

World Model & Contradiction

Diagnostics & Control State

Planning & Task Graph

Execution & Verification

Recovery & Memory

Caller State & Escalation

Experience Store

Reviewer Pass & Output Contract

Canvas Integration

E2E Integration & Testing

State & reasoning infrastructure

World model with typed beliefs and generation_id tracking, evidence store, hypothesis system with diversity enforcement, contradiction detection, diagnostic health vectors, five-tier control state resolver, and a six-state task graph.

Execution hardening & learning

VOI estimation, nine-layer verification, pre-execution review gate, reversibility strategy, six named recovery strategies, context compression, experience store for cross-run structural reuse, adversarial reviewer pass, and output contract validation.

Canvas & full integration

Ten harness canvas node types, diagnostic health dashboard, updated framework adapters with harness-aware tracing in Langfuse, caller state and escalation, process concepts, and end-to-end tests across all four frameworks.

GET STARTED

Up and running in two commands.

Its Harness runs entirely on your machine via Docker. No cloud account, no sign-up — just clone and run.

terminal

# 1. Generate secrets and configure your environment
./setup-env.sh

# 2. Start all services
docker compose up

What starts up

Canvas — localhost:3000

The visual workflow editor. Draw your flows here.

API — localhost:8000

The backend that compiles and runs your flows.

Langfuse — localhost:3001

Monitoring dashboard. Every run is traced automatically.

Nine services in total. setup-env.sh handles all the secrets and configuration automatically — you only need to provide your LLM API key (or skip it and use a free local model instead).

AI MODELS

Use OpenAI, Claude, or run completely free locally.

Its Harness routes all AI calls through LiteLLM — a proxy that works with any model provider. You pick the model in your workflow; the rest is handled automatically.

OpenAI

Add your OPENAI_API_KEY and use gpt-4o or gpt-4o-mini in any flow.

Anthropic

Add your ANTHROPIC_API_KEY and use claude-sonnet, claude-haiku, or claude-opus.

Ollama — free & local

Run mistral, qwen3, or qwen2.5-coder locally. No API key, no cost, no data leaving your machine.

Any custom model

Edit one config file to add any OpenAI-compatible model or endpoint — including self-hosted or fine-tuned models.

Want to try it without an API key? Install Ollama, pull a model (ollama pull mistral), and run ./setup-ollama.sh — it tests all four frameworks end-to-end with no paid account needed.

HOW IT WORKS

One harness definition. Runs on any engine.

Its Harness stores your flow in an open format called FlowSpec. The same spec compiles to LangGraph, CrewAI, Mastra, or MAF — no rewriting. As the harness phases land, FlowSpec gains new node types for world model management, control state, verification gates, and recovery — all backwards-compatible with existing flows.

support-triage.flow.json

{
  "workflow": "support-triage",

  "nodes": [
    {
      "type": "llm_call",
      "prompt": "Classify severity"
    },
    {
      "type": "condition",
      "route": "high_priority"
    },
    {
      "type": "human_review"
    }
  ],

  "telemetry": {
    "provider": "langfuse"
  }
}

What you can build

Multi-step AI flows

Routing, branching, conditional logic, and tool calling — all drawn visually.

Safety & reliability

Guardrails, retry logic, and human approval checkpoints built right in.

Team collaboration

Edit workflows together, share with your team, and push to production.

Full observability

Langfuse traces give you complete visibility into every run, cost, and failure.

node taxonomy · execution layer14 types

i/o

inputoutput

llm

llm_call

tools

tool_invoke

agents

agent_roleagent_debate

flow control

conditionparallel_forkparallel_join

human

hitl_breakpoint

memory

memory_readmemory_write

composition

subgraphtransform

Harness canvas layer — 10 harness node types

world_model hypothesis_set control_state task_graph verification_gate recovery_node evidence_store experience_store reviewer_pass gather_evidence

SUPPORTED FRAMEWORKS

Use the AI framework your team already knows.

Most workflow tools are tied to one framework. Its Harness separates the workflow design from the execution — so you can experiment, migrate, or compare frameworks without rebuilding your flows each time.

LangGraph

Graph-based orchestration for complex, stateful AI pipelines.

CrewAI

Multi-agent workflows where AI "crew members" collaborate on tasks.

Mastra

TypeScript-native AI workflows, built for modern JS teams.

Microsoft Agent Framework

Enterprise AI workflows powered by Semantic Kernel.

Feature	LangGraph	CrewAI	Mastra	MS Agent Framework
llm_call · structured output Prompt templates, model overrides, validators.	full	full	full	full
agent_role · agent_debate Named personas, multi-agent loops. MAF maps natively to AgentGroupChat.	via synthesis	role native	via synthesis	AgentGroupChat native
condition · parallel_fork / join Branching, fan-out, configurable join reducers.	full	full	full	full
hitl_breakpoint Pause, inspect, edit state, resume.	full	partial	full	_HitlPause exception
memory · semantic (vector) Embeddings, top-k, similarity threshold.	via plugin	full	full	full
streaming · tokens / updates Token streaming and graph-update events.	full	partial	full	via adapter
checkpoint · Postgres / Redis Durable execution, resume from any node.	full	via plugin	full	via Dapr / Orleans

EXAMPLE FLOWS

Five ready-made flows to learn from or build on.

The repo ships with five real, working flows. Each one demonstrates a different set of features — open them in the canvas, run them, and modify them to fit your use case.

Flow Best framework What it shows

RAG Agent

Search a knowledge base, retrieve the most relevant chunks, and generate a grounded answer.

LangGraph Memory read, semantic search, caching

Content Moderation + Human Review

Classify content, auto-approve low-risk items, and pause for a human to review anything flagged as high risk.

Mastra Human-in-the-loop, structured output

Parallel Risk Assessment

Three specialist AI agents review a document simultaneously, then their findings are merged into one report.

CrewAI Parallel agents, fan-out/merge

Research Crew

A team of AI agents collaborate on a research task, with human approval required before any external tool is used.

CrewAI Multi-agent, tool approval

Debate Agent

Multiple AI agents argue different sides of a question until they reach a conclusion — then the flow is also exposed as an A2A agent other systems can call.

MS Agent Framework Multi-agent debate, A2A protocol

MONITORING

Know exactly what your AI did — and why it did it.

Every prompt, decision, tool call, failure, and slowdown is tracked automatically via Langfuse. Compare runs across frameworks, spot problems fast, and understand exactly what happened — without digging through logs.

As the harness phases land: traces will extend to harness-specific spans — world model generation_id per step, control state transitions (NORMAL → CAUTIOUS → BLOCKED), diagnostic health vectors, recovery strategy changes, and reviewer-pass findings — giving you a complete audit trail of every reasoning decision the agent made.

DEPLOY & INTEGRATE

One click to publish your flow as an API, a Claude tool, or an agent.

Once your flow is ready, deploying it takes a single API call. It goes live simultaneously as three different things — so whatever system needs to use it can call it the way that makes sense.

REST endpoint

Any app can trigger your flow with a standard HTTP POST. No special SDK needed.

MCP tool

Your flow becomes a tool that Claude Desktop (and any MCP client) can call directly in conversation.

A2A agent

Expose your flow as an agent other AI systems can discover and invoke using the open A2A protocol.

Embeddable canvas

Drop the visual editor into your own app with the @itsharness/canvas npm package.

PUBLIC ALPHA

Its Harness is in public alpha.
The full harness is here.

The canvas, adapters, observability layer, and the full reasoning and control architecture are all working and ready to use — world model, control state resolver, nine-layer verification, recovery, experience store, adversarial reviewer pass. Your real flows, bug reports, and contributions shape what gets fixed and what gets built next.

Public alpha — current and planned work is in the open

The canvas-and-adapters layer is stable but alpha — APIs may shift, Docker Compose behaviour may vary, and edge cases in less-common node combinations aren't fully covered yet. The full harness reasoning and control architecture is implemented and tested. Run it, break it, and tell us what you need. Every report shapes what gets prioritised.

track A · run & report

Run it against real flows

One docker compose up starts all nine services. Point it at a real flow — your actual use case, not a toy example. When something breaks, crashes silently, or produces wrong output, open a bug report with your FlowSpec JSON, the runtime you used, and the full error. The more specific, the faster it gets fixed.

Report a bug →

track B · shape the roadmap

Tell us what you need

Missing a node type? A harness feature that would change how you build? A runtime behaviour that doesn't map cleanly? Open a feature request. Describe what you're building and where Its Harness falls short — concrete use cases carry far more weight than abstract asks. Phase priority is influenced by community demand.

Request a feature →

track C · build with it

Build a node pack or contribute a phase

FlowSpec v0.2.0 is stable for third-party node packs (@itsharness/nodes/…). The full harness implementation is public and open for community contribution. The spec, adapter interface, and canvas package are all stable enough to build on today.

Read the spec → View the plan →

It's all about the harness. It's AI's harness.