Journal

11 essays · Latest 21 Jul 2026

Notes from the control plane

Essays on harnesses, evaluation, guardrails, and observability — the discipline of shipping AI systems you can stand behind.

Filed underEnterprise Evals Harnesses Observability Prompting Safety Whitepaper All tags →

01 — Index

Nested concentric circles of light against a dark background.

Latest · 21 Jul 2026

Priors over priors

Why we are called Hyperpriors: what a hyperprior means in Bayesian terms, and why the control plane for production AI is exactly that — beliefs about how a system should hold beliefs.

Safety · Observability

16 Jun 2026

Anatomy of a harness

An agent harness is the structured runtime between a model and the world: tool mediation, context management, retries, fallbacks, and human escalation. A prompt and a while-loop is not an architecture.

Harnesses · Safety

Exposed wiring and labelled cable runs inside a server cabinet.

16 Jun 2026

The eval maturity model

A five-stage maturity model for LLM evaluation practice — from ad hoc spot checks to continuous production evaluation — with the failure modes of each stage and the exit criteria that mark genuine progress to the next.

Evals · Whitepaper

A staircase ascending through a concrete structure, each flight lit from above.

19 May 2026

Context engineering for agent systems

The context window is the scarcest resource in an agent system. A whitepaper-length treatment of budgeting it: retrieval allocation, tool-result summarisation, state across steps, and defending against context poisoning.

Prompting · Whitepaper

A shipping-container yard viewed from above, containers packed in ordered rows.

12 May 2026

Eval-driven development

Evals are the unit test suite of AI systems. How regression gates, golden sets, and honest LLM-as-judge practice keep model behaviour shippable — and why eval suites rot if you let them.

Evals

Rows of green and red status indicators on a monitoring dashboard.

14 Apr 2026

Defence in depth for LLM applications

No single control makes an LLM product safe, and none needs to. A layered architecture — input mediation, capability scoping, output enforcement, human review, audit — works because the layers fail independently. A whitepaper on building it deliberately.

Safety · Whitepaper

Translucent layers of orange and pink fabric overlapping in motion against a pale background.

14 Apr 2026

Deploying LLM systems in regulated environments

Audit trails, data residency, model-change management, validation documentation, and incident response for model-driven features — and how each requirement maps onto control-plane primitives you can actually operate.

Enterprise · Whitepaper

A bank of filing drawers with typed index labels, photographed in shallow depth of field.

10 Mar 2026

Prompts are interfaces, not incantations

A prompt is the interface contract between your deterministic system and a probabilistic component. That means version control, review, separation of concerns, and eval coverage — the same discipline you apply to any other interface.

Prompting

A printed technical specification laid out on a desk beside a keyboard.

10 Mar 2026

Regression gates for LLM systems

How to make evals block merges the way tests do: golden set construction, threshold design, flake management, and the gate discipline that makes provider migrations survivable.

Evals

A closed barrier gate on a road, lit by amber signal lights.

03 Mar 2026

The buy-vs-build question for AI controls

Every enterprise deploying LLM features ends up building an eval harness, a trace store, and a guardrail layer. Most build them badly, twice. Where the ownership boundary should actually sit.

Enterprise

Rows of identical shipping containers stacked in a freight yard.

03 Mar 2026

Uncertainty routing in production

A production AI system does not need to be right every time; it needs to know when it might be wrong and hand those cases to a person. Confidence signals, threshold calibration, escalation ergonomics, and the discipline of not crying wolf.

Safety

Abstract long-exposure photograph of pink and red light trails blurred against a soft blue horizon.

02 — Signal

Occasional notes, calibrated.

There is no email list on this site yet — we will not put your address in a query string and pretend a subscription happened. Read the journal here, or follow the RSS feed.

Subscribe via RSS Request access to the product

All essays