Strange Lab
We get your company ready for agents.
Every company already writes down what it does: mail, chat, docs, tickets, code. We turn that record into the thing agents need: company state, repeated work, decision rights, policy gates, and outcomes.
The first output is an Agent Deployment Map: what agents can train on, what they can shadow, what they can run, and where a human stays in the loop.
There is a bet underneath the map. Most work just continues: tomorrow looks like today, every forecaster looks like a genius, and agents are safe on exactly that kind of work. The expensive question is when things stop continuing, and answering it takes a model of how your company actually moves.
We test this on public records. The cleanest showcase is Bismarck: one dense PDF becomes 1,922 dated events, later history is hidden, GPT and the world model see the same past, and the world model reads the next pressure shifts more closely while the page lets you try harder forks. The broader evidence set keeps the same discipline: stand at a date in the past, hide everything after it, and grade the forecast against what actually happened.
You can watch that test run in the examples: Bismarck shows the whole PDF-to-world-model loop, Enron's mail archive replays company forks, two public-history surfaces do the same with macro data and Civil War-era news, and Star Wars / Middle-earth synthetic worlds check whether the method works when there is no internet answer key. The scored comparisons, including the trend-break split, live on the evidence page.
The same machinery now runs on a private company work record as an anonymized Decision Lab case: more than 46,000 canonical events, 15 workflow candidates, and 22 discovered skills. Those skills are discovery-grade until owners, reviewers, and replay tests exist; that boundary is the point of the Agent Deployment Map.
Agents take the work that repeats; the model's job is to notice when it stops.
Working on something where this matters, or want to poke holes in it? The fastest way to reach us is through Strange Loop Canon.
See it run
- Examples: public world-model demos and what each one proves
- Evidence: every scored comparison we publish, with Bismarck as the easiest audit path
- Decision Lab: private-company workflow and skill discovery as an Agent Deployment Map (password-gated; ask us for the keys)
- Bismarck: one PDF becomes a dated event stream, a hidden-future comparison, and a fork explorer
- Enron: choose a historical cutoff, write an email as an Enron actor, and compare forecasts
- Public History: choose a macro cutoff and test an analyst move
- Civil War-era public news: choose a news cutoff from 1859-1865 and test a public response
Essays, papers, and code
- MarketBench: blog, paper
- The Future of Work Is Playing a Videogame
- The Future of Work Is World Models
- Homo Agenticus Sapiens: essay Seeing Like an Agent, GitHub list
- Management flight simulator: blog, VEI repo
- Aligned Agents Still Build Misaligned Organisations