Think North Learning
thinknorth.consulting
AGENTS · TOOLS · MCP Prediction 7 min

Hands for the Brain

01 · THE SETUP

2023. You ask a chatbot: “What's on my calendar Thursday? Book us a table somewhere near the office after my last meeting.” You get a paragraph of graceful apology: it has no access to your calendar, cannot make reservations, and is, after all, just a language model.

2026. You ask an assistant the same thing. It checks the calendar, sees the 4pm running long, finds a restaurant, books 6:30, and drops the confirmation in your inbox.

Before reading on, predict: what changed? Did the models get smart enough to act — or did something else happen?

Careful — the intuitive answer ('the AI got smarter') hides an assumption. A brain in a jar that doubles in intelligence is still in a jar. What would it take to get out?

02 · YOUR CALL ⏸ YOUR CALL — PICK ONE TO CONTINUE

What actually let the model act on the world?

If you pick A

The assumption to catch: intelligence produces text, and no amount of it turns text into a restaurant booking. A 2026 frontier model with no connections is exactly as trapped as a 2023 one — just more eloquent about it. The change was outside the brain.

If you pick B — the mechanism

That's it — and notice how unglamorous it is. The model emits something like calendar.list(day=THU); software around it runs that for real and pastes the result back into the conversation; the model reads it and decides the next step. Brain unchanged, loop added. The loop is the whole revolution.

If you pick C

A reasonable architecture guess — that's roughly how some robotics stacks work. But for digital assistants, no action-network exists. The same language model does everything; it just learned to say actions in a format ordinary software can execute. The magic is plumbing.

If you pick D

A wise suspicion — you've read the Thousand Humans lesson. And humans do gate the consequential steps in well-built systems. But the calendar check, the search, the booking call: those run as real software actions, chosen by the model, no hidden operator required.

Pick one — committing first is what makes the answer stick.

the lesson continues after you choose

03 · NOT SO FAST

“The models got smarter” is true — and it's the wrong explanation. Smarter alone produces better apologies.

What it misses is that acting requires an exchange: the model must be able to reach out, and the world must be able to answer back, over and over, until the job is done. That exchange has a shape — and once you see it, you'll understand both why agents suddenly work and why they fail the specific ways they fail.

04 · THE MECHANISM
REASON TOOL CALL EXECUTE OBSERVE the plan updates on what actually happened MCP — THE SOCKET calendar search bookings your CRM… build a tool once, every agent can pick it up the compounding trap: 95% reliable per step ≈ 36% after 20 steps — agent quality is reliability engineering, not brilliance
The agent loop: reason → call a tool → observe the result → repeat until done. MCP standardises the sockets.

Then came a plumbing problem with big consequences. Every assistant needed a custom connector to every tool — calendars, CRMs, databases — an N×M explosion of one-off integrations. The fix was a standard: the Model Context Protocol (MCP), opened by Anthropic in November 2024 — a universal socket ('USB-C for AI') letting any assistant use any tool that speaks it. Adoption tells the story: OpenAI and Google adopted it in 2025, it was donated to the Linux Foundation's new Agentic AI Foundation that December, and by early 2026 it counted tens of thousands of tool servers and SDK downloads in the tens of millions per month. Build a tool once; every agent can pick it up.

One piece of arithmetic separates agent hype from agent engineering: errors compound around the loop. A step that's 95% reliable sounds excellent — run twenty such steps and the chance everything went right is about 36%. That's why agent quality in 2026 is less about brilliance and more about reliability: checkable intermediate results, recoverable errors, and a human gate on anything expensive to undo. And every tool an agent holds raises the stakes of the injection lesson from the Limits shelf — hands make the gullibility consequential.

05 · BACK TO THE OPENING

So the leap from apology to booked table wasn't the brain crossing an intelligence threshold — it was the jar getting hands, and then the hands getting a standard plug. Your prediction question resolves precisely: something else happened — a loop and a protocol. Which reframes every “can AI do this?” you'll hear at work: it's really two questions — can the model reason about it, and does the loop have the right, safe tools?

06 · TAKE THIS WITH YOU

Your rule — the good-delegation test: a task suits an agent when three things are true: success is checkable (you can verify the outcome cheaply), steps are reversible (or gated before the irreversible one), and the blast radius is bounded (the worst case is an annoyance, not a crisis). Booking dinner passes. Emailing your top 100 clients does not — yet.

REFERENCES
  1. Anthropic — Introducing the Model Context Protocol
  2. Linux Foundation — Announcing the Agentic AI Foundation (MCP's new home)
  3. Anthropic — Building effective agents