● Agents pillar · Architecture-first · MCP-native · HITL by default

Agentic AI,
built to survive
contact with production.

Most agentic AI projects fail. Architecture-first design, MCP-native by default, human-in-the-loop as policy not afterthought — and evals before the agent ships. We build the kind of agentic systems that survive the second incident.

Request a scoped proposal → Read the POV →

Why this pillar exists

The 40% Problem.

Gartner forecasts that more than 40% of agentic AI projects will be canceled by the end of 2027 — not because the models can't reason, but because organizations deploy agents before the data architecture, governance, and operating models around them are built. We exist to fix that ordering. The rest of this page is the how.

→ Read the full point of view · The 40% Problem

What is — and isn't — an agent.

An agentic system has three properties: it pursues a goal across multiple steps, it decides which tools to use at each step, and it has the latitude to recover when something goes wrong. Most of what's being marketed as agentic in 2026 doesn't meet that bar.

A chatbot with retrieval is not an agent. A workflow scripted in YAML that calls an LLM in one step is not an agent. An RPA bot rebranded as an "autonomous co-pilot" is not an agent. The Gartner term for this is agent washing — and it accounts for most of the projects that get canceled, because the agentic layer adds cost without adding capability.

A real agentic system is harder to build, riskier to deploy, and more valuable when it works. We build the latter. We won't take engagements that ask us to ship the former.

MCP-native by default.

The Model Context Protocol is the only agent protocol the frontier has agreed on. Anthropic introduced it in late 2024, donated it to the Agentic AI Foundation (under the Linux Foundation) in 2025, and by early 2026 every frontier vendor — OpenAI, Google, Microsoft, AWS — had shipped MCP support. 78% of enterprise AI teams now run at least one MCP-backed agent in production. 64% of large enterprises run their own custom internal MCP servers.

We design on MCP because it makes the agentic engineering job durable. The tool layer — the part you actually invest in — stays useful across every model and runtime rotation. The agent code becomes the cheap, swappable part. That sequencing is what an MCP-native engagement actually means.

The stack

The production stack — with opinions.

Every Datanation agentic engagement runs on the same opinionated technical scaffolding. Each choice is deliberate. Each is replaceable when the right reason appears — and "marketing" is not the right reason.

Agents production stack — six layers, with the default and the reason.
Layer	Default	Why
Runtime	Amazon Bedrock AgentCore (AWS) / Vertex AI Agent Builder (GCP)	Framework-agnostic; ships with memory, identity, OAuth-secured tool access via a managed MCP gateway — primitives we'd otherwise have to build ourselves.
Orchestration	LangGraph	Real production agentic workflows are state machines, not chains. LangGraph models that explicitly.
Durable execution	Temporal	When the agent needs to survive a crash mid-workflow, or run past a Lambda timeout, this is the answer.
Evals	Braintrust	Written before the agent ships, run on every change. Credible alternatives: Langfuse, Helicone.
Vector layer	pgvector → Pinecone / Weaviate at scale	Start with what you already run (Postgres). Migrate when scale or query patterns demand it.
Foundation models	GPT-5 · Claude 4 Sonnet · Gemini 2.5 (per-use-case model-routing)	The right model is per use case, not house-wide.

How we ship.

Every engagement ships with the same discipline — not as compliance paperwork, but as engineering practice:

A written human-in-the-loop policy that names which decisions the agent makes alone, which it escalates, and what the escalation surface is.
Model-version pinning per inference, so behavior is reproducible and any regression is traceable to the version that caused it.
Structured inference logging and drift dashboards, so quality regression is visible before it becomes a customer-facing failure.
A rollback runbook that has been tested at least once.
Evals as code — written before the agent ships, run on every change.
A capability map of which internal systems are exposed as MCP servers, with what authentication pattern.

None of this is exotic. Most of it is missing from the projects that get canceled.

How to start

Agent Readiness Assessment.

Two weeks. Fixed scope. No prerequisites.

A 10-day engagement that opens with the right question: is an agent the right answer here? About a third of these conclude that the right next step isn't an agent at all — it's six to twelve weeks of data architecture work first.

Deliverable: a use-case scoring matrix · a production-architecture sketch · a written 90-day deployment plan.

Request a scoped proposal →

What we build beyond the sprint

The work that follows.

Once the architecture is clear, the engagement shapes around the work:

Agent platform builds — production-grade multi-agent systems on AgentCore + LangGraph, with full eval substrate and rollback discipline.
Custom internal MCP servers — the asset that compounds across every future model and framework rotation.
Eval & observability infrastructure — for teams that shipped agents before the eval substrate caught up.
Agentic system rescues — projects in the 40% failure category that can be saved. Honest answer up front about whether yours is one of them.

What we don't do.

A short list, because saying no is the cheapest insurance policy a buyer of agentic AI can take out in 2026:

We don't take agentic engagements without an architecture review first.
We don't reframe chatbots or RPA scripts as "agents" to make a deal feel bigger.
We don't start engagements with fewer than 8 weeks of runway, because nothing that matters can be built in 4.
We don't ship without evals and a tested rollback runbook.
We don't pretend to be model-vendor-neutral while quietly steering every project to one vendor.

Engage

Two engineers. One honest answer.

A 30-minute briefing with the people who'd actually do the work. If we're not the right partner, we'll tell you — and probably tell you who is.