Retail Brand Comms · Internal AI
They'd been pitched a 12-week UI build. We put their team in production in days.
A 20-person mid-market retail brand comms agency had its product knowledge scattered across Google Docs, an internal wiki, and a shared drive nobody trusted. Three other shops had pitched custom UIs on twelve-week timelines. We argued for the opposite: defer the UI, ship the smallest surface that gets the team using it, and earn the right to build something bigger later.
In one sentence
A three-tier internal Claude rollout — Claude.ai Project for fast access, MCP server in Docker for tool-grounded queries, custom UI deliberately deferred — that put a 20-person team in production in days while preserving the path to Phase 2.
The Problem
The AI was supposed to relieve the bottleneck. It was becoming one.
A 20-person brand comms team kept its sales collateral, product taxonomy, and customer-onboarding playbook in a mix of Google Docs, an internal wiki, and a shared drive. None of it was reachable by an off-the-shelf chatbot without manual copy-paste. So the team did what teams do — they fell back on Slack. Every non-trivial question routed through the same two senior strategists. That was already the bottleneck. A generic AI assistant didn't fix it; it widened the gap between the people who knew where things lived and the people who didn't.
By the time the buyer reached out, three other shops had pitched. All three proposed the same thing: a custom UI on a twelve-week timeline. The CEO put it bluntly — he didn't want to spend a quarter watching slide decks, he wanted his team using something next week.
I don't want to spend a quarter watching slide decks. I want my team using something next week.
— The CEO, during the scoping call
The proposals weren't wrong on the technology. They were wrong on the order.
What he needed wasn't a platform. He needed a rollout — grounded in the team's actual knowledge, hosted directly against the Anthropic API (no third-party AI layer sitting between his data and the model), and shaped so the first useful version landed in days, not quarters.
The Approach
Defer the UI. Ship the smallest surface that gets the team using it.
The contrarian move was telling the buyer not to buy the most expensive thing first. A custom UI is the artifact that looks the most like “a product” — which is why agencies pitch it — but it's also the artifact that takes the longest to ship and the one whose design is hardest to get right without watching real users. Build the UI first and you commit to a surface before you know how the team actually asks questions.
So we proposed three concentric tiers, with the surface ordered by speed-to-value: Claude.ai Project at L1 for in-production-next-week access, an MCP server at L2 for tool-grounded queries against internal data, and the custom UI at L3 — deferred until usage patterns in L1 and L2 told us what to build. The buyer was already paying for Claude.ai seats. We just had to make them useful.
Key insight
Defer the custom UI. Start the team in production next week on the smallest surface that grounds Claude in their knowledge — earn the right to build something bigger only if usage patterns justify it.
L1: Claude.ai Project — the surface they already had
A Claude.ai Project loaded with the team’s product taxonomy and sales collateral. No infrastructure. No new login. Live the same week the SOW was signed. The team starts asking real questions; we start watching real usage.
L2: MCP server in Docker — one implementation, three surfaces
A containerized MCP server exposes tools for knowledge-base lookup, sales-collateral retrieval, and onboarding-playbook queries. The same tool definitions run inside Claude Desktop today and behind a custom UI tomorrow. We are not building three integrations — we are building one.
L3: Custom UI — deferred on purpose, not by accident
Claude Desktop is the interim L2 surface and it is genuinely good. If usage patterns prove a custom UI earns its build cost, Phase 2 ships it on top of the same MCP server. If they don’t, the buyer keeps the budget.
Direct Anthropic API — no Bedrock unless a compliance reason demands it
The reflex to route through Bedrock or a third-party AI gateway costs latency, observability, and dollars without buying anything most mid-market teams actually need. Direct API was simpler, cheaper, and faster to ship. Bedrock stays as a Phase 2 swap if a future requirement forces it.
A single insight from scoping collapsed the data problem
During scoping, the buyer’s EA noted that sales collateral is product-knowledge-based, not customer-specific — one clarification that collapsed the data-grounding complexity by an order of magnitude. Product knowledge is one corpus; per-customer context is a different problem we did not need to solve in Phase 1.
Implementation
What shipped, and why each piece was shaped the way it was.
Milestone 1A — Project space + knowledge ingestion
The L1 Project was loaded with the team's product taxonomy and the highest-traffic sales collateral first, not everything. Ingest-everything is the instinct; it's also the path to a Project that hallucinates because retrieval drowns in stale drafts. We started narrow on purpose: ship the pages the team actually opens, watch which queries hit dead ends, then expand. The first usable version was live before any custom infrastructure existed.
Milestone 1B — MCP server in a Docker container
The MCP server in Docker exposes tools for knowledge-base lookup, sales-collateral retrieval, and onboarding-playbook queries. Containerized so the same image runs on a small VM in Phase 1 and lifts to Cloud Run in Phase 2 with no rewrite. Tools are scoped narrowly and named for the question they answer, not the table they query — the model picks the right tool by name far more reliably when names describe intent.
Before and after, in one workflow
Before: a marketing manager opens four Google Docs to assemble a competitive brief, pings the senior strategist three times to verify a claim, and the senior strategist becomes the rate limiter for the whole team's output. After: the manager asks the Claude.ai Project, gets a grounded answer in roughly thirty seconds with attribution back to the source doc, and only escalates the cases that genuinely need senior judgment. If she needs a tool-grounded query — pulling a specific record, not a passage — the same MCP server backs Claude Desktop with no UI switch.
Anthropic API direct — and what that buys
One API contract. No middleware that re-implements rate limits badly, no third-party vendor in the data path, no per-token markup on top of list price. When the next Claude release lands, the team gets it the day it ships, not the quarter their gateway vendor decides to support it.
Deployment + buyer-trust discipline — Phase 2, pre-architected
Cloud Run migration, custom UI, and a retainer framed as a service bundle, not 20 reserved hours — proactive maintenance, SLA, included engagement hours, quarterly review — are all designed into the Phase 1 architecture so Phase 2 is a step, not a rebuild.
Results
What changed for the team.
In production in days, not a quarter
The team was using the L1 surface against real product knowledge before any custom UI had been designed — the outcome the CEO had asked for, on the timeline he had asked for.
One MCP server, three surfaces
The same tool definitions back Claude Desktop today and a custom UI tomorrow. One implementation to maintain, not three integrations to babysit.
The senior-strategist bottleneck cooled
Slack questions that used to escalate landed on the Project instead. The strategists kept the questions that genuinely needed their judgment and stopped acting as a search index for the rest of the team.
No middleware tax
Direct Anthropic API, not Bedrock, not a third-party AI gateway. Internal data stays under one API contract; new Claude releases land the day they ship.
Phase 2 is a step, not a pivot
Cloud Run migration, custom UI, and the retainer service bundle are pre-architected. Whether the buyer triggers any of them is a usage decision, not a re-platforming decision.
Engagement artifact
The rollout architecture deck, on request.
Three-tier rollout deck
The implementation repo is private. The architecture deck — three-tier design, MCP server pattern, deferred-UI rationale, retainer service-bundle structure — we share on request. Useful if your team is sizing a similar internal Claude deployment and wants the decision tree before the first SOW conversation.
Takeaway
If you take one thing from this case study, take this: most internal AI rollouts don't fail at the model — they fail at the surface. Pick the smallest surface that wins — the one that gets the team using it — ground it in the knowledge they actually need, and earn the right to build something bigger. The custom UI is sometimes the right answer. It's almost never the right first answer.
Stuck between three 12-week custom-UI proposals?
Let's pick the right tier shape before you commit to the wrong one.
30 minutes. We'll walk through the L1/L2/L3 decision tree, the MCP server pattern, and the deferred-UI rationale. You'll leave knowing whether you actually need a custom build — or whether your team can be in production next week.
Book a 30-min call