Cloudflare's AI Stack Fuels 93% Engineer Adoption: Inside the Platform Powering 241 Billion Tokens Monthly

Cloudflare engineers now rely on AI for 93% of their coding work. That’s across research and development teams. In just 11 months, the company rolled out tools handling 241 billion tokens through its AI Gateway each month, with 51.83 billion processed on Workers AI. Merge requests jumped from 5,600 a week to over 8,700, peaking at 10,952 in March 2026. This isn’t hype. It’s measurable output from a stack built entirely on products Cloudflare ships to customers.

The push started with the iMARS tiger team—Internal MCP Agent/Server Rollout Squad. Now folded into the Dev Productivity group, which oversees CI/CD, builds, and automation. They rethought code review. Onboarding. Even repository standards. Every merge request passes through an AI code reviewer. No exceptions.

Picture this. 3,683 active users. That’s 60% of the company, 93% of R&D. 47.95 million AI requests in the last 30 days. 295 teams plugged in. OpenCode alone tallied 27.08 million messages. Windsurf added 434,900. And it’s all gated through Cloudflare Access for zero-trust authentication.

AI Gateway sits at the core. It routes requests, tracks costs, handles bring-your-own-keys, and enforces zero data retention. Last month: 20.18 million requests, 241.37 billion tokens. OpenCode AI Gateway alone: 688,460 requests a day, 10.57 billion tokens daily. Providers break down to Frontier Labs—OpenAI, Anthropic, Google—at 91.16% of requests. Workers AI takes 8.84%, but it’s climbing fast. Why? Their Kimi K2.5 model, launched in March 2026, offers 256k context, tool calling, structured outputs. Processes 7 billion tokens a day. 77% cheaper than proprietary rivals.

Workers AI runs serverless inference on global GPUs. No cross-cloud latency. 51.47 billion input tokens, 361.12 million output last month. Kimi handles lightweight tasks like CI docs review and AGENTS.md generation. The setup? A single proxy Worker. Engineers run one command: opencode auth login https://opencode.internal.domain. It fetches config from a .well-known endpoint—JSON with auth details, providers, MCP servers, agents, permissions. Auth via cloudflared yields a signed JWT. Boom. You’re in.

That proxy? A Hono app. Serves shared config compiled at deploy time. Proxies to AI Gateway: strips auth headers, adds cf-aig-authorization and metadata with an anonymous UUID. No buffering. Hourly cron jobs pull model catalogs from models.dev, cache in KV, tag with zero-retention flags. Tracking stays anonymous—email to UUID in D1 or KV. Gateway sees only the UUID.

Config lives as code. Agents and commands in Markdown with YAML frontmatter. Compiled to JSON, schema-validated. And MCP? Model Context Protocol servers. 13 production ones, 182+ tools. Backstage. GitLab. Jira. Sentry. All aggregated in one portal. Single OAuth endpoint via Cloudflare Access. Built on McpAgent from the Agents SDK, workers-oauth-provider. Lives in the monorepo with Bazel CI/CD.

But token bloat hit hard. 15,000 tokens just for 34 GitLab tools—7.5% of a 200k context. Solution: Code Mode. Portal collapses tools into two: portal_codemode_search and portal_codemode_execute. Scales without exploding schemas. Sandboxed via Dynamic Workers for agent-generated code.

Knowledge layer next. Backstage—self-hosted open-source catalog. Tracks 2,055 services, 167 libraries, 122 packages, 228 APIs, 544 systems across 45 domains. 1,302 databases, 277 ClickHouse tables, 173 clusters. Dependency graphs everywhere. Its MCP server exposes 13 tools for agents to query ownership, APIs, Tech Insights.

Then AGENTS.md. One per repo. Details runtime, test commands, linting, navigation, conventions, boundaries. Auto-generated at scale. Pipeline pulls Backstage metadata, analyzes repos, maps to Engineering Codex standards. Model drafts it. Opens a merge request for review. 3,900 repos processed. Updates trigger on changes flagged by the AI reviewer.

Enforcement seals it. AI Code Reviewer in every GitLab CI pipeline. Runs on merge request open or update. Multi-agent coordinator assesses risk—trivial, lite, full. Delegates to specialists: quality, security, codex, docs, performance, release. They hit AI Gateway, read Codex rules, parse AGENTS.md. Post structured comments. Categorized by type. Severity levels: Critical, Important, Suggestion, Nits. Cites rules across iterations.

Workers AI powers 15%—Kimi for docs. Frontier models like Opus and GPT-5.4 for the tough stuff. Last 30 days: 5.47 million requests, 24.77 billion tokens. Engineering Codex? Distilled rules like ‘If X, use Y.’ Includes AI-specific prompts. Audited for compliance.

Cloudflare announced this stack on April 20, 2026, via their company blog. Their X account echoed it: “20 million requests routed through AI Gateway, 241 billion tokens processed, and inference running on Workers AI, serving more than 3,683 internal users.” Engineer Rajesh Bh added: “93% R&D adoption in 11 months… Every MR gets an AI code review. Built on the products we ship to customers.”

Recent moves tie in. During Agents Week in April 2026, they shipped Sandbox SDK to general availability. Durable Objects and Agents SDK enable stateful, long-running sessions. Workflows scaled 10x. Kimi K2.5 hit Workers AI in March, as noted in a separate blog post shared on X. Agents Week kicked off agentic AI: autonomous systems thinking, tooling, workflows.

Background agents loom next. On-demand cloud agents with MCP portal access, git ops, testing. Clone repos in isolated Sandbox environments. Build. Test. Open MRs. All durable.

This stack proves dogfooding at scale. Cloudflare eats its own inference, routing, auth. Others talk agents. They ship them internally first. Velocity shows it. 93% adoption doesn’t lie.

Challenges? Token efficiency. Schema explosion. Attribution without PII. They solved them. Others will copy. But Cloudflare’s edge: everything runs on their global network. No vendor lock beyond their own products.

Cloudflare’s AI Stack Fuels 93% Engineer Adoption: Inside the Platform Powering 241 Billion Tokens Monthly

Notice an error?

Ready to get started?