China’s Code Surge: Closing the Gap on U.S. AI Agents

American AI coding tools from companies like Anthropic and Replit have set a high bar, churning out functional apps in minutes and fueling a 60% year-over-year surge in new app releases on iOS stores, according to data from a16z New Media. This boom echoes the early days of the iPhone SDK, where rapid development drew users and revenue. Yet, Chinese rivals such as Zhipu AI’s GLM-4.7 are matching these feats, prompting questions about the durability of U.S. leads in agentic coding.

CNBC tested Zhipu’s new coding tool against U.S. counterparts and found it built a tracker for major Chinese public companies faster than Replit or Claude Code, though with rougher edges. “The user base of Zhipu GLM Coding Plan is primarily concentrated in the United States and China,” Zhipu noted, with demand so high it imposed access limits. This traction defies U.S. developer biases against Chinese models, as confirmed by AI builders.

Frontier Tools Reshape Development

Zhipu’s GLM-4.7 shines on benchmarks like SWE-Bench at 73.8%, up 5.8 points from GLM-4.6, and 66.7% on its multilingual variant, per Zhipu AI’s technical report. It supports “thinking before acting” in frameworks like Claude Code and Cline, delivering stable multi-step execution. LiveCodeBench V6 scores reached 84.8 for GLM-4.7, topping Claude 4.5 Sonnet in some evaluations, as reported on Reddit’s r/singularity.

Replit’s “Mobile Apps on Replit” feature lets users prompt natural language into monetizable iOS apps via Stripe integration, outpacing larger players like OpenAI in accessibility, according to CNBC. Claude Code hit $1 billion annualized revenue in six months, while Cursor reached $29.3 billion valuation. Still, vulnerabilities persist; a Tenzai study flagged security risks in apps from Replit and Claude Code.

Google DeepMind CEO Demis Hassabis told CNBC Chinese models trail U.S. ones by “a matter of months,” closer than the prior one-to-two-year gap. Epoch AI’s analysis shows a 7-month average lag since 2023 on the Epoch Capabilities Index, with DeepSeek-R1 and Qwen models narrowing gaps but not yet topping OpenAI’s o3.

Benchmarks Reveal Tight Race

IQuest Lab’s IQuest Coder V1, released January 1, 2026, by Chinese hedge fund Ubiquant, hit 81.4% on SWE-Bench Verified with its 40B model, edging Claude Sonnet 4.5’s 81.3% and topping GPT 5.1 Mini’s 77.5%, per DEV Community. Its Code-Flow training on commit histories enables evolution-aware coding, with variants like Thinking for algorithms via reinforcement learning.

Open-source appeal bolsters Chinese advances. GLM-4.7 integrates seamlessly into tools like Roo Code, offering Claude-level output at under $3 monthly. MIT Technology Review predicts more Silicon Valley apps will run on Chinese open models in 2026, with release lags shrinking to weeks. Reuters reports Chinese researchers like Zhipu founder Tang Jie embracing high-risk innovation despite chip constraints.

Baseten CEO Tuhin Srivastava, whose firm raised funds with Nvidia backing, sees GLM-4.7 as a “DeepSeek-like breakout,” per CNBC. On X, developers note Chinese coders favor U.S. tools like Claude Code despite local options, but global adoption grows.

Efficiency Trumps Scale

China optimizes amid U.S. chip export controls. DeepSeek’s Sparse Attention cuts compute costs 50% without performance hits, enabling agent data generation. Council on Foreign Relations warns 2026 export tweaks could boost China’s AI power by two-to-three years. Xi Jinping hailed 2025 AI and chip gains in his New Year address, per Euronews.

U.S. strengths lie in ecosystems and innovation like transformers, Hassabis noted, but Chinese open-weights like Qwen lower barriers worldwide. DW.com highlights China’s “AI Plus” push for industry integration. Capital Economics sees China matching U.S. rollout speed via state support.

X posts from @afrazhaowang reveal Shanghai devs sticking to Claude despite free locals, while @HeLiuLeo observes China leading models, U.S. applications. This split favors hybrid futures.

Geopolitical Stakes Rise

Atlantic Council forecasts China’s open-source push shaping global infrastructure, with U.S. firms already using its LLMs. CSIS notes China’s targeted policies aiding commercial models for quicker economic wins. NBR.org emphasizes coding assistants’ adoption parity, with AI writing more code than engineers.

Alibaba scientist Yao Shunyu gives China under 20% odds of leading in 3-5 years due to chips, per SCMP, yet gaps narrow per benchmarks. MIT predicts agentic commerce hitting trillions by 2030, per McKinsey.

Zhipu’s Hong Kong IPO and Baseten’s enterprise moves signal maturation. As CNBC asks, if Chinese agents match power at lower cost openly, what moats remain for U.S. tools?

China’s Code Surge: Closing the Gap on U.S. AI Agents

China’s Code Surge: Closing the Gap on U.S. AI Agents

Notice an error?

Ready to get started?