Anthropic Removes Secret Code Blocking Queries on Chinese AI Rivals

Anthropic has decided to remove a section of code from its Claude models that was designed to detect and block queries related to Chinese artificial intelligence companies. The change, first reported by The Register, marks a shift in how the San Francisco-based company handles competitive intelligence within its safety and alignment systems.

The code in question operated quietly in the background of Claude’s reasoning process. When users asked questions that appeared aimed at extracting information about the inner workings of models from firms such as DeepSeek, Moonshot, or Baichuan, the system would trigger a refusal. These refusals often came with explanations that the query violated policies on competitive intelligence or trade secrets. Anthropic had embedded this logic without drawing public attention to it, treating the protection of its own model details as a core safety concern on par with preventing harmful content.

Engineers discovered the mechanism through careful examination of Claude’s chain-of-thought traces, which occasionally revealed the hidden instructions. The code instructed the model to watch for references to specific Chinese organizations and to respond with standardized deflections. In some cases, even indirect questions about model architecture or training methods triggered the filter if the system linked them to a competitor’s identity. This approach reflected a broader industry pattern in which American AI developers have grown wary of rapid progress coming out of Chinese laboratories, especially after several domestic models demonstrated capabilities that surprised Western observers.

The removal of this covert filter follows months of internal discussion at Anthropic about the effectiveness and appropriateness of such measures. Company representatives indicated that the decision stemmed from a reassessment of how these blocks actually performed in practice. In many instances, determined users could bypass the restrictions through careful phrasing or by avoiding direct company names. At the same time, the filter sometimes activated on innocent queries from researchers simply interested in comparative AI performance. Maintaining the code required constant updates to keep pace with new organizations and evolving query patterns, creating an ongoing maintenance burden.

Anthropic’s choice also aligns with a growing recognition across the sector that model information is becoming harder to keep secret. Once a system is released to the public, users can interact with it extensively, run benchmarks, and apply various analysis techniques to infer architectural details. Papers published by Chinese teams have already reverse-engineered aspects of leading American models with notable accuracy. In this environment, hard-coded refusals may offer limited protection while complicating normal research conversations.

The episode highlights tensions between commercial secrecy and the open exchange of ideas that has historically accelerated AI development. For years, leading laboratories maintained strict controls over training data, model weights, and system prompts. Yet as capabilities spread and the number of well-funded players increases globally, the boundary between protected knowledge and public science has grown increasingly porous. Anthropic appears to have concluded that expending resources on blocking specific nationalities or organizations no longer justifies the effort, particularly when similar information can be obtained through other channels.

Observers have pointed out that the filter’s focus on Chinese entities rather than all foreign competitors raised questions about consistency. Companies in Europe, Canada, and Israel have also produced advanced models, yet the code specifically targeted organizations based in China. This geographic emphasis mirrored wider concerns in Washington about technology transfer and national security implications. Lawmakers and intelligence officials have repeatedly warned that certain foreign governments encourage their domestic AI firms to gather information on American breakthroughs. By removing the code, Anthropic may be signaling that it prefers to address such risks through other means, such as improved monitoring of large-scale data scraping or stronger legal protections for proprietary information.

The timing of the change coincides with several notable developments in the global AI race. Chinese laboratories have released models that perform strongly on standard benchmarks, sometimes matching or exceeding Claude in specific domains. At the same time, export controls on advanced chips have complicated hardware access for researchers in China, leading many to focus intensely on algorithmic efficiency and novel training methods. This combination of constraints and ingenuity has produced innovations that Western companies now study closely. Rather than attempting to wall off discussion of these advances, Anthropic seems to be moving toward a posture that acknowledges the reality of global competition.

From a technical standpoint, eliminating the competitive intelligence filter simplifies Claude’s instruction hierarchy. Safety systems in large language models rely on layered directives that tell the model which behaviors to prioritize. Every additional rule increases complexity and creates potential conflicts with other goals, such as being helpful to users or providing accurate technical information. By stripping away this particular constraint, Anthropic reduces the chance that legitimate questions about model design or optimization techniques will be incorrectly refused. The company has long emphasized the importance of making its models more truthful and less prone to unnecessary refusals, and this adjustment fits within that broader effort.

Users who experiment with Claude after the update have noticed fewer abrupt blocks when discussing international AI developments. Questions about architectural similarities between different models or comparisons of training approaches now receive more substantive answers, though the system still respects boundaries around direct extraction of proprietary code or internal training details. This more balanced response pattern may encourage greater academic and industry collaboration, allowing researchers to build on each other’s findings rather than working in isolated silos.

The decision has sparked mixed reactions within the AI community. Some security experts argue that removing any defensive measure against targeted information gathering leaves American companies more exposed. They point to documented cases where state-affiliated actors have used public model interfaces to probe for weaknesses or extract training heuristics. Others counter that such concerns, while valid, are better addressed through network-level protections, employee screening, and careful design of application programming interfaces rather than through brittle prompt-based filters that users can often evade.

Anthropic has not provided extensive public commentary on the rationale behind the removal, citing the sensitive nature of competitive strategy. However, statements from executives in recent months have stressed the need for the company to focus its safety resources on higher-priority risks, including potential misuse of powerful models for biological or cyber threats. In this hierarchy of concerns, blocking queries about rival AI labs appears to have fallen in priority. The company continues to maintain strong safeguards against genuinely dangerous requests while allowing more open discussion of technical topics that were previously restricted.

This development occurs against a backdrop of increasing transparency initiatives across the industry. Several organizations have begun publishing more details about their model architectures, training procedures, and evaluation methods. Although full disclosure of weights and data remains rare, the trend points toward greater openness as a means of building trust and accelerating collective progress. Anthropic’s removal of the competitor-detection code can be seen as one small step in that direction, even if motivated primarily by practical considerations rather than ideological commitment to openness.

Looking ahead, the episode raises broader questions about how AI companies will protect their intellectual property as the technology matures. Traditional software businesses have relied on copyrights, patents, and trade secret laws to safeguard their advantages. AI models present unique challenges because much of their value emerges from patterns learned during training rather than explicit code. Once those patterns can be inferred through interaction, legal protections become difficult to enforce. Companies may increasingly turn to usage monitoring, rate limiting, and sophisticated watermarking techniques to detect when their models are being systematically analyzed for competitive gain.

For Anthropic specifically, the change suggests a maturing approach to competition. Founded with a strong emphasis on safety and alignment, the company has sometimes been perceived as more cautious than its peers. The quiet elimination of this particular safeguard indicates a willingness to adapt policies based on real-world performance rather than theoretical risks. It also reflects confidence that Claude’s underlying capabilities and the team’s ongoing research provide sufficient differentiation even when technical discussions flow more freely.

The broader AI community will likely watch closely to see whether other developers follow Anthropic’s lead. OpenAI, Google DeepMind, and Meta have each implemented various forms of competitive protection in their systems, though details remain scarce. If the removal proves successful in reducing unnecessary refusals without leading to significant information leakage, it could encourage a wider relaxation of similar filters. Such a shift would contribute to a more vibrant exchange of ideas across borders, potentially speeding up innovation while requiring all players to remain vigilant about genuine security threats.

In the meantime, researchers and developers can expect more productive conversations with Claude about international AI advancements. The removal of the specialized code allows the model to engage more naturally with questions that span multiple organizations and research traditions. While competitive pressures in artificial intelligence remain intense, Anthropic’s adjustment suggests that some barriers erected during the early rush of development may no longer serve their original purpose. As the field continues to expand, finding the right balance between protecting legitimate secrets and fostering scientific dialogue will remain an ongoing challenge for every major laboratory.

Anthropic Removes Secret Code Blocking Queries on Chinese AI Rivals

Notice an error?

Ready to get started?