OpenAI Unveils Its Web Crawler Bot and Provides Instruction on How to Block It

OpenAI has unveiled its own web crawler bot, GPTBot, and has provided web admins with the means to block it if they want to....
OpenAI Unveils Its Web Crawler Bot and Provides Instruction on How to Block It
Written by Staff
  • OpenAI has unveiled its own web crawler bot, GPTBot, and has provided web admins with the means to block it if they want to.

    AI training methods have become a hot topic, with the industry still trying to figure out the legality and ethics of training AI models using data on the internet. OpenAI is addressing those concerns head-on, by giving web admins the ability to block GPTBot.

    Usage

    Web pages crawled with the GPTBot user agent may potentially be used to improve future models and are filtered to remove sources that require paywall access, are known to gather personally identifiable information (PII), or have text that violates our policies. Allowing GPTBot to access your site can help AI models become more accurate and improve their general capabilities and safety. Below, we also share how to disallow GPTBot from accessing your site.

    Disallowing GPTBot

    To disallow GPTBot to access your site you can add the GPTBot to your site’s robots.txt:

    User-agent: GPTBot Disallow: /

    Get the WebProNews newsletter delivered to your inbox

    Get the free daily newsletter read by decision makers

    Subscribe
    Advertise with Us

    Ready to get started?

    Get our media kit