TechDogs-"Why Are Businesses Looking To Block ChatGPT-Creator OpenAI’s Latest Endeavor?"

Emerging Technology

Why Are Businesses Looking To Block ChatGPT-Creator OpenAI’s Latest Endeavor?

By TD NewsDesk

TD NewsDesk

Updated on Mon, Oct 2, 2023

Overall Rating

Sharing is caring.  

We’ve all heard this phrase growing up, right? It’s even a principle that most businesses look to adopt in the products and services they provide.  

One such company looking to ease the content, research and language gap problems of people instantly is OpenAI, with their revolutionary product – ChatGPT. 

This tool, that instantly generates text, also managed to instantly generate users, reaching a million in under a week (5 days to be exact) and reached 100 million users in just 2 months, according to a report

Check out all the important details pertaining to the evolution of this magnificent wonder here

Even businesses went gaga over this tool and its capabilities, with 49% of companies using ChatGPT and 30% conveying plans to do so in the future, according to a study published in February 2023. 

However, two recent announcements made by OpenAI have people and businesses excited and concerned, respectively.  

The first one sees ChatGPT earn internet browsing capabilities to provide updated results (yay!) and you can read all about it here. 

The second one, which we’ll dive into now, is one that’s making businesses nervous and leading to them taking firm measures.  

So, what’s it all about? Let’s explore: 

What Did OpenAI Announce? 

  • On August 7, OpenAI announced its own web crawler – GPTBot. 

  • Data and information collected by the crawler could be used to improve future models of OpenAI’s chatbot. 

  • On the company’s website, OpenAI states that the crawler is filtered to remove sources that “require paywall access”, “are known to gather personally identifiable information (PII)” or possess text that violates the company’s policies. 

  • According to OpenAI, “Allowing GPTBot to access your site can help AI models become more accurate and improve their general capabilities and safety.” 

  • Ahead of this, the company provides information on how to disable the crawler from accessing a site, which could be a complete ban or a partial ban. 

  • Essentially, if a website wishes to completely disallow the crawler, they need just add GPTBot to their site’s “robots.txt.” file, whereas if it’s a part-block, then the site could add the GPTBot token to the same file.  

  • Robots.txt is a decades-old method through which a site can tell a crawler to ignore it. 

  • While OpenAI has two separate user agents for web crawling and user browsing, they clarified that blocking one agent will result in blocking both, as the company’s opt-out system currently treats both user agents the same. 

    A Screenshot Showing The Method Through Which A Website Can Disallow OpenAI's GPTBot FromSource

    How Did Businesses Respond?  

  • In the initial few days after OpenAI released its web crawler, it was reported that around 69 of the top 1,000 most popular websites had blocked it.  

  • This consisted of at least 15% of the top 100 and 7% of the top 1,000 websites and included sites like Amazon, NYTimes, Shutterstock, CNN, Wikihow, Tumblr and more. 

  • Now, as per the latest study dated September 22, 26% of the top 100 sites are blocking the web crawler. 

  • Essentially, in the past month, the number of popular websites blocking it has gone up by a whopping 250%. 

  • Reportedly, the number of top sites blocking GPTBot sits at 250 and includes names such as Pinterest, Vimeo, GrubHub, Indeed, Apartments, Reuters, The Guardian, Live Science, USA Today, NPR, CBS News, CBS Sports, NBC News, CNBC, The New Yorker, People and more, with even blocking it.

    As more popular websites block the web crawler designed to help grow the most famous and used GenAI (generative artificial intelligence) chatbot, there are subsequent concerns raised. 

    For one, such AI chatbots require and rely on accurate data to improve and enhance its performance but most of the information used to train them comes from the internet. So, in depriving OpenAI of accurate content that can be used for training, are popular websites inadvertently reducing the potential output quality that can be achieved from OpenAI’s chatbots? 

    What do you think about the response from websites in blocking GPTBot?  

    Let us know in the comments below! 

First published on Mon, Oct 2, 2023

Liked what you read? That’s only the tip of the tech iceberg!

Explore our vast collection of tech articles including introductory guides, product reviews, trends and more, stay up to date with the latest news, relish thought-provoking interviews and the hottest AI blogs, and tickle your funny bone with hilarious tech memes!

Plus, get access to branded insights from industry-leading global brands through informative white papers, engaging case studies, in-depth reports, enlightening videos and exciting events and webinars.

Dive into TechDogs' treasure trove today and Know Your World of technology like never before!

Disclaimer - Reference to any specific product, software or entity does not constitute an endorsement or recommendation by TechDogs nor should any data or content published be relied upon. The views expressed by TechDogs’ members and guests are their own and their appearance on our site does not imply an endorsement of them or any entity they represent. Views and opinions expressed by TechDogs’ Authors are those of the Authors and do not necessarily reflect the view of TechDogs or any of its officials. All information / content found on TechDogs’ site may not necessarily be reviewed by individuals with the expertise to validate its completeness, accuracy and reliability.


Emerging Technology OpenAI CharGPT GPTBot Web Crawling

Join The Discussion

  • Dark
  • Light