OpenAI Rolls Out GPTBot for Enhanced Web Crawling

OpenAI Rolls Out GPTBot for Enhanced Web Crawling

UTC by Chimamanda U. Martha · 3 min read
OpenAI Rolls Out GPTBot for Enhanced Web Crawling
Photo: Depositphotos

The launch of GPTBot comes a few weeks after OpenAI filed a trademark application for “GPT-5”.

OpenAI, a US-based artificial intelligence (AI) company, has introduced GPTBot, a groundbreaking web crawling tool designed to enhance future AI models like ChatGPT. According to the company’s blog post, the new AI tool serves as a web crawler, a digital explorer navigating the internet to index website content. Unlike traditional search engines like Google and Bing, the tool aims to leverage indexed data to improve the accuracy, versatility, and depth of upcoming AI models, promising a significant leap in user interactions.

GPTBOT’s primary mission is to gather publicly available data from various online sources across the globe. The company said it is committed to ethical data collection practices, equipping the new web crawlers with the ability to exclude sources that demand payment, collect personal information, or violate content guidelines to ensure responsible data usage.

Website Owners Can Control the Use of GPTBot

The company also said in its announcement that it had built GPTBot to respect the autonomy of website owners. OpenAI introduced a series of security measures alongside GPTBot, which allows website administrators to control GPTBot’s interactions within their sites.

Website operators can choose to restrict the web crawlers from accessing their content by implementing a “disallow” command within a standard file on their servers. The move allows web owners to take control of their content.

As mentioned earlier, OpenAI’s new web crawlers were designed with a proactive feature that scans collected data, eliminating personally identifiable information (PII) and content violating its policies. However, according to Decrypt, some technology ethicists on Hacker News argued that the opt-out approach still raises concerns about consent and privacy protection.

The report noted that certain users defended OpenAI’s decision by asserting that comprehensive data gathering is necessary for a capable future AI tool. One user emphasized the importance of current data to prevent GPT models from becoming stagnant in September 2021.

“They still need current data, or their GPT models will be stuck in September 2021 forever,” the user said.

Another tech enthusiast focused on privacy criticized OpenAI for not adequately citing sources, thereby obscuring its derivative work.

OpenAI Files Patent Application for GPT-5 Model

Meanwhile, the launch of GPTBot comes a few weeks after OpenAI filed a trademark application for “GPT-5”. The application, submitted to the US Patent and Trademark Office (USPTO) on July 18, suggests a potential successor to GPT-4.

While filing a trademark application does not guarantee an immediate product launch, in June, the company’s CEO Sam Altman said the firm was “nowhere close to beginning training on the GPT-5 model”. He further noted that multiple safety audits must be completed before implementing the model.

The company recently rolled out an Android version of ChatGPT to increase users’ adoption. The app was immediately made available to users in India, the US, Bangladesh, and Brazil, with plans to expand to other regions in the future.

Artificial Intelligence, News, Technology News
Related Articles