Search engine provider Perplexity AI is accused of acting like “North Korean hackers” after the company’s bots were found crawling websites with anti-scraping rules in place.
The accusation comes from Cloudflare, an internet infrastructure provider that’s developed safeguards to prevent AI companies from scraping data from third-party websites. On Monday, Cloudflare CEO Matthew Prince blasted Perplexity AI for invasive web crawling. (The AI company has also been found scraping data from media websites.)
“Some supposedly ‘reputable’ AI companies act more like North Korean hackers. Time to name, shame, and hard block them,’ Prince tweeted.
This Tweet is currently unavailable. It might be loading or has been removed.
Cloudflare conducted an investigation that allegedly found Perplexity AI “repeatedly modifying” the company’s web-crawling bots to evade data-scraping measures on third-party websites.
In response, Cloudflare has delisted Perplexity AI as a “verified bot,” lumping the company’s web crawlers in with other untrusted activity, which could make it harder for it to index content. In addition, Cloudflare updated its own systems to block the “stealth crawling” from Perplexity AI.
Perplexity AI didn’t immediately respond to a request for comment. But the crackdown risks undermining its AI-powered search engine, which has also been flagged for violating web-scraping rules at news websites, without asking for permission or paying for a license.
“Today, over two and a half million websites have chosen to completely disallow AI training through our managed robots.txt feature or our managed rule blocking AI Crawlers,” Cloudflare says.
Get Our Best Stories!
Your Daily Dose of Our Top Tech News
By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.
Thanks for signing up!
Your subscription has been confirmed. Keep an eye on your inbox!
Cloudflare flagged the alleged web-scraping after receiving complaints from customers, who were specifically blocking Perplexity’s bots from indexing their sites. Cloudflare then verified the claims by creating several test domains that were supposed to be deliberately hidden from search engines, but Perplexity AI still found a way to crawl them.
“We observed that Perplexity uses not only their declared user-agent, but also a generic browser intended to impersonate Google Chrome on macOS when their declared crawler was blocked,” the company found. In addition, the web crawler used multiple IP addresses outside of Perplexity’s official IP range, rotating through them if the data scraping was blocked.
Recommended by Our Editors
“This activity was observed across tens of thousands of domains and millions of requests per day,” Cloudflare added. “Of note: when the stealth crawler was successfully blocked, we observed that Perplexity uses other data sources — including other websites — to try to create an answer. However, these answers were less specific and lacked details from the original content, reflecting the fact that the block had been successful.”
The incident underscores the ongoing clash between AI programs and their insatiable demand for data and growing calls for them to pay for the content they use. In response, some media companies have sued Perplexity AI and other providers, including OpenAI, for alleged copyright infringement.
In the meantime, Cloudflare anticipates Perplexity AI will update its web crawler to beat such anti-bot measures. The company adds that others, such as OpenAI, have been respecting the anti-data scraping measures in place.

5 Ways to Get More Out of Your ChatGPT Conversations
Disclosure: Ziff Davis, PCMag’s parent company, filed a lawsuit against OpenAI in April 2025, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.
About Michael Kan
Senior Reporter
