Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Cohere secures US $500M new funding at US$6.8B valuation – UPDATED

Perplexity proposes revenue-sharing model to publishers in new AI search engine

Moveworks: A Very Human AI Platform

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
Perplexity AI

An AI Data Trap Catches Perplexity Impersonating Google

By Advanced AI EditorAugust 5, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


If you want to succeed in AI, a good hack would be to impersonate Google. You just can’t get caught.

This is what just happened to Perplexity, a startup that competes with ChatGPT, Google’s Gemini, and other generative AI services.

Quality data is crucial for success in AI, but tech companies don’t want to pay for this, so they crawl the web and scrape information for free, often without permission. This has sparked a backlash by some content creators and others interested in preserving the incentives that built the web.

Cloudflare and its CEO, Matthew Prince, have stormed into this battle with new features that help websites block unwanted AI bot crawlers. Cloudflare is an infrastructure, security, and software company that helps run about 20% of the internet. It thrives when the web does well, hence its interest in helping sites get paid for content.

Some Cloudflare customers recently complained to the company that Perplexity was evading these blocks and continued to scrape and collect data without permission.

So, CloudFlare set a digital trap and caught this startup red-handed, according to a Monday blog describing the escapade.

“Some supposedly ‘reputable’ AI companies act more like North Korean hackers,” Prince wrote on X on Monday. “Time to name, shame, and hard block them.”

Perplexity didn’t respond to a request for comment. 

The bait: Honeytrap domains and locked doors

Cloudflare created entirely new, unpublished websites and configured them with robots.txt files that explicitly blocked all crawlers — including Perplexity’s declared bots, PerplexityBot and Perplexity-User. These test sites had no public links, search engine entries, or metadata that would normally make them discoverable.

Yet, when Cloudflare queried Perplexity’s AI with questions about these specific sites, the startup’s service responded with detailed information that could only have come from those restricted pages. The conclusion? Perplexity had accessed the content despite being clearly told not to.

The cloak: How Perplexity masked its crawl

Perplexity initially crawled these sites using its official user-agent string, complying with standard protocols. However, Cloudflare said it discovered that once blocked, Perplexity resorted to stealth tactics.

Related stories

Business Insider tells the innovative stories you want to know

Business Insider tells the innovative stories you want to know

Cloudflare found that Perplexity began deploying undeclared crawlers disguised as normal web browsers and sending requests from unknown or rotated IP addresses and unofficial ASNs, [what is ASN? write out on first ref?] which are crucial identifiers that help route internet traffic efficiently.

When its official crawlers were blocked, Perplexity also used a generic web browser designed to impersonate Google’s Chrome browser on Apple Mac computers. (Business Insider asked Google whether it has told Perplexity to stop impersonating Chrome. Google did not respond).

According to Cloudflare, Perplexity has been making millions of such “stealth” requests daily across tens of thousands of web domains.

This behavior not only violated web standards, but also betrays the fundamental trust that underpins the functioning of the open web, Cloudflare explained.

The comparison: How OpenAI gets it right

To emphasize what good bot behavior looks like, Cloudflare compared Perplexity’s conduct to that of OpenAI’s crawlers, which scrape data for developing ChatGPT and giant AI models such as the upcoming GPT-5.

When OpenAI’s bots encountered a robots.txt file or a similar block, they simply backed off. No circumvention. No masking. No backdoor crawling, according to Cloudflare tests.

The Fallout: De-verification and blocking

As a result of these findings, Cloudflare has de-listed Perplexity as a verified bot and rolled out new detection and blocking techniques across its network.

Cloudflare’s takedown serves as a cautionary tale in the AI arms race. While the web shifts toward stronger control over data access and usage, actors who flout these evolving norms may find themselves not just blocked, but publicly called out.

In an era where AI systems are hungry for training data, Cloudflare’s sting operation is a signal to startups and established players alike: Respect the rules of the web, or risk being exposed.

Sign up for BI’s Tech Memo newsletter here. Reach out to me via email at abarr@businessinsider.com.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleCost tracking multi-tenant model inference on Amazon Bedrock
Next Article The Dawn of Dynamic AI: How Generative Video Models Are Reshaping Content Creation
Advanced AI Editor
  • Website

Related Posts

Perplexity proposes revenue-sharing model to publishers in new AI search engine

August 25, 2025

Perplexity Comet’s flaw exposes how dangerous agentic AI can be

August 25, 2025

Perplexity AI Compares XRP to Top Altcoins

August 19, 2025

Comments are closed.

Latest Posts

Dealers Living Like Collectors, Egypt’s Tourism and More: Morning Links

Mütter Museum in Philadelphia Announces New Policy for Human Remains

Inigo Philbrick, Art Dealer Convicted of Fraud, Appears in BBC Film

Links for August 22, 2025

Latest Posts

Cohere secures US $500M new funding at US$6.8B valuation – UPDATED

August 25, 2025

Perplexity proposes revenue-sharing model to publishers in new AI search engine

August 25, 2025

Moveworks: A Very Human AI Platform

August 25, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Cohere secures US $500M new funding at US$6.8B valuation – UPDATED
  • Perplexity proposes revenue-sharing model to publishers in new AI search engine
  • Moveworks: A Very Human AI Platform
  • CEO Transition and Guidance Miss Could Be a Game Changer for C3.ai (AI)
  • Do What? Teaching Vision-Language-Action Models to Reject the Impossible – Takara TLDR

Recent Comments

  1. NexioWert on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  2. GytaqZew on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  3. phim sex hiếp dâm học sinh on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  4. Zadazsobjed on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. pirinç küpeşte on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.