Close Menu
  • Home
  • AI Models
    • DeepSeek
    • xAI
    • OpenAI
    • Meta AI Llama
    • Google DeepMind
    • Amazon AWS AI
    • Microsoft AI
    • Anthropic (Claude)
    • NVIDIA AI
    • IBM WatsonX Granite 3.1
    • Adobe Sensi
    • Hugging Face
    • Alibaba Cloud (Qwen)
    • Baidu (ERNIE)
    • C3 AI
    • DataRobot
    • Mistral AI
    • Moonshot AI (Kimi)
    • Google Gemma
    • xAI
    • Stability AI
    • H20.ai
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Microsoft Research
    • Meta AI Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding & Startups
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • Expert Insights & Videos
    • Google DeepMind
    • Lex Fridman
    • Matt Wolfe AI
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • Matt Wolfe AI
    • The TechLead
    • Andrew Ng
    • OpenAI
  • Expert Blogs
    • François Chollet
    • Gary Marcus
    • IBM
    • Jack Clark
    • Jeremy Howard
    • Melanie Mitchell
    • Andrew Ng
    • Andrej Karpathy
    • Sebastian Ruder
    • Rachel Thomas
    • IBM
  • AI Policy & Ethics
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
    • EFF AI
    • European Commission AI
    • Partnership on AI
    • Stanford HAI Policy
    • Mozilla Foundation AI
    • Future of Life Institute
    • Center for AI Safety
    • World Economic Forum AI
  • AI Tools & Product Releases
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
    • Image Generation
    • Video Generation
    • Writing Tools
    • AI for Recruitment
    • Voice/Audio Generation
  • Industry Applications
    • Finance AI
    • Healthcare AI
    • Legal AI
    • Manufacturing AI
    • Media & Entertainment
    • Transportation AI
    • Education AI
    • Retail AI
    • Agriculture AI
    • Energy AI
  • AI Art & Entertainment
    • AI Art News Blog
    • Artvy Blog » AI Art Blog
    • Weird Wonderful AI Art Blog
    • The Chainsaw » AI Art
    • Artvy Blog » AI Art Blog
What's Hot

Standard Equipped with Audi’s Panoramic Assisted Driving, Starting at 235,900 Yuan_the_has_to

IBM Is Working on Combining Supercomputers with Quantum Computers

Moveworks Announces Partnership with Jamf to Transform Employee Support for Apple-First Organizations

Facebook X (Twitter) Instagram
Advanced AI News
  • Home
  • AI Models
    • OpenAI (GPT-4 / GPT-4o)
    • Anthropic (Claude 3)
    • Google DeepMind (Gemini)
    • Meta (LLaMA)
    • Cohere (Command R)
    • Amazon (Titan)
    • IBM (Watsonx)
    • Inflection AI (Pi)
  • AI Research
    • Allen Institue for AI
    • arXiv AI
    • Berkeley AI Research
    • CMU AI
    • Google Research
    • Meta AI Research
    • Microsoft Research
    • OpenAI Research
    • Stanford HAI
    • MIT CSAIL
    • Harvard AI
  • AI Funding
    • AI Funding Database
    • CBInsights AI
    • Crunchbase AI
    • Data Robot Blog
    • TechCrunch AI
    • VentureBeat AI
    • The Information AI
    • Sifted AI
    • WIRED AI
    • Fortune AI
    • PitchBook
    • TechRepublic
    • SiliconANGLE – Big Data
    • MIT News
    • Data Robot Blog
  • AI Experts
    • Google DeepMind
    • Lex Fridman
    • Meta AI Llama
    • Yannic Kilcher
    • Two Minute Papers
    • AI Explained
    • TheAIEdge
    • The TechLead
    • Matt Wolfe AI
    • Andrew Ng
    • OpenAI
    • Expert Blogs
      • François Chollet
      • Gary Marcus
      • IBM
      • Jack Clark
      • Jeremy Howard
      • Melanie Mitchell
      • Andrew Ng
      • Andrej Karpathy
      • Sebastian Ruder
      • Rachel Thomas
      • IBM
  • AI Tools
    • AI Assistants
    • AI for Recruitment
    • AI Search
    • Coding Assistants
    • Customer Service AI
  • AI Policy
    • ACLU AI
    • AI Now Institute
    • Center for AI Safety
  • Business AI
    • Advanced AI News Features
    • Finance AI
    • Healthcare AI
    • Education AI
    • Energy AI
    • Legal AI
LinkedIn Instagram YouTube Threads X (Twitter)
Advanced AI News
AI Search

understanding AI’s impact on content providers

By Advanced AI EditorJuly 3, 2025No Comments8 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Content publishers welcomed crawlers and bots from search engines because they helped drive traffic to their sites. The crawlers would see what was published on the site and surface that material to users searching for it. Site owners could monetize their material because those users still needed to click through to the page to access anything beyond a short title.

Artificial Intelligence (AI) bots also crawl the content of a site, but with an entirely different delivery model. These Large Language Models (LLMs) do their best to read the web to train a system that can repackage that content for the user, without the user ever needing to visit the original publication.

The AI applications might still try to cite the content, but we’ve found that very few users actually click through relative to how often the AI bot scrapes a given website. We have discussed this challenge in smaller settings, and today we are excited to publish our findings as a new metric shown on the AI Insights page on Cloudflare Radar.

Visitors to Cloudflare Radar can now review how often a given AI model sends traffic to a site relative to how often it crawls that site. We are sharing this analysis with a broad audience so that site owners can have better information to help them make decisions about which AI bots to allow or block and so that users can understand how AI usage in aggregate impacts Internet traffic.

How does this measurement work?

As HTML pages are arguably the most valuable content for these crawlers, the ratios displayed are calculated by dividing the total number of requests from relevant user agents associated with a given search or AI platform where the response was of Content-type: text/html by the total number of requests for HTML content where the Referer header contained a hostname associated with a given search or AI platform.

The diagrams below illustrate two common crawling scenarios, and show that companies may use different user agents depending on the purpose of the crawler. The top one represents a simple transaction where the example AI platform is requesting content for the purposes of training an LLM, representing itself as AIBot. The bottom one represents a scenario where the example AI platform is requesting content to service a user request — looking for flight information, for example. In this case, it is representing itself as AIBot-User. Request traffic from both of these user agents would be aggregated under a single platform name for the purposes of our analysis. 

When a user clicks on a link on a website or application, the client will often send a Referer: header as part of the request to the target site. In the diagram below, the example AI platform has returned content that contains links to external sites in response to a user interaction. When the user clicks on a link, a request is made to the content provider that includes ai.example.com in the Referer: header, letting them know where that request traffic came from. Hostnames are associated with their respective platforms for the purpose of our analysis.

The new metric is presented as a simple table, comparing the number of aggregate HTML page requests from crawlers (user agents) associated with a given platform to the number of HTML page requests from clients referred by a hostname associated with a given platform. The calculated ratio is always normalized to a single referral request.

The table below shows that for the period June 19-26, 2025, as an example, the ratios range from Anthropic’s 70,900:1 down to Mistral’s 0.1:1. This means that Anthropic’s AI platform Claude made nearly 71,000 HTML page requests for every HTML page referral, while Mistral sent 10x as many referrals as crawl requests. (However, traffic referred by Claude’s native app does not include a Referer: header, and we believe that the same holds true for traffic generated from other native apps as well. As such, because the referral counts only include traffic from the Web-based tools from these providers, these calculations may overstate the respective ratios, but it is unclear by how much.)

Of course, due in part to changes in crawling patterns, these ratios will change over time. The table above also displays the ratio changes as compared to the previous period, with changes ranging from increases of over 6% for DuckDuckGo and Yandex to Google’s 19.4% decrease. The week-over-week drop in Google’s ratio is related to an observed drop in crawling traffic from GoogleBot starting on June 24, while Yandex’s week-over-week growth is related to an observed increase in YandexBot crawling activity that started on June 21, as seen in the graphs below.

Radar’s Data Explorer includes a time series view of how these ratios change over time, such as in the Baidu example below. The time series data is also available through an API endpoint.

Patterns in referral traffic

Changes and trends in the underlying activity can be seen in the associated Data Explorer view, as well as in the raw data available via API endpoints (timeseries, summary). Note that the shares of both referral and crawl traffic are relative to the sets of referrers and crawlers included in the graphs, and not Cloudflare traffic overall.

For example, in the referrer-centric view below, covering nearly the first four weeks of June 2025, we can see that referral traffic is dominated by search platform Google, with a fairly consistent diurnal pattern visible in the data. (The google.* entry covers referral traffic from the main google.com site, as well as local sites, such as google.es or google.com.tw.) Because of prefetching driven by the use of speculation rules, referral traffic coming from Google’s ASN (AS15169) is specifically excluded from analysis here, as it doesn’t represent active user consumption of content.

Clear diurnal patterns are also visible in the referral request shares of other search platforms, although the request shares are a fraction of what is seen from Google. 

Throughout June, the share of traffic referred by AI platforms was significantly lower, even in aggregate, than the share of traffic referred by search platforms.

Changes in crawling traffic

As noted above, the change in ratio values over time can be driven by shifts in crawling activity. These shifts are visible in the crawling traffic shares available in Data Explorer, as well as in the raw data available via API endpoints (timeseries, summary). In the crawler-centric view below, covering nearly the first four weeks of June 2025, we can see that the share of requests related to Google’s crawling activity for both their Googlebot and GoogleOther identifiers falls over the course of the month, with several peak/valley periods. A similar pattern observed in HTTP request traffic from Google’s AS15169 during that same time period loosely matches this observed drop in share.

In addition, it appears that OpenAI’s GPTBot saw multiple periods where little-to-no crawling activity was observed throughout the month.

What this means for content providers

These ratios directly impact the viability of content publication on the Internet. While they will vary over time, the trend continues to be more crawls and fewer referrals when compared in relation to each other. Legacy search index crawlers would scan your content a couple of times, or less, for each visitor sent. A site’s availability to crawlers made their revenue model more viable, not less.

The new data we are observing suggests that is no longer the case. These models continue to consume more content, more frequently, despite sending the same or less traffic to the source of its content.

We have released new tools over the last year to help site owners take control back. With a single click, publishers can block the kinds of AI crawlers that train against their content. And today, we announced new ways to make the exchange of value fair for both sides of the equation. However, we continue to recommend that content creators audit and then enforce their preferred policies for AI crawlers.

In addition to providing these new insights around crawling and referral traffic and associated trends, we’ve also taken the opportunity to launch expanded Verified Bots content. The Bots page on Cloudflare Radar includes a paginated list of Verified Bots, displaying the bot name, owner, category, and rank (based on request volume). This list has now been expanded into a standalone directory in a new Bots section. The directory, shown below, displays a card for each Verified Bot, showing the bot name, a description, the bot owner and category, and verification status. Users can search the directory by bot name, owner, or description, and can also filter by category (selecting just Monitoring & Analytics bots, for example).

Clicking on a bot name within a card brings up a bot-specific page that includes metadata about the bot, information on how the bot’s user agent is represented in HTTP request headers and how it should be specified in robots.txt directives, and a traffic graph that shows associated HTTP request volume trends for the selected time period (with a default comparison to the previous period). Associated data is also available via the API. As we add additional information to these bot-specific pages in the future, we will document the updates in Changelog entries.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleChina’s Baidu Ramps Up AI Search, Unveils MuseSteamer AI Video Generator
Next Article Awesome Research For Everyone! – Two Minute Papers Channel Trailer
Advanced AI Editor
  • Website

Related Posts

Google defends AI summaries in search, says users want context over facts and 10 blue links

September 16, 2025

AI challenges the dominance of Google search

September 15, 2025

Apple AI and search executive Robby Walker may leave company

September 15, 2025
Leave A Reply

Latest Posts

Sylvester Stallone Owns Works by Warhol, Condo, and Other Art Stars

David Lynch’s Los Angeles Home and Studio on Sale for $15 M.

Picasso Inspires Name of Newly Discovered Microsnail

Rare Hieroglyphic Decree Identified in Egypt

Latest Posts

Standard Equipped with Audi’s Panoramic Assisted Driving, Starting at 235,900 Yuan_the_has_to

September 16, 2025

IBM Is Working on Combining Supercomputers with Quantum Computers

September 16, 2025

Moveworks Announces Partnership with Jamf to Transform Employee Support for Apple-First Organizations

September 16, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Standard Equipped with Audi’s Panoramic Assisted Driving, Starting at 235,900 Yuan_the_has_to
  • IBM Is Working on Combining Supercomputers with Quantum Computers
  • Moveworks Announces Partnership with Jamf to Transform Employee Support for Apple-First Organizations
  • Opus 2 – Litigation Doc AI Analysis – Artificial Lawyer
  • Sylvester Stallone Owns Works by Warhol, Condo, and Other Art Stars

Recent Comments

  1. im wettbüro des teufels on Accelerating Job Searches And Career Transitions
  2. Elvin on Cisco Unveils Foundation AI for Enhanced Security Integration
  3. szqiancong.com on China Using DeepSeek to Design Warplanes: Report
  4. روش انتخاب واحد پیام نور on 1-800-CHAT-GPT—12 Days of OpenAI: Day 10
  5. Wetten Quoten ErkläRung on A Bitter AI Lesson – Compute Reigns Supreme!

Welcome to Advanced AI News—your ultimate destination for the latest advancements, insights, and breakthroughs in artificial intelligence.

At Advanced AI News, we are passionate about keeping you informed on the cutting edge of AI technology, from groundbreaking research to emerging startups, expert insights, and real-world applications. Our mission is to deliver high-quality, up-to-date, and insightful content that empowers AI enthusiasts, professionals, and businesses to stay ahead in this fast-evolving field.

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

LinkedIn Instagram YouTube Threads X (Twitter)
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 advancedainews. Designed by advancedainews.

Type above and press Enter to search. Press Esc to cancel.