Saturday, January 11

Almost 90% of our AI spider traffic is from ByteDance

videobacks.net

This month, Fortune.com that ' scraper– called Bytespider– is strongly up to sustain . exact same thing when taking look at produced by HAProxy Edge– our that we ourselves utilize to serve for haproxy.com. A few of the we are seeing are relatively stunning, so let's evaluate the traffic sources and where they .

Our own , gathered by HAProxy Edge and filtered to traffic for haproxy.com, reveal a of intriguing figures:

While Bytespider is presently the most common AI spider, revealing that Bytedance is presently the source, we have actually formerly observed others (such as ClaudeBot) taking the leading . AI spider , like traffic, with .

What does AI traffic indicate for us– and you?

While we are mostly an business, we likewise consider ourselves to be a business; we initial, human-authored material– such as or that valuable to our and larger .

Content- existed long in the past LLMs began crawling the web for AI , and they have actually typically been thought about unfavorable content- sites. Lots of would not the scraping and possible re-use of their material, completely or in part, by a 3rd .

AI spiders utilized by LLMs come with distinct and .

  1. On one hand, an LLM re-use the initial material completely, or with some , or remixed with other material at the level of an LLM (approximately the level of a ). It is not likely that a understand where the initial material originated from. In where an LLM “hallucinates”, a user may get incorrect , for instance when asking for or directions.

  2. On the other hand, with lots of users turning to AI as an to , this is up being a crucial for and . may desire their or item info to be provided by chatbots in to user inquiries. If a user asks for a list of pertinent , a may desire their item to be consisted of in the list, along with and .

While we do not restrict AI spiders on our , we will to decide whether to continue to enable them or not. Other companies content-heavy sites will likely discover themselves needing to the exact same : to the of their material, or to permit the dissemination of details about their name and items through these .

ยป …
Find out more

videobacks.net