Top 10 Search Engine Crawlers and Bots Name
There are many search bots Now I am telling top ten popular web crawler
1. GoogleBot
Googlebot is obviously one of the most popular web crawlers on the internet today as it is used to index content for Google’s search engine.
Googlebot Example in Robots.txt
This example displays a little more granularity pertaining to the instructions defined. Here, the instructions are only relevant to Googlebot. More specifically, it is telling Google not to index a specific page: your-page.html.
User-agent: Googlebot
Disallow: /no-index/your-page.html
Besides Google’s web search crawler, they actually have some additional web crawlers:
Web Crawler | User-Agent String |
Googlebot News | Googlebot-News |
Googlebot Images | Googlebot-Image/1.0 |
Googlebot Video | Googlebot-Video/1.0 |
Google Adsense | Mediapartners-Google |
Google AdsBot (PPC landing page quality) | AdsBot-Google (+http://www.google.com/adsbot.html) |
Google app crawler (fetch resources for mobile) | AdsBot-Google-Mobile-Apps |
You can use the Fetch tool in Google Search Console to test how Google crawls or renders a URL on your site. See whether Googlebot can access a page on your site, how it renders the page, and whether any page resources are blocked to Googlebot.
Google+
Another one you might see popup is Google+. When a user shares a URL on Google+ . This service is different than the Googlebot that crawls and indexes your site. These requests do not honor robots.txt or other crawl mechanisms because this is a user-initiated request.
User-Agent
Google (+https://developers.google.com/+/web/snippet/)
2. Second is Bingbot
Bingbot is a web crawler deployed by Microsoft in 2010 to supply information to their Bing search engine. This is the replacement of what used to be the MSN bot.
User-Agent is bingbot
Bing also has a very similiar tool as Google, called Fetch as Bingbot, within Bing Webmaster Tools
3. Slurp Bot
Yahoo Search results come from the Yahoo web crawler Slurp and Bing’s web crawler, as a lot of Yahoo is now powered by Bing. Sites should allow Yahoo Slurp access in order to appear in Yahoo Mobile Search results.
Additionally, Slurp does the following:
- Collects content from partner sites for inclusion within sites like Yahoo News, Yahoo Finance and Yahoo Sports.
- Accesses pages from sites across the Web to confirm accuracy and improve Yahoo’s personalized content for our users.
User-Agent is Slurp
4. DuckDuckBot
DuckDuckBot is the Web crawler for DuckDuckGo, a search engine that has become quite popular lately as it is known for privacy and not tracking you. It now handles over 12 million queries per day. DuckDuckGo gets its results from over four hundred sources, DuckDuckBot (their crawler) and crowd-sourced sites (Wikipedia). They also have more traditional links in the search results, which they source from Yahoo!, Yandex and Bing.
User-Agent is DuckDuckBot
5. Baiduspider
Baiduspider is the official name of the Chinese Baidu search engine’s web crawling spider. It crawls web pages and returns updates to the Baidu index. Baidu is the leading Chinese search engine that takes an 80% share of the overall search engine market of China Mainland.
User-Agent is Baiduspider
Web Crawler | User-Agent String |
Image Search | Baiduspider-image |
Video Search | Baiduspider-video |
News Search | Baiduspider-news |
Baidu wishlists | Baiduspider-favo |
Baidu Union | Baiduspider-cpro |
Business Search | Baiduspider-ads |
Other search pages | Baiduspider |
6. Yandex Bot
YandexBot is the web crawler to one of the largest Russian search engines, Yandex. According to LiveInternet, for the three months ended December 31, 2015, they generated 57.3% of all search traffic in Russia.
User-Agent is YandexBot
8. Exabot
Exabot is a web crawler for Exalead, which is a search engine based out of France. It was founded in 2000 and now has more than 16 billion pages currently indexed.
User-Agent is Exabot
9. Facebook External Hit
Facebook crawling bots is Facebot, which is designed to help improve advertising performance.
User-Agent is facebot
10. Alexa Crawler
Ia_archiver is the web crawler for Amazon’s Alexa internet rankings. As you probably know they collect information to show rankings for both local and international sites.
User-Agent is ia_archiver
Comments
Post a Comment