Sogou Web Spider
| Vendor | Sogou (Tencent-owned) |
| Type | Search crawler (China) |
| robots.txt token | Sogou web spider |
| JavaScript rendering | Minimal |
| Honors robots.txt | Yes (per vendor docs; some reports of aggressive crawling) |
User-Agent strings
Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
Variants:
Sogou News Spider/4.0Sogou Pic Spider/3.0Sogou Orion Spider/3.0
Purpose
The crawler behind Sogou Search, a Chinese search engine majority-owned by Tencent. Smaller market share than Baidu but non-trivial in China.
Quirks
- Aggressive crawl patterns have been reported by webmasters;
Crawl-delay: is recommended for sites that experience load issues.
- Multiple spider variants for different content types.
- The robots.txt token contains a space —
Sogou web spider— which
is unusual. The space is required.
How to allow / block
User-agent: Sogou web spider
Crawl-delay: 10
# To block:
User-agent: Sogou web spider
Disallow: /