Japanese Page


About
Waseda Univ. and Baidu Collaborated Web Crawler

Yamana Lab., Computer Science and Engineering Div., Waseda Univ.
Baidu, Inc.

(Oct.2008)


If you have any problems or questionsAplease send the e-mail to
When you send us a e-mail, please include your Web server's URL.

IP address used by our W_Univ_BJ_spider

IP Addresses of W_Univ_BJ_spider
119.63.193.209

About our Project

@Our research project is a part of collaborated research between Waseda University and Baidu, Inc. Our crawler gathers Web pages mainly distributed from Japan in order to analyze their contents for our research. A part of gathered Web pages will be indexed for Baidu search engine.


About our Web Crawler

How to deny accesses from our Crawler

User-Agent: W_Univ_BJ_spider
disallow: /
User-Agent: W_Univ_BJ_spider
disallow: /*.pdf$
"*" matches any characters
"$" matches any characters placed at the end of URL.
In the above example, all URLs including the spesified characters, such as "abc.pdff", will be matched if "$" is missed.
User-Agent: W_Univ_BJ_spider
crawl-delay:600
User-agent: e-SocietyRobot
Disallow: /