e-Society Project
Computer Science Div., School of Science and Engineering, Waseda Univ.
Yoichi MURAOKA and Hayato YAMANA
If you have any problems or questions、please send the e-mail to desk@yama.info.waseda.ac.jp.
(Do not send any e-mail to me, directly)
About the Project
The research project "Technologies for the Knowledge Discovery from
the Internet" is one of the year 2003 leading projects of Ministry
of Education, Culture, Sports, Science and Technology, Japan. The project
contractor is Waseda university. The project goal is to gather all the
Web pages in the world efficiently and to apply data mining to the gathered
Web pages to discover the Knowledge. The following is the detailed sub
goals.
1. R&D on new Web page crawler
Gather all the Web pages (about 12 billion pages) and keep their freshness in one month in average.
2. R&D on discovering the knowledge
Discover the knowledge that the user want to know.
In the project, 30 Web crawlers are running on 3 different sites. The detailed
manner of the crawlers is described as bellow. We run the crawlers attentively
not to disturb your Web sites. However, if you have any troubles, please
let us know. We will get rid of any problems as soon as we can.
As for the outcomes of the project, we will distribute them widely as research
papers. Moreover, the gathered Web pages are not distributed to others.
The gathered Web pages are just used to analyze the links and the keywords
to find out knowledge.
About our Web Crawler