IJEEEE 2013 Vol.3(6): 441-445 ISSN: 2010-3654
DOI: 10.7763/IJEEEE.2013.V3.275
DOI: 10.7763/IJEEEE.2013.V3.275
Clustering Web Pages Considering the Position of Each Word and the Search Term
Ryutaro Akiyama, Katsutoshi Kanamori, and Hayato Ohwada
Abstract— Users can easily find the pages they are seeking by clustering the web pages of search results obtained using a search engine. The vector space method is often used to cluster web pages. However, the method that has been conventionally used has low clustering accuracy and high computational cost. In this study, we propose a method to solve these problems. We assume that the words that appear near the search term have a high degree of importance. We then try to solve the problems by considering the distance to the search term in the text. We conducted verification experiments in Japan for Japanese search terms. The results confirmed that the proposed method considering the distance to the search term in the text has higher clustering accuracy and lower computational cost than the conventional method.
Index Terms— Information retrieval, search engine, web clustering, web mining.
Ryutaro Akiyama was with the Department of Industrial Administration, Faculty of Science and Technology, Tokyo University of Science, Japan. He is now with the Department of Industrial and Management Systems Engineering, Creative Science and Engineering, Graduate School, Waseda University, Japan (e-mail: r-akiyama@akane.waseda.jp).
Katsutoshi Kanamori and Hayato Ohwada are with the Department of Industrial Administration, Faculty of Science and Technology, Tokyo University of Science, Japan (e-mail: katsu@rs.tus.ac.jp, ohwada@rs.tus.ac.jp).
Index Terms— Information retrieval, search engine, web clustering, web mining.
Ryutaro Akiyama was with the Department of Industrial Administration, Faculty of Science and Technology, Tokyo University of Science, Japan. He is now with the Department of Industrial and Management Systems Engineering, Creative Science and Engineering, Graduate School, Waseda University, Japan (e-mail: r-akiyama@akane.waseda.jp).
Katsutoshi Kanamori and Hayato Ohwada are with the Department of Industrial Administration, Faculty of Science and Technology, Tokyo University of Science, Japan (e-mail: katsu@rs.tus.ac.jp, ohwada@rs.tus.ac.jp).
Cite: Ryutaro Akiyama, Katsutoshi Kanamori, and Hayato Ohwada, " Clustering Web Pages Considering the Position of Each Word and the Search Term," International Journal of e-Education, e-Business, e-Management and e-Learning vol. 3, no. 6, pp. 441-445, 2013.
General Information
ISSN: 2010-3654 (Online)
Abbreviated Title: Int. J. e-Educ. e-Bus. e-Manag. e-Learn.
Frequency: Quarterly
DOI: 10.17706/IJEEEE
Editor-in-Chief: Prof. Kuan-Chou Chen
Executive Editor: Ms. Nancy Lau
Abstracting/ Indexing: EBSCO, Google Scholar, Electronic Journals Library, QUALIS, ProQuest, INSPEC (IET)
E-mail: ijeeee@iap.org
-
Nov 04, 2022 News!
The paper published in Vol 12, No 4 has received dois from Crossref
-
Oct 28, 2022 News!
IJEEEE Vol 12, No 4 is available online! [Click]
-
Jul 28, 2022 News!
The papers published in Vol 12, No 2 & No 3 have all received dois from Crossref
-
Jul 26, 2022 News!
IJEEEE Vol 12, No 3 is available online! [Click]
-
Apr 25, 2022 News!
IJEEEE Vol 12, No 2 is available online! [Click]
- Read more>>