Home > Archive > 2011 > Volume 1 Number 3 (Aug. 2011) >
IJIET 2011 Vol.1(3): 212-220 ISSN: 2010-3689
DOI: 10.7763/IJIET.2011.V1.35

Challenging Massive Information Retrieval in Persian

Mohadese Danesh, Behrouz Minaei and Omid Kashefi

Abstract—As the amount of information and the number of queries has been increasing today, indexing is a good solution to fight with the inherent complexity of text retrieval and accelerating information retrieval in different languages. Also N-Gram Indexing is a solution of the issues such as stemming, misspellings, multilingual and partial matching and has the advantages of language independent and error endurance. Persian is a name of a language which is common in the Middle East. It is spoken in some countries like Iran, Afghanistan and Tajikistan. Therefore, Persian is the language of many documents is published on the net. But, not more researches have been done about the Persian documents retrieval. In this paper, we present a method for Persian documents retrieving using N-gram indexing and distribution technique. The proposed index is a method of more effective answering queries that increases the quality of information retrieval substantially and we gain more optimizing retrieval in Persian documents. But the speed of N-gram indexing is low; to solve this problem we design a distributed N-gram indexing mechanism for large systems of Persian language. Compare with the other methods in this field, we improve the quality of retrieved documents and also the speed of information retrieval.

Index Terms—Information Retrieval, Indexing, N-Gram, Distributed, Persian

Mohadese Danesh is with the School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran (e-mail: mddanesh@comp.iust.ac.ir).
Behrouz Minaei is with the School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran (e-mail: b_minaei@iust.ac.ir).
Omid Kashefi is with the School of Computer Engineering, Iran University of Science and Technology, Tehran, Iran (e-mail: kashefi@{iust.ac.ir, ieee.org}).

[PDF]

Cite: Mohadese Danesh, Behrouz Minaei and Omid Kashefi, "Challenging Massive Information Retrieval in Persian," International Journal of Information and Education Technology vol. 1, no. 3, pp. 212-220, 2011.

General Information

  • ISSN: 2010-3689 (Online)
  • Abbreviated Title: Int. J. Inf. Educ. Technol.
  • Frequency: Monthly
  • DOI: 10.18178/IJIET
  • Editor-in-Chief: Prof. Jon-Chao Hong
  • Managing Editor: Ms. Nancy Y. Liu
  • Abstracting/ Indexing: Scopus (CiteScore 2022: 2.0), INSPEC (IET), UGC-CARE List (India), CNKI, EBSCO, Google Scholar
  • E-mail: ijiet@ejournal.net

 

Article Metrics in Dimensions