Light Stemming for Arabic Information Retrieval Leah S. Larkey Univ. of Massachusetts Dept. of Computer Science Amherst, MA 01003 Lisa Ballesteros Computer Science Dept. Mt. Holyoke College South Hadley, MA 01075 Margaret E. Connell Univ. of Massachusetts Dept. of Computer Science Amherst, MA 01003 ABSTRACT Computational Morphology is an urgent problem for Arabic Natural Language Processing, because Arabic is a highly inflected language.
- stem- classes
- statistical machine translation approach
- text collections
- availability of standard arabic data sets from the nist
- cross- language retrieval
- stemmer
- distinct words
- roots
- nist
- arabic