Download

 Description

Category

Developer

Link

A XML-based file containing all Arabic characters (letters, vowels and punctuations). Each character described with a description, a codification (Unicode), three transliterations (Buckwalter, wiki and Buckwalter-Habash-Soudi) and different display format (isolated, at the beginning, middle and the end of a word)

Arabic Characters

Taoufik Loukili

A XML-based file containing Arabic stop words. Two categories are considered: domain-independent and type-oriented stop words. For each class of these two categories, we provide the list of Arabic affixes that could be attached to them

Arabic Stop Words

Taoufik Yazidi & Hicham Baidouri

An XML-based file containing all Arabic prefixes and suffixes

Arabic affixes

Taoufik Loukili

Set of TREC (1500) and CLEF (800) questions in Arabic.
These questions have been expanded using an Arabic WordNet-based semantic Query Expansion process divided into four types: By Synonyms, By Definitions, By Subtypes and By Supertypes.

Arabic Questions for testing an Arabic Q/A system

Lahcen Abouennour