Download

 Description

Category

Developer

Link

A XML-based file containing all Arabic characters (letters, vowels and punctuations). Each character described with a description, an Unicode, two transliterations (Buckwalter and wiki) and different display format (isolated, at the beginning, middle and the end of a word)

Arabic Characters

Taoufik Loukili

A XML-based file containing Arabic stop words. Two categories are considered: domain-independent and type-oriented stop words. For each class of these two categories, we provide the list of Arabic affixes that could be attached to them.

Arabic Stop Words

Taoufik Yazidi & Hicham Baidouri

An XML-based file containing all Arabic prefixes and suffixes

Arabic affixes

Taoufik Yazidi & Hicham Baidouri

Set of TREC (1500) and CLEF (800) questions in Arabic.
These questions have been expanded using an Arabic WordNet-based semantic Query Expansion process divided into four types: By Synonyms, By Definitions, By Subtypes and By Supertypes.

Arabic Questions for testing an Arabic Q/A system

Lahcen Abouennour