ARABIC LANGUAGE COMPUTING RESEARCH @ Leeds
Call for Extended Abstracts by 30/10/13: Special Issue: Arabic NLP in the Journal of King Saud University - Computer and Information Sciences
WACL2 Second Workshop on Arabic Corpus Linguistics - * PROCEEDINGS *
Monday 22nd July 2013, Lancaster University, UK
Keynote speaker: Claire Brierley, University of Leeds
"Natural Language Processing Working Together With Arabic And Islamic Studies"
Organisers: Eric Atwell, University of Leeds; Andrew Hardie, Lancaster University
LREC'2012 and LRE-Rel - We attended the LREC'2012 Language Resources and Evaluation Conference in Istanbul, with over a thousand delegates. We hosted a pre-conference workshop on Language Resources and Evaluation for Religious Texts, mostly the Quran and Islamic texts, see LRE-Rel Workshop Proceedings:
... and we also contributed to the main LREC conference, co-authoring 5 papers on computer analysis of Quranic Arabic, see LREC'2012 Conference Proceedings:
Video: Abdullah Alfaifi - Arabic Learner Corpus
Video: Abdul-Baquee Muhammad Sharaf -
Text Mining the Quran
The Language research group in the School of Computing has an ongoing interest in corpus-based research on Arabic. Central to our research is the computational modelling of language data; a CORPUS is a text dataset representative of the language to be analysed.
Eric Atwell, Senior Lecturer.
Research Interests: Corpus Linguistics, Arabic language processing,
technologies for knowledge management applied to the Quran,
making sense of surveillance and intelligence data,
Unsupervised and Supervised Machine Learning from corpora,
chatbots and their applications,
national varieties of Arabic, Arab English,
morphosyntactic and Part-of-Speech tagging, evaluation.
Research Student projects in Arabic language computing
|Abdullah Alfaifi||Building an Arabic Learner Corpus (ALC) with Part-of-Speech (POS) Tagging and Error Annotation|
|Amal Alsaif||An Automatic analyser of Discourse structure for Arabic|
|Kais Dukes||Arabic Language Computing Applied to the Quran|
|Majdi Sawalha||Automatic Part-of-Speech Tagging of Arabic Language Text|
|Abdul-Baquee Sharaf||A Computational Model for Knowledge Representation of the Quran|
Alumni: graduates of the Arabic language computing research group
|Noorhan Abbas, 2009.||Quran 'Search for a Concept' tool and website|
|Bayan Abu Shawar, 2005.||A Corpus Based Approach to Generalise a Chatbot System|
|Latifa Al-Sulaiti, 2004.||Designing and Developing a Corpus of Contemporary Arabic|
|Eric Atwell, 2008.||Corpus Linguistics and Language Learning: Bootstrapping Linguistic Knowledge and Resources from Text|
|Andy Roberts, 2008.||Grammatical Inference and Corpus linguistics|
Research Facilties at Leeds UniversityResearch facilities in the School of Computing at Leeds University include a dedicated high speed network infrastructure, a wide range of corpora (Arabic, English and many other languages), software tools for corpus analysis, language analysis, machine learning and data mining, and software development. Staff teach Research-led undergraduate and postgraduate courses, for example Natural Language Processing, Knowledge Management and Adaptive Systems, Language. Leeds University is unique in having a very wide range of language research expertise: Arabic language computing researchers can learn from and collaborate with researchers in a wide range of departments across Leeds University:
We welcome applications to join us as PhD research students, or as research sponsors and/or collaborators.
Our publications in Arabic language computing
[pdf] Sawalha, M; Atwell, ES Constructing and Using Broad-coverage Lexical Resource for Enhancing Morphological Analysis of Arabic in: Proceedings of LREC'2010 Language Resources and Evaluation Conference. 2010.
[pdf] Atwell, ES; Dukes, K; Abdul Baquee, S; Habash, N; Louw, B; Abu Shawar, B; McEnery, T; Zaghouani, W; El-Haj, M Understanding the Quran: a new Grand Challenge for Computer Science and Artificial Intelligence. Proceedings of GCCR'2010 Grand Challenges in Computing Research. 2010.
[pdf] Hassan, H; Daud, N; Atwell, ES Connectives in the World Wide Arabic corpus . Proceedings of IVACS'2010 Inter-Varietal Applied Corpus Studies Conference. 2010.
[pdf] Sawalha, M; Atwell, ES Fine-Grain Morphological Analyzer and Part-of-Speech Tagger for Arabic Text in: LREC'2010 Language Resources and Evaluation Conference. 2010.
[pdf] Abu Shawar, B; Atwell, ES Chatbots: Can they serve as Natural Language interfaces to QA corpus? in: Proceedings of ACSE'2010: Sixth IASTED International Conference on Advances in Computer Science and Engineering. 2010.
[pdf] Dukes, K; Atwell, ES; Abdul Baquee, S Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank. LREC'2010 Language Resources and Evaluation Conference. 2010.
[pdf] Sawalha, M; Atwell, ES Adapting Language Grammar Rules for Building a Morphological Analyzer for Arabic Text(in Arabic). Proceedings of ALECSO Arab League Educational Cultural and Scientific Organization workshop on Arabic morphological analysis. 2009.
[pdf] Sharaf, A; Atwell, ES A Corpus-based Computational Model for Knowledge Representation of the Quran. Proceedings of CL2009 International Conference on Corpus Linguistics. 2009.
[pdf] Abu Shawar, B; Atwell, ES Arabic Question-Answering via Instance Based Learning from an FAQ Corpus. Proceedings of CL2009 International Conference on Corpus Linguistics. 2009.
[pdf] Atwell, ES; Al-Sulaiti, L; Sharoff, S Arabic and Arab English in the Arab World. Proceedings of CL2009 International Conference on Corpus Linguistics. 2009.
[pdf] Pritchard, J; Atwell, E; Newman, M; Dorling, D; Hall, F Mapping Language: From data to diaspora. Proceedings of Workshop on Research Infrastructure for Linguistic Variation. University of Oslo. 2009.
[pdf] Sawalha, M; Atwell, ES Linguistically Informed and Corpus Informed Morphological Analysis of Arabic. Proceedings of CL2009 International Conference on Corpus Linguistics. 2009.
[pdf] Sawalha, Majdi; Atwell, Eric. Comparative evaluation of Arabic language morphological analysers and stemmers. Proceedings of COLING 2008 22nd International Conference on Computational Linguistics. 2008.
[pdf] Atwell, Eric; Abbas, Noorhan; Abu Shawar, Bayan; Alsaif, Amal; Al-Sulaiti, Latifa; Roberts, Andrew; Sawalha, Majdi. Mapping Middle Eastern and North African diasporas: Arabic corpus linguistics research at the University of Leeds in: Proceedings of BRISMES Conference 2008. 2008.
[pdf] Atwell, Eric. A cross-language methodology for corpus Part-of-Speech tag-set development in: Proceedings of Corpus Linguistics 2007. 2007.
[pdf] Al-Sulaiti, Latifa; Atwell, Eric. The design of a corpus of contemporary Arabic. International Journal of Corpus Linguistics, vol. 11, pp. 135-171. 2006.
[pdf] Roberts, Andrew; Al-Sulaiti, Latifa; Atwell, Eric. aConCorde: Towards an open-source, extendable concordancer for Arabic. Corpora journal, vol. 1, pp. 39-57. 2006.
[pdf] Abu Shawar, Bayan; Atwell, Eric. Using corpora in machine-learning chatbot systems. International Journal of Corpus Linguistics, vol. 10, pp. 489-516. 2005.
[pdf] Al-Sulaiti, Latifa; Roberts, Andrew; Atwell, Eric. The use of corpora and concordance in the teaching of contemporary Arabic in: Proceedings of EuroCALL 2005. 2005.
[pdf] Al-Sulaiti, Latifa; Atwell, Eric. Extending the corpus of contemporary Arabic in: Proceedings of Corpus Linguistics 2005. 2005.
[pdf] Roberts, Andrew; Al-Sulaiti, Latifa; Atwell, Eric. aConCorde: towards a proper concordance of Arabic in: Proceedings of Corpus Linguistics 2005. 2005.
[pdf] Al-Sulaiti, Latifa. The North African Experience. ElSNews: Newsletter of the European Language and Speech Research Network, Vol 13.1, pp.11-12. 2004.
[pdf] Atwell, Eric. Clustering of word types and unification of word tokens into grammatical word-classes in: Bel, B &Marlien, I (editors) Proceedings of TALN04: XI Conference sur le Traitement Automatique des Langues Naturelles, Volume 1, pp. 27-32 ATALA. 2004.
[pdf] Abu Shawar, Bayan; Atwell, Eric. An Arabic chatbot giving answers from the Qur'an in: Bel, B & Marlien, I (editors) Proceedings of TALN04: XI Conference sur le Traitement Automatique des Langues Naturelles, Volume 2, pp. 197-202 ATALA. 2004.
[pdf] Atwell, Eric; Al-Sulaiti, Latifa; Al-Osaimi, Saleh; Abu Shawar, Bayan. A review of Arabic corpus analysis tools in: Bel, B & Marlien, I (editors) Proceedings of TALN04: XI Conference sur le Traitement Automatique des Langues Naturelles, Volume 2, pp. 229-234 ATALA. 2004.
[pdf] Abu Shawar, Bayan; Atwell, Eric. Evaluation of chatbot systems in: Proceedings of Eighth Maghrebian Conference on Software Engineering and Artificial Intelligence. 2004.
[pdf] Al-Sulaiti, Latifa; Atwell, Eric. Designing and developing a corpus of contemporary Arabic in: TALC 2004: Proceedings of the sixth Teaching And Language Corpora conference, pp. 92-93. 2004.
[pdf] Atwell, Eric; Abu Shawar, Bayan; Babych, Bogdan; Elliott, Debbie; Elliott, John; Gent, Paul; Hartley, Anthony; Hu, Xunlei Rose; Medori, Julia; Oba, Toshifumi; Roberts, Andy; Scharoff, Serge; Souter, Clive. Corpus Linguistics, Machine Learning and Evaluation: Views from Leeds University of Leeds, School of Computing research report 2003.02. 2003.
[pdf] Al-Sulaiti, Latifa; Atwell, Eric. The Design of a Corpus of Contemporary Arabic (CCA) University of Leeds, School of Computing research report 2003.11. 2003.
[pdf] Al-Sulaiti, Latifa. Computer Assisted Language Learning (CALL). ElSNews: Newsletter of the European Language and Speech Research Network, Vol 12.1, pp.1-3. 2003.
[pdf] Al-Sulaiti, Latifa; Knowles, Gerry. A Multimedia Arabic Course. In Proceedings of the International Symposium on the processing of Arabic, University of Manouba, Tunis, Tunisia, pp. 94-105. 2002.
[pdf] Atwell, Eric. The Language Machine., 64pp The British Council. 1999.
Brockett, A; Atwell, E S; Taylor, O; Page, M. An Arabic text database and glossary system for students in Proceedings of the Seminar on Bilingual Computing in Arabic and English, pp154-162, University of Cambridge. 1989.