Arabic Language Technologies

ALT


At QCRI we are dedicated to promoting the Arabic language in the information age by conducting world-class research in Arabic language technologies.

Ensuring that the Arabic language flourishes in the digital world is a primary focal area of our research.  Some of our current research projects address the challenges related to lack of content and equally important, extracting that content. 

QCRI strives to become the regional and global leader in Arabic language technologies – in the areas of search, information retrieval and analysis, multilingual language processing, advanced machine translation and also leading efforts to increase and enrich Arabic language content online. 

We are working hard to help close the gap caused by the lack of valuable Arabic content on the web by engaging in efforts that increase and enrich this content.  Our partnership with Wikimedia Foundation marked the first stepin this initiative.Through our collaboration with the Arabic Wikipedian community, the number of editors and their productivity has increased.  We are working with universities and educational institutions to integrate Wikipedia into the curricula, which will also help increase and enrich online content through increasing knowledge and consumption.  The initial goal is the addition of 10,000 Arabic articles of the highest quality. “Online content” is not just restricted to documents.  We are also working with YouTube/Google onvideo content, as well as social media platforms such as Twitter.A critical component of the initiative is the creation of an outreach program to communicate with the Arabic Internet users to raise awareness and the delivery of the desired information.

QCRI’s initiatives do not only address the lack of content, it also addresses challenges in retrieving this content when it exists, making it accessible and enabling information flow across language barriers. In this regard, development is underway to process the Arabic language in the search domain such as the use of morphological word analysis, named entity recognitionand data learning technology to detect relevant content that can be used for more elaborate analysis. In addition, the development of proofing tools such as typographical checks and language identificationand the handling of different forms of the Arabic language in the form of local dialects and Arabic written using Latin characters.

A major effort at QCRI goes into improving machine translation for both text and speech.  Combining a “Speech-to-Text” engine that allows the instantaneous transcription of videos with machine translation system for dealing with the Arabic language allows access to broadcast news and news distributed over the web.  Future research will concentrate on applications such as lecture translation.

With our work in search and information retrieval, we have developed services that go beyond basic search functionality thus enabling a more exploratory search and in turn, better analytics of search results.  We have built search functionality that is more scalable and more language-aware.  Much of our work has been done in the social media domain, yet is transferable to other domains.  Our expertise in natural language processing and machine translation has helped build the foundation for this research. 

Bridging a gap identified in the education domain, we have established projects related to e-education, enabling people to access and learn material in a language not native to their own.  The development of an e-book reader with native Arabic support for the Arabic language, as well as an assistive language tutor are examples of such tools that will have an immediate impact on society and learning.

We have worked closely and collaborated with many local and international organizations including Al Jazeera, MIT and the Qatar Supreme Education Council on our projects.

Some of achievements and focus areas include:

  • Arabic speech recognition and understanding in formal Arabic الفصحى, in various colloquial Arabic dialects للهجات العامية, and in mixtures of these.
  • Machine translation of non-Arabic content (news, scientific articles, etc), and making it available on the web for easier access to Arabic speakers.
  • Arabic information storage and retrieval including key-word and semantic content indexing, search, summarization, and understanding.
  • Multilingual search involving on-the-fly translation of non-Arabic content in response to queries in Arabic.
  • Creation of computational language models for Modern Standard Arabic suitable for algorithmic manipulation in support of the above activities.
  • Development of Arabic language tutoring systems to teach Arabic to native speakers (K-12 students) as well as to professionals whose native language is not Arabic.

Principal Scientist

S Vogel

Dr. Stephan Vogel

Being part of a research institute in start-up mode, helping to build a strong team doing world class research, and at the same time experiencing a different environment in terms of culture and language, geography and climate.
Read more

Follow Us

  • YouTube
  • Twitter
  • Facebook
  • RSS Feed
  • Linkedin
Back to Top

In the News

Bild1_ISCRAM-2013.jpg

QCRI scientists recognized with best paper at ISCRAM 2013

22/05/2013

Extracting Information Nuggets from Disaster Related Messages in Social Media authored by QCRI's Muhammad Imran, Carlos Castillo, Patrick Meier, former QCRI post-doc Shady Elbassuoni and Fernando Diaz of Microsoft Research was recognized as the best paper at this year's ISCRAM conference.

Read More

forbes-logo.jpg

Crisis Maps: Harnessing the Power of Big Data to Deliver Humanitarian Assistance

19/05/2013

Dr. Patrick Meier talks about QCRI's work to solve major humanitarian challenges in Forbes.

Read More

qnrf logo.jpg

Research proposal receives QNRF NPRP award

15/05/2013

Dr. Halima Bensmail, QCRI Scientific Computing, is a key investigator on a research proposal awarded by QNRF's sixth NPRP cycle for the proposal titled: Quantitative mapping of HIV incidence among stable couples and evaluation of impact of interventions targeting sero-discordant couples.

Read More

Upcoming Events

2013

Default Thumbnail

PETRAE

Download ICS File 29/05/2013, Rhodes Island, Greece

Dr. Halima Bensmail, Senior Scientist of QCRI's Scientific Computing team, is an Invited speaker at the PETRAE, giving a talk on “Time course and Neural Representation of Memorability: faces and places: an alpha-sparse model for classifying memorability regions”

For more info on PETRAE please visit www.petrae.org

Read More

Default Thumbnail

NAACL 2013

Download ICS File 09/06/2013 - 14/06/2013, Atlanta, Georgia USA

The North American Chapter of the Association for Computational Linguistics (NAACL) Annual Conference takes place in Atlanta, Georgia (USA) from June 9 - 14, 2013.

Read More

Default Thumbnail

2013 ACM SIGMOD/PODS

Download ICS File 22/06/2013 - 27/06/2013, New York, New York, USA

Our Data Analytics team will be at 2013 SIGMOD / PODS in New York. The team will present two papers.

Read More

Press Releases

Default Thumbnail

QCRI invites applicants for summer internships

08/05/2013

Qatar Computing Research Institute (QCRI) will be kicking off its 2013 summer internship program soon, and invites computer science and computer engineering students to apply. The intensive two-month internship program provides students with the opportunity to work closely with top researchers, and receive practical work experience based on their studies. Applications will be accepted until May 12, 2013.

Read More

Patrick Meier talk

Presenting a Vision for the Future of Humanitarian Technology

31/03/2013

Dr Patrick Meier, Director of Social Innovation at Qatar Computing Research Institute (QCRI) and the world’s foremost expert on humanitarian technology, explores the rise of digital humanitarian response and how new technologies are reshaping the humanitarian space in an upcoming talk.

Read More

Boeing.jpg

QCRI and Boeing to collaborate on Data Analytics Research

24/03/2013

Joint project will seek to identify patterns in large data streams

Research collaboration broadens Boeing engagement in Qatar, strengthens relationship with Qatar Foundation

Read More