Read our feature in the December 2012 Sigmod Record.
The Data Analytics group at QCRI has built expertise focused on three core data management challenges that will enable the effective use of this growing asset class: extraction from its natural digital habitat, integration from a large and evolving number of sources, and robust cleaning processes to assure data quality and validation.
The Data Trio: Extraction, Integration, and Cleaning: Institutions and industries at a national level deal with large scale, heterogeneous data collected from large number of sources. The main challenge is a judicious use of the information within and across organizations to make informed decisions and to run operations effectively.
At QCRI, we are focusing on the interaction among three core data management challenges that will enable effective use of the continuously growing data: Information Extraction, Data and Schema Integration, and Data Cleaning.
Going beyond traditional ETL approaches, we are investigating multiple new directions, including: handling unstructured data; interleaving extraction, integration, and cleansing tasks in a more dynamic and interactive process that responds to evolving data sets and real-time decision-making constraints; and leveraging the power of human cycles to solve hard problems such as data cleaning and information integration.
Scalable Knowledge Models: Grand challenges mean big data. ‘Knowledge base’ is the term commonly used to refer to data, along with the rules and the logic that describe the information within this data. Large-scale knowledge management is a core-computing challenge due to the expensive process involved in reasoning about the data and inferring the facts and the various semantics embedded within. We focus on developing efficient knowledge representation models and semantic-aware query languages and processing engines that bring semantics to real applications. Main applications domains include media and health, where current approaches are either too expensive or fall short in delivering user needs.
To be part of something different than what I had been used to at Purdue University and contribute to the first computing research institution in the region.
Storing, managing, retrieving, and mining Big Data is one of the most difficult computing challenges of our times. Along with my colleagues in the Data Analytics Group at QCRI, I am interested in enabling end-users to utilize large datasets to the fullest by designing infrastructure and algorithms, and applying data and text mining techniques.
QCRI provides an ideal environment to conduct high-impact research which can transcend disciplinary boundaries.
In the Media
The Agora Dark Web market cited Tor Hidden Services security vulnerabilities that could allow de-anonymization attacks and temporarily shut down operations after detecting suspicious activity on its ...
Atletico Madrid used few predictable passing patterns in the 2013/14 season – and won the league that year (Image: ADRIAN DENNIS/AFP/Getty) Who really calls the shots in team sports? The players? The...
VLDB is a premier annual international forum for data management and database researchers, vendors, practitioners, application developers, and users. The conference will feature research talks, ...
The premier conference on natural language processing, organized under SIGDAT , the Association for Computational Linguistics special interest group on linguistic data and corpus-based approaches to ...
Hands-On Programme Offers Undergraduate Students An Opportunity To Conduct Research And Gain Real-World Experience Doha, Qatar, 02 June 2015 - Enjoying its fourth consecutive year of success, the ...
The Provost of Carnegie Mellon University Brings A Wealth Of Knowledge And Expertise To Qatar Foundation-Based Research Institute
Doha, Qatar, May 25 2015: Emergency responders are still dealing with the fallout of Nepal’s two devastating earthquakes and using advanced technology and platforms, some of which have been developed...