Embedded Human Computation for Knowledge Extraction and Evaluation
Embedded Human Computation for Knowledge Extraction and Evaluation
ERA-Net: CHIST ERA
Disciplines
Computer Sciences (100%)
Keywords
-
Human Computation,
Natural Language Proceesing,
Knowledge Resource Acquisition,
Ontology Engineering,
Open Evaluation Methods,
Heterogeneous Web Data
The rapid growth and fragmented character of social media and publicly available structured data challenges established approaches to knowledge extraction. Many algorithms fail when they encounter noisy, multilingual and contradictory input. Efforts to increase the reliability and scalability of these algorithms face a lack of suitable training data and gold standards. Given that humans excel at interpreting contradictory and context-dependent evidence, the uComp project will address the above mentioned shortcomings by merging collective human intelligence and automated methods in a symbiotic fashion. The project will build upon the emerging field of Human Computation (HC) in the tradition of games with a purpose and crowdsourcing marketplaces. It will advance the field of Web Science by developing a scalable and generic HC framework for knowledge extraction and evaluation, delegating the most challenging tasks to large communities of users and continuously learning from their feedback to optimise automated methods as part of an iterative process. A major contribution is the proposed foundational research on Embedded Human Computation (EHC), which will advance and integrate the currently disjoint research fields of human and machine computation. EHC goes beyond mere data collection and embeds the HC paradigm into adaptive knowledge extraction workflows. An open evaluation campaign will validate the accuracy and scalability of EHC to acquire factual and affective knowledge. In addition to novel evaluation methods, uComp will also provide shared datasets and benchmark EHC against established knowledge processing frameworks. While uComp methods will be generic and evaluated across domains, climate change was chosen as the main use case for its challenging nature, subject to fluctuating and often conflicting interpretations. Collaborating with international organisations such as EEA, NOAA and NASA will increase impact, provide a rich stream of input data, attract and retain a critical mass of users, and promote the adoption of EHC among a wide range of stakeholders.
Information systems that utilize artificial intelligence and semantic Web technologies are increasingly common, with a plethora of application domains. Such systems typically require a critical mass of training data to infer patterns and optimize their results. For providing such training data and to verify data generated by machine algorithms, human effort is still needed in many cases. Using human domain experts is typically very expensive and not scalable, therefore crowdsourcing techniques are used to tap into the collective intelligence of large user communities.Within the uComp project, the WU research team at WU worked on the integration of humans into knowledge creation and knowledge verification work flows. We designed, implemented and evaluated a plugin for the Protégé ontology editor, which we then used to measure the aspects of feasibility and scalability of using crowdsourcing in knowledge acquisition. We studied various setups to make effective use of crowd workers, and compared the results from crowd workers to results of human experts. To obtain generalizable results, we operated in multiple domains of knowledge, and utilized data sources in various languages. Furthermore, WU improved the underlying system to create knowledge structures by adding support for learning from heterogeneous sources of data (for example unstructured data from news media and social media) and storing the results in a semantic knowledge base. We conducted research into optimizing and balancing systems that process redundant and potentially conflicting input data, including quality control processes for content acquisition and human computation.
- Wirtschaftsuniversität Wien - 100%
- Patrick Paroubek, The Computer Sciences Laboratory for Mechanics and Engineering Sciences - France
- Wim Peters, University of Sheffield
Research Output
- 79 Citations
- 17 Publications
-
2022
Title Docking simulation and ADMET prediction based investigation on the phytochemical constituents of Noni (Morinda citrifolia) fruit as a potential anticancer drug DOI 10.1007/s40203-022-00130-4 Type Journal Article Author Chandran K Journal In Silico Pharmacology Pages 14 -
2012
Title Dynamic Integration of Multiple Evidence Sources for Ontology Learning. Type Journal Article Author Sabou M Et Al -
2012
Title Confidence Management for Learning Ontologies from Dynamic Web Sources. Type Conference Proceeding Abstract Author Sabou M Et Al Conference Filipe, Dietz (Eds), 4th International Conference on Knowledge Engineering and Ontology Development (KEOD-2012). -
2016
Title Extracting Social Networks from Literary Text with Word Embedding Tools. Type Conference Proceeding Abstract Author Ilvovsky D Et Al Conference Workshop Language Technology Resources and Tools for Digital Humanities (LT4DH) at COLING. -
2015
Title Exploring and Exploiting(?) the Awkward Connections Between SKOS and OWL. Type Journal Article Author Belk S Journal Arenas M et al (Eds), The Semantic Web: 14th International Semantic Web Conference (ISWC). -
2014
Title Using an Ontology Learning System for Trend Analysis and Detection. Type Journal Article Author Schett M Et Al Journal Proceedings of the ISWC 2014 Posters and Demonstrations Track, 13th International Semantic Web Conference (ISWC 2014). -
2014
Title The uComp Protégé Plugin: Crowdsourcing Enabled Ontology Engineering DOI 10.1007/978-3-319-13704-9_14 Type Book Chapter Author Hanika F Publisher Springer Nature Pages 181-196 -
2015
Title Leveraging and Balancing Heterogeneous Sources of Evidence in Ontology Learning DOI 10.1007/978-3-319-18818-8_4 Type Book Chapter Author Wohlgenannt G Publisher Springer Nature Pages 54-68 Link Publication -
2015
Title Similarity Metrics in Ontology Evolution. Type Conference Proceeding Abstract Author Savenkov V Conference Klinov P, Mourmotsev D (Ed), KESW 2015, Posters and Position Papers. -
2015
Title Optimizing Ontology Learning Systems that Use Heterogeneous Sources of Evidence DOI 10.1007/978-3-319-26181-2_13 Type Book Chapter Author Wohlgenannt G Publisher Springer Nature Pages 137-148 -
2016
Title Using word2vec to Build a Simple Ontology Learning System. Type Journal Article Author Minic F Journal Groth P et al (Ed), 15th International Semantic Web Conference (ISWC), Proceedings. -
2016
Title A Comparison of Domain Experts and Crowdsourcing Regarding Concept Relevance Evaluation in Ontology Learning DOI 10.1007/978-3-319-49397-8_21 Type Book Chapter Author Wohlgenannt G Publisher Springer Nature Pages 243-254 -
2013
Title Computing Semantic Association: Comparing Spreading Activation and Spectral Association for Ontology Learning DOI 10.1007/978-3-642-44949-9_29 Type Book Chapter Author Wohlgenannt G Publisher Springer Nature Pages 317-328 -
2015
Title A Trend Detection Platform based on Ontology Learning. Type Conference Proceeding Abstract Author Karacsonyi M Et Al Conference Klinov P, Mourmotsev D (Ed), KESW 2015, Posters and Position Papers. -
2014
Title The uComp Protege Plugin for Crowdsourcing Ontology Validation. Type Journal Article Author Hanika F Journal Horridge M et al (Eds): Proceedings of the ISWC 2014 Posters and Demonstrations Track a track within the 13th International Semantic Web Conference (ISWC 2014). -
2016
Title Detection of Valid Sentiment-Target Pairs in Online Product Reviews and News Media Articles DOI 10.1109/wi.2016.0024 Type Conference Proceeding Abstract Author Vakulenko S Pages 97-104 -
2016
Title Crowd-based ontology engineering with the uComp Protégé plugin DOI 10.3233/sw-150181 Type Journal Article Author Wohlgenannt G Journal Semantic Web Pages 379-398 Link Publication