Homepage of Ansgar Scherp: About Me (http://ansgarscherp.net/#me)

Menu: [Overview] • [About Me] • [Projects] • [Publications] • [Dissertation] • [Habilitation]


Education and Professional Experience

Image showing a Portrait of Ansgar Scherp Ansgar is Full Professor for Data Science and Big Data Analytics with Ulm University, Germany. Prior, he worked as Professor of Natural Language Processing and Data Analytics and was member of the interdisciplinary Language and Computation group with the University of Essex, England, UK. Ansgar was a Associate Professor for Data Science and Predictive Analytics with the University of Stirling, Scotland UK, from August to December 2018. He had a fixed-term Professorship of Knowledge Discovery at Kiel University (W2 level, roughly equivalent to associate professor) and ZBW—Leibniz Information Centre for Economics in Kiel, Germany from January 2014 to July 2018. In Kiel, he was scientific leader of the EU Horizon 2020 project MOVING, enabling young researchers, decision makers, and public administrators to employ and use machine learning and data mining tools to search, organize, and manage large-scale information sources on the web such as scientific publications, videos of research talks, social media, etc. Before joining Kiel University, Ansgar was Juniorprofessor (W1 level, roughly equivalent to assistant professor) at the University of Mannheim, and Postdoctoral research associate as well as Juniorprofessor at the University of Koblenz-Landau in Coblenz, Germany. In Coblenz, he was work package leader for the EU FP 7 projects—WeKnowIt and Social Sensor. Previously, Ansgar acquired a prestigious Marie Skłodowska-Curie Fellowship of the EU for a 1-year research stay at the University of California at Irvine, California.

Ansgar has an excellent research reputation in Text and Graph Mining, specifically in the combination of symbolic and subsymbolic (statistical) methods for data analysis. He has won the Billion Triples Challenge at the International Semantic Web conference in 2008 and 2011. The goal of the Billion Triple Challenge is to demonstrate scalability of semantic technologies. Ansgar is elected speaker at the ACM SIGMM Rising Stars Symposium of the Special Interest Group on Multimedia (SIGMM) of the Association for Computing Machinery (ACM) that was held in October in Amsterdam honoring his 10 years of research in metadata mining and semantics. He published over 150 peer reviewed conference papers and journal articles.

Research Interests

My research interests are in novel approaches for data analysis by combining symbolic and statistical methods. I bring together methods from Information Retrieval, Data Mining and Machine Learning, and Semantic Web. I apply my novel data analysis approaches to, e. g., very large, distributed Knowledge Graphs on the web with billions of edges or large-scale document corpora in domains like life sciences/medicine, social sciences, economics, and the web.

I contribute to multiple research areas. As contribution to combining Machine Learning and Semantic Web, I have used and compared methods like Association Rules and Learning to Rank to provide a tool for modeling semantic data on the web [see CV: C54, C53]. I have used regression models to keep data caches up-to-date based on predicted data changes [see CV: C58] and have analyzed the evolution of Knowledge Graphs with logistic regression models and random forests for the purpose of change verification [see CV: C61]. I have also investigated classical and modern machine learning methods to compare text classification into a semantic thesaurus by using only the titles vs. the full-text of documents [see CV: C60]. In a work from January 2018, I managed to show that modern Deep Learning methods applied to a very large number of titles of scientific documents can yield competitive or even better classification results compared to using the full text [see CV: C63].

Regarding Information Retrieval and Semantic Web, I have developed a novel profiling method called HCF-IDF that combines the statistical strength of the popular TF-IDF model with the semantics of domain-specific thesauri [see CV: C55]. With HCF-IDF I have demonstrated in an online study with n=123 economists that one can provide scientific paper recommen­da­tions based on only the titles of the publications that is competitive compared to using the full-text. In addition, I have developed, with SchemEX, an approach for a stream-based computation of a schema-level index for very large distributed graph data [see CV: J12]. The index can be used to search the web for specific data sources just like Google for web documents [see CV: C27]. The idea of a stream-based computation of an index over graph data won the Billion Triple Challenge of the International Semantic Web Conference in 2011. The goal of the Billion Triple Challenge is to demonstrate scalability of semantic technologies.

Honors and Awards

Community Service

Ansgar is editor of the Journal of Web Semantics (JWS) since 2010. He is program committee member for conferences including World Wide Web (WWW), ACM Multimedia (MM), Multimedia Modeling (MMM), Extended Semantic Web Conference (ESWC), and International Semantic Web Conference (ISWC). He also reviews for journals including Proceedings of Very Large Data Base Endowment (PVLDB), IEEE Multimedia, Springer's Multimedia Systems and Multimedia Tools and Applications (MTAP), ACM Transactions on Multimedia Computing Communications and Applications (TOMCCAP), Journal of Web Semantics (JWS), and International Journal on Human Computer Studies (IJHCS). Ansgar is co-organizer of several scientific events such as the ACM Workshop series on Events in Multimedia conjunct with ACM Multimedia Beijing, China, 2009 and Firence, Italy, 2010, and Scottsdale, AZ, USA, in 2011. The workshop aims at bringing together different disciplines interested in detecting, processing, representing, and using events in multimedia and social media. Due to the workshop's success, the topic became its own area at the ACM Multimedia conference in 2012. Furthermore, Ansgar led the doctoral programme of INFORMATIK, the annual German computer science society meeting, in 2013, 2014, and 2015.


Keynote Talks

Invited Talks

Most Important Publications

For a complete list, please refer to the list here or to my DBLP page.

Supervised Phd Theses

Supervised Master Theses

(incomplete list)

Last update: 10/13/2020.