Workshop: Workshop on Deep Learning for Knowledge Graphs (DL4KG)
Organizers: Mehwish Alam, Davide Buscaldi, Michael Cochez, Francesco Osborne, Diego Reforgiato and Harald Sack
Over the past years there has been a rapid growth in the use and the importance of Knowledge Graphs (KGs) along with their application to many important tasks. KGs are large networks of real-world entities described in terms of their semantic types and their relationships to each other. On the other hand, Deep Learning methods have also become an important area of research, achieving some important breakthrough in various research fields, especially Natural Language Processing (NLP) and Image Recognition. In order to pursue more advanced methodologies, it has become critical that the communities related to Deep Learning, Knowledge Graphs, and NLP join their forces in order to develop more effective algorithms and applications. This workshop aims to reinforce the relationships between these communities and intends to be at the center of shared works around topics such as Deep Learning, Knowledge Graphs, Natural Language Processing, Computational Linguistics, Big Data, and so on.
Workshop: Workshop on Large Scale RDF Analytics – LASCAR
Organizers: Hajira Jabeen, Damien Graux, Mohammed Saleem, Gezim Sejdiu and Jens Lehmann
This workshop on Large Scale RDF Analytics (LASCAR) invites papers and posters related to the problems faced when dealing with the enormous growth of linked datasets, and by the advancement of semantic web technologies in the domain of large scale and distributed computing. LASCAR particularly welcomes research efforts exploring the use of generic big data frameworks like Apache Spark, Apache Flink, or specialized libraries like Giraph, Tinkerpop, SparkSQL etc. for Semantic Web technologies. The goal is to demonstrate the use of existing frameworks and libraries to exploit Knowledge Graph processing and to discuss the solutions to the challenges and issues arising therein. There will be a keynote by an expert speaker, and a panel discussion among experts and scientists working in the area of distributed semantic analytics. LASCAR targets a range of interesting research areas in large scale processing of Knowledge Graphs, like querying, inference, and analytics, therefore we expect a wider audience interested in attending the workshop.
Workshop: Knowledge Graph Building (KGB)
Organizers: David Chaves-Fraga, Anastasia Dimou, Pieter Heyvaert, Freddy Priyatna and Juan F. Sequeda
More and more Knowledge Graphs are generated for private, e.g. Siri, Alexa, or public use, e.g. DBpedia, Wikidata. While techniques to automatically generate Knowledge Graph from existing Web objects exist (i.e., scraping Web tables), the majority need extensive resources. Those with limited resources typically generate Knowledge Graphs by transforming the content of their datasets (RDB, CSV, etc). Initially, generating Knowledge Graphs from existing datasets was considered an engineering task, however scientific methods recently emerged. Lately, mapping languages for describing rules to generate knowledge graphs and processors to execute those rules emerged. Addressing the challenges related to Knowledge Graphs generation requires well-founded research, including the investigation of concepts and development of tools and methods for their evaluation. KGB is a full-day workshop on Knowledge Graph generation with a special focus on Mapping Languages. The main goal is to provide a venue for scientific discourse, systematic analysis and rigorous evaluation of languages, techniques and tools, as well as practical and applied experiences and lessons-learnt for generating knowledge graphs from academia and industry.
Tutorial: SANSA’s Leap of Faith: Scalable RDF and Heterogeneous Data Lakes
Organizers: Hajira Jabeen, Damien Graux, Mohammed Nadjib Mami, Gezim Sejdiu and Jens Lehmann
Scalable processing of Knowledge Graphs (KG) is an important requirement for today’s KG engineers. Scalable Semantic Analytics Stack (SANSA) is a library built on top of Apache Spark and it offers several APIs tackling various facets of scalable KG processing. SANSA is organized into several layers: (1) RDF data handling e.g. filtering, computation of RDF statistics, and quality assessment (2) SPARQL querying (3) inference reasoning (4) analytics over KGs. In addition to processing native RDF, SANSA also allows users to query a wide range of heterogeneous data sources (e.g. files stored in Hadoop or other popular NoSQL stores) uniformly using SPARQL. This tutorial, aims to provide an overview, detailed discussion, and a hands-on session on SANSA, covering all the aforementioned layers using simple use-cases.
Tutorial: Semantic Data Enrichment for Data Scientists
Organizers: Matteo Palmonari, Dumitru Roman, Vincenzo Cutrona, Nikolay Nikolov and Aljaž Košmerlj
The enrichment of a dataset with information coming from third-party data sources is a common data preparation task in data science. Semantic technologies and linked open data can provide valuable support for this task, paving the way for new data-scientist-friendly tools that may facilitate effort-consuming, difficult and boring data preparation activities. In this tutorial, we will provide the audience with: an explanation of the role that semantics play in data enrichment for data science; a review of advantages and limitations of tools, methodologies and techniques for semantic data enrichment available today; a practical dive into the creation of data transformations for enriching the data and the usage of the enriched data to train predictive models. For the latter, we will use tools that support the interactive specification of data transformations and their scalable execution on large datasets and a use case where digital marketing data are enriched to weather-based predictive modeling.
Tutorial: Build a Question Answering system overnight
Organizers: Denis Lukovnikov, Gaurav Maheshwari, Jens Lehmann, Mohnish Dubey and Priyansh Trivedi
With this tutorial, we aim to provide the participants with an overview of the field of Question Answering over Knowledge Graphs, insights into commonly faced problems, its recent trends, and developments. In doing so, we hope to provide a suitable entry point for the people new to this field and ease their process of making informed decisions while creating their own QA systems. At the end of the tutorial, the audience would have hands-on experience of developing a working deep learning based QA system.
Tutorial: Practical and Scalable Pattern-based Ontology Engineering with Reasonable Ontology Templates
Organizers: Martin G. Skjæveland, Leif Harald Karlsen and Daniel P. Lupp
Ontology-based information systems are coming of age, and there is considerable industrial interest in using ontologies as a means for an effective and digitalised representation of information. To ensure successful adoption of these technologies, best-practice methodologies, languages and tools geared towards building large-scale ontologies and the needs of ontology experts domain experts and end users are essential.
Tutorial participants will be introduced to the Reasonable Ontology Templates (OTTR) framework with which ontology modelling patterns can be efficiently represented and instantiated, and learn how such ontology abstraction mechanisms can be beneficial for building, interacting with, and maintaining large scale ontologies. A variety of tools will be introduced geared specifically to end users’, domain experts’ and ontology engineers’ needs. Participants will learn first-hand how to use these tools in order to efficiently create large ontologies.
The tutorial is aimed at semantic web practitioners, ontology engineers, and those who wish to develop these skill sets but are having difficulties getting started with existing solutions. The tutorial will consist of presentations, plenary and individual exercises using open source tooling. Participants that wish to take part in the individual exercises should bring a laptop with Java 8 installed.
Tutorial: Querying Linked Data with Comunica
Organizers: Ruben Taelman, Joachim Van Herwegen, Miel Vander Sande and Ruben Verborgh
Querying Linked Data on the Web is a non-trivial endeavour because of the heterogeneity of Linked Data publication interfaces and the large variety of querying algorithms. We recently introduced a meta query engine, called Comunica, as a research platform that offers a way to cope with this complexity. To enable researchers to easily get started with Comunica. we offer this introductory tutorial.
The tutorial consists of an overview of the capabilities of this platform, and demonstrates its usage for research purposes. As a result, participants from different backgrounds will be able to query Linked Data with Comunica, and modify the querying process with custom algorithms. Ultimately, this will reduce the effort needed to develop and evaluate new Linked Data querying techniques.
Tutorial: Continuous analytics on linked data streams
Organizers: Riccardo Tommasini, Robin Keskisärkkä, Jean-Paul Calbimonte, Eva Blomqvist and Emanuele Della Valle
The goal of the tutorial is to outline how to develop and deploy a stream processing application in a web environment in a reproducible way. To this extent, we intend to (1) survey existing research outcomes from the Stream Reasoning /RDF Stream Processing that arise in querying and reasoning on a variety of highly dynamic data, (2) introduce stream reasoning techniques as powerful tools to use when addressing a data-centric problem characterised both by variety and velocity (such as those typically found on the modern Web), (3) present a relevant Web-centric use-case that requires to address simultaneously data velocity and variety, and (4) guide the participants through the development of a Web stream processing application.
Tutorial: Generating and querying (Virtual) Knowledge Graphs from heterogeneous data sources
Organizers: David Chaves-Fraga, Ahmad Alobaid, Andrea Cimmino, Freddy Priyatna and Oscar Corcho
Despite the emergence of RDF knowledge bases, exposed via SPARQL endpoints or as Linked Data, formats like CSV, JSON or XML are still the most used for exposing data on the web. Some solutions have been proposed to describe and integrate these resources using mapping languages (e.g. RML, CSVW, kR2RML, etc) and many of those are equipped with associated RDF generators (e.g. RML-Mapper, CSVW generator, etc). As these solutions generate materialized RDF, they cannot efficiently deal with volatile data or provide a SPARQL entry point directly to the data sources.
In this tutorial, we explain how to use a suite of tools to manage and exploit data in heterogeneous formats (CSV, RDB, JSON or REST API) without the need to load the resulting RDF in a triple store for querying it. First, we present TADA, a tool for automatically annotating CSV files using existing Knowledge Graphs. Second, we present HELIO, a Linked Data publisher that provides unified access in real-time to multiple heterogeneous data sources. Finally, we present an OBDA approach to exploit CSV published on the Web providing access via SPARQL or GraphQL.