21st Conference on Database Systems for
Business, Technology and Web (BTW 2025)

March 3 - 7, 2025 | Bamberg, Germany

JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at info.btw25@uni-bamberg.de.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Session Overview

Session

Demo: Demo Reception

Time:

Wednesday, 05/Mar/2025:

6:00pm - 8:00pm

Session Chair: Alexander van Renen, UTN
Session Chair: Varun Pandey, Technische Universität Nürnberg

Location: WE5/00.043

Demos

Presentations

RAGONITE: Iterative Retrieval on Induced Databases and Verbalized RDF for Conversational QA over KGs with RAG

Rishiraj Saha Roy, Chris Hinze, Joel Schlotthauer, Farzad Naderi, Viktor Hangya, Andreas Foltyn, Luzian Hahn, Fabian Kuech

Fraunhofer IIS, Germany

Conversational question answering (ConvQA) is a convenient means of searching over RDF knowledge graphs (KGs), where a prevalent approach is to translate natural language questions to SPARQL queries. However, SPARQL has certain shortcomings: (i) it is brittle for complex intents and conversational questions, and (ii) it is not suitable for more abstract needs. Instead, we propose a novel two-pronged system where we fuse: (i) SQL-query results over a database automatically derived from the KG, and (ii) text-search results over verbalizations of KG facts. Our pipeline supports iterative retrieval: when the results of any branch are found to be unsatisfactory, the system can automatically opt for further rounds. We put everything together in a retrieval augmented generation (RAG) setup, where an LLM generates a coherent response from accumulated search results. We demonstrate the superiority of our proposed system over several baselines on a knowledge graph of BMW automobiles.

Saha Roy-RAGONITE-183_b.pdf

Monitoring of Heterogeneos Datastores in Poly- and MultiStores

Mareike Schmidt, Jannik Kronziel, Annett Ungethüm

University of Hamburg, Germany

Due to the growing amount of vastly different datastores, there have been several attempts to hide the complexity of selecting and combining these stores behind a common interface. However, these Poly- and MultiStores come with new challenges such as query planning, query optimisation and data placement. Given that polyglot systems typically do not have detailed information about system utilisation and data distributions at their disposal, we developed a monitoring system that is a first attempt to close this gap. Our system, presented in this demonstration, is able to measure system characteristics such es query execution times, memory consumption or the current number of connections for heterogeneous datastores. Furthermore, we added mechanisms to classify the distribution of attributes, discover functional dependencies and calculate selectivities for attribute values.

During the on-site demonstration, we visualise the data collected by our monitoring system using a web interface in combination with carefully selected example data and example query workloads.

Schmidt-Monitoring of Heterogeneos Datastores in Poly- and MultiStores-195_b.pdf

A Data Quality Dashboard for (Security) Knowledge Graphs

Davyd Pizhuk¹, Lisa Ehrlinger², Gandalf Denk³, Verena Geist¹

¹Software Competence Center Hagenberg; ²Hasso Plattner Institute, Germany; ³LIMES Security GmbH

Knowledge graphs play a crucial role in storing and reusing domain knowledge for data analytics. For example, knowledge graphs can be used to model security domain knowledge (such as technical standards) to support software architects in developing secure software systems. Assessing and assuring the quality of the data in these graphs is critical to enable trust in the use of this machine-readable domain knowledge, and to ensure high-quality results for downstream tasks that build on this knowledge. In this paper, we present a visual data quality dashboard, which allows domain experts to verify the quality of their domain knowledge graph along different dimensions. We demonstrate the use of the dashboard by means of a previously built security knowledge graph.

Pizhuk-A Data Quality Dashboard for-190_b.pdf

A Demonstration of Skyrise: A Serverless Query Processor

Thomas Bodner, Tilmann Rabl

Hasso Plattner Institute, University of Potsdam, Germany

Data processing systems are increasingly being deployed in the cloud, because of the cost-effectiveness of short-term resource provisioning. In recent years, serverless cloud computing has embodied highly elastic resource pools. This elasticity has the potential to make cloud-based systems more cost-efficient, preventing resource over-provisioning and under-provisioning. In this paper, we demonstrate Skyrise, a serverless query processor for infrequent in-situ analytics on cold data in cloud storage. We highlight Skyrise's capabilities to run entirely on serverless compute resources and to complement them with virtual servers and cloud object storage.

Bodner-A Demonstration of Skyrise-197_b.pdf

A Showcase of LLMs in Action: SQL Generation from Natural Language (Demo Paper)

Johannes Schildgen, Florian Heinz

OTH Regensburg, Germany

Today, large language models are a very efficient tool for human-computer interaction using natural language. Chatbots like ChatGPT and their corresponding APIs can be used to solve a large variety of tasks that are provided in human-comprehensible sentences, for example generating SQL queries. Executing spoken SQL queries in a database system poses a challenge, because the various syntactical details of SQL are usually not provided verbally. Here, the LLM can help to augment the recognized raw query with the syntax elements needed for successful execution. Furthermore, the correct spelling of table and column names can be derived from the database schema provided in the LLM prompt. This work showcases four use cases in which LLMs assist in querying database systems: (1) A plugin for phpMyAdmin for voice-query input in natural language, (2) a chart generator, (3) an Alexa skill, and (4) a speech-controlled action game SQL Invaders.

Schildgen-A Showcase of LLMs in Action-181_b.pdf

Discovering Suitable Anonymization Techniques: A Privacy Toolbox for Data Experts

Andrea Fieschi^1,2, Pascal Hirmer¹, Christoph Stach²

¹Mercedes-Benz AG, Germany; ²University of Stuttgart, Germany

Identifying the appropriate anonymization technique is a critical yet challenging task for developers, data scientists, and security practitioners. Our interactive toolbox addresses this challenge by providing a comprehensive overview of available anonymization techniques to assist privacy-conscious developers in selecting the right one for their specific use cases. The toolbox offers a hierarchical and classified overview of techniques, each detailed with meta-model information. It employs a modular approach, allowing techniques to be implemented and deployed independently. Additionally, it enables developers to evaluate these techniques on test datasets. Our toolbox allows for the easy addition of new categories and modules. This paper demonstrates the anonymization toolbox’s capabilities, simplifying the decision-making process in the Anonymization by Design cycle by ensuring overview, modularity, and flexibility.

Fieschi-Discovering Suitable Anonymization Techniques-189_b.pdf

Segmify: A Deep Learning-Based Interactive Tool for Real-Time Cell Segmentation and Morphological Analysis

Despina Tawadros^1,2, Lena Wiese^1,2

¹Fraunhofer ITEM, Germany; ²Goethe University Frankfurt, Germany

Accurately segmenting cells in microscopy images is vital for biomedical research, yet

remains challenging due to issues like overlapping cells, noise, and low contrast. We address these challenges with an interactive dashboard powered by a U-Net++ model. Our system integrates Contrast Limited Adaptive Histogram Equalization (CLAHE) for enhanced segmentation in low-contrast images, along with custom loss functions and data augmentation to boost accuracy. The tool supports both automated and manual segmentation, allowing real-time parameter adjustments and morphological

analysis on the cellular level. This versatile platform provides researchers with the flexibility to adapt to diverse datasets, ensuring high-quality results.

Tawadros-Segmify-196_b.pdf

Generating Federated REST API Servers

Patrick Hansert, Bozhou Bai, Ngoc Anh Le, Sebastian Michel

RPTU Kaiserslautern-Landau, Germany

The application of REST APIs for data transfer on the web is ubiquitous. In many cases, such APIs are described in the popular OpenAPI format that allows a standardized, human- and machine-readable description of provided services. In this paper, we propose the demonstration of a generator that produces a complete federated REST API server from a specification when multiple instances of compatible services are deployed. The resulting software not only provides an integrated view but is also able to apply optimizations to query processing while maintaining compatibility with the original API specification.

Hansert-Generating Federated REST API Servers-185_b.pdf

ReCLAIM: An Integrated Platform for Data on Nazi-Looted Cultural Assets

Zero Susann Janetzki, Antonio Krühler, Luise Garberding, Romy Karbstein, Carl Friedrich Mecking, Konstantin Sturtzkopf, Sebastian Walker, Jan Wilhelm, Sedir Mohammed, Felix Naumann

Hasso Plattner Institute, Germany

During the Second World War, the National Socialists looted cultural assets from people of Jewish descent. The looted assets were documented, first by the perpetrators and later by the Allies. The resulting archival artifacts are scattered across sources, complicating search and linking of the entries across those sources.

The ReCLAIM platform collects, prepares, standardizes, and links archival data on Nazi-looted cultural assets from various sources. Designed for both provenance researchers and non-experts, it offers user-friendly features, such as intuitive full-text search, advanced search options for refined queries, and easy-to-use comparison options.

Because the underlying data is only partially standardized and subject to errors from OCR digitization processes, a critical aspect of data preparation ensures that original values are preserved alongside processed counterparts. By streamlining access to this information, ReCLAIM aims to support provenance research and facilitate the study of cultural heritage affected by historical injustice.

Janetzki-ReCLAIM-184_b.pdf

Upcycling UnivIS: Discovering Study Planning 2.0

Michaela Ochs, Tobias Hirmer, Andreas Henrich

University of Bamberg, Germany

Technical innovation in university settings often progresses slowly, with legacy systems frequently remaining in use for years. This can lead to outdated technologies that no longer meet user needs. In this demonstration, we explore the revival of such legacy systems through the example of UnivIS, a course administration system used at the University of Bamberg for over two decades. We introduce Baula, a digital study assistant that builds on UnivIS data and functionality, integrating them into a modern, user-friendly interface. By upcycling existing infrastructure, Baula addresses significant student concerns regarding usability and functionality. This demonstration illustrates how such legacy systems can be modernized to better align with contemporary user requirements, showcasing a model for innovation within a traditionally slow-moving environment.

Ochs-Upcycling UnivIS-188_b.pdf

Interactive specification and visualization of group movement patterns in urban traffic data streams

Andreas Morgen¹, Julius von Willich², Tim Kilb², Bernhard Seeger¹

¹University of Marburg, Germany; ²Technical University of Marburg, Germany

The rapid growth of continuously generated spatio-temporal data streams has become the source for many innovative applications, e.g., in the context of smart cities. This demo addresses the challenge of detecting group patterns in spatio-temporal data streams derived from urban movement data, a problem that has received limited attention in the streaming context so far. Based on a group pattern operator known from spatio-temporal databases, it presents the first extension of the operator to data streams fully integrated into our JEPC event processing system.

In order to support users of JEPC in specifying and optimizing queries, an interactive workflow is provided with a visual component for immediate feedback of group pattern queries. Our contributions are the workflow for online specification and visualization of spatio-temporal group pattern queries and a discussion of their efficient implementations. Moreover,

the demo is based on a large-scale dataset of urban movement data revealing the benefits of group patterns for urban traffic analysis.

Morgen-Interactive specification and visualization of group movement patterns-198_b.pdf

Disaggregated Pipeline Grouping LIVE

Andreas Geyer, Alexander Krause, Dirk Habich, Wolfgang Lehner

Technische Universität Dresden, Germany

The high need for largely scalable systems grows together with the amount of data that has to be processed. Traditional system approaches like scale-up or scale-out reach their limits more and more often. Disaggregated systems offer a solution with their very high scalability. However, they come at the cost of a lot of data transfer via network. We demonstrate a possibility to optimize for redundant data access through our pipeline grouping approach in disaggregated systems.

Geyer-Disaggregated Pipeline Grouping LIVE-191_b.pdf

Plaquette: Visualizing Redundancies in Relational Data

Christoph Köhnen, Jens Zumbrägel, Stefanie Scherzinger

University of Passau, Germany

Functional dependencies are a fundamental topic in database education, yet students often find the concept abstract. To foster understanding of functional dependencies that capture redundancies in relational data, we propose an educational software tool called Plaquette. This tool offers a novel visualization of redundancies by coloring cells that contain redundant data, where deeper hues indicate stronger redundancies. In analogy to plaque tests at the dentist's office, we refer to these redundancies as "plaque". In our demo, we begin by recapping the definition of functional dependencies, followed by introducing our concept of plaque. Demo participants are then invited to interact with Plaquette, and assess sample scenarios that showcase small examples of relations and functional dependencies. By swiping through scenarios with correct but also fake plaque on a tablet in Tinder-style interaction, participants can playfully improve their intuition for the concept of functional dependencies. Beyond contributing a new teaching tool, our ultimate goal is to assist data analysts explore redundancies in real-world data, using tools like Plaquette.

Köhnen-Plaquette-178_b.pdf

History-Based Active Learning

Johannes Kastner, Felix Mack, Felix Sommer, Peter M. Fischer

Universität Augsburg, Germany

In this demo, we will show how Active Learning (AL) can be used to establish and transfer classification information over partially/loosely related datasets, in particular fine-grained user roles on social media, with many and unbalanced classes, large number of data points as well as different internal structures or drifts over time. The key idea is to incorporate the history of learning steps into the tool, allowing us to analyze, restart, and modify the transfer. We also provide a rich visualization that allows the human oracle to interpret the most critical cases.

Kastner-History-Based Active Learning-192_b.pdf

ReProVide: Query Optimisation and Near-Data Processing on Reconfigurable SoCs for Big Data Analysis

Tobias Hahn, Maximilian Langohr, Stefan Meißner, Benedikt Döring, Stefan Wildermann, Klaus Meyer-Wegener, Jürgen Teich

Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany

The goal of ReProVide is to provide novel hardware and optimisation techniques for scalable, high-performance processing of Big Data. The Programmable System-on-Chip (PSoC) architecture of ReProVide includes a reconfigurable FPGA for the support of hardware accelerators for various operators on relational and streaming data. Such PSoCs can be used to process data directly at the source, such as data from attached NVMes, using application-specific accelerators. For example, compute-intensive tasks such as JSON parsing can be offloaded to the hardware accelerators, reducing CPU load. In addition, reducing the volume of data at an early stage avoids unnecessary data movements, resulting in lower energy consumption.

This demo illustrates the opportunities and benefits of hardware-reconfigurable, FPGA-based PSoCs for near-data processing. The demo allows users to run two queries and select which operations should be pushed onto the SoC for near-data hardware acceleration. From no acceleration to maximum acceleration, a 52x improvement in throughput and 67x lower energy consumption can be observed.

Hahn-ReProVide-182_b.pdf

Incremental Stream Query Merging In Action

Ankit Chaudhary^1,2, Ninghong Zhu², Laura Mons², Steffen Zeuch^1,2, Varun Pandey², Volker Markl^1,2

¹BIFOLD; ²TU Berlin, Germany

Stream Processing Engines (SPEs) execute long-running queries on unbounded data streams. However, they primarily focus on achieving high throughput and low latency for a single query. To deploy multiple queries, the users instead scale the infrastructure, executing each query in isolation. As a result, SPEs overlook potential data and computation-sharing opportunities among several long-running queries. As streaming queries are continuous and long-running, identifying sharing opportunities among newly arriving and existing queries can reduce resource utilization. This allows for deploying more queries without the need to scale the infrastructure. In this demonstration, we present Incremental Stream Query Merging (ISQM), an end-to-end solution to identify and maintain sharing among stream queries. We showcase six different types of sharing identification techniques and their impact on query optimization and execution time.

Chaudhary-Incremental Stream Query Merging In Action-187_b.pdf

Compression in Main Memory Database Systems: Cost and Performance Trade-Offs of Workload-Driven Data Encoding

Martin Boissier, Marcel Weisgut, Tilmann Rabl

Hasso Plattner Institute, Germany

Automating physical design optimizations of database systems is challenging.

Recent work on index selection or data compression has shown significant advantages of automated approaches.

However, the impact on running systems is often hard to predict.

Moreover, automated systems often lack the capabilities to help users understand the decisions taken.

In this demonstration, we study the impact of optimal encoding configurations for in-memory database systems.

We allow the user to set varying main memory budgets for which optimal encoding configurations are applied, as well as allow the user to manually configure the system.

Effects on runtime performance and memory consumption can be directly observed.

The user can further analyze the impact compression has on overall memory consumption and how compression ratios affect performance when the memory bandwidth is saturated.

Boissier-Compression in Main Memory Database Systems-193_b.pdf

Mobile View Print View

Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: BTW 2025 Bamberg

21st Conference on Database Systems for Business, Technology and Web (BTW 2025)

March 3 - 7, 2025 | Bamberg, Germany

Conference Agenda

21st Conference on Database Systems for
Business, Technology and Web (BTW 2025)