21st Conference on Database Systems for
Business, Technology and Web (BTW 2025)

March 3 - 7, 2025 | Bamberg, Germany

JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at info.btw25@uni-bamberg.de.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Session Overview

Session

BigDS 3: Workshop on Big (and Small) Data in Science and Humanities 3

Time:

Tuesday, 04/Mar/2025:

1:30pm - 3:00pm

Session Chair: Birgitta König-Ries, University of Jena

Location: WE5/04.004

60 Personen

Session Abstract

We have scheduled 20 minutes for each presentation, including the discussion.
The workshop will end with a general discussion on the topic and concluding remarks.

Presentations

ADISS: Authority Data Integration Search System

Leon Fruth, Tobias Gradl, Andreas Henrich

University of Bamberg, Germany

This paper introduces ADISS, a generic search system designed to integrate heterogeneous authority file providers. Authority data is used to unambiguously identify entities such as persons, places, and organizations. As single data providers do not offer both quantity and quality of data, a combined access to multiple datasets is often required to support real-world use cases. In the context of Digital Humanities this combination improves the resolution of ambiguities in data curation processes. Our work is mainly motivated by two projects that require semi-automatic retrieval, as well as user-centered search scenarios for different authority file providers. Instead of using multiple existing endpoints to access the various datasets, we gather the heterogeneous data and make it accessible via integrated query and result models. In this paper, we present our highly configurable search API, which offers a diverse range of search and filtering options. We show that by its generic and highly configurable nature, our system is adaptable and reusable for a diverse set of use cases and conclude the paper with ideas for further steps and improvements.

Fruth-ADISS Authority Data Integration Search System-260_a.pdf

Historic to FAIR: Leveraging LLMs for Historic Term Identification and Standardization

Jan Felix Marten Fillies^1,2, Maximilian Teich³, Naouel Karam^1,2, Adrian Paschke^1,2,4, Malte Rehbein³

¹Institute für angewandte Informatik (InfAI), Germany; ²Freie Universität Berlin, Germany; ³Chair of Computational Humanities, University of Passau, Germany,; ⁴Fraunhofer FOKUS, Berlin, Germany

As society and scientific research progress, so does the language used to describe concepts, species, and objects. With the amount of historical data available online constantly growing, the need for it to be Findable, Accessible, Interoperable, and Reusable (FAIR) has become increasingly apparent. This study tackles the challenge of identifying historical common species and historical scientific names in a historic biodiversity text. The research further identifies five challenges when working with historical common names: changes in spelling, the creation of new terms, the shift from broad historical common names to more specific modern ones (and vice versa), and the renaming of historical common names. The research investigates the use of a large language model, GPT-4, to aid in the aforementioned entity detection process and to solve the identified challenges. The findings demonstrate that, with a small given context, the large language model can effectively identify the historical common species names and scientific names. In a test dataset, the LLM achieved a 92% success rate in accurately detecting the mentioned historical common names. Furthermore, 98% of the scientific terms were correctly identified. For four out of the five challenges of historical common names, the LLM was able to provide meaningful input. It was demonstrated that the LLM can match the historical common names to their modern-day counterparts, showing an embedded understanding of the evolution of biodiversity terminology. These results emphasize its potential for making the data more findable, accessible, interoperable, and reusable.

Fillies-Historic to FAIR-228_a.pdf

Investigating Zero-shot Topic Labelling of Scientific Papers Using LLMs

Jens Bruchertseifer¹, Patrick Neises², Maria Hinzmann¹, Ralf Schenkel¹, Christof Schöch¹

¹Trier University, Germany; ²Schloss Dagstuhl LZI dblp group, Germany

In this paper, we focus on the problem of adding content labels of a given vocabulary

to scientific publications using LLMs. After a short overview of the current state of the work, we

present a first implementation of a zero-shot classification pipeline. This implementation is already realized with a focus on extendibility and customizability, so that it can easily be used for different data sets and use cases in the future. We select a subset of the DBLP Discovery Dataset and execute our pipeline on it. In the end, we discuss the results, suggest a comparison with a second data set, the STTCL journal from the humanities, and present its challenges. Both of the mentioned data sets comply with the FAIR data principles. Finally, we consider our plans for the next steps.

Bruchertseifer-Investigating Zero-shot Topic Labelling of Scientific Papers Using LLMs-235_a.pdf

Mobile View Print View

Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: BTW 2025 Bamberg

21st Conference on Database Systems for Business, Technology and Web (BTW 2025)

March 3 - 7, 2025 | Bamberg, Germany

Conference Agenda

21st Conference on Database Systems for
Business, Technology and Web (BTW 2025)