Open Repositories Conference 2024 - ConfTool Pro Printout

Session

Developer Track Session 3

Time:

Thursday, 06/June/2024:

09:00 - 10:30

Session Chair: Jonas Gilbert, University of Borås

Location: Drottningporten 2

200

Presentations

Repositories and Computation: Crossover Episode

Maximilian Moser, Martin Weise, Sotirios Tsepelakis, Tomasz Miksa, Andreas Rauber

TU Wien, Austria

Research increasingly becomes data-driven, with substantial amounts of information being generated and analyzed to produce new insights and discoveries. Since Virtual Research Environments (VRE) like JupyterHub [1] are becoming more available, the audience that uses these compute resources becomes more heterogeneous too, demanding seamless identity management, datasource integration and high availability from VRE service providers. To make VRE research outputs more FAIR, they can be deposited in dedicated research data repositories. At TU Wien, we provide two different research data repositories [2,3] for publication of research data results, and Jupyter [4] for data processing as part of our VRE.

In order to improve user experience, we are building a library to be used in the Jupyter notebooks that allows researchers to use datasets directly in the VRE seamlessly provided by the repositories in the background, abstracted from the researcher.

Because the library only depends on public APIs, it is not tied to our VRE and can be used in other deployments as well. Further, we aim to keep the design of the library intentionally simple so that it can be extended with support for additional repository types.

[1] https://jupyter.org/hub

[2] https://researchdata.tuwien.ac.at

[3] https://www.ifs.tuwien.ac.at/infrastructures/dbrepo/

[4] https://jupyter.hpc.tuwien.ac.at/

The FAIR Signposting Validator

Martin Klein¹, James Powell¹, Herbert Van de Sompel²

¹Los Alamos National Laboratory, United States of America; ²Data Archiving and Networked Services

Increasing the level of FAIRness of resources is high on the agenda of repository developers and managers. The FAIR Signposting profile offers concrete recipes to approach this goal and has therefore seen an increased level of adoption in the international repository community. To further support developers, we have designed and implemented the FAIR Signposting Validator. The validator provides immediate feedback to users regarding the syntactic and semantic validity of their FAIR Signposting implementation. This presentation will outline the design, implementation details, and utility of the validator. We will share the source code openly for the developer community, in order to enable local installations.

Persisting complex, hierarchical repository content in an S3 object store

Paul Walk¹, Anusha Ranganathan²

¹Antleaf Ltd.; ²Cottage Labs

Antleaf and Cottage Labs are working with Ruhr University Bochum (RUB) to develop a Research Data Management System (RDMS) based on a Samvera Hyrax 3 repository. Three key requirements have, taken together, led to some interesting challenges:

1. RDMS must support multiple, complex data/metadata models

2. RDMS must preserve data privacy within each research group, until it is ready for publishing.

3. RDMS must persisting all data & metadata in RUB's S3 object storage facility

The presentation will give a brief overview of the three requirements, before describing the challenges that these requirements introduce both separately and in conjunction. It will then describe the technical solutions to these challenges, commenting on their effectiveness and any trade-offs that have had to be accommodated.

The ORA Data Preservation Service – a lightweight, open-source, digital repository solution

Tom Wrobel

University of Oxford, United Kingdom

Digital preservation, and data preservation, are key concerns to repository owners and stakeholders. However, implementation of system-independent, versioned, digital storage for digital repository content is often proprietary and financially expensive. Here, we present an open source, decoupled, remote solution for the preservation of repository content in an OCFL (Oxford Common File Layout) file system using the Fedora 6 Repository application to manage ingest and export.

FAIRiCat: Supporting Discovery of a Repository's Interoperability Affordances

Herbert Van de Sompel¹, Martin Klein², Patrick Hochstenbach³

¹DANS, Austria; ²Los Alamos National Laboratory, USA; ³Ghent University, Belgium

A new effort under the “Signposting the Scholarly Web” umbrella specifies a way in which repositories can advertise the interoperability affordances they support by publishing a FAIR Interoperability Catalog, FAIRiCat in short. It also specifies how FAIRiCats can be discovered to support obtaining an insight in the nature of a repository’s investments in interoperability, finding the machine entry points of repository-wide affordances (e.g. SPARQL endpoint), and examples of affordances that are available when interacting with individual objects managed by the repository (e.g. IIIF support).

Conference Agenda