The 20th International Conference on
Open Repositories
Chicago, Illinois, USA | June 15-18, 2025
Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Session Overview |
Date: Sunday, 15/June/2025 | |
09:00 - 16:00 | Registration Location: Rogers Lobby |
09:30 - 12:30 | Discover and Create Rich Metadata with the DataCite REST API Location: N110- Orchestra Room |
|
Discover and Create Rich Metadata with the DataCite REST API DataCite Persistent identifier metadata is a key resource for research discovery and machine actionability. When organizations register Digital Object Identifiers (DOIs) for research outputs and resources, the accompanying metadata is harvested to power search and discovery platforms and contributes to preservation of the scholarly record. This workshop will guide participants on using the DataCite REST API to work with DataCite DOI metadata. Participants will learn how to: 1) Retrieve DOI metadata: Participants will learn how to use the DataCite REST API to search DOIs in the DataCite Metadata Store, including how to construct queries, apply filters, and retrieve a complete list of results. 2) Create DOIs with rich metadata: Participants will try out DOI registration in the DataCite test system and learn how to update DOI metadata, taking full advantage of the DataCite Metadata Schema’s capabilities. |
09:30 - 12:30 | DSpace Developer Meet-Up and Q&A Location: N112- Band Room |
|
DSpace Developer Meet-Up and Q&A 1Lyrasis, United States of America; 2The Library Code, Germany |
09:30 - 17:00 | DSpace SEO and Statistics Master Class Location: C116- Community Gathering Room |
|
DSpace SEO and Statistics Master Class Atmire, Belgium Whether you’ve just launched your DSpace repository or you’re managing millions of items accumulated over two decades, the challenges of optimizing for search engine crawlers and leveraging repository statistics to meet stakeholder needs are ever-evolving. In an era where next-generation AI-based crawlers demand more data than ever, repository managers need to stay ahead of the curve to keep their content discoverable and relevant. In this engaging master class, DSpace committer and Atmire co-founder Bram Luyten will guide participants through proven strategies for search engine optimization and practical statistics configuration in DSpace. By exploring real-world examples and tackling common pain points, attendees will gain actionable insights to enhance their repositories—no matter the size or age. By the end of this session, you’ll be equipped with concrete tactics to boost discoverability, respond more effectively to stakeholder queries, and future-proof your repository in the evolving landscape of scholarly communication. |
09:30 - 17:00 | InvenioRDM Workshop Location: C119&121- Classrooms |
|
InvenioRDM Workshop 1Northwestern University, United States of America; 2Cottage Labs, United Kingdom; 3Ubiquity Press, United Kingdom; 4TU Wien, Austria; 5New York University, United States of America; 6CNUDST, Tunisia; 7European Organization for Nuclear Research (CERN), Switzerland The Invenio framework, developed by CERN, has existed in various forms supporting scientific resources and research outputs for over twenty years. The current form, InvenioRDM, is a next-generation, modular, turnkey repository and RDM platform supporting FAIR practices, research transparency, and discoverability. Both the back-end of Zenodo.org and a free, community-supported opensource software, InvenioRDM meets user needs by employing the robust DataCite metadata standard, powerful search features, ORCiD- and ROR-enhanced creator and contributor fields, and much more. Partners in this strong international development effort include developers, librarians, administrators, and other professionals in Europe, Africa, Asia, and North America. This InvenioRDM workshop will introduce repository managers, developers, system/software administrators, decision makers, and librarians both to the open source repository framework and the inspirational and inclusive international community that supports it. This full-day workshop will feature presentations from project partners on implementations, innovations, and customizations. It will also include practical and motivational community-focused sessions filled with real-world expertise and tips about both how to sustain an InvenioRDM instance and how to make collaborative contributions towards the community’s goals. Throughout the workshop, participants will gain in-depth insights into the software and how participants at any technical or institutional level can become contributing members of this dynamic community. |
12:30 - 14:00 | Lunch Break |
14:00 - 15:30 | Data Repository Descriptive Metadata: Recommendations & Resources Location: N112- Band Room |
|
Data Repository Descriptive Metadata: Recommendations & Resources World Data System, Canada In this 90 minute workshop, we emphasize the importance of metadata used to describe data repositories, with a focus on the re3data Registry of Research Data Repositories, and recommendations from the RDA Common Descriptive Attributes of Research Data Repositories. While some of these attributes are straightforward, others are more complicated to represent; thus, reasons they are more difficult will be summarized and discussion time is allocated. We will also cover use cases and applications for these metadata. An interactive component will allow participants to try re3data features and introduce them to the form for creating and maintaining records. The target audience is data repository representatives who are responsible for maintaining re3data metadata records. A secondary audience would be users of data repository metadata, who may wish to perform meta-analysis about data repositories or identify fit-for-purpose repositories. The workshop will be delivered by World Data System (WDS) staff. The WDS, an affiliate member of the International Science Council, serves a membership of trusted data repositories and related organizations. The WDS mission is to enhance the capabilities, impact, and sustainability of our member data repositories and data services. |
14:00 - 17:00 | FAIR Metadata Bright Spots: Guides on the Road to Future Possibilities Location: N110- Orchestra Room |
|
FAIR Metadata Bright Spots: Guides on the Road to Future Possibilities Metadata Game Changers, United States of America Since the emergence of the World Wide Web during the late 1900’s, many repositories have focused documentation efforts on data discovery to such an extent that the concepts “metadata” and “data discovery” have become inextricably intertwined in many repository practices. The FAIR Principles were proposed nearly a decade ago and have been applied in many contexts. For repositories, these principles broaden the focus from Findability to include Access, Interoperability, and Re-use. The DataCite metadata schema has over fifty metadata elements that support all four of these use cases. Repository support for these use cases in DataCite can be measured and expressed as metadata completeness. We have identified repositories that 1) are doing well in supporting these use cases and 2) have improved support during the last year. These repositories, termed bright spots, provide good examples for others and understanding their practices can help raise the bar for the new FAIR use cases. |
Date: Monday, 16/June/2025 | |
08:00 - 17:00 | Registration Location: Rogers Lobby |
09:00 - 09:40 | Opening Plenary Location: Griffin Auditorium 09:00-09:05- Introduction and housekeeping 09:05-09:15- Welcome to Chicago 09:15-09:30- 09:30-09:40-Steering Committee welcome, introductions & acknowledgements |
09:40 - 10:30 | Keynote speaker Heather Joseph Location: Griffin Auditorium As the Executive Director of SPARC, Heather Joseph is an internationally renowned and well-respected expert in open research policies, practices, and implementation strategies. Under her stewardship, SPARC has become the leading advocacy organization that promotes innovative, open, and equitable global systems of research and education. |
10:30 - 11:00 | Coffee Break Location: Rogers Lobby |
11:00am - 12:30pm | Presentations- Repository sustainability and future-proofing Location: Griffin Auditorium |
|
Beyond the Buzzwords: agile collaboration and rejecting a perfectionist mindset (how we did it, and you can too!) Princeton University, United States of America This presentation details Princeton University Library's shift in software design and cross-departmental collaboration to enhance the long-term stewardship of research data and scholarly publications. The transition involved moving away from legacy, monolithic systems toward a more flexible and scalable approach. Rather than merely replacing old software with newer systems, the team took a step back to evaluate the underlying organizational needs, user requirements, and the evolving role of software, as well as how to build strong relationships and communication patterns within teams and with stakeholders. This process required continuous work to strengthen ongoing collaborations, manage expectations, and foster mutual understanding across diverse professional backgrounds and departments across the institution. With an emphasis on learning from the past, supporting innovation, and building a sustainable, user-centric infrastructure for the future, we will share the successes and lessons learned from the iterative, collaborative approach we took, we will also share insights for other organizations that may be facing similar challenges. Democratization of Knowledge in the Open Science Era: The Role of Free Software and the Moara Network in Scientific and Technological Innovation Brazilian Institute of Information in Science e Technology, Brazil This study explores the evolution and impact of Open Science, driven by the Open Access movement and advancements in information technologies, particularly Free Software, in the contemporary scientific landscape. It emphasizes the importance of these technologies in democratizing access to knowledge, fostering collaboration, and ensuring scientific transparency. Employing a qualitative approach, the study examines how the dominance of foreign or restricted-access platforms challenges the universality and accessibility of Open Science, underscoring the need for inclusive strategies to address these barriers. The analysis highlights the role of virtual scientific communities, powered by digital technologies, as essential spaces for innovation and knowledge dissemination. Additionally, the work explores the contributions of Free Software and open-source code to the replicability and verifiability of scientific results—key factors in building a robust and reliable body of knowledge. Finally, the Moara Network is presented as an innovative Brazilian example that promotes scientific collaboration and collective engagement to democratize science, illustrating a transformative shift towards a more integrated and collaborative scientific community. After more than 20 years of eScholarship...where to now California Digital Library, United States of America eScholarship.org, the institutional repository and open access publishing platform for the 10-campus University of California system, has been running for almost 23 years on mostly home-grown solutions which are now reaching their end of life. As we look to the future, we are considering existing community solutions, but finding some difficulties because of our long and organically grown history. We will explore how we remained sustainable as a service and a system as long as we did, and how we are hoping to put ourselves on a footing to grow sustainably for the next 20 plus years. Repositories in the US Federal Funding Workflow: Lessons from the “Reasonable Costs for Public Access” Project Invest in Open Infrastructure, United States of America Repositories are a key component of the compliance ecosystem for federally funded research in the United States. Invest in Open Infrastructure conducted an ethnographic-style study on repository and institutional responses to public access policy changes initiated by the 2022 US Office of Science and Technology Policy Memorandum on Ensuring Free, Immediate and Equitable Access to Federally Funded Research. Through interviews, workflow diagramming, desk research, and surveys, we outline the landscape of cost and price associated with providing the infrastructure to implement the public access policies. We identify labor as a major factor impacting repository cost, and note the challenges in pricing models that do not scale to enable the growth of labor needed to address these policies. From our workflow project, intervention points are identified and shared as opportunities to financially support repositories and reduce last-minute labor burdens on staff. Harnessing Sustainable Technologies for Digital Preservation using the concept of Green Repositories 1Federal University Lokoja, Nigeria; 2University of Zululand, South Africa As digital repositories grow in scale and significance, ensuring their sustainability has become a critical challenge. Traditional methods of digital preservation often rely on energy-intensive infrastructure, contributing to environmental concerns. This paper explores the concept of green repositories, which leverage sustainable technologies to reduce environmental impact while ensuring the long-term accessibility and integrity of digital assets. The work examines the current limitations of repository systems, including their carbon footprint, resource consumption, and dependency on non-renewable energy sources. Drawing from advancements in green computing, energy-efficient data storage, and renewable energy integration, the paper proposes a framework for implementing environmentally friendly practices in digital preservation workflows. Key strategies include adopting energy-efficient hardware, utilizing cloud-based solutions powered by renewable energy, and integrating AI-driven tools to optimize storage and retrieval processes. The study also highlights real-world examples of repositories embracing sustainability and presents a roadmap for institutions aiming to transition toward greener operations. Ultimately, the concept of green repositories offers a dual solution: addressing environmental sustainability while safeguarding the longevity of cultural and academic resources. This approach not only supports the preservation mission of repositories but also aligns with broader global goals for environmental stewardship. |
11:00am - 12:30pm | Lightning (24x7) Presentations - Repository showcase Location: N110- Orchestra Room |
|
Tracing the Footprints of Academic Research in Zambia through the Institutional Repository: A Case of the University of Zambia" Zachary Zulu THE UNIVERSITY OF ZAMBIA, Zambia Abstract The University of Zambia established an institutional repository (IR) in order to archive and make available to the research community the university’s intellectual output using DSpace. This presentation will focus, examining the Repositories role in tracing the footprint of academic research in Zambia. The presentation identifies explores challenges such as Repository management ,underutilization, limited content diversity, and issues in digital preservation that hinder the repository's effectiveness. Using a mixed-methods approach, the study analyzes the repository's current state, user engagement, and content accessibility. Findings reveal significant gaps in repository management and suggest strategies to enhance its functionality, including increased advocacy for open access, capacity-building for stakeholders, and policy enhancements. This work underscores the importance of institutional repositories in documenting national research outputs and their potential to drive academic visibility and collaboration in Zambia and beyond Realizing UNSW’s Vision for a Next-Generation Repository UNSW Sydney, Australia UNSWorks, UNSW’s institutional repository, is advancing toward its vision of providing next-generation infrastructure to enhance the global visibility and impact of UNSW research. Guided by FAIR principles and aligned with UNSW’s strategic goals, UNSWorks integrates with UNSW HR, Grant, and CRIS systems to acquire authoritative metadata and persistent identifiers, including DOIs, ORCIDs, and grant IDs. The repository supports linking datasets with publications, automates DOI creation, and enhances content discoverability via global aggregators and search engines. This strategy contributes to a diverse approach to open science and open access, fostering inclusivity and collaboration in global research dissemination. Future plans include upgrading to DSpace 7.6.2 to enable item-level versioning and “COAR Notify” functionality, expanding preservation to achieve CoreTrustSeal certification, and broadening its PID Strategy to include RORs and external researcher ORCIDs. UNSWorks aspires to develop an Indigenous research collection aligned with UNSW’s Indigenous Strategy, incorporate SDG classifications to support societal impact, and introduce descriptions of research instruments. Further plans include harvesting UNSW OA research outputs from other platforms, to ensure comprehensive oversight and monitoring of research and impact activity. Inspired by COAR’s vision, UNSWorks sustains innovation, promotes inclusivity, and enhances accessibility, supporting research excellence, equitable access, and the future of open repositories. HAL: Strengthening Connections Between Publications, Data, and Software in the French National Open Science Ecosystem CCSD / CNRS, France HAL, the multidisciplinary French national open archive, hosts over 1.45 million academic documents, including articles, preprints, conference papers, and more. As a pillar of the national open science policy, HAL plays a critical role in promoting accessibility and visibility of research outputs. A key initiative of the second French plan for Open Science is to create an integrated ecosystem linking publications, research data, and software. This involves strengthening the connections between HAL, Recherche Data Gouv, and Software Heritage. Our work has focused on developing tools to manage the relationships among these diverse research objects, enhancing their visibility within HAL, and supporting their dissemination across other repositories via the COAR Notify protocol. This effort was undertaken as part of the HALiance project, funded by the French National Research Agency (ANR 21-ESRE-0047). It combines technical innovations with a commitment to fostering interoperability and discoverability within the global open science landscape. In this presentation, we will outline the development of this new service, detail the technical challenges addressed during its implementation, and share early user feedback following its launch in early 2025. George Eliot Scholars: (Middle)Marching Towards Open Access 1CoSector, University of London, United Kingdom; 2Auburn University, United States George Eliot is one of the most frequently studied authors in English literature but many of her works remain behind paywalls. This is despite the fact that there are open access pathways to much of this research. This talk describes George Eliot Scholars, a subject repository collating open access research on the subject of Eliot as part of the George Eliot Archive ecosystem. It describes five areas that need[ed] to be considered in the course of this project. These are: [1] amassing a collection of relevant scholarship [2] assessing the copyright and access agreements associated with the work once identified [3] working across continents with a frequently changing team of student volunteers [4] ensuring work is accessible for all potential stakeholders [5] publicising this resource to the communities that can benefit from it The Current Situation, Problems and Future Development of Institutional Repositories in China: Taking the Institutional Repository of the Chinese Academy of Sciences as an Example National Science Library, Chinese Academy of Sciences, China, People's Republic of Against the international backdrop of the continued development of institutional repositories around the world, China’s institutional repositories have also undergone significant development and changes in the past 20 years. Taking the institutional repository of the Chinese Academy of Sciences as an example, with the support of Chinese policies, it adopts a pilot-first and phased promotion approach, and its scale has continued to expand, with rich resources and diverse functions. At the same time, there are also some problems in the construction of institutional repository, especially in the process of upgrading to institutional repository cloud (IR Cloud). For example, there is a certain degree of siloing in the construction of institutional repositories; some institutions have low user recognition, participation and attention, which leads to problems in sustainable development; the data security of institutional repository is also increasingly being paid attention to by scientific community. In view of the above problems, relevant thoughts on the future development of institutional repositories are proposed. Towards a new digital repository for Qatar National Library Qatar National Library, Qatar Qatar National Library (QNL), began its digital heritage efforts in 2012 with the Qatar Digital Library (QDL), which now hosts nearly 2.5 million pages. Further advancements include the Islandora-based QNL Digital Repository (2017) for local heritage collections and the Manara Repository (2022) for national research outputs. Since 2020, QNL has employed Archivematica for digital preservation. Currently, QNL is developing a unified repository system to replace outdated platforms, enhancing user experience, Arabic support, and AI-driven metadata enrichment. This presentation highlights ten years of QNL's repository operations, focusing on lessons learned impact on planning for its new digital repository project. |
11:00am - 12:30pm | Panel- Pushing the boundaries on citation tracking and usage reporting for open research outputs Location: N112- Band Room |
|
Pushing the boundaries on citation tracking and usage reporting for open research outputs DataCite, United Kingdom Reporting usage measures for research outputs, such as views, downloads, and citations, helps repositories showcase the value of their collections and provide recognition to researchers. Usage information is also crucial for institutions seeking to include open outputs in research evaluations. However, tracking citations consistently and at scale for outputs other than journal articles is challenging due to limitations in the publisher workflows, the need for standardized frameworks to aggregate and normalize citations, and metadata gaps that limit the context information associated with a citation. This panel-led discussion will explore how repositories, institutions, and community initiatives are addressing these challenges. We will hear repository efforts to capture and display citation information, insights on institutional needs for citation information in the context of policy implementation and research evaluation, and updates from the Data Citation Corpus project which seeks to scale the data citations that are openly available to the community. The discussion will explore progress and lessons learnt from community efforts to identify, capture and aggregate usage of open outputs through a diversity of methodologies, and share insights toward mechanisms to more broadly integrate usage events taking place beyond the repositories into repository platforms. |
11:00am - 12:30pm | Presentations- Data Repositories 1 Location: C116- Community Gathering Room |
|
The unification of the effort: the Swedish university RDM network Swedish National Data Service, Sweden The presentation offes an outline of the background of the Swedish National Data Service, a national collaborative network for research data management at Swedish universities, aimed at giving researchers and students direct access to needed help when it comes to finding and re-using research data, as well as structuring, organising and sharing data they have work with themselves. The presentation will feature the history and structure of the network, the shared effort involved in the collaboration, steps taken to secure a willingness to participate, and the structure of the local network access points at the different participating universities. The presentation will also introduce the next step in this process, the launch of the new portal Researchdata.se (scheduled for March 2025). “Research Data Management repositories with special references to social science” 1Documentation Research and Training Centre Indian Statistical Institute, India; 2Documentation Research and Training Centre Indian Statistical Institute, India Effective management of research data is crucial, particularly in social sciences, where diverse and accessible data can drive meaningful progress. However, universities in Karnataka face several challenges, such as limited access to data, difficulty in collaborating, and a lack of integration across existing systems, which hinders researchers' work. In this study, we explore the current state of research data repositories at universities in Karnataka and identify the key obstacles researchers face. To address these issues, we propose developing a unified research data repository using DSpace. This centralized platform will gather datasets from universities across the region, offering researchers a simple, streamlined way to access a wide variety of data. The repository will be designed to be user-friendly, with intuitive tools for managing data, easy integration with other systems, and secure, role-based access. By encouraging data sharing and fostering collaboration, this platform will help spark innovation and make research more impactful. Our vision is to fill the existing gaps and set a new standard for research data management in Karnataka. We hope our approach will inspire similar initiatives in other regions, helping to create a future where research data is shared, accessible, and beneficial to all. Bridging the Silos of Institutional Data Repositories: Community Collaboration and Cross-Institutional Development 1University of Minnesota, United States of America; 2University at Buffalo Institutional repositories (IRs) play a crucial role in preserving and sharing scholarly outputs, yet they often operate in isolation, limiting their potential to support a truly integrated open data ecosystem. Through the Repository Readiness Project, led by the Data Curation Network (DCN), data stewards and repository specialists explored strategies for developing more interconnected and effective research data services. The project culminated in the Summit for Academic Institutional Readiness in Data Sharing (STAIRS), which brought together over 100 institutional stakeholders including librarians, IT professionals, and administrators from Offices of Research from US academic institutions to address research data management challenges. Key findings reveal the critical need for cross-institutional collaboration, shared standards, and a more holistic approach to data sharing infrastructure. This research highlights that effective data management cannot be achieved by libraries or individual institutions in isolation but requires strategic engagement and an active community of practice within and across institutions throughout the research lifecycle. In this presentation, we will provide background information on the project as well as recommendations for funding agencies, academic institutions, and research collaborators, focusing on three primary strategies: fostering cross-institutional collaboration, establishing centralized resource banks, and strengthening institutional data services. Advancing Indigenous Data Sovereignty through Dataverse and Local Contexts Integration 1Harvard University, United States of America; 2localcontexts.org In 2022, the annual Dataverse Community meeting theme was “Indigenous Data Sovereignty.” Following this meeting, Dataverse used an opportunity with the NIH-funded GREI repositories initiative and conversations with Local Contexts (LC) to start collaborating to advance Indigenous data sovereignty in the Dataverse-supported repository. The proof-of-concept integration allows data depositors to link to a Local Context Hub registered project or item(s) using a Local Contexts Project metadata block built into Dataverse and released in 2025. TK Labels and Notices are registered and maintained by indigenous authorities in the LC Hub site and appear on the Dataverse Dataset metadata and landing pages. This presentation will provide a technical overview of the LC Hub - Dataverse Integration Project and highlight potential community use cases and planned improvements |
11:00am - 12:30pm | Developer Track Session 1 Location: C119&121- Classrooms |
|
A year of Hybrid ML/AI cataloging aid in Archipelago Commons: The state, the lessons and probable future(s) explained through a real production implementation Metropolian New York Library Council, United States of America During OR2024 we showcased our first release of a public ML/AI Image Similarity and semantic search prototype integrated into Archipelago Commons(1.4.0). A primer in OSS Repository systems. We also gave an introduction to the challenges, concerns and hows of our development, including a not so brief 101 on ML inference. A year of research later, we want to share the evolved state of the solution and what will be available for 1.5.0, all this by using a specific community prototype as a showcase: Revs Institute. We will explain through this project how we support field specific human knowledge and responsible decision making to enrich unknowns (media/metadata) through transfer learning from well cataloged and known Digital Objects without giving up control to unattended inference. We will also demonstrate how datasets for re-training and refining of ML Models are produced and reinjected through the act of assisted cataloging, allowing a cyclic, self feeding ML system. Finally, and most importantly, we will share a closing and honest discussion of the expectations, limitations, and carbon footprint effects of ML on production: the scope of ML/AI from our and our community’s ethical and system’s design goals perspective. Deploying DataFed for Scientific Data Management: Lessons Learned 1College of Engineering, Drexel University; 2College of Computing & Informatics, Drexel University; 3Metadata Research Center, Drexel University This Developer Presentation shares our experience using DataFed, a novel scientific data management system, in research projects at Drexel University. We will address three main questions: What is DataFed, and why might Open Repositories participants find it useful? What challenges have we faced in deploying DataFed endpoints, and how did we overcome those? What should researchers know about the technical requirements and process of implementing DataFed in the university setting? DataFed, developed on an open-source basis at Oak Ridge National Laboratory (ORNL), is a distributed system for managing research data and metadata. It simplifies handling large datasets in alignment with the FAIR principles, providing researchers with scalable storage, a customizable metadata infrastructure, and optional federated sharing capabilities. Deploying a DataFed repository presents certain challenges—for example, a server with a public static IP address and open ports for ingress, which institutional network administrators may be reluctant to allow. Additionally, DataFed depends on Globus Connect Server for secure data transfer, introducing further complexities. Despite such obstacles, we have found that the benefits of DataFed more than justify the effort. We are also collaborating with ORNL to streamline the deployment process for future users. Renovation and enhancement of statistics pages in DSpace 7 1University of Oklahoma, United States of America; 2University of Oregon, United States of America This proposal outlines the renovation and enhancement of statistics pages in DSpace 7 to improve their functionality, usability, and impact. Current limitations, such as text version design lacks information that users are interested in, hinder their effectiveness. The project aims to introduce modernized interfaces, advanced data visualization, real-time analytics, and enhanced filtering options. By addressing these issues, the renovated statistics pages will empower administrators, researchers, and contributors with actionable insights, improve user engagement, and align with institutional goals. The proposed enhancements will make DSpace a more statistically efficient and user-friendly digital repository platform. Putting your middleware on steroids with DSpace 7+ REST API Atmire, Belgium For many institutions, highly customized DSpace code has long been necessary to handle complex integration requirements—a practice that often results in technical debt and impedes upgrades to DSpace installations. Since DSpace 7, however, the decoupling of the front end (DSpace Angular) and back end (DSpace REST) has fundamentally changed the way integrations can be managed. By “consuming its own dog food,” the DSpace community now channels all DSpace UI operations through the comprehensive and well-documented DSpace 7 REST API. In this proposal, we present a real-world implementation that demonstrates how data from a third-party business process can be seamlessly ingested into DSpace based on robust business rules—without any modifications to the core DSpace code. Instead, we leverage simple Bash scripts that interface with the DSpace 7 REST API, showcasing a sustainable approach to complex integrations that avoids the pitfalls of traditional code customization. We will discuss the design, implementation, and impact of this solution, highlighting best practices and lessons learned to help others optimize their own DSpace integrations. Jetstream2 and Cloud-Based Dev Tools for Data Curation Training University of California, Santa Barbara Research data curators draw on a wide range of software tools to manage and preserve repository submissions. There is no singular software “stack” for data curation. However, for the purposes of facilitating workshops on data curation practices, a shared environment with commonly used tools and resources is needed. As part of an ongoing project with the Data Curation Network (DCN), the author evaluated the feasibility of using cloud-based developer environments for conducting workshops on specialized data curation topics. This presentation provides an overview and demonstration of an exploratory infrastructure for data curation workshops running on JetStream2, a computing resource available to US-based researchers and educators through the National Science Foundation. (NSF). Coder (an open-source, self-hosted web service) is used to provision cloud-based environments for workshop participants. The infrastructure is defined using Terraform to facilitate configuration and deployment for each workshop; the code is available on GitHub (https://github.com/srerickson/js2-coder). |
12:30 - 13:30 | Lunch Break Location: Rogers Lobby |
13:30 - 15:00 | Presentations- Standards, Accessibility and Digital Preservation Location: Griffin Auditorium |
|
Building a Digital Preservation Service Model for Canadian Institutional Repositories Through Community Engagement 1Western University, Canada; 2Scholars Portal Preservation of repository content is a vital and complex responsibility. Institutional repositories (IRs) house unique content created by academic communities and hold a significant record of scholarship over time. Preserving this scholarship is crucial to ensure access for future generations. Preserving IR content involves many challenges, including developing policies and procedures, building technical workflows, considering the needs of diverse file formats and disciplinary practices, and managing costs. This presentation will look at digital preservation in IRs through one solution to those shared challenges. Scholaris is a new Canadian national, opt-in shared repository service built on the DSpace platform. Presenters are members of the Scholaris Digital Preservation Expert Group, which provides recommendations and advice on digital preservation planning requirements and pathways for the Scholaris service development team. We will share results from a nation-wide environmental scan and survey of repository managers to gather insights on current practices, capacity, and needs related to digital preservation, as well as inform the workflows and service model we are developing in response to these needs. We hope to leave audience members with tools you can use in your own preservation work, along with a sense of optimism and a success story of national collaboration. IIIF at one end, OCFL at the other, Fedora in the middle. 1University of Leeds, United Kingdom; 2Digirati, United Kingdom The emergence of two standards – IIIF and OCFL (International Image Interoperability Framework and Oxford Common File Layout) – means that a complete digital preservation and delivery infrastructure can take a standards-based and open-source approach all the way through. We will show how the University of Leeds and Digirati have used existing software and newly developed components to deliver a system, as part of the University of Leeds’ Digital Library Infrastructure Project (DLIP), that conforms to digital preservation best practices and is open to ad hoc application development and re-use. One key aspect of this is the use of Fedora as a gateway to OCFL and the ability for components other than Fedora to make use of this standards-based structure. The presentation will describe the project, problems it aims to address, the approach adopted and ask questions about the role of METS in between the two outer standards-based boundaries of IIIF and OCFL. Embedding Accessibility into ETD Workflows: A Case Study California State University San Marcos, United States of America Are you curious about formatting documents for accessibility? Interested in making your ETDs available to people using assistive technology? Embedding Accessibility into ETD Workflows: A Case Study analyzes how one campus took a flailing accessibility formatting program and turned it into a successful, scalable, and less-stressful process. We flipped the process completely: from having the students handle the formatting to taking it in-house and having library workers do the work. We will share our pinch-points and our successes with attendees. This case study examines the evolution of the ETD/accessibility workflow from how it started to the one that is currently in use and successful over more than a decade of constant iterative process improvement. The presenter aims to give others information that could help them to advocate for embedding accessibility formatting into their own workflows. With this structure in place, we have incorporated accessibility formatting into other workflows, including faculty publications and digital archives materials. Accessibility formatting takes time, can be complex, and still requires human intervention at this time, but it is important. ETDs are official campus documents. Making them available for people using assistive technology is the equitable - and right - thing to do. Practice research as a lens to enable a future with a FAIRer, more equitable scholarly research landscape 1University of Westminster, United Kingdom; 2Jisc, United Kingdom; 3CoSector, University of London, United Kingdom; 4University of Leeds, United Kingdom At OR2023 the Practice Research Voices and Sustaining Practice Assets for Research, Knowledge, Learning and Education project findings (10.5281/zenodo.8091553) highlighted the limitations of the current repositories landscape and open standards to enable Findable, Accessible, Interoperable and Reusable (FAIR) practice research. The teams have joined together, developed a metadata standard for practice research, engaged with discipline and open standards communities, published findings, and are working towards setting up a Research Data Alliance (RDA) Interest Group. Global engagement has demonstrated the need to articulate what practice research is and the benefits of using this lens to go beyond the assumption that ‘non-traditional research outputs’ are defined by their format and merely supplementary. This presentation will discuss opportunities for open and research data repositories to: (1) develop community owned open infrastructure working in co-design with discipline communities; (2) make the research process visible as it is developed; (3) implement accessibility, user interface and user experience best practice; (4) capture a more inclusive range of contributors (5) enable re-use that respects rights owners and provenance and (6) act as a space to adopt and inform changes to open standards and influence a more nuanced aligned strategy and policy landscape and initiatives. |
13:30 - 15:00 | Repository Showdown 1 Location: N110- Orchestra Room |
|
Figshare Figshare, United States of America Archipelago Commons: blooms, new growth and healthy trees from the community garden. Metropolian New York Library Council, United States of America Repository Showdown: Dataverse Harvard University, United States of America Introducing Fedora - The flexible, modular, open-source repository platform for long-term digital preservation Fedora Repository Showdown: DSpace 1The Ohio State University Libraries, United States of America; 2The Library Code, Berlin, Germany; 34Science, Rome, Italy; 4Atmire, Leuven, Belgium; 5The University of Edinburgh, Edinburgh, Scotland InvenioRDM: Twenty Years of Supporting Research with FAIR and Transparent Practices CERN (European Organization for Nuclear Research), Switzerland |
13:30 - 15:00 | Presentations- Metadata and Harvesting Location: N112- Band Room |
|
Content-update Signaling and Alerting Protocol (CUSAP) 1Solutions Spectrum, LLC; 2Atypon The Content-update Signaling and Alerting Protocol (CUSAP) is a project of the International Association of STM Publishers (STM). CUSAP aims to develop a protocol and common service to actively signal and alert repositories and other stakeholders of updates or amendments to published scholarly content – for example errata, name changes, corrections, retractions, newer versions, or expressions of concern. Such a service is not currently available. The context is that the growth of Open Access is expected to further amplify the proliferation of copies of publications across (potentially many) different platforms – calling for additional measures to steward the use of the correct scholarly literature -- i.e., the Version of Record (VoR) -- and to prevent the use of out of date, amended, or retracted content. The project conducted outreach to potential users, and has developed a metadata package and specifications for an initial demonstrator system. SDG-Classify: Automating the classification of research outputs into UN SDGs CORE, The Open University, United Kingdom This paper presents SDG-Classify, a novel AI model for multi-label classification of research papers based on UN Sustainable Development Goals (SDGs), along with its integration into the CORE Dashboard, helping Higher Education Institutions (HEIs) to better understand how the content held in their repository contributes to SDGs. Using a few-shot, two-stage contrastive learning approach, the method generates contextual embeddings from publication metadata, including titles and abstracts, to train a classification head, leveraging an out-of-domain (OOD) multi-label SDG dataset from news articles. While the two-stage fine-tuned model performs effectively in OOD settings, incorporating additional context through label descriptions significantly enhances the model’s performance in the in-domain evaluations. Additionally, integrating SDG-Classify into the CORE Dashboard streamlines the monitoring of SDG contributions for HEIs and supports research managers in targeted resource allocation and impact-driven decision-making. Identifying and extracting Data Access Statements from full-text academic articles Open University, United Kingdom A Data Access Statement (DAS) is a formal declaration detailing how and where the underlying research data associated with a publication can be accessed. It promotes transparency, reproducibility, and compliance with funder and publisher data-sharing requirements. Funders such as Plan S, the European Union, UKRI, and NIH emphasise the inclusion of DAS in publications, underscoring its growing importance. While a DAS enhances research by increasing transparency, discoverability, and data quality while clarifying access protocols and elevating datasets as first-class research outputs, the repository community faces challenges in managing and curating DAS as a standard metadata component. Manual DAS curation remains labour-intensive and time-consuming, hindering efficient data-sharing practices. CORE has co-designed with the repository community a module that uses machine learning to identify and extract DAS from full-text articles. This tool facilitates the automated encoding, curation, and validation of DAS within metadata, reducing manual workload and improving metadata quality. This integration aligns with CORE's objective to enhance repository services by providing enriched metadata and supporting compliance with funder requirements. By streamlining DAS management and expanding metadata frameworks, CORE contributes to a more accessible and interconnected scholarly ecosystem, fostering data discoverability and reuse. Laying the Groundwork for the Future: Creating Tools to Better Harness Metadata and Data Packages National Transportation Library, United States of America In this repository showdown, I plan to demonstrate various tools I have created to better enhance our metadata and lay the groundwork for our repository’s future capabilities. The presentation will cover the following tools: DOI Parser Version 2.0 for DataCite, DCAT-US Version 1.1 Generator, and CSV to Markdown Bulk README Template Converter. The DataCite DOI parser had directly led to more accurate and complete DOI metadata for all our repository’s DOIs, utilizing the power of persistent identifiers to create robust linked data. The DCAT-US generator has given researchers the tools they need to make these required metadata files for their data packages. Lastly, the README generator allows researchers, librarians, and catalogers to easily create documentation for their datasets. These tools have significantly improved our metadata for our DOIs and DCAT-US files, and they have also greatly increased the speed and accuracy of creating and managing our DOIs, metadata files, and README files. These tools are open to the public to be interpreted and repurposed for other repositories and users. These tools demonstrate that problems and improvements can be tackled one issue at a time, even if repository integration of these tools is deep into the future. |
13:30 - 15:00 | Presentations- Citations, Tracking and Impact Location: C116- Community Gathering Room |
|
Citations growth for journal articles that are in open digital repositories 1Brazilian Institute of Information in Science and Technology (IBICT), Brazil; 2Federal Center for Technological Education of Minas Gerais (CEFET-MG), Brazil The study, carried out as part of the BrCris, Laguna and Oasisbr projects and portal, investigated the impact of digital repositories on the average number of citations of Brazilian scientific articles. The Oasisbr portal aggregates approximately 5 million digital objects and collects metadata on 421,243 journal articles deposited in Brazilian repositories. In parallel, the OpenAlex database provided 1,901,216 records of articles exclusively linked to Brazilian institutions. The comparative analysis between articles in scientific journals and repositories (Class 1) and those only in journals (Class 2) revealed that, on average, articles deposited in repositories receive 83.10% more citations. This growth varies according to the year of publication, percentile of the journals' h-index and field of knowledge, with Social Sciences standing out (+206.02%) and a drop in fields such as Engineering and Computer Science (-8.21%). The discussion suggests that the greater visibility promoted by repositories, especially in open-sharing practices, is a determining factor in the increase in citations. Repositories facilitate the dissemination of articles on platforms such as Google Scholar, increasing the likelihood of citation, even in lower-impact journals. We conclude that repositories are strategic tools for fostering the impact and reuse of scientific knowledge. New Applications for measuring Data Impact in a Domain Science Open Repository University of Texas at Austin, United States of America; Texas Advance Computing Center Data usage metrics are essential to inform repository administrators, funding agencies, and data authors and users about research impact. Like many open repositories, the DesignSafe Data Depot Repository (DDR), a natural hazards data repository, presents Make Data Count (MDC) compliant metrics for each published dataset. Over the last two years the DDR team has developed applications of MDC statistics for benchmarking levels of data usage and for examining usage patterns across time. These provide more revealing insights than a typical count of views or downloads per individual dataset. The applications leverage DesignSafe's extensive publication metadata, enabling a granular view of what data types are used, when, how much, and for how long. To date, the results have provided a picture of continued usage and stability in usage patterns for the majority of the repository's datasets. Other applications such as correlating MDC metrics with citation counts are in the works. The applications highlight the value of usage metrics which, in the form of longitudinal studies, can inform the implementation of long-term strategies to increase data usage, develop targeted data services, and to become more sustainable by demonstrating a repository's key role in the open research landscape. Enhancing Repository Integration with Crossref Services Crossref In the 20 years since Open Repositories began, the importance of metadata in enabling equitable access to research outputs has only grown. Repositories play a crucial role in making scholarly content discoverable, interoperable, and reusable. This presentation will explore how repositories can enhance their metadata workflows by integrating with Crossref’s services. We will provide practical examples of how repositories can utilize Crossref’s metadata to enrich their records, link content persistently, and comply with FAIR data principles. By discussing best practices, we will demonstrate how this integration fosters interoperability and discoverability, supporting the Open Repositories community in the preservation of digital content. Interoperability between Digital Repositories and OpenAlex: Challenges and Strategies 1Brazilian Institute of Information in Science and Technology (IBICT), Brazil; 2Federal Center for Technological Education of Minas Gerais (CEFET-MG), Brazil This study evaluates the representativeness of 199,446 Brazilian publications simultaneously present in the repositories indexed by Oasisbr and in OpenAlex, considering the period from 2000 to 2024. A steady increase in the number of publications was observed over the years, except for small drops in 2014 and 2015 and again from 2022 onwards, attributed to the interval between publication, deposit and indexing. Around 7% of publications (13,976 articles) do not have a DOI, highlighting the need for strategies to generate persistent identifiers, especially in areas such as Medicine, Agronomy and Education. The integration of these databases and the adoption of persistent identifiers are fundamental to enriching local repositories, increasing the visibility of Brazilian science and strengthening interoperability and scientific communication in line with the principles of Open Science. |
13:30 - 15:00 | Panel- Community-Driven Global Governance of the OpenAIRE Interoperability Guidelines: advancing sustainability, openness, and modernization Location: C119&121- Classrooms |
|
Community-Driven Global Governance of the OpenAIRE Interoperability Guidelines: advancing sustainability, openness, and modernization 1University of Minho, Portugal; 2OpenAIRE AMKE This panel discussion focuses on the ongoing updates to the OpenAIRE Interoperability Guidelines, overseen by a global working group comprising OpenAIRE AMKE members and representatives from the international repository community. These updates aim to simplify, harmonize, and globally align metadata guidelines, ensuring interoperability across regional and national repository networks. The working group established in 2023 and active since September 2024, is committed to maintaining the guidelines, evaluating metadata standards, and aligning them with community-specific and regional needs. The session will highlight significant improvements supporting metadata flexibility, enhanced links to research outputs, alignment with Open Science mandates and promoting FAIRness and quality. Other advancements will be presented, including the transition from different versions of the OpenAIRE guidelines, updates to CRIS guidelines, DataCite schema 4.5 changes and the integration of COAR Controlled Vocabularies. The panel will feature flash talks by representatives from initiatives such as LA Referencia, LIBSENSE, CARL, OpenAIRE, and organizations like COAR, EuroCRIS, and DataCite. Interactive elements, including live Q&A and real-time polling, will engage participants and gather feedback for further refinement of the guidelines. Insights from this session will contribute to a paper planned for late 2025, summarizing the updates and feedback, further promoting global adoption and alignment of the OpenAIRE guidelines. |
15:00 - 15:30 | Coffee Break Location: Rogers Lobby |
15:30 - 16:30 | Minute Madness Location: Griffin Auditorium |
|
Growing Up with Open-Source: A Digital Library Story Indiana University, United States of America Indiana University started using open-source digital repository software as early as 2003 and has followed that open-source digital repository path continuously since then. This poster visualizes this progression and growth, showing not only the value of using open-source digital repositories, but also the value of participating in open-source community development work. InvenioRDM features powering the EU Open Research Repository 1CERN (European Organization for Nuclear Research), Switzerland; 2Northwestern University, United States The EU Open Research Repository launched in 2024, supporting the EU Open Science policy, and providing access to more than 100,000 research outputs from about 12,000 EU funded projects. This was made possible thanks to the many InvenioRDM features which either already existed or were developed as part of this project. This poster will present how we managed to showcase EU funded projects under one consistent visual identity thanks to InvenioRDM features like branded communities and subcommunities. We will also present how we improved discoverability of research outputs by using high quality metadata like the European Science Vocabulary (EuroSciVoc) and projects grant links from the Community Research and Development Information Service (CORDIS) to classify records by subject areas and provide browsing by collections. Finally we will also see how we empowered many projects to perform distributed curation by leveraging InvenioRDM features like “requests” and subcommunities. Looking ahead, we will show how the same features are being used to improve search and discovery of NIH-funded data in Zenodo. This poster will demonstrate how InvenioRDM modularity and flexibility enables its diverse user base to implement complex projects at scale. Prim and Proper?: Exploring Best Practices for Useful Title Creation on Non-Text Digital Items Texas Tech University, United States of America As the use of digital content management systems such as DSpace expands to include more items that do not have a creator-created Title proper, issues of professional standardization and user needs arise. Seeking to find the balance between traditional archival practices and digital usability, this poster covers our experience in identifying accessibility issues related to Title conventions for digital items created without a name, exploring others published policies, developing local standards, and the impact of these changes. Reimagining DSpace Analytics: A Blue-Sky Approach to Accessible, Author-Centered Metrics University of Oregon, United States of America DSpace has seen significant advancements with versions 7 and 8, particularly in feature development. While built-in analytics tools provide basic insights, repository managers often rely on various external solutions to demonstrate repository value and impact. This proposal presents a blue-sky framework for enhancing analytics in future versions, emphasizing accessibility compliance, granular usage metrics, and author-centered analytics. Drawing from recent development experience at the University of Oregon, we seek to initiate discussions about forming a DSpace Analytics Interest Group to foster community-driven development and define essential features. This collaborative approach aims to create comprehensive, integrated analytics solutions that meet the diverse needs of the global DSpace community while addressing challenges such as bot traffic mitigation and impact assessment. Reinstating Central Open Access Repository in Nepal: Renovation, Collaboration and Wider Participation 1Social Science Baha; 2Nepal Library and Information Consortium Nepal Library and Information Consortium (NeLIC) established a Central Open Access Repository in Nepal in 2012 with support from EIFL. After the COVID-19 pandemic, NeLIC faced a financial crisis and could not bear the cost of repository maintenance and hosting it on the web. It became offline. Now, in collaboration with the Nepal Research and Education Network (NREN) and the National Institute of Informatics, Japan NeLIC plans to reinstate the repository in the new platform. A Secure Hub for Access, Reliability, and Exchange of Data (SHARED) University of Chicago, United States of America The University of Chicago is developing SHARED (Secure Hub for Access, Reliability, and Exchange of Data) as a comprehensive resource for data-driven research and an integrated data management platform. Funded by the NSF, SHARED offers federated data storage across disciplines to promote collaboration and exemplary data management practices. This infrastructure extends beyond managing 'active' data by integrating with UChicago's institutional repository for data sharing in line with FAIR principles. The initiative is spearheaded by the University Library and Research Computing Center, alongside several university departments. SHARED aligns with the university's data lifecycle strategy, ensuring data access, analysis, publication, distribution, and long-term archiving. It supports diverse scientific applications, from cosmological studies to linguistic research, with a four petabyte ceph-based storage platform. Projects include dark matter searches, simulations of cosmic reionization, and cognitive process studies. SHARED promotes interdisciplinary research and offers educational opportunities through collaborations with minority-serving institutions and K-12 student engagement. Initiatives like the Data Science Preceptorship program with Chicago City Colleges enhance workforce diversity and data-related education. This poster will showcase SHARED’s approach to integrating active storage with long-term data sharing and preservation through repositories. Building Inclusive Repositories: Addressing the Accessibility Gap for BVI Users University of Wisconsin-Milwaukee, U.S. A Digital repositories are essential for disseminating knowledge and advancing research. However, significant accessibility barriers continue to hinder Blind and Visually Impaired (BVI) users, limiting their ability to navigate, search, and access content effectively. This poster presentation addresses this critical gap, analyzing the challenges posed by inaccessible interfaces and proposing actionable strategies to foster inclusivity. Existing repository designs often fall short, lacking proper semantic markup, alternative text for images, keyboard navigation support, and integration with assistive technologies such as screen readers. These deficiencies marginalize BVI users, restricting equitable access to information. Adopting a user-centered approach grounded in established accessibility guidelines, this work synthesizes insights from a comprehensive literature review to identify critical barriers. It proposes solutions such as accessible metadata schemas, alternative content formats (e.g., audio descriptions, braille), and enhanced compatibility with assistive technologies. By embedding accessibility throughout the repository lifecycle (from design and development to content creation and preservation) this work underscores the transformative potential of inclusive practices. Addressing the accessibility gap will empower BVI users, enabling their full participation in the digital research ecosystem and ensuring repositories fulfill their mission as equitable knowledge platforms. Enhancing Search in Digital Collections: Traditional vs. AI Keyword Searches Northwestern University Libraries, United States of America As AI-powered search tools become more common, users are starting to expect enhanced search capabilities across all platforms, including libraries. While generative AI search systems can provide efficient results, they are just one tool in our search arsenal with strengths and weaknesses like any other. Northwestern Libraries Digital Collections provides both lexical (traditional keyword-based) and semantic (AI-powered) search options. This poster will examine the different results offered when searching both systems with the same keywords. By comparing specific search terms, we aim to understand how each tool processes queries and what kind of results they produce. We will assess which system provides the most relevant and valuable resources for each search, helping us to both guide users in effectively using both systems and to refine metadata processes to improve future result relevance across all search methods. Publishing datasets with JoDaKISS and Episciences overlay journals 1CNRS - CCSD, France; 2University of Stuttgart, Germany This poster demonstrates an innovative publishing workflow implemented by JoDaKISS, a new diamond open access journal dedicated to simulation science data and software, hosted on the Episciences overlay journal platform. The workflow represents a significant advancement in scholarly communication by integrating multiple open infrastructures and tools to support the peer review and publication of research datasets. Through Episciences' overlay journal model, which builds upon existing open repositories, the workflow connects various components including data repositories, automated review tools, and Episciences peer-review tools. One of the features is the implementation of the COAR Notify Protocol, enabling real-time notifications and establishing connections between research objects across different services. The poster will visualize the complete workflow, highlighting the roles of different participants and systems, from initial dataset submission through technical quality checks, automated review, and scientific peer review, to final publication. By presenting this decentralized publishing solution, we aim to engage conference attendees in discussions about innovative approaches to dataset publication and inspire broader adoption of interoperable scholarly communication infrastructure. BNP Digital and its relevance to make visible the documentary bibliographic heritage and the native languages of Peru BIBLIOTECA NACIONAL DEL PERU, Peru The BNP Digital, launched in 2009, provides free access and dissemination of the Bibliographic Documentary Heritage (PBD) held by the National Library of Peru. Since December 2023, a new version of this platform (DSpace 7.X) has been available, with technical and functional improvements: more attractive design, standardized metadata, intuitive and more powerful search engine, and interoperability. This makes it more visited and ensures accessibility to the PBD for future generations, especially for those underrepresented communities. The BNP Digital brings together documents by renowned Peruvian authors that make up the nation's cultural heritage as well as valuable bibliographical gems (incunabula and others) that make up the UNESCO Memory of the World Program. It should be noted that several of these documents were written in indigenous Peruvian languages (Quechua, Aymara, Shipibo, among others). In general terms, indigenous languages have little presence on the Internet, therefore, within the framework of the International Decade of Indigenous Languages (2022-2032), through the BNP Digital, it is important and necessary to contribute to guaranteeing access, preservation and dissemination of digital PBD in and about indigenous languages, especially public domain and/or open access material, thereby achieving greater representation of this community by applying an intercultural approach. Concepts of Visibility, findability, discoverability, SEO and ASEO in digital repositories. 1Universitat Pompeu Fabra, Spain; 2Universidad de Chile, Chile This study investigates the concepts of visibility, findability, discoverability, Search Engine Optimization (SEO), and Academic Search Engine Optimization (ASEO) within the context of digital repositories. It specifically addresses the various definitions of each concept, their interrelationships, and the differing techniques applied to each. A scoping review was conducted to achieve this objective, encompassing documents published between 2019 and 2024 and indexed in Web of Science, Scopus, and OpenAlex, focusing on one or more of these concepts in digital repositories. Methodologically, this scoping review utilizes the SALSA framework (Grant & Booth, 2009), which involves the phases of Search, Appraisal, Synthesis, and Analysis. The anticipated outcome is to generate a conceptual overview outlining the differences and similarities between these concepts and the optimization techniques for each property in digital repositories. Experimentation, Implementation, and Evaluation of AI/ML Tools in Repository Submission Workflows Los Alamos National Laboratory, United States of America The Los Alamos National Laboratory (LANL) Research Library recently completed a major effort that encompassed successfully deploying new repository software and implementing an entirely new information management ecosystem for LANL’s corpus of unclassified scientific and technical information (STI). Parallel to this work was exploration of open-source AI/ML tools to enhance user-mediated deposit of STI into our repository infrastructure. These potential enhancements included automated metadata extraction, keyphrase generation, image/table/chart extraction, transcript creation for AV materials, and content summarization. Using these tools to extract these features and supplement user-supplied metadata it is hoped to garner more complete metadata records, allow better discovery in internal/external catalogs, and provide options for accessibility. At the same time, outputs of these tools must be validated and verified by human actors to ensure fidelity and accuracy. As such, we have deployed human-in-the-loop workflows to ameliorate potential errors. Furthermore, we are incorporating analytic mechanisms to track the use, disuse, and correction of these outputs to better assess the utility and reliability of these AI/ML tools. Exploring ETD Embargo Policies: Survey Results and Practical Guidance for Repository Managers 1University of Texas at San Antonio, United States of America; 2Texas Tech University, United States of America This poster will present the results of a survey conducted in summer 2024 of university institutional repositories’ electronic theses and dissertations (ETDs) policies, specifically regarding embargoes. The goal of the survey was to gather examples of policies in order to develop a resource for repository managers who are creating or updating their ETD policies. We developed and distributed a Qualtrics survey with questions for each institution regarding definition of ETDs, embargo availability, length of embargoes, extension options, and more. We asked institutions to submit links to their publicly available ETD policies or share their internal policies if they wished. The results of the survey will be presented along with the resulting policy resource. This poster will be of particular interest to repository managers who have ETD collections or are going to start an ETD collection in the future and would benefit from examples from other institutions. Fostering good practices at the Cultural Heritage Open Scholarship Network (CHOSN) British Library, United Kingdom This poster will introduce a recently developed community of practice called the Cultural Heritage Open Scholarship Network (CHOSN) in the UK, present its progress to date, and outline the benefits of collective efforts in establishing good practices for open scholarship activities. We place openness at the centre of CHOSN’s activities and use open repositories as a means to make GLAM research openly accessible and visible. Improving Research Availability in Low-Bandwidth Areas: Eprints3v5 Bundle Export EPrints Services, University of Southampton, United Kingdom Loading large modern web pages in areas without fast internet connections means that precious bandwidth is used for non-important data, such as colours and animations. A repository user is primarily interested in navigating effectively to and reading research, not the presentation of the research on a webpage. In OR24 there was a buzz surrounding the idea of digital repositories provisioning for internet-in-a-box applications. EPrints is very good at curating data and exporting to any kind of format. Therefore the idea is to create an export option where a user who is interested in a set of outputs can export a set of basic html pages and the pdf’s of research where available. This set of HTML pages and PDF’s is as small as it can be in order to be added to any computer. The user on the other end can navigate the HTML pages and find the research they are looking for and have it served locally, wherever that may be, around the globe. We plan on making this standard in the latest version of EPrints, 3v5. New Work Types are Here: Expanded ORCID Metadata Schema based on COAR ORCID, Inc., United States of America Since its founding 12 years ago, ORCID’s metadata schema has been based on the CASRAI standard. CASRAI ceased operating as an organization in 2020, so In order to keep our vocabulary vibrant, ORCID will expand its work type vocabulary in 2025 along the lines of the COAR resource type list, with particular attention to non-traditional research outputs, which humanities and social science scholars have found difficult to clearly identify their outputs in ORCID. ORCID interacts with numerous systems and stakeholders in the research cycle, but repositories in particular, stand to be most transformed by this output vocabulary. As this expansion will take place in early 2025, it will be possible to glean the first impacts in the ORCID registry data and research repositories at Open Repositories 2025. PHAIDRA: A Journey Towards Scalable Open Repositories University of Vienna, Austria PHAIDRA [1] is the repository for the permanent, secure storage of digital assets at the University of Vienna. Since its launch in 2008, the repository has evolved constantly and seen significant transformation. What began as a monolithic internal system has developed into a modern, microservices-based open-source solution, developed in close collaboration with the user community. The latest fully dockerized version enables faster, simpler, and automated installation, while modular metadata workflows ensure flexibility to meet diverse research needs. Supported by the migration from Fedora 3 to Fedora 6, these advancements ensure compliance with FAIR principles and long-term adaptability, while making PHAIDRA a scalable and versatile repository platform that can easily be adapted for various institutional needs. This poster outlines the stages of PHAIDRA’s evolution and architectural rebirth. It will visualise how the project addressed limitations of its original design and contemporary challenges in repository management. Key advancements include enhanced metadata flexibility through linked data standards (e.g., JSON-LD), alignment with FAIR principles and scalable solutions for long-term sustainability and adaptability through Docker. These efforts make PHAIDRA a versatile platform for diverse use cases - from theses and Open Educational Resources (OER) to digital archives. Plucking and Re-planting ORCIDs in Data Repository Datasets: Readme Harvesting for Metadata Improvement University of Minnesota Libraries, United States of America Persistent identifiers (PIDs) for research data authors are increasingly important and have been identified as major gaps in the metadata for data repositories including the Data Repository for the University of Minnesota (DRUM). Open Researcher and Contributor IDs (ORCIDs) are already strongly recommended as unique author IDs in the scholarly community and are collected in the DRUM repository through text-based Readme file templates. An increasing number of journals and funding bodies are requiring ORCIDs from submitters including more US federal agencies in May 2025. In DRUM, ORCIDs are present in the well-structured, text-based Readme files, but are not incorporated into the DSpace-based system metadata which prevents automation and reuse of the data in processes such as minting DOIs with DataCite. This metadata improvement project seeks to harvest the ORCID information in these Readme files and transform them into a format that will enable better integration with the repository metadata structure and the global scholarly infrastructure at large. Tools and resources used in this project include the DataCite API, DSpace API, OpenRefine, and Python scripts to collect and analyze local metadata and transform it. This process will be documented to enable automated and repeatable processes for a strategic metadata improvement project. TU Wien & OSTrails: Connecting services TU Wien, Austria Modern research infrastructure has arrived at a point where researchers have an array of services available to help with many tasks in their research workflows. However, these services still mostly exist as island solutions with limited interoperability. TU Wien has the technical lead in a European project called OSTrails aiming to define “Interoperability Frameworks” and APIs for common workflows spanning various types of research infrastructure services (DMPs, SKGs, FAIR assessment). These APIs are designed to enable (semi-)automatic execution of common workflows spanning different types of services, and thus eliminate some manual work for researchers. Further, by defining these APIs in a service-agnostic manner, the risk of vendor lock-in can be reduced which will lead to a more sustainable ecosystem. At OR2023, we presented a preview of such an integration between a research data repository and a DMP tool where users could report datasets as being reused in a DMP via the interface of the repository. In late 2024, TU Wien deployed this integration, connecting our InvenioRDM-based research data repository and Damap-based DMP tool. This poster is intended to spread awareness of the ongoing standardization effort in the OSTrails project, and perhaps even inspire follow-up projects beyond European borders. Change Platforms at the Next Station: A Repository Migration Itinerary National Transportation Library, United States of America Serving the U.S. Department of Transportation (USDOT), state departments of transportation, universities, and transportation organizations across the country, the National Transportation Library (NTL) supports a broad community of transportation researchers. The NTL provides an open access repository for research produced or funded by the USDOT as part its legislative mandate. ROSA P – the Repository and Open Science Access Portal – is migrating to a new digital repository platform, aiming to improve functionality and features for both the front and back end. The journey to a new repository requires careful planning to ensure the community of stakeholders reaches the destination with minimal disruptions. This poster will demonstrate how the NTL is preparing a migration itinerary that includes training, documentation, and communication. Leveraging ReCiter to identify articles, notify authors, and facilitate deposition of manuscripts into Cornell’s eCommons Weill Cornell Medicine, United States of America Institutional repositories (IRs) are ideal venues for storing research products such as postprint manuscripts and data. However, IRs are generally under-utilized due to the significant time and administrative burdens placed on authors, IR facilitators, and institutional stakeholders. This project seeks to alleviate these burdens by integrating Cornell’s IR, eCommons, built in DSpace, with ReCiter, an authorship prediction algorithm that links authors with papers and contains authoritative institutional data about Weill Cornell faculty and their output. To identify which manuscripts can be uploaded into eCommons, journal embargo policy information has been added to the ReCiter database. This allows us to identify applicable papers based on author status and position, as well as what the embargo policy is for each record. Automated/semi-automated processes using Microsoft Power Automate and Airtable have been built to notify authors, collect manuscripts, fill out bulk deposition forms, and generate documentation required for uploading and hosting in eCommons. PROMOTING INCLUSIVITY IN OPEN REPOSITORIES: A COMMUNITY-CENTRIC APPROACH IN NEPAL OPEN ACCESS NEPAL, Nepal This proposal explores the significant challenges and opportunities in promoting inclusivity within open repositories in Nepal, a country with diverse linguistic, cultural, and technological landscapes. Despite the global momentum for open access, repositories in Nepal face specific barriers such as limited infrastructure, digital literacy challenges, and diverse linguistic needs. Open repositories are key to fostering equitable access to knowledge, but their effectiveness in Nepal is hindered by these issues. This work proposes a community-centric approach that leverages local partnerships, capacity-building programs, and multilingual support to ensure that repositories are accessible, inclusive, and representative of all communities. Through case studies and pilot projects initiated by Open Access Nepal, the poster will present tangible examples of how grassroots involvement can address barriers to access and engagement. These initiatives, such as the inclusion of indigenous languages in metadata and the development of localized training for digital literacy, aim to bridge gaps and make open repositories more inclusive. This poster aims to contribute to the broader global conversation about inclusivity in open repositories by sharing insights into the Nepalese context and offering strategies applicable to similar challenges worldwide. Toward accessible PDF documents in Open Access Repositories Instituto Tecnológico de Costa Rica, Costa Rica Currently, disability affects 15% of the world's population (approximately one billion people), according to data from the World Health Organization’s (WHO) first report on disability. Of these, 645 million people experience visual or hearing impairments, which can hinder the proper use of academic documents. Most documents in Open Access Repositories are deposited in PDF format, making it crucial to ensure their accessibility. Consequently, it is necessary to evaluate the accessibility of PDF documents in open access repositories available on DOAR (Directory of Open Access Repositories) within the Communications and Information Technology field. For developers and site managers, it is important to understand the accessibility status of PDF documents deposited in repositories and to learn how to address and resolve these issues effectively. Digital Repositories in the Arab World: Status, Challenges, and Future Prospects Bibliotheca Alexandrina, Egypt Digital repositories in the Arab world are vital for preserving cultural heritage and academic research. Despite advancements, they face challenges, such as infrastructure limitations and language barriers. This poster will outline the current state of these repositories, key obstacles, and the potential role of artificial intelligence (AI) for future growth. AI technologies offer promising solutions for metadata automation, enhanced retrieval, and language processing, supporting a more accessible and collaborative knowledge network. Helping to preserve and share Oaxaca's history San Diego State University, United States of America SDSU has a history of partnerships with cultural heritage organizations in Mexico. While our partnerships with organizations such as the Archivo Historico in Tijuana and the Biblioteca de Investigación Juan de Córdoba in Oaxaca are well established, we have recently embarked on a new project with the Fray Francisco Burgoa library in Oaxaca. The Burgoa holds a one-of-a-kind collection of 19th and early 20th century newspapers from the Oaxaca region. The collection is well known and heavily consulted. Unfortunately, the issues were folded and bound into unwieldy volumes resulting in serious damage to the fragile paper. The staff at the Burgoa have embarked on a preservation and digitization program. They are stabilizing the paper and creating high quality digital images, but they have no means of displaying the newspapers online. On our part, SDSU is working to compile the images, create bilingual metadata and make them accessible through our online repository, Archipelago. The collaboration will allow more researchers to access the historic material without damaging the fragile originals. This poster outlines and illustrates our cross border collaboration. Bridging the Gap: Developing Open Digital Repositories for Namibian Cultural Heritage Namibia University of Science Technology, Namibia Repositories have emerged as crucial hubs for preserving and disseminating knowledge, cultural heritage, and scientific research. However, to continue serving a diverse global community, they must adapt to evolving technological landscapes and societal needs. Digital repositories have the potential to democratize access to information and cultural heritage. However, significant disparities persist in accessing and utilizing these resources. This project aims to develop and implement inclusive digital repositories that adhere to accessibility standards and Develop strategies to support multilingual content and user interfaces and consents from rights holders. Currently Namibia is having few digital repositories that preserve indigenous Knowledge. One of these includes the “Digital Namibian Archive (DNA)” Digital Namibian Archive (DNA) is an innovative project that brings together international partners to develop a rich digital resource that reflects the diversity of voices and cultural stories of Namibian people to individuals throughout the United States, Africa, and the world. However, through analysis, it is yet to be established what kind of technologies these repositories adopt for sustainability. Existing platforms visual content especially videos and limited Multilingual Accessibility, thus this is the gap this project intents to close. COAR Resource Type Vocabulary - Enhancing Interoperability Across the Repository Ecosystem 1COAR, Netherlands, The; 2CSIC, Spain COAR maintains three controlled vocabularies. In December 2024, the COAR Resource Type vocabulary, the most well used vocabulary, was updated and now contains 105 terms that represent the wide variety of scholarly resources that are being deposited, managed, and made available through repositories around the world. Each concept is accompanied by a definition and is related to other concepts in the vocabulary, as well as linked to similar concepts in third-party vocabularies. Thanks to the efforts of the COAR’s international membership, the vocabulary is now available in 25 languages, offering even greater value to the global open science ecosystem. This is an excellent illustration of the COAR community’s strong commitment to multilingualism in scholarly communications. Nice DSpace Indiana University Libraries, United States of America Repository systems exist within real world system architectures. They share resources with numerous and varied other software systems. For this arrangement to be sustainable all systems must use the available resources responsibly. At Indiana University LIbraries, three production DSpace 7 instances share system resources with each other, with other repositories, with cataloging processing and archiving systems, all living on university-wide enterprise hardware. Through research and ongoing experimentation, methods and tools have been implemented at IU to ensure DSpace instances remain well-behaved, utilizing enough resources to remain responsive, but never leaving other systems to starve for memory or CPU time. This poster will describe those methods and tools. Powering Institutional Repository Growth with OA Switchboard Carnegie Mellon University, United States of America This poster presents Carnegie Mellon University Libraries' approach to integrating OA Switchboard with CMU’s Figshare repository, called KiltHub, to automate the creation of new records for CMU-authored publications. This project leverages OA Switchboard notifications to efficiently add metadata records for Open Access (OA) publications that are not already represented in KiltHub. Through this integration, we hope to provide an accurate representation of nearly all OA publications by CMU authors in the institutional repository. This poster provides an overview of the ongoing project of connecting OA Switchboard with FigShare for an automated data feed, which enables the efficient transfer of standardized metadata for creating new records within KiltHub. Turning mirrors into windows: an opportunity to build open repository to promote access to Indigenous Knowledge at Lusaka Apex Medical University, Zambia Lusaka Apex Medical University, Zambia Indigenous Knowledge (IK) remains a key asset of rural communities in Zambia. This knowledge is passed on from one generation to the next through custodians, who are mostly elders, and local people rely on IK to make decisions on various aspects of their lives. However, one of the critical components in IK is traditional medicine, which is diminishing because of an increase in barriers that affect its transmission between and within community members. This paper aims to discuss an on-going project called TCAM (Traditional, Complementary, and Alternative Medicine) being undertaken at Lusaka Apex Medical University (LAMU) to document traditional medicines through capturing all the key data elements for each plant used by diverse communities. The documented data is deposited on the LAMU open repository, allowing access to wider communities. The paper will discuss the current status of the LAMU open repository, factors influencing the implementation, and challenges faced in implementing the TCAM project. Qualitative data will be gathered using an interview guide through face-to-face interviews with 8 participants who are part of the steering committee mandated with the responsibility to document and deposit information on the LAMU open repository. Open Access Repositories Tracking Project CORE, The Open University, United Kingdom We are conducting a new research project to improve the global understanding of repository content by generating yearly statistics on research materials held by repositories. The project will assess the volume of undiscoverable content, estimate the number of repositories that are not registered in public repository registries, and differentiate scholarly resources deposited directly in repositories from those merely linked. It will also consider issues of duplication and multiple versions, such as preprints and postprints, to produce statistics that can be easily interpreted. This work complements and extends existing Open Access monitoring efforts by including content that both contains and lacks persistent identifiers (PIDs), taking into account an estimated 100m+ scholarly records that usually go untracked. Ultimately, the project aims to build an infrastructure for better monitoring and understanding of repositories' roles in scholarly communication, facilitating future research into Open Access growth. Rescuing Legacy Data: Using Optical Character Recognition Technologies to Make Airline Consumer Data Accessible National Transportation Library, United States of America This poster highlights efforts to rescue and make accessible the U.S. Department of Transportation Air Travel Consumer Report data tables. This data has been previously made available as PDF documents or physical print documents that include data tables. These data tables are now being extracted and converted into an accessible tabular format for publication on NTL's Repository, ROSA P. Using ABBYY FineReader PDF software, this project transforms and rescues PDF-locked data tables into machine-readable formats, ensuring greater accessibility and usability for researchers and the public. This poster will not only cover how to achieve this goal of preservation using ABBYY FineReader PDF software, but it will also provide an accessible roadmap that can be repurposed for other projects and other repositories. Through rescue efforts such as these, other legacy data projects can be executed efficiently by data professionals. Engaging a City’s History through Consortial Search Chicago Collections Consortium, United States of America The city of Chicago has over a hundred collecting and archival institutions; most of which include serving the community of Chicago within their mission statements. The Chicago Collections Consortium is a non-profit organization that serves as a collaboration among organizations throughout the city of Chicago. This session will focus on how the Consortium (CCC) created a diverse and collaborative environment for cultural heritage workers across the city; creating shared authority and opportunities for connection and leadership through an active committee structure. This includes discussions on how CCC ideates, creates, and iterates content across physical and digital channels, for audiences from tourists to genealogists, students to families, and beyond. Using the Consortium’s work around the EXPLORE portal, this lightning talk will include a discussion of how to maximize efforts to bring collections into conversation with each other, provide a seamless user experience, and create a process that accommodates the voluntary nature of a consortial model. The purpose of this talk is to inspire cities around the country to adopt similar Consortium methods, bringing distinct ethnographies out of individual archives and together forming a multifaceted telling of history and culture. An attempt to create a sustainable data repository and CRIS using a bespoke application National Institute for Materials Science This presentation will introduce a use case of merging and migrating an existing institutional repository and researcher directory database to a new bespoke application. National Institute for Materials Science has been providing Materials Data Repository (MDR, https://mdr.nims.go.jp) and NIMS researchers directory database SAMURAI (https://samurai.nims.go.jp) for several years. They were developed and maintained separately for several years, but it has been much more difficult for the system administrators to maintain both services due to an emerging demand for immediate open access and a shortage of IT engineers. To solve the problem, we decided to develop a new CRIS application “MDR2” and merge those two applications into the new application. This presentation will describe why we decided to migrate and merge those two application and how we designed the new data repository and CRIS application to simplify our implementation. Preserving Narratives of Disinformation: a digital repository for research and analysis of disinformation Instituto Brasileiro de Informação em Ciência e Tecnologia, Brazil This proposal addresses the growing issue of disinformation, which has been amplified by technological advancements such as the expansion of internet access and artificial intelligence tools. The COVID-19 pandemic, starting in 2020, highlighted the significant impact of disinformation on public health, contributing to a decline in vaccination rates in Brazil. The Brazilian Institute for Information in Science and Technology (Ibict) has been working for 70 years to strengthen Open Science, with initiatives like the Minerva Network that monitor digital discussions to counter disinformation. The proposal outlines the development of a Digital Repository to catalog disinformation narratives from the pandemic period, supporting future research. This initiative aligns with platforms like Tempora and datasets like FakeNewsNet, which collect and organize digital information for analysis. Key challenges include institutional support, defining metadata standards, ensuring long-term data preservation, and building an accessible platform for diverse audiences. The project aims to foster discussions about the importance of preserving disinformation narratives, not only for future studies on their social, cultural, and political impacts, but also for mitigating current disinformation. |
16:30 - 18:30 | Poster Session and Welcome Reception Location: Sherry Lansing Theatre |
Date: Tuesday, 17/June/2025 | |
08:00 - 17:00 | Registration Location: Rogers Lobby |
09:00 - 10:30 | Presentations- Advocacy and the Human Touch Location: Griffin Auditorium |
|
Our Common Cause: Advocating for Human Labor for Outreach, Technical Expertise, and the future of Institutional Repositories Syracuse University Libraries, United States of America Since 2021, SURFACE, Syracuse University’s institutional repository, has seen substantial growth in both content and commitment to long-term digital preservation. This progress is largely due to the Syracuse University Libraries’ investment in a dedicated three-person team focused on sustainability and development. As we face emerging technologies and AI, we question whether AI can replace human labor to reduce costs. Despite the financial appeal of cost-cutting, our presentation underscores the essential role of human labor. Through community-focused outreach and technical advancements like mass minting DOIs and restructuring our platform, we have demonstrated the irreplaceable value of human involvement. This presentation will explore the human labor behind the growth and maintenance of an Institutional Repository, arguing that AI cannot replace the nuanced and strategic efforts of human workers. We will discuss how our distributed labor approach has fostered a vibrant community and ensured long-term sustainability and preservation. Additionally, we will address the importance of outreach in transforming repositories into dynamic spaces of shared resources produced and maintained by collective effort. Through real-world examples, we will illustrate the limitations of AI and the critical need for human expertise in managing and preserving digital resources. Better Together: Growing Digital Repositories through Community-building and Collaboration 1California State University San Marcos, United States of America; 2California State University Office of the Chancellor; 3California State University Northridge; 4California State University Fullerton; 5California State University Dominguez Hills The California State University system is the largest public university system in the United States but does not have many shared services amongst the 23 campuses. The digital repository ecosystem with the CSUs has grown from individual campus repositories with no shared governance structure to a three-pronged systemwide service that has a steering committee, working groups, shared documentation/guidelines, and an annual meeting. But growing together isn’t always easy! The campuses have different resources, different expertise levels, and very different needs. This presentation examines how we have gone from an individual campus approach with no centralized infrastructure to one of shared repositories, shared documentation, workflows by consensus, and collaborative co-working sessions. Over the past decade, we have created governance structures, committees and communication paths, open forums and annual meetings. We have worked to bring each other along, sharing knowledge and work across this large (and under-funded) system. This collaborative approach has been slow at times, but with our ScholarWorks Institutional Repository, the CSU Digital Archives, and the CSU Open Journals, we are working together to build our skills and capacity across the system – together. Automating depositing research to the institutional repository and reshaping the ‘call to action’ for authors University of Oxford, United Kingdom Recent internal reports have shown conclusively that researcher engagement with the University of Oxford’s green Open Access (OA) service, consisting of Symplectic Elements and Oxford’s institutional research repository, ORA, has reduced since the last REF period ended in 2021. Around 30% of articles follow the University’s ‘Act on Acceptance’ (AoA) policy, which asks authors to deposit a digital copy of the accepted manuscript of journal articles and conference papers within three months of acceptance. Many funder policies are now out of step with AoA; action within three months is no longer sufficient as they require immediate OA on publication. The forthcoming Research Excellence Framework OA Policy also sees a move towards publication as the point for action. Publisher Transformative Agreements (TAs) have also had an impact on the decline in reliance on ‘green’ open access due to the wider inclusion and greater coverage of OA publishing through TAs, whether or not the author has been in receipt of research funding. This presentation explores the challenges for institutional repositories in ensuring that they can continue to collect OA content through innovative technical developments whilst providing value and vision to the academic community, using Oxford as a case study. Empowering Community Voices Cal Poly Humboldt, United States of America To support our communities is to support the students, faculty, and staff who live there; the politics, society, and environment upon which our campus is situated; the potential of our future students; and an empowered, educated, and engaged populace. Towards that goal, in 2015, the Cal Poly Humboldt Library opened their institutional repository and library publishing services to the community. In the ten years since, the library has added three online community collections and experienced a steady growth of community publications across the first six years before leveling out at 10-15 publications annually. These community engagements have brought a wide range of benefits to the campus and community, including: • scholarly communications work for student assistants, advancing their skills, engagement, and post-graduation opportunities • expanded educational programming and opportunities to students across campus • greater collaboration between the university with the community • fundraising and marketing for both the community and the university • raised awareness to local environmental and social justice issues This presentation will detail the operations of the program; the benefits to the community, students, and university; and the resources expected to sustain a community publishing service. |
09:00 - 10:30 | Presentations- Repository Implementation Location: N110- Orchestra Room |
|
Towards a Plan S compliant repository: Building a safe and sustainable haven for scholarly content. 1Tilburg University; 2Atmire Since 2013, Tilburg University has been using the Elsevier application Pure as both a Current Research Information System (CRIS) and an Institutional Repository (IR). The use of Pure as an IR has been considered undesirable over time, because Pure lacks delete-recovery, versioning, and tombstone functionality, and therefore does not comply with Plan S. Therefore in 2022 the decision was made to use a separate, open source IR as a safe haven for all CRIS content. Pure would continue to function as input for all data relating to publications and the full-texts would be forwarded to the IR via the build-in Pure connector. After an extensive market research and a tender procedure in 2024 TiU found a suitable IR application (DSpace) and an implementation and hosting partner (Atmire). Over the years, several institutional repositories have integrated with Pure. More general, the idea of using a CRIS as the submission front end, and leverage the repository for preservation and dissemination is common. However, the new challenges to comply with Plan S requirements, in combination with the capabilities and limitations in the DSpace 7-Pure connection, both presented here, give the audience an update on the state of the art in this area. The Green Road to Open Access: potential and limitations in the experiences of the University of Brasilia (UnB) Universidade de Brasilia, Brazil In Brazil, the implementation of the Green Road took place through Institutional Repositories (IRs), which serve to collect, store, organize, and disseminate scientific output. However, studies show that the self-archiving model is not widely adopted in the country. Based on this observation, data was collected to assess the extent to which IRs are facilitating access to scientific publications. A case study involving 14 CNPq level 1A productivity researchers affiliated with the University of Brasilia (UnB) revealed that only 11% of the 2,200 articles self-reported by these researchers are deposited in the UnB IR. An analysis of 136 editorial policies indicated that this low percentage is primarily due to operational challenges rather than editorial restrictions. KWAME NKRUMAH UNIVERSITY: EVOLUTION AND CHALLENGES IN MANAGING POSTGRADUATE RESEARCH REPORTS KWAME NKRUMAH UNIVERSITY, Zambia Kwame Nkrumah University (KNU), Zambia’s fourth public university, has undergone a remarkable evolution since its founding in 1967 as Kabwe Teachers Training College. Initially established to train junior secondary school teachers, the institution achieved university status in 2014, expanding its academic offerings and infrastructure. Today, KNU serves over 10,000 students through a diverse range of programs under its schools of Natural Sciences, Business Studies, Humanities and Social Sciences, Health Sciences, and Education, alongside a Directorate of Research and Postgraduate Studies.The university’s reliance on physical storage for postgraduate research reports has posed significant challenges, including space constraints, accessibility issues, preservation difficulties, logistical costs, and limited discoverability. These issues, coupled with rising postgraduate enrollments and environmental concerns, underscored the need for innovative solutions. Health Educational Resources Repository (ARES): experience report on its updating process 1Universidade Aberto do Sistema Único de Saúde (UNASUS), Brazil; 2 Abstract Objective. This research aims to demonstrate the updating process of the Brazilian repository called Health Educational Resources Repository (ARES), which gathers educational resources in health produced under the Open University of the Unified Health System (UNASUS) of the Ministry of Health, aimed at the continuous training of health professionals. Methodology. This work is characterized as an experience report as it addresses the procedures necessary for updating the Health Educational Resources Repository (ARES). It presents the technologies employed, the changes in informational architecture and metadata standards, as well as the main advancements and challenges. Keywords Educational resources; Health educational resources; DSpace; Open University of SUS (UNASUS). Audience Managers and developers working with digital repositories based on DSpace. |
09:00 - 10:30 | Lightning (24x7)- Communities Location: N112- Band Room |
|
You Can’t Force a Community: Trusting Your Instinct in Digital Collection Partnerships Oregon Health & Science University, United States of America This presentation will talk about one grant funded digital library partnership between the Oregon Health & Science University Library and the Northwest Narrative Medicine Collaborative. While the goals of the grant were accomplished, the partnership didn’t benefit both organizations equally. What didn’t work will be described and attendees will receive concrete takeaways for community-based repository projects. Longitudinal growth and use of Open Repositories in the U.S. since 2015 Virginia Commonwealth University, United States of America One indication of the maturity of institutional repositories (IRs) in 2015 was that the Association of College and Research Libraries (ACRL) added two questions focused on IRs to their Academic Library Trends and Statistics Survey. ACRL has continued to collect this data annually, documenting the number of IR items and downloads. This lighting talk will take a longitudinal dive into this data for a quantitative IR portrait of the past decade and include the type of U.S. higher education institution as defined by Carnegie classification. By looking at this self-reported data from the past decade, growth trends and benchmarks can be observed along with ongoing questions of the various complexities in measuring IR success. Models can also be built on this past record to anticipate future capacity needs for the IR ecosystem. Lessons Learned from the Internet Archive's Vault: Expanding Access to Digital Preservation through Community Collaboration Internet Archive, United States of America Ensuring the long-term sustainability of digital repositories requires innovative solutions that are both financially and environmentally sound. The digital preservation landscape favors larger institutions with more resources. Vault, by the Internet Archive, has aimed to democratize this process by providing an affordable and extensible solution tailored to the needs of smaller cultural heritage organizations. Vault’s service leverages open infrastructure and a non-profit service model to provide affordable and scalable digital preservation. This presentation explores how Vault addresses the unique challenges faced by smaller institutions, including budgetary constraints, staffing limitations, and technical capacity - ultimately contributing to a more sustainable digital preservation ecosystem. We will share insights from Vault’s growth so far, highlighting how direct engagement with diverse partners shaped the development of Vault's features, cost models, and community support. From Open Archive to Multifaceted Platform: Reconciling the Diverse Stakeholders Needs in HAL CNRS-CCSD, France HAL, launched in 2001 as a multidisciplinary open archive, has evolved into the national repository infrastructure for the French research community. With over 1.4 million files, including articles, preprints, and conference papers, HAL has experienced significant growth since 2019, with annual deposits exceeding 150,000 files. The platform provides long-term preservation and services for researchers and institutions, balancing their diverse needs. This presentation examines four key factors driving HAL’s evolution, highlighting its dual role as a technical system and a collaborative network. First, HAL has expanded its metadata capabilities beyond traditional bibliographic information, enabling research monitoring and integration with diverse information systems. Second, it has streamlined researcher tools, offering automated self-archiving and metadata extraction from deposited files to reduce workloads and increase adoption. Third, HAL’s federation model allows institutions to create customizable portals, fostering collaboration and contributing to a unified national research network. Finally, comprehensive user support, including training and dedicated assistance, strengthens the repository’s overall usability. While these innovations advance open science, they also raise challenges, such as managing increasing complexity, balancing simplicity with feature expansion, and ensuring HAL maintains its primary focus on open dissemination rather administrative data management. Why and how the TRUST Principles for digital repositories are used 1World Data System, United States of America; 2University of Tennessee, Knoxville The TRUST Principles for digital repositories were formalized in 2020. Repositories must earn the communities' trust and demonstrate a long-term commitment to the data. The RDA/WDS TRUST Principles Working Group (TRUST WG) created a survey to identify what and where hesitancy exists regarding adopting the TRUST Principles. Questions on certification standards, extrinsic and intrinsic motivations behind repository certification, and public displays of the different aspects of the principle are asked and will be reported on. Additionally, a call for use cases is made with analysis ongoing with collection while a discussion on what is missing from the TRUST Principles leads to worthy future work. A preliminary analysis of results collected as of November 3, 2024 (n=49) was presented at RDA P23 in Costa Rica. Still, this presentation will be the first to share the final survey results of approximately 77 responses from 26 countries, with an additional 16 responses in progress (as of 1/13/25, the survey closes 1/31/25, so the included statistics and figures may change). Cost-benefit of open research infrastructures: the case of the Portuguese repositories network RCAAP 1University of Minho, Portugal; 2FCT-FCCN; 3CSIL Open Science is transforming research and knowledge sharing by promoting transparency, collaboration, innovation, and scientific progress. Despite its integration into major policy frameworks and funding programs like Horizon Europe, there is a pressing need for evidence on its economic impacts to validate the investment made by funders and institutions. This presentation highlights the findings of a cost-benefit analysis of the Portuguese network of open access repositories, RCAAP, conducted as part of the PathOS project. PathOS aims to assess the effects of OS through impact pathway analysis, literature reviews, causal effect narratives, and CBA. RCAAP serves as Portugal’s central system for discovering and retrieving scientific content, hosting hundreds of thousands of outputs. Its goals include enhancing the visibility and accessibility of Portuguese research, facilitating access to research outputs, and integrating Portugal into international OA initiatives. The CBA of RCAAP analyzed its infrastructure, services, and usage data collected via desk research, surveys, and interviews. The study compared the benefits of RCAAP’s operation with a counterfactual scenario in which RCAAP does not exist. Findings revealed substantial net benefits, including cost savings in storage, infrastructure, and labor for institutions and reduced access costs for users. These results underscore the significant economic and research impact of OS practices. |
09:00 - 10:30 | Panel- Managing Access to Open Repositories in the Age of Generative AI Location: C116- Community Gathering Room |
|
Managing Access to Open Repositories in the Age of Generative AI 1CORE, The Open University, United Kingdom; 2Pacific Northwest National Laboratory (PNNL); 3The University of Glasgow; 4Confederation of Open Access Repositories (COAR); 5Metropolitan New York Library Council |
09:00 - 10:30 | Developer Track Session 2 Location: C119&121- Classrooms |
|
Benefits of integrating publication cost data into repositories: An aggregator's perspective Bielefeld University, Germany The financial dimension of the Open Access (OA) transformation is becoming increasingly significant. The diversity of business models makes it difficult for institutions to gain a comprehensive overview of cost flows and trends. To make informed decisions, transparent and comparable data on payments across institutions is essential. OpenAPC, located at Bielefeld University, functions as an aggregator that collects and disseminates cost data on OA publishing under an Open Database License. This presentation will highlight the pivotal role of repositories in managing publication costs sustainably and making this data freely accessible. Via the OAI-PMH interface, OpenAPC can harvest publication costs from participating repositories employing a standardized XML metadata schema developed by the openCost project. The demonstration will illustrate the OpenAPC harvesting routines using real-world use cases, showcasing the workflows for institutions that have already implemented the openCost schema. It will cover a variety of repository systems, including DSpace, E-Print, Invenio, and LibreCat. The presentation will concentrate on the benefits of storing cost data in repositories for institutions, as well as on the advantages this approach offers to aggregators such as OpenAPC, with the aim of showing how this strategy can reinforce the importance of repositories in the ongoing transformation. The IRD: Improving knowledge about the state of the repository system landscape through automated curation processes Antleaf Ltd., United Kingdom The International Repositories Directory (IRD) will be launched at Open Repositories 2025. The IRD will address concerns about repository metadata quality, including harvesting, de- duplication, responsiveness checks, and platform identification. This session will demonstrate the more technical aspects of this system, focussing on how the metadata quality is maintained with a set of automated curation functions. The session will conclude with a brief presentation of the improved dataset from the IRD, showing a more positive picture of repository functionality worldwide - notably that OAI-OPMH support is actually 67% rather than 50%. Repository Insights with OpenSearch California Digital Library, United States of America The Merritt Digital Preservation Repository consists of 6 java microservices and 1 ruby microservice. Each service operates as a high-availability service running on load-balanced instances. In 2023, the Merritt team adopted OpenSearch for consolidated logging across each of these instances. The migration to a consolidated logging solution produced benefits beyond the team's expectations. This presentation will describe the Merritt team’s need for consolidated logging. Because the university has adopted AWS technologies, OpenSearch was a logical solution. The presentation will provide a brief overview of OpenSearch and an explanation of the distinction between OpenSearch and ElasticSearch. Once the team had published log records to OpenSearch, the team began to explore the data visualization capabilities of OpenSearch. The creation of data visualizations enabled the team to proactively identify problems before users had reported them. Additionally, the visualization capabilities of OpenSearch were so attractive that the team has begun publishing aggregated data records to OpenSearch for visualization purposes. Integrating Digital Repositories and Learning Management Systems using EduLink University of Wisconsin - Madison, United States of America Learning Management Systems have become ubiquitous in Higher Education and K-12, especially post-pandemic with the rise of remote and hybrid learning. This talk discusses the implementation and deployment of the new EduLink plugin for the Metavus digital collections platform, which leverages the Learning Tools Interoperability (LTI) Deep Linking protocol to integrate digital repository materials seamlessly into LMSes. Metavus underpins ATE Central, the official information hub and archive for the National Science Foundation's Advanced Technological Education program, and the new EduLink plugin is being used to power STEMLink, a service institutions can use to incorporate ATE-developed STEM curriculum into their LMS courses. In addition to looking at the overarching issues posed by repository/LMS integration, specific techniques used to overcome challenges using LTI in the face of tracking prevention in modern browsers are discussed, and a demo of using STEMLink to add a lab activity to a Canvas course is provided. Tailoring Research Data Management Solutions: Lessons from Ethiopian Institutions Addis Ababa University; AcaTech Technology, Ethiopia Following the increasingly complexities of problem, data-driven interventions become vital solutions, with repositories being serving as the cornerstone. This paper presents innovative approaches used in the design and implementation of three key projects in Ethiopia: the Agri-Datahub, Research Data Repository (RDR) at Addis Ababa University, and National Academic Digital Repository of Ethiopia (NADRE) by the Ministry of Education. While open-source tools provide significant opportunities, we found that they often cannot be deployed "as-is" to address the intricate problem we faced. Each project required novel strategies to orchestrate existing technologies in some way for specific workflows. The Agri-Datahub exemplifies this innovation, serving as a one-stop data service for value chain actors in Ethiopian agriculture. We integrates state-of-the-art data science and big data technologies, and offer robust features like data acquisition, curation, scalable multimodal storage, and comprehensive access options, dashboards and APIs. Complementing Agri-Datahub are RDR at AAU, tailored to address research data management needs using the Dataverse platform, and NADRE, launched during the COVID-19 pandemic is to centralize resources for higher education. In general, the projects address data fragmentation, governance challenges, and technical constraints, demonstrating the power of stakeholder collaboration and innovation to transform Ethiopian agriculture and academia through data-driven solutions. |
10:30 - 11:00 | Coffee Break Location: Rogers Lobby |
11:00 - 12:30 | Presentations- Software: repositories, metadata and citations Location: Griffin Auditorium |
|
Exploring global academic repositories for software. Queen's University Belfast, United Kingdom This present study investigates the occurrence of research software as part of academic outputs within international institutional repositories (IRs). Previous work [1] analysed 182 academic IRs from 157 UK universities and found that there are a limited number of records for research software in these repositories. Additionally, many IRs are unable to list software as independent research outputs due to the constraints of the underlying Research Information System (RIS) platforms. In the present study, data from OpenDOAR—a directory of global Open Access Repositories—was utilized to conduct similar analyses on international IRs, marking what is believed to be the first census of its kind. A total of 4,970 repositories across 125 countries were reviewed for the inclusion of software, along with associated metadata that could indicate relevant factors. The findings suggest that there is significant potential for making straightforward technical enhancements to RIS platforms to allow for the recognition and recording of software as distinct research outputs, including linking IRs with more development-friendly repositories. We explore the implications of these results, particularly concerning the evident lack of acknowledgment of software as a discrete output within the research process. Lastly, we examine the dedicated software underpinning repositories for academia and their usage. A Repository for Preserving and Managing Running Applications University of Vienna, Austria Software preservation poses a significant challenge in modern data management since data often relies on specific software for interpretation and utilization. Unlike relatively stable data, software operates within an ever-evolving technological landscape that includes hardware, operating systems, and libraries, making long-term preservation particularly difficult. Current software development cycles typically last between 3 to 5 years, which can render tools obsolete and jeopardize the accessibility and reproducibility of research that depends on them. Even preserving source code demands considerable effort to recreate functional environments, including necessary dependencies and configurations. To address these challenges, we propose a repository for managing and preserving running applications using containerization technologies such as Docker and Kubernetes. This approach encapsulates software with its dependencies, creating portable and consistent environments that ensure long-term usability. A proof of concept developed at the University of Vienna demonstrates the feasibility of this method, enabling applications to run securely in isolated "sandboxes" while maintaining their operational context. Enhancing Metadata Workflows and Long-Term Preservation of Research Outputs: Adopting FAIR Principles with Dagstuhl Publishing Schloss Dagstuhl Leibniz-Zentrum für Informatik GmbH, Germany In its final report, the EOSC working group on Scholarly infrastructures for Research Software (SIRS) identifies archiving, referencing, describing, and crediting research software as critical pillars for open science and long-term sustainability. Based on these recommendations, Dagstuhl Publishing has enhanced its workflows to improve the publication, preservation, and accessibility of scholarly outputs, particularly supplementary materials such as research software. The extended submission system (DSUB) now enables authors to submit research software alongside their articles. Long-term preservation of the software is ensured by archiving on Software Heritage. The two tools for this, SWH Archive Client and SWH Deposit Client, are available as open source. Metadata is automatically fetched from platforms such as GitHub or CITATION.cff files and manually curated to ensure completeness. A demo is available at http://faircore4eosc.dagstuhl.de/. Furthermore, the collected metadata is published in a collection on the DROPS repository which connects articles with their supplementary materials. This submission process follows FAIR principles and maintains minimum metadata standards: https://drops.dagstuhl.de/entities/collection/supplementary-materials. This presentation will highlight Dagstuhl's solutions, including automated metadata workflows using CodeMeta JSON and manual curation, showcasing a scalable model for software preservation and repository management. We will also explore lessons learned and future directions to advance open science and sustainability. Promoting Visibility into Collections through Object Analysis California Digital Library, United States of America At CDL we work with librarians and archivists from across the ten University of California campuses. More specifically, as the team that maintains and augments our digital preservation repository, Merritt, we assist a variety of depositors with their digital preservation initiatives. Though most depositors are extensively familiar with the file formats and metadata of their content when it is ingested, preserving collections will, in a chronological sense, stretch beyond any one group of individuals. Therefore, a key challenge we face over time is to provide the means for current and future staff to thoroughly analyze the contents of their collections, at scale. This presentation will discuss a solution the Merritt team at CDL created to address the need to analyze the metadata, digital object structure and file types in the numerous collections being preserved in the repository. The solution leverages Amazon OpenSearch and its data visualization tools. We’ve found that OpenSearch has allowed us to create a rich map of relationships across elements of object data, and in turn lend depositors the necessary knowledge to make informed preservation decisions related to the metadata and structure of the objects in their collections and regarding file format sustainability. |
11:00 - 12:30 | Repository Showdown 2 Location: N110- Orchestra Room |
|
Hyku Samvera, United States of America Towards EPrints 3.5: repository developments, roadmap, and governance improvements 1EPrints Services, University of Southampton, UK; 2CoSector, University of London, UK; 3University of Glasgow, UK; 4Concordia University, Canada; 5University of York, UK; 6University of Leeds, UK Repository Showdown: Islandora Islandora Foundation, United States of America Hyrax Tufts University, United States of America Open Science Framework (OSF) Center for Open Science, United States of America Avalon Media System Indiana University Libraries, United States of America |
11:00 - 12:30 | Presentations- Cultural Heritage and Digital Humanities Location: N112- Band Room |
|
404 not found – Approaches for Ensuring the Sustainable Management of Living Resources apart from Data Repositories in the Digital Humanities Data Center for the Humanities (DCH), University of Cologne, Germany Especially within the Digital Humanities living resources can be central research outputs. Examples are databases accessible through websites with additional tools for data analysis, or digital scholarly editions. They are often used for making research data available within a specific (and necessary) context and need to be made as findable, accessible, interoperable and reusable as research data. There are still no standardized workflows and infrastructural services, also not from repositories, for handling living resources in a sustainable way. In fact, many living resources, along with their associated research data, are lost shortly after deployment. In my presentation I will discuss the challenges of handling living resources and relate them to already existing strategies and their vulnerabilities. I will present a new approach for managing living resources by considering the responsibilities of different stakeholders – researchers, funding institutions and data centers/libraries and infrastructural institutions – and argue for their orchestration. In this way, I am contributing to the Open Repositories conference by addressing a central challenge in research and software data management, which at first sight might seem untypical for the conference, but which is a discussion we must have. Conundrums of Open Repositories: Challenges in Establishing a Collaborative Framework for Digitizing Medieval Manuscript Collections Across the Midwestern United States Indiana University Bloomington, United States of America This paper explores the challenges, compromises, and inevitable opportunities that arise when a collaborative digitization project comprised of twenty-two partners largely representing smaller higher-education institutions as well as seminaries, a museum and a monastery across the Midwestern United States assemble in support of uncovering unknown or under-documented medieval manuscript collections. Issues emerged around connecting manuscript leaves held across partner institutions, representation – descriptive and structural – given the complexity of medieval manuscripts and the domain’s descriptive practices, and organization of content including metadata for the best legibility. Ultimately, the stakeholders had to balance particular requirements or expectations for this type of material with the reality that the host repository serves hundreds of collections of various types, not just manuscript collections. As part of this balancing act, improvements that would impact all collections were identified as we continue to grapple with providing the best user experience for discovering and reading medieval manuscripts in this general open repository environment. Ili-ili: i-Library and Index to Ilonggo Literature and Indigenous Knowledge 1College of Information and Communications Technology, West Visayas State University, Iloilo, Philippines; 2University Learning Resource Center, West Visayas State University, Iloilo, Philippines; 3Aquaculture Department, Southeast Asian Fisheries Development Center, Iloilo, Philippines; 4Office of the Vice President for Academic Affairs, West Visayas State University, Iloilo, Philippines Students and researchers have been complaining about the unavailability and lack of access and publications on local history, literature, endemic species and disorders, indigenous knowledge in and about the region. The Ili-ili or i-Library and Index to Ilonggo Literature and Indigenous Knowledge was developed to index and digitally archive Hinilawod epic, Ilonggo literature, lubag (x-linked Dystonia-Parkinsonism), endemic species and disorders, and indigenous knowledge. It aims to facilitate easy access and retrieval in libraries, archives, and museums (LAMS). Literary works in Hiligaynon, Kinaray-a, Aklanon literature will be indexed, digitized, and/or archived. Additional metadata and apps used, such as UN SGD contribution, Getty Thesaurus, GBIF, controlled vocabulary (UNESCO, MeSH, Library of Congress) and local language, will be described. Later, the repository aims to digitally archive oral histories, epics, traditions, and practices. The presentation will discuss the experiences and challenges encountered in developing and populating the online repository and index. It will describe the efforts of repository managers, developers, LIS professionals, and students. It will highlight the collaborative efforts done with LAMs, researchers, writers, and Indigenous Peoples to preserve and promote their collections and works to address the challenge of digital inclusion and provide access to local and indigenous knowledge. Open Repositories beyond Academic Communities in Ethiopia: The Case of Ethiopian House of People Representatives Addis Ababa University, Ethiopia This presentation highlights a successful use case demonstrating the expansion of open repositories beyond academic and research environments to new communities, users, and types of content. Ethiopian higher education institutions have been pioneers in this field, launching repositories for theses, dissertations, journals, and preprints. As one of the few African countries to adopt a national open access policy, Ethiopia has seen widespread implementation of repositories across academic and research institutions. Following the success in academia, open repositories are now being expanded to other sectors, such as the legislative branch and national statistical agencies. Public organizations, however, require different workflows for document publishing and review compared to academic repositories. This presentation shares lessons learned while customizing the DSpace platform for the Ethiopian House of People’s Representatives by including workflows to capture real-time audio data from parliamentary sessions, transcribe recordings, and implement a multi-step editorial and approval process for parliamentary minutes. The presentation also covers customization efforts to localize metadata into Amharic and design improvements to suit local needs. The repository has been tested and is now live, with a significant number of documents uploaded. The insights will be useful for developers, librarians, and decision-makers looking to expand repositories into new contexts. |
11:00 - 12:30 | Presentations- Sharing Repositories and Resources Location: C116- Community Gathering Room |
|
Building a Sustainable Open Repository Network: The Launch of Open Repositories Ireland (ORI) 1University of Galway, Ireland; 2University of Limerick The launch of Open Repositories Ireland (ORI) marks a significant milestone in Ireland’s open research infrastructure, providing a sustainable, collaborative framework for repository development and preservation. ORI aims to address key challenges in the Irish repository landscape, such as fragmented metadata practices, staffing shortages, and the lack of preservation strategies. By aligning Irish repositories with international standards like OpenAIRE and Plan S, ORI seeks to make Irish research more visible and accessible globally. This talk will detail the role of ORI in promoting sustainable governance, training, and metadata alignment, ensuring long-term accessibility and preservation of repository content. The presentation will highlight lessons learned from the NORF Open Access Repository Project and provide practical insights for other countries and institutions aiming to strengthen their repository networks. The Big Picture: Visualizing Networks in the Shared Research Repository British Library, United Kingdom This presentation introduces work done to visualise co-authorship patterns across the British Library Research Repository and the Cultural Heritage Shared Research Repository. The visualisation program was developed to help repository managers of the Shared Research Repository visualize content and map co-authorship networks, graphical representations of relationships between authors based on shared publications. Co-authorship networks are crucial for identifying collaboration patterns, key contributors, and the structure of research communities. Using machine learning, subject clusters were also identified by analyzing metadata, such as keywords and abstracts. These clusters reveal thematic areas and were visually integrated into the co-authorship networks, linking authors to specific subjects. This approach allows repository managers to explore research themes, identify leading authors, and understand connections between works and cultural heritage institutions. The tool enhances the accessibility and usability of repository resources, providing transferable solutions for each partner repository in the Shared Repository. By offering actionable insights, it supports data dissemination to funders, administrators, and the public while advancing understanding of the cultural heritage sector. The Expanding and Overlapping Roles of Institutional and Generalist Repositories: Building an Interoperable Data Repository Ecosystem Together 1Northwestern University; 2Harvard University Library; 3Elsevier - Mendeley Data; 4University of Rochester; 5Vivli; 6Figshare Institutional repositories and generalist repositories are evolving to meet new challenges in research data sharing. The NIH Generalist Repository Ecosystem Initiative (GREI) brings together seven leading generalist repositories to determine common data repository standards and enhance interoperability. Building on GREI's presentation at Open Repositories 2024, this panel will explore how institutions are leveraging generalist repository infrastructure to support their data sharing needs. Through real-world examples, panelists will demonstrate how different institutions utilize generalist repository features and services to enhance their data sharing workflows. The discussion will highlight various implementation approaches, from hosted solutions to self-managed instances, and examine how these platforms accommodate different disciplinary requirements and administrative preferences. Panelists will share experiences with different deposit models, curation workflows, and strategies for ensuring FAIR data sharing while maintaining institutional identity. This session will benefit repository managers, librarians, research administrators, and others involved in institutional data sharing. Attendees will gain practical insights into how generalist repository infrastructure can support institutional needs, enhance research data discovery, and contribute to an interoperable repository ecosystem. The session will include interactive polling and dedicated time for audience Q&A to facilitate deeper exploration of specific use cases and implementation strategies. Ensuring the Future of Digital Repositories in West and Central Africa: A Case Study on BAOBAB and Sustainable Repository Development West And Central African Research and Education Network (WACREN), Ghana As part of WACREN's commitment to advancing equitable access to research outputs and digital resources across West and Central Africa, we have deployed BAOBAB—an Invenio-based digital repository that provides open access to academic and research content for our member institutions. BAOBAB is designed not only to store research content but also to ensure its long-term sustainability. A key feature is the automated assignment of Archival Resource Keys (ARKs), providing persistent identifiers to every uploaded item, which is critical for digital preservation and discovery. In this presentation, we discuss the implementation and adoption of BAOBAB, our integration of ARKs, and the challenges and lessons learned in building sustainable repository infrastructure for diverse institutional needs. Our goal is to inspire and inform repository managers, developers, and librarians on how regional repositories like BAOBAB can contribute to sustainable access to knowledge and preservation of content across generations. |
11:00 - 12:30 | Presentations- DSpace 1 Location: C119&121- Classrooms |
|
Towards enriched open scholarly information: integrating DSpace repositories and OpenAlex 1University of Cambridge, United Kingdom; 24Science The research and scholarly publishing environments are changing rapidly and there is an increasing expectation that research findings will be shared openly, both among funders and policy makers, and the wider research and public community. Institutional research repositories and scholarly platforms play a critical role in supporting these open research practices by capturing, preserving, and disseminating the research and scholarly outputs produced by institutions, but the current processes to support researchers and librarians in doing so are still fairly manual and time consuming. This presentation will describe the outputs of a project to address this issue by integrating one of the most widely used, open repository platforms, DSpace, with OpenAlex, a free and open catalogue of the world’s scholarly research system. Using OpenAlex’s open API (Application Programming Interface), this integration allows for the quick import of relevant research and scholarly (meta)data into DSpace repositories, which will in turn help institutions improve the quality and completeness of their research records and will streamline researcher publication and reporting workflows by providing accurate information in automated ways. Moreover, this solution will contribute to increasing and enhancing the availability of open and accurate information about research outputs in the wider scholarly ecosystem. Deep Integration of GND with DSpace’s External Sources Framework: A Case Study of Authority Data Utilization The Library Code, Germany Authority files are essential for organizing and enhancing metadata discoverability in repositories. The German National Authority File (GND), a cornerstone of metadata management in the German-speaking world and part of the Virtual International Authority File (VIAF), encompasses nearly 10 million entries, including persons, corporate bodies, subject headings, conferences, works, and geographics. This presentation showcases a comprehensive integration of GND into DSpace using the External Sources Framework. The integration enables seamless searching and linking of GND content within DSpace, with a configurable submission interface allowing users to filter results by type and display enriched information, including images from Wikimedia Commons. Geolocations stored in GND are leveraged alongside GeoNames to power features like map-based search results, item views with OpenStreetMap, and a browse-by-map functionality. This extension is highly customizable, supporting flexible metadata field configurations and linking or importing entries via DSpace’s authority or custom entities frameworks. Additionally, we developed external data providers for lobid.org, Wikimedia Commons, and GeoNames, with the source code planned for open release. A live demonstration will highlight the functionality, technical challenges, and design principles, inviting repository managers, librarians, and developers to explore the potential of this integration for their institutions. Bridging Communities: Vireo and DSpace Integration for Open Knowledge Sharing 1Texas State University; 2Texas Tech University; 3University of Texas; 4University of Texas at San Antonio This panel will explore how the integration of Vireo with DSpace fosters a community-centered approach to managing Electronic Theses and Dissertations (ETDs) and other research outputs. By streamlining submission workflows, enhancing repository functionality, and enabling interoperability, this integration empowers institutions to serve diverse communities more effectively. The discussion will include real-world case studies, insights into open-source collaboration, and strategies for promoting trust in digital repositories while supporting local and knowledge sharing. The integration of Vireo, an open-source ETD submission and management system, with DSpace, a widely used open repository platform, represents a significant achievement in repository evolution. This panel aligns with the Community sub-theme by showcasing how this synergy addresses the diverse needs of global users while fostering trust and enabling repositories to better promote equitable access. With the increasing demand for interoperable systems that balance global connectivity and local context, Vireo and DSpace stand as powerful tools for supporting repository growth and innovation. DSpace 9.0 and Beyond: What’s next for DSpace Lyrasis, United States of America DSpace 9.0 is due to be released in May 2025. This presentation will discuss the new features and improvements to the DSpace platform that arrive in 9.0, including major performance enhancements, accessibility improvements, and upgrades to Bootstrap 5 and Angular 18. The exact set of features released with version 9.0 will be announced in May. We’ll briefly discuss the ongoing maintenance releases for both 8.x and 7.6.x, including which improvements have been backported from the 9.0 development processes. In addition, we’ll provide a brief update to the community on the early roadmap to DSpace 10.0, due in May 2026. This may include brief updates on the potential merger of DSpace-CRIS back into DSpace, as well as other DSpace Steering priorities for the 10.0 release. A separate panel session will discuss this potential merger in greater detail. With a view to supporting the development of DSpace 10.0 and future versions of DSpace, , we’ll conclude by providing a brief overview of current and planned DSpace community activities, along with ways that institutions can get involved or support DSpace by becoming a member or contributor. |
11:00 - 12:30 | Lightning (24x7)- Repository case studies 2 Location: Sherry Lansing Theatre |
|
Documenting Maternal Health Practices: Building a Culturally Sensitive Repository for Kerala’s Indigenous Tribes University of Calicut, India Managing traditional and scientific knowledge: A case study of the Takinahakỹ Center for Indigenous Higher Education at the Federal University of Goiás – Brazil 1Universidade Federal de Goiás, Brazil; 2Universidade Federal do Rio Grande do Sul, Brazil Preserving Knowledge, Promoting Equity: The Current Landscape and Future Prospects of Institutional Repositories in India 1Chandigarh University, India; 2Akal University, India Digital Curation in Deposita Dados: Challenges, Solutions and the Role in Academic Rigorosity 1Instituto Brasileiro de Informação em Ciência e Tecnologia, Brazil; 2Universidade Federal do Rio Grande do Sul Preserving Our Past, Securing Our Future for Inclusivity: A Nigerian Perspective on Route To Digital Contents Sustainability National Library of Nigeria, Nigeria |
12:30 - 13:30 | Lunch Break Location: Rogers Lobby |
13:30 - 15:00 | Presentations- AI and Repositories Location: Griffin Auditorium |
|
A Data Curation, Interrogation, and Access System for the Texas Robotics DataVerse University of Texas at Austin, United States of America Research in robot autonomy and human-robot interaction involves multidisciplinary perspectives including engineering, computer science, and social sciences. Datasets derived from robotics studies are large, multimodal, and uniquely structured. Open repositories are increasingly publishing robotics datasets. However, the lack of specific metadata standards hinders their understanding and usability. Additionally, current repository infrastructure does not provide support to interrogate and compare multiple datasets. This work introduces a system that leverages curated metadata from robotics datasets published in the DataVerse-based Texas Data Repository, to enable context-aware access through natural language interaction. A robotics-specific data model was implemented as a knowledge graph within the Texas Advanced Computing Center's open infrastructure. Descriptive and structural metadata from curated datasets in the Texas Data Repository are automatically harvested, mapped to the data model, and integrated into the knowledge graph. The datasets' metadata, selected data files, the knowledge graph schema, and related publications are used to train a ChatGPT-based chatbot, enabling users to query and retrieve data from the repository via natural language. The system’s design, implementation, and evaluation are demonstrated, showcasing its potential to enhance open dataset's interoperability and accessibility. Through continuing research, the system can be applied to datasets from different domains in open repositories. Can this robot query my Linked Data Store? Exploring retrieval-augmented models for Repository Search & Discovery University of Toronto The University of Toronto Scarborough Library's Digital Scholarship Unit has been developing multiformat and multilingual digital collections for over ten years, with a focus on post-custodial and non-extractive models. This presentation explores the integration of linked data and AI-powered chat-based search, using one specific project as a use case. The Dragomans project, which explores the role of diplomatic interpreter-translators in the Ottoman Empire, provides a multilingual dataset with linked data that allows us to assess the effectiveness of various retrieval-augmented generation methods when implementing chat-based search. This research offers insights that are applicable across various knowledge domains. The limitations of generative artificial intelligence in handling structured data, complex reasoning, and contextual understanding are well described. We compare various retrieval-augmented models to direct SPARQL/CIPHER queries and compare the results. This allows us to evaluate these approaches and comment on their effectiveness as well as resource implications. Our findings uniquely illuminate the landscape of generative AI, providing valuable guidance for future repository infrastructure development and contributing to the broader discourse on the integration of AI and linked data in digital repositories. Repository of the Future: A Hybrid Approach of Human Expertise and AI-Driven Data Enrichment University of Wisconsin-Milwaukee The continuous advancement and complexities of digital repositories necessitates a forward-thinking approach that balances scalability and contextual accuracy. This proposal introduces the repository of the future that combines a fluid approach of human expertise and artificial intelligence (AI) innovation to produce smarter and enriched data for knowledge advancement. By leveraging the power of AI-enabled semantic web technologies like Wikibase, librarians could potentially change the way repositories are designed, used and interacted with. This combined approach could see humans and AI expertly execute different tasks for the overall good of the repository and mankind. For instance, while artificial intelligence could expertly handle tasks such as metadata generation, data cleaning, and resource recommendation; human oversight could ensure ethical and data integrity, cultural sensitivity, and trustworthiness. The presentation will detail a pilot Wikibase project in Glorious Vision University Library, where this hybrid approach has enhanced the library’s repository functionality and accessibility. Attendees will therefore gain insights into practical workflows, opportunities and challenges in adopting such a model to meet the evolving needs of the global community. Integrating Machine Actionable Data Management and Sharing Plans (maDMSP) into Campus-Based Open Research and Repository Workflows: A Case Study from the University of Colorado Boulder University of Colorado Boulder, United States of America Data Management and Sharing Plans (DMSPs) are now a well-established part of the modern scholarly ecosystem, as evidenced by the fact that they are a required component of grant applications at major funding agencies. Despite their widespread use, DMSPs are currently limited by the fact that they are generally intended solely for human audiences, and are rarely machine readable or actionable. As a result, while DMSPs communicate valuable data management and project information to human stakeholders, they are effectively “black boxes” with respect to the various digital systems embedded in the infrastructure of Open Science, including digital data repositories. This presentation explores an effort to better-understand, develop, and implement “next-generation” DMSPs, known as Machine-Actionable Data Management and Sharing Plans (maDMSPs), at the University of Colorado Boulder, as part of a multi-institution grant led by the California Digital Library (CDL) and the Association of Research Libraries (ARL) and funded by the Institute of Museum and Library Services (IMLS). We discuss local use cases for maDMSPs, our project activities, and the challenges we encountered in integrating maDMSPs into campus workflows. We also explore the implications of maDMSPs for the University of Colorado’s institutional repository, and open repositories more generally. We Need to Chat: A presentation about real-world AI use cases 1Northwestern University Libraries, United States of America; 2Harvard University; 3Emory University Recent advances in artificial intelligence and large language models (LLMs) have created new opportunities for enhancing digital repositories and discovery. This panel presents three implementations that have moved beyond theoretical exploration to AI solutions that solve real-world problems. Northwestern University Libraries developed a discovery interface using Retrieval Augmented Generation (RAG) that enables natural language querying of IIIF-based digital collections, leading to an IMLS-funded open-source solution. Emory University Libraries implemented a hybrid intelligence approach for metadata enhancement, leveraging AWS infrastructure and LLMs to improve description while maintaining human oversight and addressing historical biases. Harvard Library's Collections Explorer project employs semantic search and generative AI to revolutionize special collections discovery, building on insights from their "Talk with HOLLIS" pilot and collaboration with Mozilla.ai. Through these case studies, the panel will examine practical implementation strategies, architectural decisions, and lessons learned while deploying AI in repository environments. Discussion will address how these technologies can meet evolving user expectations while maintaining institutional values and responsibilities. The session will provide attendees with concrete examples of successfully operationalizing AI tools for discovery and metadata generation in library contexts. |
13:30 - 15:00 | Lightning (24x7)- Repository tools and developments Location: N110- Orchestra Room |
|
Implementing UN SDG Auto-Tagging: A Practical Guide for Librarians Cal Poly Humboldt, United States of America This presentation presents a practical approach to using artificial intelligence (AI) for tagging graduate theses housed in an institutional repository with the United Nations Sustainable Development Goals (UN SDGs). Utilizing strategies requiring no prior programming experience, this presentation will provide a step-by-step guide, cost analysis, and lessons learned from employing two AI-based tagging methods. These methods, attempted with varying degrees of success, highlight the potential of using AI for the thematic tagging of digital library resources. Walking to IIIF Presentation 4.0 - Moving IIIF API Integrations Forward in Archipelago Metropolitan New York Library Council, United States of America Archipelago Commons, or simply Archipelago, is an open source repository platform developed and supported by the Digital Services Team at the Metropolitan New York Library Council (METRO). Archipelago features a curated list of open-source CMS and digital repository community built software and services, and the custom Archipelago Strawberry Field modules. All data, including metadata, for digital objects and collections is stored in JSON, and cast into different metadata schemas and displays using a unique templating system. In addition to standard metadata and display templates, IIIF API support and IIIF compliant viewer customizations are woven into Archipelago’s architecture and standard configurations. During this presentation, participants will learn how current IIIF Presentation API versions are implemented in an Archipelago repository through extensible templates and configurations. By reviewing our platform’s functionality areas related to IIIF integration, participants will learn about the simple, efficient way Archipelago repositories will be able to move forward to support the latest IIIF Presentation 4.0 API scheduled for initial release in the second half of 2025. Participants will also see a sneak peek at an example IIIF Presentation 4.0 manifest, and live demonstrations of customized implementations of popular IIIF viewers such as Mirador and IA Bookreader. A Flexible Workspace: Using the Cloud to Stage & Ingest New Content California Digital Library, United States of America At CDL we work with librarians and archivists from across the ten University of California campuses. More specifically, as the team that maintains and augments our digital preservation repository, Merritt, we help depositors with their digital preservation projects. During recent collaborations, our team identified a need to be able to assist partner libraries with limited resources to stage their content ahead of ingest into the repository. This content is typically that which was generated as a result of work with a digitization vendor and resides on HDDs or other transfer devices. To facilitate staging, validation and ingest of HDD content, we created a workspace that incorporates AWS S3, while simultaneously incorporating shell scripts and other resources to manipulate content once it is uploaded to the cloud. To this end, we implemented Amazon FSx for Lustre as a bridge between an EC2 instance and S3. The system also incorporates a Lambda function for submission automation purposes, as well as configuration management via Sceptre. This workspace has allowed us to streamline the staging and ingest of newly digitized content while working with our library partners, via a collaborative solution that minimizes impact to resources of all teams involved. Integrating IIIF annotations in DSpace 4Science, Italy Since 2017, 4Science has been working on implementing support for IIIF in DSpace to provide a better user experience in enjoying images, especially in the cultural heritage domain. To achieve this goal, a dedicated add-on has been implemented, easily integrated with a set of external Image Servers, such as Cantaloupe or Digilib. To enrich the content related to the digital cultural heritage managed within DSpace, we are now implementing workflows aimed at saving IIIF annotations created with Mirador and at relating them with all the information provided by metadata, fulltexts and entities. The proposed paper illustrates such workflows and how to relate annotations to each other and to other entities structured at data model level, in order to integrate them in the repository knowledge base. Sustainable Development Goals in EPrints: Updates, Failures, and Experiments CoSector, University of London, United Kingdom At OR 2024 I presented on work I had undertaken to add sustainable development goals (SDGs) to EPrints. The goal of this work was to a) present SDGs in an accessible and attractive way in EPrints repositories and b) relieve burden on repository administrators by incorporating automated identification of outputs related to SDGs. This talk will discuss the progress towards this, more specifically an issue that arose of excessive and imprecise identifications by the automated searches that resulted from this search. Comparing the results of these searches in EPrints to those identified by the same search terms in Scopus, I will explore some of the possible reasons for these issues and some of the possible solutions. One specific area of investigation concerns the specific search engine and whether switching to a different mode of searching might alleviate this issue. AI as a Responsible Partner in FAIR Metadata Creation: Lessons Learned University of Michigan Inter-university Consortium for Political and Social Research, United States of America Creating FAIR (Findable, Accessible, Interoperable, Reusable) metadata is essential for research data reuse, yet it often poses a significant challenge for data depositors. At the Inter-university Consortium for Political and Social Research (ICPSR), located at the University of Michigan, we discovered a disconnect: while depositors are experts in their research, they struggle to translate this expertise into quality metadata. In response, ICPSR developed TurboCurator, an AI-driven tool that transforms depositor-provided content, like research summaries from publications/press releases and research plan methodologies, into metadata recommendations aligned with ICPSR’s standards. This presentation chronicles our journey to integrate responsible AI into the metadata creation process, blending machine intelligence with human expertise to enhance the depositor experience while upholding transparency and control. How do you describe software in record metadata? Open University, United Kingdom The discoverability, attribution, and reusability of open research software are often hindered by its inadequate representation in research manuscripts. Frequently mentioned only implicitly or buried within supplementary materials, software fails to achieve recognition as a distinct, citable output. Addressing this challenge requires systematic identification and assignment of persistent identifiers (PIDs) to software, ensuring compliance with FAIR (Findable, Accessible, Interoperable, and Reusable) principles. Despite its significance, most open research software remains underrepresented in metadata, with limited explicit links between software and the research papers introducing or using them. The SoFAIR project (2024–2025) seeks to enhance the identification and representation of software assets in research. By leveraging the global network of open repositories, the project aims to look into the current state of metadata standards and proposes adaptations to include software descriptions. The presentation will explore current metadata formats and propose actionable solutions for improving the discoverability and reusability of open research software, aligning with best practices for metadata interoperability. |
13:30 - 15:00 | Panel- Software dies, data should be forever: OCFL as a software agnostic storage approach Location: N112- Band Room |
|
Software dies, data should be forever: OCFL as a software agnostic storage approach 1Cornell University; 2Emory University; 3University of Oxford; 4University of Wisonsin, Madison; 5University of Texas at Austin; 6Harvard University With very few exceptions, the repository software we use now is not the software we were using 20 years ago and is not the software we will use 20 years from now. In most academic repositories, however, we expect and demand that data -- be it articles, theses, digital collections, or datasets -- placed in our repositories 20 years ago or now will be available 20 years hence and on into the future. Experience with repository software migrations has highlighted the difficulties and risks associated with data migration and sometimes the lock-in these difficulties create. This panel will highlight the features and benefits of adopting a storage layer upon which multiple repository systems can be implemented. The Oxford Common File Layout (OCFL) specification editors will introduce the format and describe the diversity of ways it is being used, including vendor systems, use as the storage foundation for Fedora 6 and local implementations. Most of the time will be devoted to panelist presentations on different implementations. Each panelist will cover the background of their repository application, reasons for selecting OCFL, implementation status and experience, and the utility of a software-agnostic storage layer. |
13:30 - 15:00 | Presentations- Provider Communities Location: C116- Community Gathering Room |
|
The Fedora Community finds its future path by learning from the past and bringing others together. Fedora/Lyrasis, Canada This year's Open Repositories theme, “Twenty Years of Progress, a Future of Possibilities”, resonates closely with the Fedora Community and reminds us how looking to the past can help chart our future. Throughout its 20+ year lifespan, Fedora has overcome challenges, embraced innovation, and experienced success. The release of Fedora 6.0 in 2021 marked a pivotal milestone, addressing technical hurdles while laying the groundwork for future sustainability in technology, operations, and community. Fedora’s success is deeply rooted in its connected global community. Recognizing the challenge of declining contributor capacity, the Fedora Governance Group led a two-year strategic planning effort, culminating in the 2022 Fedora Community Roadmap. This collaboratively developed roadmap prioritizes the needs of users, fosters cross-community engagement, and highlights the critical role of collaboration in supporting diverse stakeholders. This presentation will showcase Fedora’s intentional efforts to build partnerships with sister communities, create welcoming spaces for new voices, and innovate in user engagement. Presenters will share lessons learned from strategic planning and highlight new collaborative initiatives to ensure the long-term sustainability of our program. Attendees will gain insights into how collaboration and forward-thinking strategies can inspire other communities to “lift all boats” while navigating the future of open-source repository solutions. Balancing the Global and the Local at the Research Organization Registry (ROR) Crossref, United States of America When members of the Research Organization Registry (ROR) and Zenodo teams gave a workshop on ROR at Open Repositories in Stellenbosch in South Africa in 2023, more than one attendee mentioned that it would be useful if ROR’s organizations could be grouped by continent as well as by country to enable easier tracking of research associated with African research organizations and funders. The ROR team put this helpful suggestion on the ROR roadmap and then implemented it with the release of ROR metadata schema version 2.1: every one of the 110,000+ records for research organizations in ROR now includes information indicating the continent where that organization is located. In this presentation, we will share similar stories of how ROR works to serve a global community of research organizations and scholarly systems with diverse uses, needs, languages, alphabets, systems, organization structures, and regional identifiers through efforts to ensure global equity in ROR’s data, definitions, technology, and operations. With records for research organizations from 228 countries and research organization names in 128 languages, ROR already strives to be a truly global registry, but we know we can always do more, and we look to Open Repositories for help. USRN Discovery Pilot: Increasing the Discoverability of Open Access Content Through a National Network 1CORE, The Open University; 2Antleaf; 3Confederation of Open Access Repositories (COAR); 4SPARC This presentation will present the results of the USRN Discovery Pilot Project, a collaboration of SPARC, the Confederation of Open Access Repositories (COAR), CORE and Antleaf, to enhance the discoverability of research papers in US repositories leveraging CORE as an indexing service for USRN repositories. The project conducted actions in three strategic areas: Assessing and quantitatively measuring discoverability and barriers to it at the beginning and end of the pilot project, conducting interventions to increase discoverability, and supporting interventions by technology and guidelines (provided by CORE services), to minimise effort and maximise effect. The key results of the project include: Around three-quarters of a million research outputs held in the selected US repositories have been made discoverable (a 50% increase) compared to the year before; The project has made available the CORE Data Provider’s Guide as well as a selection of new and improved tools to support repositories in increasing their discoverability. These include the CORE Reindexing Button and Index Notification modules, Fresh Finds and the USRN Desirable Characteristics for Digital Publication Repositories checking tool. The project team is now exploring ways to scale out this work to include more repositories. |
13:30 - 15:00 | Presentations- Data Repositories 2 Location: C119&121- Classrooms |
|
Development Of An Integrated Lifecycle Of RDM Tools For Data Publication: Looking Back And Forward @ KU Leuven KU Leuven, Belgium KU Leuven started on the road to FAIR in 2018 with the creation of a general institutional RDM policy based on the FAIR principles and the motto “as open as possible, as closed as necessary”, accompanied by the decision to set up institutional tools to enable the implementation of the policy. This includes the development and launch of the Dataverse-based institutional data repository, RDR, and the local iRODS instance, ManGO, which supports data management in a structured and metadata-rich way during the research project. To fully enable FAIR data management, it was important to facilitate interoperability of data by connecting these infrastructures with each other and other RDM tools. In this context, a Dataverse plug-in was developed by the RDR team to easily pull data from other RDM tools and for ManGO, a connection was set up to push data to RDR as well. This ensures that researchers who use ManGO or RDR have a seamless experience in moving data from ManGO to RDR that are ready for publication. All work is shared in open-source where possible to make sure that not only KU Leuven data, but also RDM infrastructure is as open as possible. Scaling Up: Expanding Data Repository Support for Growth in Large Datasets 1University of Texas at Austin, United States of America; 2Texas Digital Library; 3Texas A&M University; 4Southern Methodist University; 5Baylor University Researchers are increasingly generating larger datasets thanks to improved technologies and methods, which can present a challenge for data repositories that need to scale their infrastructure and services to support publication of these datasets. The Texas Data Repository (TDR) has been working for several years on enhancing its multi-institutional service model, technical infrastructure, and data retention policy in order to better accommodate publishing of large research datasets. The TDR Steering Committee has recently approved recommendations produced by its Large Data group which will result in new work to scale support for large datasets while preserving flexibility for individual TDR member institutions. This presentation will share these recommendations, progress that has been made thus far, and our strategy for collaborating with the open source Dataverse community on codebase contributions that can also benefit other data repositories. Mapping the Infrastructure and Awareness Gap: A Landscape Analysis of Data Management Practices in Latin America 1DataCite, Germany; 2Remolino, Chile Latin America faces unique obstacles in increasing equal access to research outputs and repositories due to variations in infrastructure, resources, and awareness. This presentation summarizes the results of a detailed Landscape Analysis undertaken in the region, which identified important gaps and possibilities for promoting repository development and interoperability. This analysis provides concrete recommendations for closing these gaps through collaboration, capacity building, and the implementation of metadata standards. Attendees will get a better awareness of the regional context, practical solutions to difficulties, and scalable ways for global repository networks. Toward a Comprehensive Research Data Catalog at Texas A&M University Texas A&M University, United States of America Texas A&M University has had a standard administrative procedure in place for many years to ensure that publicly funded research is made available for the public good and is preserved for appropriate amounts of time. However, research practices have not always guaranteed retention or availability, and TAMU had cause to improve the accounting for research data output. In 2023, TAMU’s Vice President for Research began an initiative to address this state of affairs. The VPR started discussions with Libraries and Technology Services to develop a catalog to track grant-funded research data and increase its discoverability and impact. Such data include datasets in repositories, data housed on departmental or cloud services, and physical research artifacts in storage. To fulfill these aims, Libraries directed the development of a customized DSpace instance, leveraging the new Entities feature introduced in DSpace 7 to meet the required use-cases of the platform. With the Data@TAMU catalog now in production, the collaborating stakeholders each fulfill vital roles. The Libraries collaborate with researchers to manage and catalog data outputs, Technology Services provides development and networking services, and the VPR’s office ensures deans and faculty responsibly manage data to achieve compliance. |
15:00 - 15:30 | Coffee Break Location: Rogers Lobby |
15:30 - 17:00 | Presentations- DSpace 2 Location: Griffin Auditorium |
|
From Load Times to Bot Traffic: A Comprehensive Approach to DSpace Performance Atmire, Belgium Transitioning Repositories: From EPrints to DSpace CRIS – The Case of the Zurich Open Repository and Archive (ZORA) 1University of Zurich, Switzerland; 2PCG Academia, Poland DASH Stories: Implementing a qualitative feedback service in DSpace 8 1Harvard University, United States of America; 24Science, Italy Reimagining a trusted institutional repository: transforming EPFL’s Infoscience with DSpace-CRIS EPFL, Digital repositories and archives - Library, Switzerland |
15:30 - 17:00 | Presentations- Repository Retrospectives Location: N110- Orchestra Room |
|
17 years of PHAIDRA at the University of Vienna. Opportunities and challenges of managing an open repository at a large and heterogeneous university. University of Vienna, Austria Since 2008, the University of Vienna has been using a repository that is open to all disciplines and to all academic and administrative staff. This opens up many possibilities in the area of research data management, as data can be blocked or opened for certain groups of people at any time, if this is necessary for legal or ethical reasons. The content of the repository can also be linked in different ways, for example, publications and research data can be linked together. A number of metadata standards are provided to allow a very rich description of the objects. At the same time, there are a number of challenges to overcome in order to serve all interested parties and disciplines equally. In addition to technical, structural and financial challenges, the service has to be constantly adapted to different needs. The presentation will take a closer look at these challenges, focusing on solution strategies and positive results in terms of successful collaboration with researchers. Three Repositories Walk into a Library: recapping 30 years of repository development at Duke University Libraries Duke University Libraries, United States of America “Why do we have three repositories?” This is an over-simplified version of a question that a working group at Duke University Libraries (DUL) found themselves asking (and in some cases returning to) last year. A supremely helpful member of this working group produced an impeccable 24-page timeline detailing the history of repositories at Duke, beginning with the very first collection launch in 1995. Pulling the greatest hits from our colleague’s impressive timeline, this presentation will take attendees on a journey through repositories at DUL, with a bit of bragging about wins, invitation to commiserate about shared challenges, and sincere reflection with an eye towards the future. Advancing Open Science: Transforming UFPR’S Digital Repository Infrastructure Federal University of Paraná, Brazil The aim of this communication is to provide an overview of the twenty-year history of UFPR’s repository, offering a focused perspective on its creation and maintenance in Latin America. The Universidade Federal do Paraná (UFPR) launched its Institutional Digital Repository (RDI) to provide access to academic outputs, such as theses, dissertations, and other scholarly production, in 2004. Over two decades, the repository evolved to include Open Educational Resources (OER), institutional research projects, and Brazil's first public scientific data repository. UFPR has progressed, yet its siloed infrastructure disrupts integration and accessibility across repositories. This fragmentation prevents the effective management, sharing, and discovery of academic and research resources, challenging Open Science ideals. UFPR aims to build a consolidated institutional repository using an integrated digital platform to solve this. Its centralized portal will simplify submission and access by applying FAIR principles. Key technological innovations include developing machine learning tools to automate metadata extraction, enable dataset interconnectivity, and improve resource discovery. The unified platform will promote proactive knowledge management, capacity building through training, community engagement, and open data policy development. UFPR intends to lead Open Science in Brazil and Latin America by fostering international partnerships, citizen science, and collaborative research. Going deeper: the case for multiple repositories at UChicago University of Chicago, United States of America As US institutions look to how institutional data repositories can help meet federal funder data access requirements, we are also considering the purpose of repositories beyond compliance. If we are doing more than ticking boxes, does a generalist repository really meet our needs, or most importantly, the needs of those we want to find, and use our data? With the receipt of NEH funding for the UChicago Node project, the Library is hosting three repositories at UChicago—Node and Unbound as specialist repositories and data infrastructure and the institutional repository supporting public access to research. Many institutions find themselves in a similar situations. This paper explains why this approach currently makes sense, while considering the future and what might trigger a rethink of approach. UChicago Node builds on the long-standing OCHRE data service, developed to support the humanities division. With Node expanding and scaling this to the management of Library collections as data, and for initiatives across campus, we're exploring three questions, the current answers for which we believe are of interest to any institution that finds itself managing multiple repositories: - Can a single repository meet all of an institution’s needs? - If not, why not? - If not now, then when? Ensuring the Longevity of University of Johannesburg Institutional Repository through Innovative Practices University of Johannesburg, South Africa Institutional repositories are digital archives designed to collect, preserve, and disseminate the intellectual output of an institution, particularly in academic settings. Therefore, ensuring the longevity of the IRs is critical for preserving and disseminating academic scholarship. This study explores innovative practices that can enhance the sustainability and accessibility of University of Johannesburg's Institutional Repository (UJ IR), which serves as a vital resource for archiving a diverse range of intellectual outputs, including journal articles, theses, books and book chapters. By implementing advanced digital preservation technologies and adhering to best practices in archival standards, the repository can effectively safeguard its collections against technological obsolescence and unauthorized access. Furthermore, fostering collaboration among academic staff, library personnel, and researchers will promote a culture of open access and encourage the submission of high-quality research outputs. This study underscores the importance of continuous investment in infrastructure and training to ensure that UJ IR remains a reliable platform for future generations of scholars, thereby contributing to the global body of knowledge and enhancing the university's academic reputation. |
15:30 - 17:00 | Panel- Recognizing Software as a Critical Component in Open Science: Advancing an Interoperable, Community-Driven Vision for Infrastructures Location: N112- Band Room |
|
Recognizing Software as a Critical Component in Open Science: Advancing an Interoperable, Community-Driven Vision for Infrastructures 1Software Heritage, Inria; 2swMath, FIZ Karlsruhe - Leibniz Institute for Information Infrastructure; 3CCSD / CNRS; 4Schloss Dagstuhl – LZI, Publishing; 5Institute of Applied Biosciences (INAB), Centre for Research and Technology Hellas (CERTH) Software source code is crucial in Open Science, representing executable knowledge essential for advancing research. Yet, it often receives insufficient attention in open repositories for metadata management and archival strategies. This panel discussion will convene experts from various infrastructures who have crafted solutions tailored to the unique aspects of software as a digital object, including scholarly repositories, publisher platforms and aggregators. Representatives from HAL, Episciences, Dagstuhl, swMath and more will participate. We will discuss the archival of source code in Software Heritage, the universal source code archive, the application of the CodeMeta vocabulary for describing software and the use of the Software Hash Identifier (SWHID) for accurate referencing. Additionally, we will showcase how initiatives like European projects, such as FAIRCORE4EOSC, FAIR-IMPACT and EVERSE and collaborations with the SciCodes Consortium are creating vital connections between scholarly infrastructures. This panel will discuss both the advancements and the challenges faced, and will suggest practical steps that institutions, publishers, and researchers can take through collaboration, guided by a community-driven vision. We aim to deepen the understanding of software's role in research and its necessary recognition, encouraging wider adoption of best practices in academia to foster a more collaborative and inclusive scholarly environment. |
15:30 - 17:00 | Presentations- Repository Governance, Ethics and Curation Location: C116- Community Gathering Room |
|
The Governance of Open Repository Programs: Progress and Possibilities The Ohio State University Libraries, United States of America This presentation will focus on the role governance plays in ensuring the future of repositories and their content. How do we sustain and preserve open repositories while keeping the spark alive that started it all? How do we provide a reliable and stable repository platform and embrace blue-sky thinking? How do we keep the light of inspiration alive while working to keep the actual lights on? DSpace is now in its twenty-third year. How did it come this far, and what is the community roadmap of its future? arXiv is now in its thirty-fourth year and recently re-structured its governance. Comparing and contrasting these programs, this presentation will explore the impact governance models have in sustaining long-standing programs and how leadership teams stay the course while grappling with ever present challenges in a constantly changing scholarly communications ecosystem. An Integrated Open Ecosystem: Whose Responsibility Is it? 1Lyrasis; 2California State University San Bernardino; 3DataCite; 4Indiana University; 5University of Maryland; 6University of Cambridge, United Kingdom Digital Cultural Heritage institutions want to be able to use open solutions for their digital infrastructure. However, the landscape of open-source and open-access components tends to be siloed, leaving it up to users to figure out how to put the pieces of the puzzle together. This can present a barrier to adoption of open infrastructure, especially when commercial providers claim to offer easy access to a complete interoperable ecosystem. With the number of priorities to consider when implementing software, interoperability often falls through the cracks or defaults to low priority because by its nature interoperability requires the participation of multiple entities. It’s not always clear whose responsibility it is or why it matters. We will share observations on this topic from the perspective of the Lyrasis Research Infrastructure Communities and the Lyrasis Organizational Home for Community Supported Technologies, posing these questions for the wider open repository community to consider: (a) what role should individual communities play in creating a viable interconnected open ecosystem? (b) how can communities be accountable to each other to ensure that the ecosystem is created and is sustained? The Case for a National Repository of Policing Data in the United States 1University of Chicago, United States of America; 2CBS News Many resources exist for criminal justice data, but these data repositories focus on crime reports, court records, and victimization reports, with secondary focus on policing as a part of the criminal justice system. The lack of a dedicated resource for policing data--data collected or generated as part of policing activities--impedes transparency about how policing operates in practice. However, such a resource must address the privacy risks that arise from making this data available to researchers and others; overcome the technical challenges that arise from the absence of nationwide standards for the collection, organization, and storage of this data; and ensure meaningful public access. We enumerate details about these each of these criteria for establishing a national repository of policing data in the United States. We argue current technology can meet these needs, but that, in the absence of regulations governing dissemination of data obtained through open records laws, it is vital to facilitate ethical research about policing in the United States through design decisions guided by the Belmont Report. We conclude by noting one possible database ontology that could meet our criteria and highlight the importance of participation by the public as part of the design process. Scenarios Motivating Integration and Re-Curation 1University of North Carolina at Greensboro, United States of America; 2Metadata Game Changers, United States of America; 3University of Pittsburgh, United States of America A vast landscape of open scientific and scholarly repositories has arisen over the past two decades as a result of the shared understanding among researchers of the purposes and motivations for creating and maintaining individual institutional and domain repositories. Many of the next developmental steps envisioned and being undertaken for this landscape entail it evolving into a more integrated ecosystem of repositories that enhance or re-curate one another’s content, especially metadata, in ways that further the interests of individual researchers, their communities, and the broader research ecosystem.. Exploring possibilities of improved integration for enhancement and re-curation requires a shared understanding of the use cases and associated motivations for collaboration among researchers, repository managers, information service professionals (aka librarians), institutional administrators, and eventually research funders. This panel will catalyze and facilitate a discussion among these groups to identify the most compelling new scenarios for inter-repository integrated processes and re-curation functions, with a special focus on surfacing the motivations and value propositions for such scenarios. Panelists will serve to catalyze and facilitate a structured group discussion for these aims. The primary outcome from the panel-led discussion following OR2025 would be a consolidated summary of the discussion. that invokes the Chatham House Rule. |
15:30 - 17:00 | Presentations- PIDs and Harvesting Location: C119&121- Classrooms |
|
The Decentralized Archival Resource Key (dARK) - from a Proof of Concept to a Service Implementation in the Brazilian Open Science Ecosystem 1Brazilian Institute of Information in Science and Technology (IBICT), Brazil; 2Red Latinoamericana para la Ciencia Abierta (LA Referencia), Spain; 3Universidade Federal de Campina Grande (UFCG), Brazil The Decentralized Archival Resource Key (dARK) project addresses challenges faced by institutions in the Global South regarding persistent identifier (PID) systems, including high costs and reliance on centralized models. dARK is a decentralized, scalable, and cost-effective solution operating on a blockchain-based infrastructure, ensuring compatibility with existing PID systems like DOIs while fostering equitable access to scholarly communication resources. In 2024, dARK transitioned from proof of concept to its first production phase, integrating with Brazil’s Oasisbr, a federated platform aggregating over 5 million digital objects. During the pilot phase, 400,000 objects received dARK identifiers, marking significant progress toward the goal of full coverage. Key innovations, such as hyperdrive middleware for metadata transfer and integration with the ARK Alliance’s global resolver, enhance discoverability and ensure long-term PID preservation. This presentation explores dARK’s evolution, technological advancements, and its transformative impact on the open science ecosystem, inviting stakeholders to contribute to a more inclusive and sustainable scholarly communication model. The Utility of PIDs in Harvesting Open Data Repositories: Challenges in Operating a National Data Discovery Service Digital Research Alliance of Canada, Canada Lunaris is a national data discovery service for Canada. Providing complete and structured, preferably standards-based, metadata for data records enables Lunaris and other discovery services to effectively find potentially relevant datasets, filter them to concretely identify Canadian datasets, and crosswalk them to an internal metadata schema. We will focus on practical challenges related to our mandate of harvesting Canadian datasets, but our recommendations will be applicable to discovery of other subsets of research data at large. The benefits of persistent identifiers (PIDs) are often discussed in an abstract or aspirational way. Lunaris’ experience harvesting repositories reveals concrete scenarios in which PIDs are critical to effectively handling repository metadata. This presentation will outline our procedure for harvesting a new repository and highlight situations in which repositories’ use of PIDs and other structured metadata improve this procedure, making explicit recommendations for structured metadata use. We will then discuss the way specific PIDs and other metadata elements enable us to help our users effectively discover records in this cross-repository environment. Finally, we will discuss future work in handling emerging PIDs and metadata standards. Analysis of Requirements and Solutions for Issuing Persistent Identifiers (DOIs) in the Brazilian Repository of Biodiversity - SiBBr 1Universidade Federal de Goiás, Brazil; 2Universidade Federal do Rio Grande do Sul, Brazil; 3Rede Nacional de Pesquisa, Brazil The data sharing definitions and openness must consider, additionally, issues of institutional interest, national sovereignty, intra- and extra-country asymmetries and of reciprocity, in order to avoid increasing inequalities in the scientific and technology and population access to knowledge. In biodiversity context, Brazil has a huge relevance once the country occupies almost half of South America and is the country with the greatest biodiversity in the world. SiBBr Repository was developed as the Brazilian national repository of data and information on biodiversity, responsible for organizing, indexing, storing and making available data and information about biodiversity and Brazilian ecosystems, providing subsidies for scientific researches and government management related to conservation and sustainable use. This presentation will cover the results of a case study that aimed to bring out the challenges involved in understanding DOI attribution in biodiversity repositories: How DOI attribution to biodiversity materials in SiBBr can allow a relevant increase in the citations and visibility of Brazilian biodiversity data? Once the relevant context of SiBBr, serving as the Brazilian national GBIF node, it is mandatory to implement best practices in the SiBBr repository, for instance, including better characterization, identification, location and (re)use of data published. |
17:30 - 22:00 | Conference Dinner on the Spirit of Chicago- prior registration required |
Date: Wednesday, 18/June/2025 | |
08:30 - 14:00 | Registration Location: Rogers Lobby |
09:00 - 10:30 | Presentations- Research Data Preservation Location: Griffin Auditorium |
|
The CURATE(D) Future of ROSA P: The Implementation of the Data Curation Network’s CURATE(D) Steps and Enhancement of Data Management and Preservation at the National Transportation Library National Transportation Library, United States of America The Future of Nuclear Data Preservation at the IAEA: Challenges and Opportunities IAEA, Austria From Legacy to Leadership: Transforming OLCF’s Data Repository for the Future of Open Scientific Data 1Oak Ridge National Laboratory, United States of America; 2CivicActions Advancing Data Stewardship: Developing a Research Data Retention Policy at the Texas Data Repository 1University of Houston, United States of America; 2University of Texas at Austin, United States of America; 3Baylor University, United States of America; 4Texas A&M University, United States of America; 5University of Texas Health San Antonio, United States of America; 6Texas State University, United States of America |
09:00 - 10:30 | Presentations- Building National Repositories Location: N110- Orchestra Room |
|
Shaping the future now: Piloting national institutional Repository the challenges and the way forward Ministry of Finance, Tanzania This research explores the evolving landscape of institutional repositories (IRs) within national research ecosystems. With increasing demands for open access, data transparency, and the preservation of academic and scientific knowledge, national IRs are pivotal in shaping the future of scholarly communication. This paper presents a case study of the piloting process of a national institutional repository, examining both the challenges faced during its development and implementation as well as the potential solutions to overcome them. Key challenges discussed include issues related to metadata standardization, digital preservation, interoperability, legal and copyright concerns, and stakeholder buy-in. Additionally, the paper identifies the critical success factors for a sustainable national repository, such as governance structures, collaboration between academic institutions and government agencies, and the integration of emerging technologies like machine learning for data curation. Finally, the paper outlines strategic recommendations for the future development and scaling of national repositories, with an emphasis on fostering international collaborations and ensuring long-term sustainability in the digital age. By synthesizing these insights, this work aims to contribute to the global conversation on optimizing national repositories for broader academic, societal, and technological impacts. Building a Digital Future: Can Bangladesh Develop a National Repository to Internationalize Research? University of Dhaka, Bangladesh, People's Republic of Universities in Bangladesh face challenges like limited research funding, quality education, access to global e-resources, and policy gaps in adopting open access (OA). Initiatives like BanglaJOL and UGC Digital Library provide partial support and only a small fraction of Bangladeshi journals are indexed in Scopus or listed in the Directory of Open Access Journals (DOAJ). Open access remains underdeveloped due to the lack of awareness, resources, and policy frameworks. BANDAR means ‘port’ in Bengali is aiming to build a centralized hub for archiving theses and dissertations from all public, private, and international universities in Bangladesh. With only 16 digital institutional repositories currently in the country, BANDAR will cover mainly on graduate and doctoral research outputs. The University Grants Commission (UGC) can lead this initiative, mandating electronic submission of theses to ensure global accessibility. The repository framework will include governance, ICT infrastructure, standardized metadata, and preservation mechanisms. Researchers will submit works that undergo validation before being archived for open access. This initiative will enhance research visibility, reduce duplication, and elevate the quality of academic output, connecting Bangladeshi research to the global academic community. Towards a National CRIS: Building and Perspectives for Open Science in the Dominican Republic Instituto Tecnologico de Santo Domingo (INTEC), Dominican Republic, Open Science Caribbean (OSCaribbean) Creating a national Current Research Information System (CRIS) in the Dominican Republic represents a crucial opportunity to enhance scientific dissemination and promote transparency in research processes. Currently, the limited number of digital platforms in academic and non-academic research institutions restricts the visibility of valuable scientific information such as projects, publications, patents, theses, researcher profiles, and laboratories, among others, hindering the efficient dissemination of knowledge. A national CRIS will consolidate a digital infrastructure that highlights scientific achievements and fosters cooperation among national and international institutions, enhances competitiveness, and drives research policy objectives. With this project, we aim to establish a culture of open science, integrating principles of transparency, open access, and collaboration in scientific research, which will strengthen the foundation for a more robust and sustainable future. In this proposal, building on successful cases and lessons learned, we outline the steps to develop a national CRIS with a comprehensive approach to advancing science in the Dominican Republic. A Repository of Repositories: Developing a National Registry of Repositories at the National Research Fund in Kenya using Dspace 7 1Mount Kenya University, Kenya; 2National Research Fund, Kenya This paper presents the development and implementation of a registry of repositories in Kenya, spearheaded by the National Research Foundation (NRF) and powered by DSpace 7.6.2. The project attempts to address the need for a centralized directory of institutional and thematic repositories in the country, offering researchers, policymakers, and the public a comprehensive platform for discovering repositories and resources held therein. Unlike union repositories, this registry focuses on cataloging repositories rather than harvesting their content. Key features include integration with open street maps, custom interface design for the unique entity model, content organization in line with counties in Kenya and integration with international metadata standards. This paper will explore the theoretical foundations of repository registries, highlighting their role in improving research visibility, fostering collaboration, and supporting national open-access policies. The technical section covers the process of building the registry on DSpace 7.6.2, addressing issues such as setting up a custom “Repository” entity, updating the metadata registry and the challenges encountered. The paper will provide a replicable model for other countries seeking to establish similar registries and therefore contributing to global efforts in knowledge sharing and digital preservation. |
09:00 - 10:30 | Lightning (24x7) - Repository possibilities Location: N112- Band Room |
|
It doesn’t have to be this way: Reimagining Institutional Repositories in-Transition Southern Illinois University Edwardsville, United States of America We derive lessons from the migration of our institutional repository from BePress to Preservica, specifically looking at the parts of the endeavor that were unnecessary, onerous, unreasonable and otherwise presented extra institutional and emotional labor for us as librarians. We transmute these lessons into utopian possibilities for how interactions with vendors, institutional partners, and stakeholders could go. We imagine a context in which institutional data had its own set of rights and standards not just for preservation but for transmission and usability between platforms in the event of a vendor change. We imagine the potential for institutions, whose data create the necessity for preservation and access platforms, to be able to advocate through cooperation and solidarity against vendor-created obstacles and lack of care. We provide examples of the problems we encountered (e.g. reclaiming data, dealing with third party storage, connecting vendors to institutional IT) and how they present an opportunity to reimagine relationships with vendors and the data itself. Wikidata and repositories: opening up the future Saint Mary's College, United States of America Wikidata is an emerging resource and presents a new model for conducting scholarship. It provides an opportunity for repositories to expand their capabilities and for their staff to get hands-on experience with linked open data. This topic is important as I believe that Wikidata is not currently well understood as a concept or how it can be used by repositories in a practical, real world manner. This presentation will discuss enhancing institutional repositories with Wikidata, focusing on how it can be an extra layer of discovery. It will highlight how Wikidata can improve the discoverability, findability, and searchability of repositories, offering innovative ways to increase their visibility. The presentation will demonstrate how institutional repositories can achieve global reach, by leveraging linked data structures to maximize their impact for institutions, creators, and users alike. This session will be an invaluable chance for in person discussion and dialogue amongst individuals already using Wikidata in their repositories and librarians who are not familiar with or do not have experience in Wikidata or linked data principles. This will empower repositories to start projects on their own and to make a difference for both their users and the world through their work on Wikidata. Diamond Open Access: Repositories as journal publishing platforms, practical experiences University of Cambridge Cambridge University Library (CUL) is currently undertaking a pilot project to engage with Cambridge researchers wishing to publish their research in non-traditional journals. The project seeks to provide greater discoverability and availability of content by implementing suitable infrastructure, built on interoperable, open, and widely adopted platforms. During the first year we have implemented and launched the journal hosting platform, which is available at https://diamond-oa.lib.cam.ac.uk. It is based on DSpace, a widely adopted, open-source repository platform. This choice allows us to explore alternative publishing and review models, currently unavailable in more traditional journal publishing platforms. We are working very closely with our pilot participants to assess submission and editorial management workflows, as well as determining content structuring and journal pages design needs and gathering feedback from the journals. These are key activities that allow us, jointly with our participating journals, to determine the suitability of DSpace as a journal publishing platform. So far, feedback from participants has been very positive and they are finding the platform and publishing processes intuitive and easy to use. This presentation will provide an overview of the pilot, lessons learned so far and describe next steps and future areas for development as we transition into service. Possibilities for Accessible Repositories: a Case Study in ADA Title II Implementation University of Minnesota, United States of America Institutional repositories face two types of accessibility concerns. The first is the accessibility of the repository architecture as a system to navigate. The second issue is the accessibility of the repository's content. For the latter, this content is often produced outside of the repository, and repository managers have little ability to control or mitigate accessibility issues before it is deposited. For repositories that include media files, accessibility to the content requires additional steps to create accurate text-based representations of the resource. Staff from the University of Minnesota’s institutional repository share a brief overview of repository accessibility for users with disabilities, new rules in the United States regarding digital accessibility, as well as the findings of a one-year project to improve their repository’s accessibility through the identification and remediation of high-use inaccessible items. 120 years of dissertations, a new understanding of scholars Montana State University, United States of America Dissertations are cornerstones of many repository collections. They are also a source of unique information held by each university. While the dedication and acknowledgements of these papers are often skimmed if they are read at all, they can be a rich source of information. This presentation shares some of the findings from these sections of Montana State University dissertations and suggests future research on the topic. These items, already held by the repository, can offer inspiration for future library services or research across the community. This presentation will outline the motivation, process, and findings from a textual analysis of acknowledgments and dedications of dissertations from one institution. Barriers to the institutional repository network: how far is integration possible? Loughborough University, United Kingdom In our work on establishing and growing the Thoth Archiving Network, we have learned several lessons around the challenges in working with university systems, repository software versions, and the labyrinthine strands of metadata customisations that occur within repositories. Differing metadata schemas can be uniquely customized within one institution who has implemented an instance of open-source repository software, differently to another institution on the network with the same software. Issues with institutional repositories and a sustainable network are not limited to security, metadata, and software versioning, however. More high-level concerns, some administrative, are also in play. These include the level of volatility that exists within HE institutions and the brevity of contracts between universities and repository software providers. We still believe there is a benefit to involving institutional repositories around the world in helping to archive and preserve at-risk scholarly knowledge in the form of open access monographs, but we recognise the barriers that exist, in physical, technical, and contractual components, which would need to be well-considered going forward, for the benefit of both the Network and the involved institutions. Surely solutions are possible, through cooperation and collaboration. In this talk we hope to move forward the conversation. Simplifying Data Curation Through Tooling And Automation KU Leuven, Belgium In its Dataverse-based institutional repository, KU Leuven has a review phase for each dataset before publication to support researchers in publishing their data in a FAIR way. To streamline our curation workflow and free up as much time as possible for support, we have developed a review dashboard to track who reviews what. The dashboard plugs in to our Dataverse instance and automates part of the review process with a general checklist and automated feedback generation in addition to easily assigning and following up on dataset reviews. With the review dashboard, we aim to support human curation through automation and not replace it. This also means exploring automation options by building in automated checks. We’ll share our road to the creation of the review dashboard and the work we are currently doing to further the implementation of curation supported by automated checks. We’ll show the UI-side, but also provide a look into the logic of the automated checks. We hope to spark conversation on how to further support the human task of curation through tools and technology without losing the important human touch and interpretation that is so valuable to making a dataset as FAIR as possible. |
09:00 - 10:30 | Presentations- COAR Notify Location: C116- Community Gathering Room |
|
Using PCI, COAR Notify and EPrints to Re-Invent the Publication Workflow CoSector, University of London, United Kingdom Impacts on the Repository of COAR Notify, and tools to help you Cottage Labs, United Kingdom Interoperable verification and dissemination of software assets in repositories using COAR Notify 1Open University, United Kingdom; 2Brno University of Technology; 3Software Heritage Moving repositories out of the periphery and into the center of scholarly publishing 1COAR, Netherlands; 2University of Minho, Portugal; 3Antleaf LTD, UK; 4Pacific Northwest National Laboratory, US |
09:00 - 10:30 | Developer Track Session 3 Location: C119&121- Classrooms |
|
Leveraging AI Programming Assistants for Digital Repository Development: A Practical Demonstration Oregon State University Libraries and Press, United States of America Digital repository development increasingly demands proficiency across multiple programming languages and frameworks, creating significant barriers for developers learning new technologies. While pair programming with experienced colleagues has traditionally facilitated this learning process, AI coding assistants offer a promising alternative. This presentation demonstrates practical applications of AI programming tools—specifically Cursor and GitHub Copilot—in developing digital repository solutions. Through live demonstrations, I will explore two real-world scenarios: implementing a search application using the Primo API and creating a Python-based preservation script for Hyrax repository content. The demonstration will showcase how AI assistants can accelerate development by providing real-time code generation, automated project structuring, and intelligent error-handling suggestions. This presentation offers insights into leveraging AI assistance to create more robust and maintainable repository systems by examining these tools' strengths and limitations. This approach reduces barriers to entry for new developers and enables faster implementation of digital repository features, ultimately contributing to more equitable access to digital resources. Building Flexible, AI-Powered Forms for Repositories with react-formule CERN, Switzerland The CERN Analysis Preservation (CAP) repository enables physicists at CERN to store and manage their analysis metadata. Supporting diverse experiments like CMS and LHCb, CAP faces the challenge of accommodating each experiment’s unique procedures and data requirements, which go beyond a one-size-fits-all schema. Initially, these schemas were manually designed in collaboration with each experiment, but this approach proved slow, inflexible, and a bottleneck for onboarding new experiments. To address this, we developed an interactive form builder, allowing experiments to independently create, maintain, and edit their schemas through a user-friendly interface. Over time, this tool evolved into a robust solution, react-formule, with powerful customization options and a versatile feature set, making it a valuable asset beyond CAP. Now open-sourced in alignment with CERN’s commitment to open science, react-formule offers a variety of field types, validation logic, and visual settings, along with new exciting AI-powered features that further simplify form creation. This presentation will introduce CAP and give an overview of react-formule, its main features, how to use it and how to integrate it in your applications. Automated Data Analytics: A Statistical Dashboard Built with GitLab CI/CD for a Data Repository Based on CKAN 1National Cheng Kung University, Taiwan; 2Academia Sinica, Taiwan This project presents a fully automated, cost effective, and highly available statistical dashboard providing an overview of the current status of the CKAN-based data repository depositar (https://data.depositar.io/). The dashboard is built entirely with open-source tools such as D3.js, DataTables, and Bootstrap. In addition, the dashboard is automatically refreshed daily with the help of GitLab’s CI/CD pipelines. We utilize CKAN's Action API which exposes all of CKAN's core features. The CKAN APIs allow access to JSON-formatted lists of a CKAN instance's datasets, as well as the JSON representations of all the datasets and resources at the instance. This practice demonstrates how to automate data analytics and visualization without incurring heavy costs, offering an innovative approach to collecting status data and sharing repository insights. Automating Data Imports in a DSpace-CRIS’s Institutional Repository EPFL, Switzerland The migration of Infoscience, EPFL’s institutional repository, to DSpace-CRIS required a custom Python-based pipeline to automate the ingestion of research outputs and datasets. Limitations in default DSpace-CRIS import tools, such as insufficient query controls, incomplete metadata mappings, and a lack of deduplication mechanisms, necessitated a tailored approach. The pipeline leverages the DSpace REST API to enable precise queries, metadata reconciliation, and robust deduplication. It incorporates fallback mechanisms, such as publisher-specific APIs, for full-text retrieval when standard tools like Unpaywall and CrossRef prove insufficient. Key challenges included reconciling authorship with EPFL directories, aligning metadata across diverse collections, and maintaining data consistency during imports. The developer track presentation will provide a visual breakdown of the pipeline’s architecture, highlight key challenges, and illustrate the solutions implemented. The presentation will complement this by delving deeper into the technical details and lessons learned. Both formats will offer practical insights for repository managers and developers seeking to automate data imports and optimize workflows in institutional repositories. Unraveling the Mystery of DSpace Backend Failures: A Debugging Journey dataquest s.r.o., Slovakia Efficient debugging is crucial for maintaining the stability and performance of repository systems like DSpace. In this developer track presentation, we will showcase the systematic techniques employed to identify and resolve a challenging backend database issue in DSpace. The problem, initially revealed through automated UI tests, was traced back to unresponsive database connections. By leveraging advanced debugging tools and methods, we isolated the root cause and developed several fixes. |
10:30 - 11:00 | Coffee Break Location: Rogers Lobby |
11:00 - 11:50 | Keynote speaker Ben Zhao Location: Griffin Auditorium Ben Zhao is the Neubauer Professor of Computer Science at the University of Chicago. He is an ACM (Association for Computing Machinery) distinguished scientist and has received numerous awards. He has authored more than 160 publications in such areas as security and privacy, machine learning, networked systems, Internet measurements, and human-computer interaction. |
11:50 - 12:30 | Closing Plenary Location: Griffin Auditorium |
12:30 - 13:30 | Lunch Break Location: Rogers Lobby |
13:30 - 15:00 | Optimizing Metadata Discoverability: A Lean Six Sigma Approach Location: N110- Orchestra Room |
|
Optimizing Metadata Discoverability: A Lean Six Sigma Approach U.S. Army Corps of Engineers, United States of America This workshop provides repository professionals with practical tools to apply Lean Six Sigma principles to improve metadata workflows and enhance discoverability within repository systems. Participants will explore the DMAIC framework (Define, Measure, Analyze, Improve, Control) and its application to solving metadata-related challenges, such as inconsistencies, errors, and inefficiencies. The workshop combines presenter-led instruction, hands-on exercises, and group discussions, enabling attendees to identify root causes of metadata issues, streamline workflows, and implement sustainable improvements. Real-world examples and scenarios will ensure practical application of the concepts. By the end of the session, participants will have a toolkit to optimize metadata processes, improve discoverability, and enhance the user experience in their repositories. This session is tailored for repository managers, metadata specialists, librarians, and professionals in research data management. |
13:30-17:00 | Fedora User Group Location: C116- Community Gathering Room |
13:30-17:00 | DataCite User Group Location: C119&121- Classrooms |
Contact and Legal Notice · Contact Address: Privacy Statement · Conference: Open Repositories 2025 |
Conference Software: ConfTool Pro 2.6.153+TC © 2001–2025 by Dr. H. Weinreich, Hamburg, Germany |