Session | ||
Visual Cultural Heritage
| ||
Presentations | ||
Workflows for Digital Scholarship in Three Dimensions Maastricht University, Netherlands, The Reproducible workflows in the humanities have traditionally been embedded in research publications in the forms of footnotes, endnotes, citations, and bibliographies. This scaffolding allows researchers to trace the arguments in a publication to its antecedent roots. Reproducible workflows have taken on a new urgency with the advent of big data: making not just the underlying dataset available, but the algorithms used to parse the data so that the findings may be analysed, confirmed, or refuted. We would argue that scholarship in 3D sits somewhere between these two modalities. Many of the workflows in 3D scholarship, particularly if that scholarship involves reconstruction of physical spaces (which may or may not currently exist or exist in a different form from the digitally reconstructed state), remain invisible to those outside of the team that created the model. The workflows in 3D reconstruction resembles traditional humanities scholarship in which a plethora of decisions are taken in the creation of an argument resulting in the 3D model. Here, it is virtually impossible to take a big data approach to make visible the workflow: even if the entire dataset (eg the model files) is deposited in a repository, it will not reveal the myriad of decisions taken in the model’s construction. Perhaps the nearest equivalent for documenting the modelling workflow might be the lab notebook. But even with the notebook, researchers would also need access to all the paradata the modeller/researcher used in the decision-making process, in effect, creating an archive that does not benefit from the archivist’s expertise in ordering and cataloguing. Nevertheless, there is no doubt that scholarship in 3D lacks models so that it can become, in itself, a scholarly argument, as opposed to the ways in which 3D models are currently engaged with: as ‘twirly things’ available on a platform such as SketchFab (with limited annotation available) or in surrogates including videos and 2D images in articles. PURE3D (funded by the Dutch PDI-SSH (Platform Digitale Infrastructuur–Social Sciences and Humanities) is filling this gap, creating both an infrastructure and a workflow from which 3D scholarship can be published, argued, and interrogated. The infrastructure is modelled on traditional text-based editions with one caveat: the text in a PURE3D edition is the 3D model, surrounded by multimodal annotation (text, images, video, structured/unstructured data). The editions also document the creation process (paradata), providing users with direct access to modelling/interpretative decisions, and source material (which can be embedded in the edition or hyperlinked online). The goal of the PURE3D platform is not to provide researchers with toolkit to create a reproducible model, but provide researchers with the underlying dataset (which is deposited in a trusted digital repository), the final published model/dataset, along with the decision-making process and source material. A peer review process provides an additional layer of transparency so that the edition becomes a knowledge site, making visible decisions, assumptions, and levels of certainty, along with the paradata, on which modelling decisions have been made. Bridging the gap between visual cultural heritage collections and digital scholarship in DARIAH-FI University of Helsinki, Finland Despite advances in the field of computer vision that have inspired some to call for a visual turn in digital humanities [1], [2], there are multiple challenges that prevent researchers in humanities and cultural studies to benefit from these advances or uptake digital workflows. Humanities scholars are often sensitive regarding the risk of bias in any automated way of interpretating images, and critical discrepancies have been found between the most advanced AI image labeling services [3]. Before digital workflows can be made accessible to research communities interested in extensive historical and contemporary imagery, research infrastructures (RIs) must be developed to enhance the state of data that is often unstructured, challenging to discover, and lacks proper contextualization. In this presentation, we focus on research workflows familiar to humanists interested in visual cultural heritage materials, highlighting the insights from qualitative interviews with six visual researchers and findings of recent studies on their data practices. Many researchers still resort to hybrid approaches, which involve visiting archives, digitizing resources themselves, or acquiring materials from both open and private sources [4]. As a result, researchers examining historical or contemporary visual artefacts are facing similar challenges. A growing trend is the accumulation of mid- to large-sized visual corpora, alongside documents, interviews, or archival materials that are susceptible to be transformed into well-contextualized research datasets. Here, we emphasize the terms “susceptible” and “datasets” because, despite the existence of computer-assisted software for qualitative data analysis or for annotating visual collections, digital means to conduct research and generate data are not systematically adopted by the community [5]. Furthermore, recent research in Sweden and Finland have shown that sharing data is difficult for visual researchers due to GDPR, third-party ownership, copyright and the inadequacy of data sharing platforms for their often heterogeneous research materials [6], [7]. To address the challenges encountered by visual research communities, DARIAH-FI, currently in its developmental stage, will prioritize issues of technological and legal nature over the next two years. DARIAH-FI builds on years of experience in text-driven computational humanities in Finland, currently consolidating as a network of researchers in diverse fields of digital SSH from six universities, in collaboration with major hubs for computer science and digital cultural heritage aggregators (see www.dariah.fi). The RIs vision is to now include visual research communities and institutions giving access to Finnish photographic and other visual cultural heritage. The goal is to begin closing the workflow gap that exists among various scholarly practices in the humanities and cultural research. This will involve the participation of researchers whether they are familiar with digital humanities or not. The objective is to: 1) create AI models that can generate usable metadata for researchers, 2) connect visual objects to archival and textual sources essential for their interpretation, and 3) alleviating the visual culture research community, whose workflows are constrained either by cumbersome accumulation of research materials, or barriers to publishing, sharing and archiving data. [See references in PDF document] Improving workflows in digital art history: the usefulness of patrimonial images segmentation 1École Pratique des Hautes Études, France - Université de Montréal, Canada - Centre National de la Recherche Scientifique, France; 2Université Rennes 2, France - Université de Montréal, Canada Since Johanna Drucker questioned the existence of “digital art history” (Drucker, 2013), the last decade has seen a growing number of art historical projects make use of computer vision methods. However, digital art history remains a fragmented field: until now, these projects prefer to develop their own technical and methodological solutions. This project-based logic prevents the reproducibility of workflows and, thus, requires a significant financial and time investment, large quantities of data and technical skills that are not accessible to all research units, let alone all researchers (Romein et al., 2020, 310). This logic ultimately hampers the standardization of digital practices, particularly in terms of data recording and algorithm training. Particularly, segmentation of patrimonial images is not subject to any standard or harmonization. It's not a question of trying to standardize the description of image content - an attempt that falls within the realm of ontology - but proposing a way to harmonize the recording of objects of interest within these images. If initiatives are being produced in the field of literary studies (Chagué, Clérice & Romary, 2021), in digital art history, the question remains little addressed (Bardiot, 2021). Yet having corpora of segmented images, whose objects have been identified by their coordinates and recorded in a hierarchical and standardized way, would make it possible to create ground truths that would serve as basis for training new algorithms and be profitable for every workflow in art history. As workflows in digital art history are generally based on the technique of transfer learning, which reduce the amount of data required to train algorithms and improve the results obtained, this question is crucial in the discipline. However, the algorithms available are for the most part trained on “natural” images, which means they are hardly relevant to the specificity of patriomonial images. The absence of harmonization in the solutions proposed project after project culminates in a bitter realization: the tools available swiftly evolve, rendering them obsolete in the short term. Therefore, it appears vital today, in the age of Open Science, to pool our segmented images so they can serve as ground truths for future research, minimizing model training while enhancing the results obtained. Segmentation always proposes an interpretation of the images. As such, how can we harmonize data segmentation, while remaining relevant to the epistemological requirements of art history? Asking this question implies putting into tension issues of interoperability and reproducibility, on the one hand, and epistemological reasoning in relation to the tools developed, on the other hand. These questions should not be addressed after tools development but upstream and should accompany the entire workflow (Stutzmann, 2010, 247-278). To address these tensions, the specific needs of digital art history about questions of image segmentation have to be identified, especially in the case of small and limited corpora, and existing standards in other fields will be discussed. The aim is twofold: to produce ready-to-use models and workflows adapted to patrimonial data, and to share data enabling these models to be produced. |