• LOGIN
    Login with username and password
Repository logo

BORIS Portal

Bern Open Repository and Information System

  • Publications
  • Theses
  • Research Data
  • Projects
  • Organizations
  • Researchers
  • More
  • Collections
  • Statistics
  • LOGIN
    Login with username and password
Repository logo
Unibern.ch
  1. Home
  2. Publications
  3. Exploring Data Provenance in Handwritten Text Recognition Infrastructure: Sharing and Reusing Ground Truth Data, Referencing Models, and Acknowledging Contributions. Starting the Conversation on How We Could Get It Done
 

Exploring Data Provenance in Handwritten Text Recognition Infrastructure: Sharing and Reusing Ground Truth Data, Referencing Models, and Acknowledging Contributions. Starting the Conversation on How We Could Get It Done

Options
  • Details
  • Files
BORIS DOI
10.48350/194575
Publisher DOI
10.46298/jdmdh.10403
Description
This paper discusses best practices for sharing and reusing Ground Truth in Handwritten Text Recognition infrastructures, as well as ways to reference and acknowledge contributions to the creation and enrichment of data within these systems. We discuss how one can place Ground Truth data in a repository and, subsequently, inform others through HTR-United. Furthermore, we want to suggest appropriate citation methods for ATR data, models, and contributions made by volunteers. Moreover, when using digitised sources (digital facsimiles), it becomes increasingly important to distinguish between the physical object and the digital collection. These topics all relate to the proper acknowledgement of labour put into digitising, transcribing, and sharing Ground Truth HTR data. This also points to broader issues surrounding the use of machine learning in archival and library contexts, and how the community should begin to acknowledge and record both contributions and data provenance.
Date of Publication
2024
Publication Type
Article
Subject(s)
100 Philosophy
800 Literature, rhetoric & criticism
900 History
Keyword(s)
Handwritten Text Recognition
•
Ground Truth
•
Crowdsourcing
•
Citizen Science
Language(s)
en
Contributor(s)
Romein, C. Annemieke
Hodel, Tobiasorcid-logo
Walter Benjamin Kolleg (WBKolleg)
Walter Benjamin Kolleg (WBKolleg) - Digital Humanities
Gordijn, Femke
Zundert, Joris J. van
Chagué, Alix
Lange, Milan van
Jensen, Helle Strandgaard
Stauder, Andy
Purcell, Jake
Terras, Melissa M.
Heuvel, Pauline van den
Keijzer, Carlijn
Rabus, Achim
Sitaram, Chantal
Bhatia, Aakriti
Depuydt, Katrien
Afolabi-Adeolu, Mary Aderonke
Anikina, Anastasiia
Bastianello, Elisa
Benzinger, Lukas Vincent
Bosse, Arno
Brown, David
Charlton, Ash
Dannevig, André Nilsson
Gelder, Klaas van
Go, Sabine C.P.J.
Goh, Marcus J.C.
Gstrein, Silvia
Hasan, Sewa
Heide, Stefan von der
Hindermann, Maximilian
Huff, Dorothee
Huysman, Ineke
Idris, Ali
Keijzer, Liesbeth
Kemper, Simon
Koenders, Sanne
Kuijpers, Erika
Rønsig Larsen, Lisette
Lepa, Sven
Link, Tommy O.
Nispen, Annelies van
Nockels, Joe
Noort, Laura M. van
Oosterhuis, Joost Johannes
Popken, Vivien
Estrella Puertollano, María
Puusaag, Joosep J.
Sheta, Ahmed
Stoop, Lex
Strutzenbladh, Ebba
Sijs, Nicoline van der
Spek, Jan Paul van der
Trouw, Barry Benaissa
Van Synghel, Geertrui
Vučković, Vladimir
Wilbrink, Heleen
Weiss, Sonia
Wrisley, David Joseph
Zweistra, Riet
Additional Credits
Walter Benjamin Kolleg (WBKolleg)
Walter Benjamin Kolleg (WBKolleg) - Digital Humanities
Series
Journal of data mining and digital humanities
Publisher
Episciences
ISSN
2416-5999
Access(Rights)
open.access
Show full item
BORIS Portal
Bern Open Repository and Information System
Build: dd892c [ 9.04. 8:30]
Explore
  • Projects
  • Funding
  • Publications
  • Research Data
  • Organizations
  • Researchers
  • Audiovisual Material
  • Software & other digital items
  • Events
More
  • About BORIS Portal
  • Send Feedback
  • Cookie settings
  • Service Policy
Follow us on
  • Mastodon
  • YouTube
  • LinkedIn
UniBe logo