• LOGIN
Repository logo

BORIS Portal

Bern Open Repository and Information System

  • Publication
  • Projects
  • Funding
  • Research Data
  • Organizations
  • Researchers
  • LOGIN
Repository logo
Unibern.ch
  1. Home
  2. Publications
  3. SNSF Spark Projekt ‘Dynamic Data Ingestion’ for server-side data harmonisation: Creating a database with 200k students and scholars 1200-1800: Method, concept and practical implementation
 

SNSF Spark Projekt ‘Dynamic Data Ingestion’ for server-side data harmonisation: Creating a database with 200k students and scholars 1200-1800: Method, concept and practical implementation

Options
  • Details
BORIS DOI
10.48350/198516
Date of Publication
June 4, 2021
Publication Type
Conference Paper
Division/Institute

Historisches Institut...

Author
Gubler, Kasparorcid-logo
Historisches Institut - Mittelalterliche Geschichte
Subject(s)

900 - History::940 - ...

Language
English
Uncontrolled Keywords

nodegoat

data modeling

data analysis

data visualisation

digital humanities

Description
The linking of research data has been a dominant topic for years, especially in digital history. Linked Open Data (LOD) is the buzzword at conferences and in research projects. However, it is not the collection of such data available on the internet that is the greatest challenge here, but its harmonisation, because research databases are usually structured differently. It is therefore not surprising that despite many initiatives no research project in digital history has yet been realised being able to harmonise data across several structural levels of the databases. This means, for example, not only linking persons of databases by their names, but going deeper into the data structure to harmonise, for example, the geographical origin or attributes of a person’s education. But that would be the aim: to answer scientific questions through structural data harmonisation. This is where our SPARK project comes in. The third and final phase of the project (Episode 3) has been completed in January 2021. What are the core results of this project? In essence, it is a software module (DDI module for ‘dynamic data ingestion) and a method: data (research data) is collected from different source databases and ingested on a central server using the module according to the spider principle, creating a new metadatabase. The harmonisation of the collected data in this new build database is done as far as possible already with the data ingestion by mapping the database fields of the source databases into corresponding database fields to the new metadatabase. If such a mapping is not or only partially possible because the database fields of the source database and the metadatabase are too dissimilar, in a second step, as soon as the data is stored on the central server, an algorithm can be used to bring uniformity to this data by data reconciliation. In addition, the data can also be automatically reclassified in order to standardise it. These measures prepare the data for analysis and ultimately for publication, which both can be done in the virtual research environment Nodegoat.
Related URL
https://histdata.hypotheses.org/nodegoat-day-2021
Handle
https://boris-portal.unibe.ch/handle/20.500.12422/178748
Show full item
File(s)
FileFile TypeFormatSizeLicensePublisher/Copright statementContent
Nodegoat-Day-2021-Programm.pdftextAdobe PDF972.1 KBhttps://www.ub.unibe.ch/services/open_science/boris_publications/index_eng.html#collapse_pane631832supplementalOpen
BORIS Portal
Bern Open Repository and Information System
Build: d1c7f7 [27.06. 13:56]
Explore
  • Projects
  • Funding
  • Publications
  • Research Data
  • Organizations
  • Researchers
More
  • About BORIS Portal
  • Send Feedback
  • Cookie settings
  • Service Policy
Follow us on
  • Mastodon
  • YouTube
  • LinkedIn
UniBe logo