Publication: SNSF Spark Projekt ‘Dynamic Data Ingestion’ for server-side data harmonisation: Creating a database with 200k students and scholars 1200-1800: Method, concept and practical implementation
cris.virtual.author-orcid | 0000-0002-6627-5045 | |
cris.virtualsource.author-orcid | 3c23c193-6e41-4334-bd86-0373143fa684 | |
datacite.rights | open.access | |
dc.contributor.author | Gubler, Kaspar | |
dc.date.accessioned | 2024-10-26T18:26:44Z | |
dc.date.available | 2024-10-26T18:26:44Z | |
dc.date.issued | 2021-06-04 | |
dc.description.abstract | The linking of research data has been a dominant topic for years, especially in digital history. Linked Open Data (LOD) is the buzzword at conferences and in research projects. However, it is not the collection of such data available on the internet that is the greatest challenge here, but its harmonisation, because research databases are usually structured differently. It is therefore not surprising that despite many initiatives no research project in digital history has yet been realised being able to harmonise data across several structural levels of the databases. This means, for example, not only linking persons of databases by their names, but going deeper into the data structure to harmonise, for example, the geographical origin or attributes of a person’s education. But that would be the aim: to answer scientific questions through structural data harmonisation. This is where our SPARK project comes in. The third and final phase of the project (Episode 3) has been completed in January 2021. What are the core results of this project? In essence, it is a software module (DDI module for ‘dynamic data ingestion) and a method: data (research data) is collected from different source databases and ingested on a central server using the module according to the spider principle, creating a new metadatabase. The harmonisation of the collected data in this new build database is done as far as possible already with the data ingestion by mapping the database fields of the source databases into corresponding database fields to the new metadatabase. If such a mapping is not or only partially possible because the database fields of the source database and the metadatabase are too dissimilar, in a second step, as soon as the data is stored on the central server, an algorithm can be used to bring uniformity to this data by data reconciliation. In addition, the data can also be automatically reclassified in order to standardise it. These measures prepare the data for analysis and ultimately for publication, which both can be done in the virtual research environment Nodegoat. | |
dc.description.sponsorship | Historisches Institut - Mittelalterliche Geschichte | |
dc.identifier.doi | 10.48350/198516 | |
dc.identifier.uri | https://boris-portal.unibe.ch/handle/20.500.12422/178748 | |
dc.language.iso | en | |
dc.relation.conference | nodegoat day 2021: From source to visualization: Data modeling and analysis with Nodegoat | |
dc.relation.organization | DCD5A442BA43E17DE0405C82790C4DE2 | |
dc.relation.organization | DCD5A442C509E17DE0405C82790C4DE2 | |
dc.subject | nodegoat | |
dc.subject | data modeling | |
dc.subject | data analysis | |
dc.subject | data visualisation | |
dc.subject | digital humanities | |
dc.subject.ddc | 900 - History::940 - History of Europe | |
dc.title | SNSF Spark Projekt ‘Dynamic Data Ingestion’ for server-side data harmonisation: Creating a database with 200k students and scholars 1200-1800: Method, concept and practical implementation | |
dc.type | conference_item | |
dspace.entity.type | Publication | |
dspace.file.type | text | |
oaire.citation.conferenceDate | 04.06.2021 | |
oaire.citation.conferencePlace | Universität Bern | |
oairecerif.author.affiliation | Historisches Institut - Mittelalterliche Geschichte | |
oairecerif.identifier.url | https://histdata.hypotheses.org/nodegoat-day-2021 | |
unibe.contributor.role | creator | |
unibe.date.licenseChanged | 2024-07-08 06:43:47 | |
unibe.description.ispublished | unpub | |
unibe.eprints.legacyId | 198516 | |
unibe.refereed | false | |
unibe.subtype.conference | speech |
Files
Original bundle
1 - 1 of 1
- Name:
- Nodegoat-Day-2021-Programm.pdf
- Size:
- 972.1 KB
- Format:
- Adobe Portable Document Format
- File Type:
- text
- License:
- https://www.ub.unibe.ch/services/open_science/boris_publications/index_eng.html#collapse_pane631832
- Content:
- supplemental