Canonical Workflow for Experimental Research
(Dirk Betz, Claudia Biniossek, Christophe Blanchi, Felix Henninger, Thomas Lauer, Philipp Wieder, Peter Wittenburg, Martin Zünkeler),
2022-01-01
Data Depositing Services und der Text+ Datenraum
(Andreas Witt, Andreas Henrich, Jonathan Blumtritt, Christoph Draxler, Axel Herold, Marius Hug, Christoph Kudella, Peter Leinen, Philipp Wieder),
2022-01-01
Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes
(Hendrik Nolte, Philipp Wieder),
2022-01-01
DOI
Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes
(Hendrik Nolte, Philipp Wieder),
2022-01-01
URL
Toward data lakes as central building blocks for data management and analysis
(Philipp Wieder, Hendrik Nolte),
2022-01-01
DOI
Toward data lakes as central building blocks for data management and analysis
(Hendrik Nolte, Philipp Wieder),
2022-01-01
URL
2021
An Optimized Single Sign-On Schema for Reliable Multi -Level Security Management in Clouds
(Aytaj Badirova, Shirin Dabbaghi, Faraz Fatemi-Moghaddam, Philipp Wieder, Ramin Yahyapour),
In Proceedings of FiCloud 2021 – 8th International Conference on Future Internet of Things and Cloud,
2021-01-01
DOI
Certification Schemes for Research Infrastructures
(Felix Helfer, Stefan Buddenbohm, Thomas Eckart, Philipp Wieder),
2021-01-01
Sekundäre Nutzung von hausärztlichen Routinedaten ist machbar – Bericht vom RADAR Projekt
(Johannes Hauswaldt, Thomas Bahls, Arne Blumentritt, Iris Demmer, Johannes Drepper, Roland Groh, Stephanie Heinemann, Wolfgang Hoffmann, Valérie Kempter, Johannes Pung, Otto Rienhoff, Falk Schlegelmilch, Philipp Wieder, Ramin Yahyapour, Eva Hummers),
2021-01-01
2020
OLA-HD – Ein OCR-D-Langzeitarchiv für historische Drucke
(Triet Ho Anh Doan, Zeki Mustafa Doğan, Jörg-Holger Panzer, Kristine Schima-Voigt, Philipp Wieder),
2020-01-01
BibTeX: Toward data lakes as central building blocks for data management and analysis
@article{2_129372,
abstract = {"Data lakes are a fundamental building block for many industrial data analysis solutions and becoming increasingly popular in research. Often associated with big data use cases, data lakes are, for example, used as central data management systems of research institutions or as the core entity of machine learning pipelines. The basic underlying idea of retaining data in its native format within a data lake facilitates a large range of use cases and improves data reusability, especially when compared to the schema-on-write approach applied in data warehouses, where data is transformed prior to the actual storage to fit a predefined schema. Storing such massive amounts of raw data, however, has its very own challenges, spanning from the general data modeling, and indexing for concise querying to the integration of suitable and scalable compute capabilities. In this contribution, influential papers of the last decade have been selected to provide a comprehensive overview of developments and obtained results. The papers are analyzed with regard to the applicability of their input to data lakes that serve as central data management systems of research institutions. To achieve this, contributions to data lake architectures, metadata models, data provenance, workflow support, and FAIR principles are investigated. Last, but not least, these capabilities are mapped onto the requirements of two common research personae to identify open challenges. With that, potential research topics are determined, which have to be tackled toward the applicability of data lakes as central building blocks for research data management."},
author = {Hendrik Nolte and Philipp Wieder},
grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/129372},
month = {01},
title = {Toward data lakes as central building blocks for data management and analysis},
type = {article},
url = {https://publications.goettingen-research-online.de/handle/2/114449},
year = {2022},
}
BibTeX: Toward data lakes as central building blocks for data management and analysis
@article{2_114449,
abstract = {"Data lakes are a fundamental building block for many industrial data analysis solutions and becoming increasingly popular in research. Often associated with big data use cases, data lakes are, for example, used as central data management systems of research institutions or as the core entity of machine learning pipelines. The basic underlying idea of retaining data in its native format within a data lake facilitates a large range of use cases and improves data reusability, especially when compared to the schema-on-write approach applied in data warehouses, where data is transformed prior to the actual storage to fit a predefined schema. Storing such massive amounts of raw data, however, has its very own challenges, spanning from the general data modeling, and indexing for concise querying to the integration of suitable and scalable compute capabilities. In this contribution, influential papers of the last decade have been selected to provide a comprehensive overview of developments and obtained results. The papers are analyzed with regard to the applicability of their input to data lakes that serve as central data management systems of research institutions. To achieve this, contributions to data lake architectures, metadata models, data provenance, workflow support, and FAIR principles are investigated. Last, but not least, these capabilities are mapped onto the requirements of two common research personae to identify open challenges. With that, potential research topics are determined, which have to be tackled toward the applicability of data lakes as central building blocks for research data management."},
author = {Philipp Wieder and Hendrik Nolte},
doi = {10.3389/fdata.2022.945720},
grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/114449},
month = {01},
title = {Toward data lakes as central building blocks for data management and analysis},
type = {article},
year = {2022},
}
BibTeX: Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes
@article{2_129373,
author = {Hendrik Nolte and Philipp Wieder},
grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/129373},
month = {01},
title = {Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes},
type = {article},
url = {https://publications.goettingen-research-online.de/handle/2/121151},
year = {2022},
}
BibTeX: Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes
@article{2_121151,
author = {Hendrik Nolte and Philipp Wieder},
doi = {10.1162/dint_a_00141},
grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/121151},
month = {01},
title = {Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes},
type = {article},
year = {2022},
}
BibTeX: Data Depositing Services und der Text+ Datenraum
@misc{2_127235,
author = {Andreas Witt and Andreas Henrich and Jonathan Blumtritt and Christoph Draxler and Axel Herold and Marius Hug and Christoph Kudella and Peter Leinen and Philipp Wieder},
grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/127235},
month = {01},
title = {Data Depositing Services und der Text+ Datenraum},
type = {misc},
year = {2022},
}
BibTeX: Canonical Workflow for Experimental Research
@article{2_121152,
author = {Dirk Betz and Claudia Biniossek and Christophe Blanchi and Felix Henninger and Thomas Lauer and Philipp Wieder and Peter Wittenburg and Martin Zünkeler},
grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/121152},
month = {01},
title = {Canonical Workflow for Experimental Research},
type = {article},
year = {2022},
}
BibTeX: Sekundäre Nutzung von hausärztlichen Routinedaten ist machbar – Bericht vom RADAR Projekt
@article{2_97749,
abstract = {"Zusammenfassung Ziel der Studie „Real world“-Daten aus der ambulanten Gesundheitsversorgung sind in Deutschland nur schwer systematisch und longitudinal zu erlangen. Unsere Vision ist eine permanente Datenablage mit repräsentativen, de-identifizierten Patienten- und Versorgungsdaten, längsschnittlich, fortwährend aktualisiert und von verschiedenen Versorgern, mit der Möglichkeit zur Verknüpfung mit weiteren Daten, etwa aus Patientenbefragungen oder biologischer Forschung, zugänglich für andere Forscher. Wir berichten methodische Vorgehensweisen und Ergebnisse aus dem RADAR Projekt.Methodik Untersuchung des Rechtsrahmens, Entwicklung prototypischer technischer Abläufe und Lösungen, mit Machbarkeitsstudie zur Evaluation von technischer und inhaltlicher Funktionalität sowie Eignung für Fragestellungen der Versorgungsforschung.Ergebnisse Ab 2016 entwickelte ein interdisziplinäres Wissenschaftlerteam ein Datenschutzkonzept für Exporte von Versorgungsdaten aus elektronischen Praxisverwaltungssystemen. Eine technische und organisatorische Forschungsinfrastruktur im ambulanten Sektor wurden entwickelt und im Anwendungsfall „Orale Antikoagulation“ (OAK) umgesetzt. In 7 niedersächsischen Hausarztpraxen wurden 100 Patienten gewonnen und nach informierter Einwilligung ihre ausgewählten Behandlungsdaten, reduziert auf 40 relevante Datenfelder, über die Behandlungsdatentransfer-Schnittstelle extrahiert, unmittelbar vor Ort in identifizierende bzw. medizinische Daten getrennt und verschlüsselt zur Treuhandstelle (THS) bzw. an den Datenhalter übertragen. 75 Patienten, die die Einschlusskriterien erfüllten (mind. 1 Jahr Behandlung mit OAK), erhielten einen Lebensqualitäts-Fragebogen über die THS per Post. Von 66 Rücksendungen wurden 63 Fragebogenergebnisse mit den Behandlungsdaten in der Datenablage verknüpft.Schlussfolgerung Die rechtskonforme Machbarkeit der Gewinnung von pseudonymisierten hausärztlichen Routinedaten mit expliziter informierter Patienteneinwilligung und deren wissenschaftliche Nutzung einschließlich Re-Kontaktierung und Einbindung von Fragebogendaten konnte nachgewiesen werden. Die Schutzkonzepte Privacy by design und Datenminimierung (Artikel 25 mit Erwägungsgrund 78 DSGVO) wurden systematisch in das RADAR Projekt integriert und begründen wesentlich, dass der Machbarkeitsnachweis rechtskonformer Primärdatengewinnung und sekundärer Nutzung für Forschungszwecke gelang. Eine Nutzung hinreichend anonymisierter, aber noch sinnvoller hausärztlicher Gesundheitsdaten ohne individuelle Einwilligung ist im bestehenden Rechtsrahmen in Deutschland schwerlich umsetzbar."},
author = {Johannes Hauswaldt and Thomas Bahls and Arne Blumentritt and Iris Demmer and Johannes Drepper and Roland Groh and Stephanie Heinemann and Wolfgang Hoffmann and Valérie Kempter and Johannes Pung and Otto Rienhoff and Falk Schlegelmilch and Philipp Wieder and Ramin Yahyapour and Eva Hummers},
grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/97749},
month = {01},
title = {Sekundäre Nutzung von hausärztlichen Routinedaten ist machbar – Bericht vom RADAR Projekt},
type = {article},
year = {2021},
}
BibTeX: Certification Schemes for Research Infrastructures
@misc{2_108259,
abstract = {"This working paper discusses the use and importance of various certification systems for the field of modern research infrastructures. For infrastructures such as CLARIAH-DE, reliable storage, management and dissemination of research data is an essential task. The certification of various areas, such as the technical architecture used, the work processes used or the qualification level of the staff, is an established procedure to ensure compliance with a variety of standards and quality criteria and to demonstrate the quality and reliability of an infrastructure to researchers, funders and comparable consortia. The working paper conducts this discussion based on an overview of selected certification systems that are of particular importance for CLARIAH-DE, but also for other research infrastructures. In addition to formalised certifications, the paper also addresses the areas of software-specific and self-assessment-based procedures and the different roles of the actors involved."},
address = {Göttingen},
author = {Felix Helfer and Stefan Buddenbohm and Thomas Eckart and Philipp Wieder},
grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/108259},
month = {01},
title = {Certification Schemes for Research Infrastructures},
type = {misc},
year = {2021},
}
BibTeX: An Optimized Single Sign-On Schema for Reliable Multi -Level Security Management in Clouds
@inproceedings{2_121153,
author = {Aytaj Badirova and Shirin Dabbaghi and Faraz Fatemi-Moghaddam and Philipp Wieder and Ramin Yahyapour},
doi = {10.1109/FiCloud49777.2021.00014},
grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/121153},
journal = {Proceedings of FiCloud 2021 – 8th International Conference on Future Internet of Things and Cloud},
month = {01},
title = {An Optimized Single Sign-On Schema for Reliable Multi -Level Security Management in Clouds},
type = {inproceedings},
year = {2021},
}
BibTeX: OLA-HD – Ein OCR-D-Langzeitarchiv für historische Drucke
@article{2_116509,
author = {Triet Ho Anh Doan and Zeki Mustafa Doğan and Jörg-Holger Panzer and Kristine Schima-Voigt and Philipp Wieder},
grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/116509},
month = {01},
title = {OLA-HD – Ein OCR-D-Langzeitarchiv für historische Drucke},
type = {article},
year = {2020},
}