Distributed Privacy-Preserving Data Analysis in NFDI4Health With the Personal Health Train





Research Data Infrastructure, NFDI4Health, distributed data analytics, personal health train


Data sharing is often met with resistance in medicine and healthcare, due to the sensitive nature and heterogeneous characteristics of health data. The lack of standardization and semantics further exacerbate the problems of data fragments and data silos, which makes data analytics challenging. NFDI4Health aims to develop a data infrastructure for personalized medicine and health research and to make data generated in clinical trials, epidemiological, and public health studies FAIR (Findable, Accessible, Interoperable, and Reusable). Since this research data infrastructure is distributed over various partners contributing to their data, the Personal Health Train (PHT) complements this infrastructure by providing a required analytics infrastructure considering the distribution of data collections. Our research have demonstrated the capability of conducting data analysis on sensitive data in various formats distributed across multiple institutions and shown great potential to facilitate medical and health research.


Download data is not yet available.


O. Beyan, A. Choudhury, J. Van Soest, et al., “Distributed analytics on sensitive medical data: The personal health train,” Data Intelligence, vol. 2, no. 1-2, pp. 96–107, 2020.

S. Welten, Y. Mou, L. Neumann, et al., “A privacy-preserving distributed analytics platform for health care data,” Methods of information in medicine, vol. 61, no. S 01, e1–e11, 2022.

Y. Mou, S. Welten, M. Jaberansary, et al., “Distributed skin lesion analysis across decentralised data sources,” in Public Health and Informatics, IOS Press, 2021, pp. 352–356.

S. Welten, L. Hempel, M. Abedi, et al., “Multi-institutional breast cancer detection using a secure on-boarding service for distributed analytics,” Applied Sciences, vol. 12, no. 9, p. 4336, 2022.

F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier-Hein, “Nnu-net: A self-configuring method for deep learning-based biomedical image segmentation,” Nature methods, vol. 18, no. 2, pp. 203–211, 2021.

J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223–2232.

J. Wasserthal, M. Meyer, H.-C. Breit, J. Cyriac, S. Yang, and M. Segeroth, “Totalsegmentator: Robust segmentation of 104 anatomical structures in ct images,” arXiv preprint arXiv:2208.05868, 2022.




How to Cite

Mou, Y., Li, F., Weber, S., Haneef, S., Meine, H., Caldeira, L., … Kirsten, T. (2023). Distributed Privacy-Preserving Data Analysis in NFDI4Health With the Personal Health Train. Proceedings of the Conference on Research Data Infrastructure , 1. https://doi.org/10.52825/cordi.v1i.282
Received 2023-04-26
Accepted 2023-06-29
Published 2023-09-07

Funding data