The "FAIR data toolkit"

Learning resources for FAIR data

Meet HEAP researchers, FAIR frog and Data Gator in the “Swamped?” video series, and discover FAIR data through practical examples, tips, tricks, and a few common mishaps.  

An overview of the BIBBOX – a platform to supports researchers in publishing their datasets in a FAIR manner,  published in the journal New Biotechnology in November 2023 by the Work Package 7 team from the Medical University of Graz.

Tools for FAIR data

FAIR data self-assessment service

The Self Assessment Service (SAS) is an on-line questionnaire to guide researchers on how Findable, Accessible, Interoperable and Reusable (FAIR) their data is.


FAIR toolbox

A modular component-based toolkit for life sciences. Open-source software is combined with ID- and user-management to record data provenance graphs to enable implementation of research workflows in a FAIR manner.

More tools from HEAP

Hopsworks exposome Platform as a Service

PaaS for distributed management and analysis of exposome data. GDPR compliant, sharable data, bioinformatics tools and analysis pipelines. Provides Machine Learning out of the box.

How will the HEAP informatics platform help me as a researcher? HEAP software developer Alex explains the key features of HEAP in this 4-minute video. 

Exposome Toolbox

The HEAP tools are part of EHEN's virtual exposome toolbox, which is a signposting hub for the tools developed by the European Human Exposome Network projects. They include data models, guidelines and protocols,

Scientific publications from HEAP

Project profile paper

Published in the journal Environmental Epidemiology in December 2021, an overview of the HEAP platform as a research resource for the integrated and efficient management and analysis of human exposome data.


Published in the International Journal of Cancer in February 2023 by the Work Package 8 team at the University of Innsbruck. 

Using a DNA methylation signature to investigate changes in the epigenome caused by the HPV virus.

Published in the journal Genome Biology in February 2022 by the Work Package 8 team at the University of Innsbruck. 

The findings imply that there are multiple epigenetic clocks, many of which are tissue-specific, and that the differential tick rate between these clocks may be an informative surrogate measure of disease risk.

Ethics and regulations

Policy comment 

Published in the journal Health Policy  in June 2023, policy recommendations to address the perceived risks of the proposed European Health Data Space (EHDS). The authors included a legal and ethical expert from the HEAP Ethics and Regulations Work Package team.

GDPR and joint controllershop

Published in Open Research Europe in June 2022 by the Work Package 2 team at MLCF Foundation (now Lygature). 

Proposal for a ‘funnel-and-sieves’ model for deciding who are joint controllers under GDPR legislation for the data processing in research consortia and who are not.

Medical ethics

Published in the journal Computer in July 2021 by the Work Package 7 team at the Medical University of Graz and the Work Package 2 team at MLCF Foundation (now Lygature). 

Practical guidelines for those applying artificial intelligence to provide a concise checklist to a wide group of stakeholders.

Artificial Intelligence (AI)

Published in the journal New Biotechnology in May 2023 as part of a special issue entitled Artificial Intelligence for Life Sciences.

An overview of open research issues and challenges relating to biotechnology and Artifical Intelligence.

Published in Explainable AI: Foundations, Methodologies and Applications in October 2022. Authors included HEAP researchers at the Medical University of Graz (Data interoperability and sharing Work Package). 

Book chapter demonstrating the crucial role that human-AI interfaces play in conveying the trustworthiness of AI solutions to their users.

Published in the journal New Biotechnology in September 2022 by the Work Package 7 team at the Medical University of Graz. 

Description of concepts and examples of how explainability and causability are essential to demonstrate scientific validity, as well as analytical and clinical performance for future AI-based IVDs.

Published in the journal Computer in February 2022 by the Work Package 7 team at the Medical University of Graz. 

An argument for using causability in medical artificial intelligence (AI) to develop and evaluate future human–AI interfaces.

Published in the journal IEEE in February 2022 by the Work Package 7 team at the Medical University of Graz. 

A 5-step approach for developing personas to support human-centered design of AI applications, with practical examples from personas development for AI solutions for digital pathology.

Cohort data

Community-randomised HPV vaccination trial

Published in the journal Cell Host & Microbe in November 2023 by the Large sample Cohorts Work Package team at Karolinska Institutet and the University of Oulu. 

The study shows that in the years following vaccination, cancer-causing HPVs are replaced by vaccine-untargeted HPV types with low or no risk for cancer. This effect was most pronounced in communities where gender-neutral vaccination campaigns had been carried out.

Consumer Purchase Data (CPD)

Published in the journal Nature Scientific Reports in December 2023 by the Consumer Exposure Monitoring System Work Package team from at Statens Serum Institut. 

Description of the My Purchases cohort, a web-app enabled, prospective collection of CPD, covering several large retail chains in Denmark, that enables linkage to health outcomes.  Combined with extensive product databases and health outcomes, CPD could provide the basis for extensive investigations of how what we buy affects our health.

Consumer Purchase Data (CPD)

Published in the journal BMJ Open in June 2022 by the Consumer Exposure Monitoring System Work Package team from at Statens Serum Institut. 

Protocol for a population-based inception cohort study aiming to investigate the underlying mechanisms for the heterogeneous course of IBD, including need for, and response to, treatment. Environmental factors and quality of life will be assessed using questionnaires and, when available, automatic registration of purchase data.


Published in the journal Cancer Medicine in 2023. 

Analysis of bacterial and viral communities in colorectal cancer using non-targeted deep-sequencing strategies enabling full microbiome characterisation up to species level.

Published in the journal Nature, Scientific Reports in July 2022 by the Large sample Cohorts Work Package team from at Karolinska Institutet. 

Description of “HPV-meta”, the first open-source pipeline aiming to specifically detect HPV transcripts in RNA sequencing data.

Published in the journal Multidisciplinary Digital
Publishing Institute (MDPI)
in 2022. 

Identification of a microbiome signature for predicting the risk of colorectal cancer, that was validated in a new study, Colorectal Cancer Screening (COLSCREEN).


Published by Springer Nature Switzerland in March 2022 by the Data Interoperability and Sharing Work Package team at the Medical University of Graz. 

Standards and tools for biospecimen quality management must be democratised for biorepositories in a variety of settings to have a truly global impact on research.


Education and outreach

Published by Open Research Europe in February 2023 by the Education and Dissemination Work Package team at the International Agency for Research on Cancer (IARC). 

The exposome is a broad and a recent concept, and is challenging to define in a structured way. Personas have been used in computer science to improve our understanding of human-computer interaction. Using personas specific to exposome research is a useful way of supporting education activities for this complex scientific field.

Project reports and public deliverables

This document defines the life cycle and governance framework for all data to be collected, processed and generated during the HEAP project. 

This document provides guidance for the development and use of the Reference Architecture within the Human Exposome Assessment Platform (HEAP). It describes the technical architecture, data and metadata flow and means for accessing data in the platform.

The HEAP project reports are all available on Zenodo...

  • Trier Møller, Frederik, Ewes, Caroline, Wilkowski, Bartlomiej, Chong, Steven, Grønborg Junker, Thor. (December, 2022). Consumer cohort - secure platform and recruitment. Public deliverable 4.1. Zenodo.
  • Coombs, Heather, Kozlakidis, Zisis, Berger, Anouk. (December, 2022). Knowledge and Information - Phase 1 report. Public deliverable 11.3. Zenodo.
  • Ormenisan, Alex, Mukhedkar, Dhananjay, Arroyo, Sara, Pimenoff, Ville, Zhang, Allison, Trier Møller, Frederik, Herzog, Chiara, Bala, Piotr. (December, 2022). HEAP Knowledge Engine. Public deliverable 6.2. Zenodo.
  • Coombs, Heather, Berger, Anouk, Kozlakidis, Zisis. (December, 2021). Knowledge and Information Sharing Plan. Public deliverable 11.2. Zenodo.
  • Herzog, Chiara, Widschwendter, Martin. (May, 2022). Ageing DNAme signatures. Public deliverable 8.2. Zenodo.
  • Herzog, Chiara, Vavourakis, Charlotte, Widschwendter, Martin. (October, 2021). Epigenomics Analysis - Public deliverable 8.1. Zenodo.
  • Dowling, Jim, Ormenisan, Alex, Negru, Stefan, Merino, Roxana, Muller, Heimo. (June, 2021). HEAP Informatics Platform and Knowledge Engine. Public deliverable 6.1. Zenodo.
  • Muller, Heimo, Nitsche, Patrick, Kipperer, Bettina, Jungwirth, Emilian, Reihs, Robert. (January, 2022). FAIR toolbox and Bring Your Own Data workshops. Public deliverable 7.2. Zenodo.
  • Groos, Daniel, Boeckhout, Martin, Besic, Alma, Van Veen, Evert-Ben. (August, 2022). Report on consumer receipts data and sample-derived data in exposome research. Public deliverable 2.2. Zenodo.
  • Martin Boeckhout, Evert-Ben van Veen. (November, 2022). Report on databases mapped - Human Exposome Assessment Platform (HEAP), Public Deliverable 2.1. Zenodo.
Scroll to Top