Common Workflow Language (CWL): The Common Workflow Language (CWL) is an open standard for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high-performance computing (HPC) environments.

Data Management Plan (DMP): The data management life-cycle for data to be collected, processed and generated by an EU Horizon 2020 project.

Digital Object Identifiers (DOI): A digital object identifier (DOI) is a persistent identifier or handle used to identify objects uniquely, standardized by the International Organization for Standardization (ISO). DOIs are widely used, mainly to identify academic, professional, and government information, such as journal articles, research reports and data sets, and official publications.

EFSA food classification system FoodEx2: A metadata format that will be used for the HEAP Consumer Cohort study.

Exposome: The exposome has been defined as the totality of exposure individuals experience over their lives and how those exposures affect health. Three exposome domains have been identified: internal, specific external and general external. Internal factors are those that are unique to the individual, and specific external factors include occupational exposures and lifestyle factors. The general external domain includes factors such as education level and financial status.

FAIR data principles: FAIR stands for Findable, Accessible, Interoperable and Re-usable. A FAIR approach ensures that data are well managed and can lead to knowledge discovery and innovation, integration and reuse.

General Data Protection Regulation (GDPR): Regulation (EU) 2016/679.

Global Data Synchronisation Network (GDSN): A metadata format.

Global Product Classification (GPC): A metadata format.

Hopsworks: An open-source enterprise platform for the development and operation of machine learning (ML) pipelines at scale, allowing easy progress from data exploration and model development to running end-to-end machine learning pipelines.

Human Exposome Assessment Platform (HEAP): A scalable, technical research platform to assess the impact of the internal and external exposome on human health.

Information Commons: The HEAP Information Commons is the data repository infrastructure that provides on-demand data accessibility in line with the HEAP ethical and regulatory framework. It includes the physical resources to store and archive HEAP data and the interfaces to make these data available for analysis and future research through the HEAP Platform as a Service (PaaS).

Interoperability: Data interoperability means allowing data exchange and re-use between researchers, institutions, organizations and countries. This entails closely adhering to standards for formats that are compliant with available (open) software applications, and facilitating recombinations with different datasets from different origins.

Metagenomics: The study of genetic material recovered directly from environmental samples.

Minimum Information about Biobank Data Sharing (MIABIS) standard: A data standard available to biobanks, covering three core datasets, describing biobanks, sample collections, and studies, complete with 37 metadata attributes that enable a general standard for the integration of biobank databases.

Minimum Information about a Genome Sequence (MIGS): A metadata category used for metagenomics data analysis, assigned during data sampling.

Minimum Information about a Metagenomics Sequence (MIMS): A metadata category used for metagenomics data analysis, assigned during data sampling.

Minimum Information about a Marker Gene Sequence (MIMARKS): A metadata category used for metagenomics data analysis, assigned during data sampling.

Minimum Information about any (x) sequence (MIxS): A metadata category used for metagenomics data analysis, assigned during data sampling.

MIRACOLIX: A German medical informatics platform.

Nagoya Protocol on access and benefit sharing: A 2010 supplementary agreement to the 1992 Convention on Biological Diversity (CBD). Its aim is the implementation of one of the three objectives of the CBD: the fair and equitable sharing of benefits arising out of the use of genetic resources, thereby enhancing species conservation and biodiversity.

Observational Health Data Sciences and Informatics (OHDSI): The OHDSI (pronounced “Odyssey”) programme is a multi-stakeholder, interdisciplinary collaborative to bring out the value of health data through large-scale analytics.

Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM): The OMOP-CDM is becoming the de facto standard for observational health data, especially in medication-related studies, in the USA, and it is also being adopted in several European and Asian research networks.

Open source: Te open-source model is a decentralized software development model that encourages open collaboration, meaning “any system of innovation or production that relies on goal-oriented yet loosely coordinated participants who interact to create a product (or service) of economic value, which they make available to contributors and non-contributors alike.” A main principle of the open-source model is peer production, with products such as source code, blueprints, and documentation freely available to the public.

Personal Identification Code (PIC): Personal Identification Code in Denmark is used in dealings with public agencies, from health care to the tax authorities. It is also used as a customer number in banks and insurance companies. People must be registered with a PIC number if they reside in Denmark, if they own property or if they pay tax.

Platform as a Service (PaaS): The HEAP PaaS, which is based on Hopsworks, includes the Data and Feature Warehouse and Knowledge Engine.

Resource Description Framework (RDF): A method for conceptual description or modelling of information that is implemented in web resources, using a variety of syntax notations and data serialization formats. It is also used in knowledge management applications.

SPARQL system: A standardized language for querying Resource Description Framework (RDF) data, also able to query linked data. 

Uniform Resource Identifier (URI):  A sequence of characters that identifies a logical or physical resource to facilitate links between data. Examples of resources include electronic documents, elevator door sensors, XML namespaces, webpages and identification microchips for pets.HoT

