Vision

Population-based epidemiological cohort studies such as the Cooperative Health Research in South Tyrol (CHRIS) study provide a comprehensive overview of the general population’s health state, effectively capturing the standard physiological range and a spectrum of pathological states. Integrating molecular profiling data with clinical and lifestyle metadata presents significant potential for elucidating physiological changes and underlying mechanisms associated with early disease onset. However, even as the number of cohort studies increases and ever more sophisticated data are collected, fruitful exploration and analysis of these datasets across multiple dimensions remains challenging, even for expert data scientists. Network-based approaches are well-suited for this purpose, integrating multi-level biomedical data to build knowledge graphs that capture system-level and functionally relevant interactions.

In response to these advancements and challenges, we are developing two platforms for the dynamic exploration of population cohorts through multi-level network medicine approaches.

The DyHealthNet light platform allows the exploration of GWAS summary statistics and offers filtering, annotation, and aggregation of GWAS summary statistics data. Furthermore, with the transformation of variant-level results onto gene-level, the platform enables the application of well-known algorithms and tools from network biology to conduct e.g. disease module discovery or functional enrichment analysis.

The second platform, DyHealthNet, integrates heterogeneous molecular and baseline metadata into a unified multi-omics network framework to elucidate complex associations among the different data types. Furthermore, the functionalities of the light version based on GWAS summary statistics will be integrated. Even if certain data layers should not be present in a given dataset, this can be specified in the setup of the platform and all calculations and analyses will then only be applied on the actual present data. There is no need for the user to upload their data to some public server, since each user can set up their own instance of both platforms on their local computing facilities where the data is actually stored. This way, privacy-sensitive data can be kept in a secure and protected environment by “having the platform come to the data”.

The DyHealthNet platform supports both global exploratory analyses across the entire cohort data and dynamic computation of associations within user-defined population subsets, enabling the identification of subgroup-specific molecular mechanisms. Such subsets can be defined via logical concatenations of omics variables (especially phenotypes) or the absence / presence of genetic variants. As an example, if a user is interested in the underlying mechanisms of pain sensitivity, they might subset their cohort to e.g. samples over 50 years old with a pressure pain sensitivity above a certain threshold or to samples that possess a certain genetic variant known to be associated with pain sensitivity.

The cohort-specific multi-omics association network will be extended by integrating external resources from public databases for the different types of omics layers, in order to “embed” the cohort-specific results into already existing knowledge. Lastly, the platform’s differential context comparison mode will enable differential network analyses between pairwise knowledge graphs of two subgroups. Referring back to our original example research question, this functionality will allow the user to mine for differentially expressed subnetworks between e.g. pairwise knowledge graphs for pain sensitivity of men and women. This way we hope to uncover mechanisms and pathways of variables that, acting together, lead to significant functional differences between e.g. pain sensitivity in men and women.

Both platforms are designed to be modular, configurable via cohort-specific configurations, and easily deployable. Thus, DyHealthNet makes disease association mining and explorative analysis of population cohort data more broadly accessible, even to researchers without programming expertise.