image: Researchers from the GCAT project at IGTP. On the left, Dr Rafael de Cid, last author of the study, and on the right, Natàlia Blay, first author.
Credit: IGTP
A research team from the GCAT|Genomes for Life project, one of the strategic initiatives of the Germans Trias i Pujol Research Institute (IGTP), has published a study in the journal Scientific Reports proposing a method to correct selection biases in population-based cohorts. The study represents a significant step forward in improving the reliability of data derived from such studies for public health research and precision medicine.
The GCAT cohort, made up of nearly 20,000 adults from Catalonia, is a unique platform for analysing the interaction between genetic and environmental factors in the development of complex diseases. As the cohort ages, it becomes especially valuable for identifying common patterns in disease incidence. However, like other volunteer-based cohorts, it is subject to what is known as the healthy volunteer bias: an overrepresentation of individuals in better health and with more favourable socioeconomic conditions than the general population. This imbalance can compromise the validity of conclusions drawn when trying to extrapolate findings to the broader population.
The study, led by first author Natàlia Blay under the supervision of Dr Rafael de Cid, scientific director of the GCAT project, also includes the participation of Dr Conxa Violán, a researcher from the Primary Care Research Support Unit of the Northern Metropolitan Area (USR-MN). The work is part of the collaborative research group GRIMTra (Research Group on the Impact of Chronic Diseases and Their Trajectories), which is integrated into IGTP's CORE Program (Program in Public Health and Primary Healthcare). It represents a valuable contribution that exemplifies a strategic approach to bridging biomedical research with population data and everyday clinical practice.
The analysis addresses this challenge by comparing data from the GCAT cohort with health records and population surveys from Catalonia. Using a statistical adjustment technique known as raked weighting, based on key variables such as age, sex, educational level, smoking habits and perceived health status, the researchers achieved a substantial reduction in bias: up to 70% in demographic variables and 26% in disease prevalence estimates. In other words, the sample imbalances are corrected, significantly improving the representativeness of the cohort data with respect to the general population.
This approach enables more accurate and representative findings, enhancing the potential of GCAT as an infrastructure for implementing community-scale precision medicine pilots and for developing health policies grounded in real-world data.
"This work strengthens the value of the GCAT cohort as a population laboratory not only for studying disease mechanisms, but also for generating evidence that is useful for public health and for implementing community-scale precision medicine pilots", says Dr Rafael de Cid.
Journal
Scientific Reports
Method of Research
Data/statistical analysis
Subject of Research
People
Article Title
Weighting health-related estimates in the GCAT cohort and the general population of Catalonia
Article Publication Date
16-May-2025
COI Statement
The authors declare no competing interests.