Data Linkage for Researchers
Data linkage
Data linkage
The data linkage programme enhances the research value of TwinsUK by integrating routinely collected health records with the detailed phenotypic and genotypic data available in the TwinsUK Biobank.
Researchers can access these linked datasets through the King’s College London Trusted Research Environment (TRE), enabling longitudinal analyses of healthcare use, diagnoses, and outcomes. Linked electronic health records are currently available for over 13,000 of the 16,000 participants in the cohort

Hospital Episode Statistics – Admitted Patient Care (HES APC)
Captures patients admitted for treatment at NHS hospitals in England, either as day cases or overnight and either as emergency or planned admissions. A record represents one episode. This dataset supports analyses of hospitalisation patterns, long-term outcomes, and healthcare needs over time.
Hospital Episode Statistics – Outpatient (HES OP)
Captures outpatient appointments in English NHS hospitals and NHS England commissioned outpatient appointments in the independent sector. Researchers can explore patterns in outpatient service utilisation, examine care pathways, and follow-up trends.
Hospital Episode Statistics – Accident & Emergency (HES AE)
Provides data on attendances at Accident & Emergency departments. Useful for examining emergency care usage, acute health events, and healthcare-seeking behaviour.
Hospital Episode Statistics – Critical Care (HES CC)
Focuses on episodes of care in intensive and high-dependency units. This dataset enables investigation into critical illness, recovery trajectories, and severe health outcomes.
Emergency Care Dataset (ECDS)
A newer and more detailed dataset on emergency care attendances, offering enhanced insight into the nature of visits to emergency departments from April 2017 onwards. It supports analyses of acute care pathways and system demand.
Mental Health Minimum Dataset (MHMDS)
Includes information on specialist mental health service contacts and care. Suitable for studying mental health service use, psychiatric comorbidities, and treatment pathways.
Cancer Registration
The National Cancer Registration dataset is the population-based cancer registry for England. It collects, quality assures and analyses data on all people living in England who are diagnosed with malignant and pre-malignant neoplasms, with national coverage since 1971.
Mortality (Office for National Statistics)
The Civil Registration of Deaths data set contains details of all registered deaths in England and Wales.
Working with Linked Data
Researchers can access the linked datasets in combination with TwinsUK phenotypic, genotypic, and omics data, allowing for integrated and longitudinal analyses. All linked data work must take place within the secure TwinsUK Trusted Research Environment (TRE), where outputs are disclosure-checked prior to release. Support is available for navigating and understanding these datasets, including metadata, cohort-specific documentation, and user guides.
Access Process
To access linked data through TwinsUK:
- Review Available Data – Visit the TwinsUK website to explore all phenotypic and omics data available. (https://twinsuk.ac.uk/researchers/explore-our-data-and-samples/)
- Submit a Proposal – Researchers must submit a data access request, which includes ethical approvals and project details. More information is available on the Accessing TwinsUK data and samples page. (https://twinsuk.ac.uk/researchers/access-data-and-samples/request-access/)
- Work Within the TRE – Once approved, projects are provisioned in a secure environment. Remote access allows researchers to analyse the data using standard tools (e.g. R, Stata, Python).
We encourage researchers to get in touch early to ensure their proposed use of linked data aligns with available resources.
Questions
If you have questions about data linkage, dataset structure, or working within the TRE, please contact twinsuk@kcl.ac.uk