Article Text

Download PDFPDF

Routinely collected health data to study inherited heart disease: a systematic review (2000–2016)
  1. Bianca Blanch1,2,
  2. Joanna Sweeting1,2,
  3. Christopher Semsarian1,2,3 and
  4. Jodie Ingles1,2,3
  1. 1Agnes Ginges Centre for Molecular Cardiology, Centenary Institute, Sydney, New South Wales, Australia
  2. 2Sydney Medical School, University of Sydney, Sydney, New South Wales, Australia
  3. 3Department of Cardiology, Royal Prince Alfred Hospital, Camperdown, New South Wales, Australia
  1. Correspondence to Dr Jodie Ingles; j.ingles{at}


Objective Our understanding of inherited heart disease is predominantly based on retrospective specialised clinic cohorts, which have inherent selection bias. Population-based routinely collected data can provide insight into unbiased, large-scale patterns of treatment and care but may be limited by the granularity of clinical information available. We sought to synthesise the global literature to determine whether we can identify patients with inherited heart diseases using routinely collected health data.

Methods Medline, Embase, CINAHL, PreMEDLINE and Google Scholar citation databases were searched for relevant articles published between 1 January 2000 and 31 October 2016.

Results A total of 5641 titles/abstracts were screened and 46 full-text articles were retrieved. Twelve peer-reviewed, English-language manuscripts met our inclusion criteria. Studies predominantly focused on Marfan syndrome (41%) or hypertrophic cardiomyopathy (29%). All studies used International Classification of Disease diagnosis codes to define inherited heart disease populations; three studies also used procedure codes. Nine of the 17 definitions for inherited heart disease were repeated across studies.

Conclusions Inherited heart disease populations can be identified using routinely collected health data, though challenges relate to existing diagnosis codes. This is an underutilised resource with the potential to inform patterns of care, patient outcomes and overall disease burden.

  • cardiomyopathy hypertrophic
  • marfans
  • arrhythmogenic right ventricular dyplasia

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Inherited heart diseases include the genetic cardiomyopathies, primary arrhythmogenic disorders and familial connective tissue and valve diseases. The prevalence of inherited heart diseases ranges from 1 in 200 to 10 000 in the general population1 2 and most are characterised by marked clinical and genetic heterogeneity.3 A diagnosis is contingent on clinical investigations with a cardiologist; a detailed three-generation family history noting any sudden deaths or heart disease; and in some cases, genetic testing.4 Clinical and family management of inherited heart diseases are based on published guidelines.3 5–7 However, to date, much of this evidence is derived from clinical trials and patient registries, which have inherent selection biases.4

Routinely collected data are obtained for administrative and clinical purposes without any specific a priori research goals.8 These health data are linked, person-level and longitudinal, providing researchers with the opportunity to examine one patient’s interactions with the healthcare system, for any medical reason, over the entire observation period. Routinely collected health data are also population-based, meaning they capture information for all persons receiving treatment. These strengths are particularly noteworthy when investigating rare diseases, as they may be more effective at identifying a representative sample, which increases the generalisability of findings and reduces selection biases. However, routinely collected health data typically include only a limited amount of clinical information so it is unclear whether these data alone can be used to accurately identify an inherited heart disease population. We sought to conduct a global systematic review to examine current methods of identifying patients with inherited heart diseases using routinely collected health data exclusively.


Eligible studies

The review included English-language, peer-reviewed articles, published between 1 January 2000 and 31 October 2016, which identified patients with an inherited heart disease using routinely collected health data exclusively. We included the following inherited heart diseases: arrhythmogenic right ventricular cardiomyopathy; bicuspid aortic valve disease; Brugada syndrome; catecholaminergic polymorphic ventricular tachycardia; familial dilated cardiomyopathy; familial hypercholesterolaemia; familial restrictive cardiomyopathy; hypertrophic cardiomyopathy; long QT syndrome; left ventricular non-compaction and Marfan syndrome. The RECORD (REporting of studies Conducted using Observational Routinely-collected health Data) definition of routinely collected health data was used: ‘data obtained for administrative and clinical purposes without specific a priori research goals’.8 All grey literature (research that is unpublished or published in a non-commercial form), government reports, case reports/studies, editorials, commentaries, letters, conference abstracts, protocols and review articles were excluded.

Search strategy

On 31 October 2016, four bibliographic databases (Medline, PreMEDLINE, EMBASE and Cumulative Index to Nursing and Allied Health Literature [CINAHL]) were searched, combining subject headings and keywords to capture relevant studies. Search terms included those related to data type (eg, hospitalisation, medical records); methodological design (eg, cohort studies, epidemiological methods) and heart disease (eg, hypertrophic cardiomyopathy, Marfan syndrome). Online supplementary table 1A–D outlines the full search strategy. Back references and citing articles (via Google Scholar) of all manuscripts included in this review were searched to identify additional relevant articles.

Supplementary file 1

Abstracts and titles of all articles were screened (BB) to identify potentially relevant studies. Two reviewers (BB and JS) independently assessed each article based on a 5-item tool specifying the eligibility criteria (online supplementary figure 1). A third reviewer (JI) independently assessed any article for which consensus was not reached (2% of articles).

Data extraction

The following is reported for each article:

  • Study details: first author surname; year of publication; publishing journal; funding source; setting; data source(s); data coverage; observation period and study objectives relating to inherited heart disease(s). We also calculated the publication lag (year of publication – final year of observation).

  • Medical condition of interest: inherited heart disease(s) studies and definition using routinely collected health data, such as International Classification of Diseases (ICD) code9 and whether any family members of affected individuals were identified.

  • Cohort details: cohort eligibility criteria; number of cohorts; cohort demographics including number of persons, age (median with interquartile range [IQR] or mean with standard deviation [SD]), proportion of women; and where relevant, variables used to match cohorts.

  • Outcome measures: any outcome measure related to an inherited heart disease cohort. Summary statistics are shown for statistically significant results, and summaries of non-statistically significant results are provided. In studies where no statistical analyses were performed, we report the results of relevant outcome measures. We assigned each outcome measure a theme: resource utilisation (eg, length of hospital stay, use of intensive care unit); mortality; costs (eg, hospitalisation, burden to society); occurrence of other disease/condition (eg, hypertension, cardiac condition); in-hospital procedures/intervention (eg, aortoiliac dissection, cardiovascular intervention); post-event outcomes or complications (eg, post-procedural complications, aortic repair) and prescription drug use (eg, beta-blockers).

  • Summary statistics: numbers with percentages or other reported statistics (eg, mean with SD; median with IQR or range; odda ratio with confidence intervals; prevalence of condition) for relevant outcome measures. Where possible, we calculated the number and/or proportion if not provided in the original study.

  • Any other relevant findings or conclusions. 

  • Comprehensiveness of reporting (BB only): we scored each manuscript against the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) checklist.10 11 The STROBE checklist is comprised of 22-items relating to the features of the scientific process that should be included in an accurate and complete report of an observational study.

Online supplementary figure 2 features the main data extraction tool.


The study objectives and outcome measures varied across reviewed manuscripts. Therefore, it was not appropriate to use traditional meta-analysis approaches to pool individual study results. A descriptive analysis, with details of the key findings of individual studies, was provided. Summaries of study features are shown in the tables and figures. The review is consistent with A Measurement Tool to Assess Systematic Reviews (AMSTAR) and Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines12 13 (online supplementary table 2A,B).


Studies identified

The titles/abstracts of 5 641 articles were screened, and 46 full-text manuscripts were retrieved and reviewed. Twelve manuscripts met the eligibility criteria; nine were identified from database searches and three from back reference/citing article search (figure 1). For a list of all included and excluded studies, see online supplementary tables 3 and 4, respectively.

Figure 1

Flow chart of study identification and selection.

Study features

The studies were set in the USA (seven studies), Germany (two studies), Taiwan (one study), Hong Kong (one study), and one study did not specify the setting (table 1). Eight articles (67%) were published between 2013 and 2016. For the 11 studies (92%) that reported an observation period, the median was 8 years (range: 3–14 years) and the median publication lag was 4 years (range: 2–6 years). One study used two data sources; the remaining 11 used one data source. Hospitalisation data were the most commonly used (eight studies [62%]) data source. One study used medical records from two urban hospitals14; one study used state-based data15 and the remaining 10 studies (83%) used a national dataset that covered 1%,16 9%,17 18 20%19–22 or 100%23 24 of the population. Only one-quarter of studies reported a funding source, most commonly research/government grants (three studies). One study defined six inherited heart diseases; the other 11 studies focused on one disease, Marfan syndrome (seven studies [41%]) or hypertrophic cardiomyopathy (four studies [29%]) (table 1).

Table 1

Characteristics of included studies focusing on inherited heart disease (N=12)

Cohort characteristics

Nine studies (75%) reported a cohort size. Our review included 17 269 persons with an inherited heart disease; 5 794 persons with Marfan syndrome (range: 12–2 329) and 11 475 persons with hypertrophic cardiomyopathy (range: 227–11 248). Two studies reported the number of procedures rather than people (total=1023, range: 358–665). One study did not report a sample size (online supplementary table 5).

Nine studies (75%) reported the mean age of an inherited heart disease cohort (range: 12–68 years). Marfan syndrome cohorts were younger than hypertrophic cardiomyopathy cohorts (range: 12–49 vs 57–68, respectively). Eleven studies (92%) reported cohort sex demographics. Of these, two studies (17%) restricted their cohort to women. In the remaining nine studies, hypertrophic cardiomyopathy cohorts (four studies) were predominantly women (range: 53%–62%), whereas Marfan syndrome cohorts (five studies) were predominantly men (female proportion range: 37%–43%) (online supplementary table 5).

Three studies (25%) reported a control cohort including 7 095 507 persons (range: 554–7 094 061).16 17 19 Two studies matched controls to the inherited heart disease population using multiple variables including sex, age, comorbidities, pharmaceutical claims and/or year of surgical procedure.16 17

None of the studies used routinely collected data to identify family members of patients with an inherited heart disease. One study reported maternal and fetal outcomes of pregnant women with Marfan syndrome. Compared with controls, women with Marfan syndrome required significantly more medical interventions during the birth, their fetuses were smaller and they had a higher frequency of pre-term birth.19

Definition of inherited heart disease using routinely collected health data

Seven of the 12 inherited heart diseases included in our search strategy were defined in at least one study (table 2). All studies specified ICD-9 or ICD-10 diagnosis codes to define an inherited heart disease population; three studies also used procedure codes.20–22 Across all 12 studies, there were 17 definitions of inherited heart disease. Of these, nine definitions were used in at least two studies or conditions.

Table 2

Definitions of identifying inherited heart disease using routinely collected health data

Marfan syndrome

Two of the seven Marfan syndrome studies did not further detail relevant ICD-9 code(s).14 23 The remaining five studies had a consistent definition of Marfan syndrome based on the coding system used by the routinely collected data source. Specifically, the two German studies used the ICD-10-GM (German Modification) diagnosis code Q87.4, given as either an in-hospital diagnosis or two outpatient diagnoses within 6 months.17 18 The three studies using ICD-9-CM (Clinical Modification) all used the diagnosis code 759.82.19 24 25 One of these studies required the diagnosis code to be either the primary or secondary diagnosis.

Hypertrophic cardiomyopathy

Four of the five hypertrophic cardiomyopathy studies used the ICD-9-CM diagnosis code of 425.1 to identify hypertrophic cardiomyopathy16 20–22; one study used the ICD-9-CM code of 425.18.15 Two of the five studies used the ICD diagnosis code alone to identify patients with hypertrophic cardiomyopathy. The other three studies required the patient to have a procedure code for a septal myectomy, septal ablation or alcohol septal ablation, in addition to the ICD diagnosis code of 425.1. One of these three studies also required the ICD diagnosis of hypertrophic cardiomyopathy to be the primary diagnosis.

Other inherited heart diseases

One study defined six inherited heart diseases using ICD-9 diagnosis codes.15 In addition to HCM, these conditions include Brugada syndrome (ICD-9 code of 746.89), catecholaminergic polymorphic ventricular tachycardia (427.1) and long QT syndrome (426.82). Despite left ventricular non-compaction and arrhythmogenic right ventricular cardiomyopathy being distinct diseases, both were identified using one ICD-9 code (425.4).

Prevalence of inherited heart disease

Four Marfan syndrome studies calculated the prevalence rate, which ranged from 0.5 to 3/10 000 persons.17 19 23 24 One study reported 5-year age-specific prevalence, and the highest prevalence was in persons aged 15–19 years (32.3/100 000 persons) followed by 10–14 years (23.6/100 000 persons) and 20–24 years (18.6/100 000 persons).24

Outcome measures

We categorised the outcome measures into seven themes (table 3); reviewed studies reported a median of three themes (range: 1–5). The most commonly reported outcome measures were resource utilisation (seven studies); mortality (seven studies); occurrence of disease (six studies); costs (five studies) and in-hospital interventions (five studies).

Table 3

Summary of outcome measures examined in reviewed studies (n=12)

Compared with a control group, persons with inherited heart disease have increased costs and risk of adverse outcomes.16 17 19 For example, persons with Marfan syndrome had greater annual costs than controls, of up to €2 366 for direct medical costs, €5 875 for direct non-medical costs and €7 487 for indirect costs.17 Furthermore, the annual personal and societal costs associated with Marfan syndrome were up to €61 million and the societal burden was up to €387 million.17 Persons with Marfan syndrome also had increased risk of maternal delivery and morbidity/mortality outcomes, most notably pneumothorax (OR 51.95, 95% CI 6.18 to 437.10), aortic repair (OR 42.54, 95% CI 3.62 to 500.33), maternal death (OR 22.38, 95% CI 2.92 to 171.81) and use of forceps during delivery (OR 6.35, 95% CI 4.10 to 9.83) compared with controls.19 Patients with hypertrophic cardiomyopathy had a significantly higher frequency of death (6.7% vs 2.5%), myocardial infection (2.2% vs 0.3%) or either of these outcomes (8.8% vs 2.7%) after non-cardiac surgery compared with controls.16

Comprehensiveness of reporting

The median STROBE score was 22 (range: 14–28) out of a possible 36 (online supplementary table 6). At least 17 STROBE items were reported in 92% of studies. The methodological aspects that were the least reported include: identifying study design (reported in five studies), sources of bias (four studies), study sample size calculation (three studies), detailing number of persons eligible at each stage of study design (two studies) or addressing missing data (one study).

Nine studies were published from 2009, after the STROBE statement was published. The median STROBE score was slightly lower for studies published prior to the STROBE statement publication (median: 20; range 14–22) compared with those published afterwards (median: 22; range 17–28).


Routinely collected health data can be used to identify persons with inherited heart diseases, but these data are currently underutilised. The range of outcome measures and information extracted from only 12 studies demonstrates the vast potential of routinely collected health data to make significant inroads to better understand patterns of care, patient outcomes, burden of disease and resource utilisation. These data can also generate population-level evidence for priority research areas such as examining the natural history of disease and effectiveness of treatments in the real world.26 This is particularly important for inherited heart diseases as clinical trials are often impracticable and may under-represent certain populations, including women and older patients.26 Furthermore, routinely collected health data have the potential to complement other research efforts based around primary data collections, such as existing disease registries. As our knowledge of inherited heart diseases and patterns of inheritance increases, routinely collected health data can provide valuable insight by examining whole-of-population associations between patient factors, treatment and patient outcomes to inform clinical guidelines. Despite the challenges and shortcomings of current ICD coding, routinely collected health data provide the opportunity to examine the real-world impact of these diseases.

Globally, few studies have examined the societal burden of inherited heart diseases, an important consideration given the relatively young age of patients, inherent risk to family members and potential for the severe outcomes of heart failure and sudden cardiac death. In Marfan syndrome, the primary cost drivers were inpatient treatments, care by non-physicians, reduced work productivity and loss of production due to absence, disability or death.17 Understanding the burden of disease at a societal level allows for a global view of the overall impact of disease. In the setting of a disease such as Marfan syndrome with a population prevalence of approximately 2–3 in 10 000,27 taking into consideration the younger age of patients, often being adolescents and young adults, will allow a greater appreciation of the true impact of disease.

All reviewed studies used an ICD diagnosis code to define the inherited heart disease cohort, which allows for simple cross-jurisdictional comparisons and examination of trends over time for specific conditions. The majority of studies only required one diagnosis code to define the cohort. Six of the 17 inherited heart disease definitions required additional information such as a procedure code20–22 25 or multiple diagnoses within a specific time frame.17 18 The impact of these additional criteria on cohort demographics and/or outcomes is unclear. Researchers should perform sensitivity analyses and/or validation studies using various cohort definitions to ensure  stricter definitions do not inadvertently impact study findings.

Importantly, the ICD diagnosis coding system does not explicitly define many rare diseases, with only 500 of the 6 000 rare diseases having an ICD diagnosis code.28 For countries such as Australia using the ICD-10-AM, only three of the 12 inherited heart diseases included in our search strategy have an explicit ICD diagnosis code (table 4). The codes best matching the other nine conditions are broad descriptive diagnoses (eg, other cardiomyopathies to describe arrhythmogenic right ventricular cardiomyopathy), which encompass non-inherited heart diseases. This observation likely accounts for the focus on Marfan syndrome and hypertrophic cardiomyopathy in the reviewed studies, given that they have specific ICD diagnosis codes. The current underutilisation of routinely collected health data to identify inherited heart diseases may be due to the limitations of the current coding system, which points to a need for more explicit codes to truly realise the potential of routinely collected health data.

Table 4

List of inherited heart diseases and relevant International Statistical Classification of Diseases and Related Health Problems codes (ICD-9-AM)

Despite using a comprehensive search strategy to identify relevant articles, one-quarter of reviewed studies were identified from reference and citation searches in the current study. Results were restricted to English-language manuscripts, which may have excluded relevant studies, and a journal contents search was not completed as our 12 reviewed articles were published in unique journals. Grey literature was also excluded from our search strategy. Further, identifying relevant studies was challenging as only six inherited heart diseases have been mapped to a subject heading in at least one of the bibliographic databases used in our search strategy. To overcome such limitations, future systematic reviews in this area would be enhanced if all inherited heart diseases were mapped to a subject heading.


Routinely collected data are an underutilised resource to understand clinical management and treatment issues affecting patients with inherited heart diseases. Despite some challenges, routinely collected data in this setting are able to provide evidence around a number of outcome measures and would have even greater utility if ICD codes were more explicit. While observational and registry-based studies have played a fundamental role in our understanding of inherited heart diseases to date, use of routinely collected data may provide an unbiased and global perspective on the true impact of these diseases.


JS is the recipient of the Elizabeth and Henry Hamilton-Brown scholarship from the University of Sydney. CS is the recipient of a National Health and Medical Research Council (NHMRC) Practitioner Fellowship (No. 1059156). JI is the recipient of a National Heart Foundation of Australia Future Leader Fellowship (No. 100833). This study is funded in part by an NHMRC Project Grant (No. 1059515).


  1. 1.
  2. 2.
  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.
  8. 8.
  9. 9.
  10. 10.
  11. 11.
  12. 12.
  13. 13.
  14. 14.
  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.
  22. 22.
  23. 23.
  24. 24.
  25. 25.
  26. 26.
  27. 27.
  28. 28.


  • Contributors Authors made substantial contribution to the concept or design of the work (BB, CS, JI), acquisition, analysis (BB, JS, JI) and interpretation (BB, JI) of the data. Drafting of the work (BB, JS) and critical revision (JI, CS). Final approval (BB, JS, CS, JI). Agree to be accountable to all aspects of the work (BB, JS, CS, JI).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement Any person may contact the corresponding author for access to any of the data used in the systematic review.