Article Text
Abstract
Background Screening programmes using echocardiography offer opportunity for intervention through identification and treatment of early (latent) rheumatic heart disease (RHD). We aimed to compare two methods for classifying progression or regression of latent RHD: serial review method and blinded, side-by-side review.
Methods A four-member expert panel reviewed 799 enrolment (in 2018) and completion (in 2020) echocardiograms from the GOAL Trial of latent RHD in Uganda to make consensus determination of normal, borderline RHD or definite RHD. Serial interpretations (enrolment and completion echocardiograms read at two different time points, 2 years apart, not beside one another) were compared with blinded side-by-side comparisons (enrolment and completion echocardiograms displayed beside one another in random order on same screen) to determine outcomes according to prespecified definitions of disease progression (worsening), regression (improving) or no change. We calculated inter-rater agreement using Cohen’s kappa.
Results There were 799 pairs of echocardiogram assessments included. A higher number, 54 vs 38 (6.8% vs 4.5%), were deemed as progression by serial interpretation compared with side-by-side comparison. There was good inter-rater agreement between the serial interpretation and side-by-side comparison methods (kappa 0.89). Disagreement was most often a result of the difference in classification between borderline RHD and mild definite RHD. Most discrepancies between interpretation methods (46 of 47, 98%) resulted from differences in valvular morphological evaluation, with valves judged to be morphologically similar between enrolment and final echocardiograms when compared side by side but classified differently on serial interpretation.
Conclusions There was good agreement between the methods of serial and side-by-side interpretation of echocardiograms for change over time, using the World Heart Federation criteria. Side-by-side interpretation has higher specificity for change, with fewer differences in the interpretation of valvular morphology, as compared with serial interpretation.
- Heart Valve Diseases
- Echocardiography
- Global Health
Data availability statement
Data are available upon reasonable request. Authors will share deidentified data on any reasonable request through direct request.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
WHAT IS ALREADY KNOWN ON THIS TOPIC
Screening programmes using echocardiography offer opportunity for intervention through identification and treatment of early (latent) rheumatic heart disease (RHD).
WHAT THIS STUDY ADDS
We propose an alternative method for reading and interpretation of follow-up echocardiography studies for determining progression or regression of latent RHD.
Independent reviewer side-by-side comparison showed good agreement with the four-member panel adjudicated interpretations made by the serial read method, but resulted in a higher number of studies judged as progression.
The majority of discordant outcomes were between borderline and mild definite RHD (solely based on interpretation of valve pathology), highlighting the challenge in reproducibly distinguishing these diagnostic categories.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
Side-by-side review of echocardiographic images to determine the course of latent RHD is more robust, but serial interpretation offers a reasonable alternative and may be more practical in clinical settings.
Borderline and mild definite RHD exist on a continuum. Interpretation of valvular morphology by the World Heart Federation (WHF) criteria is less reproducible than other features. The 2012 WHF criteria may warrant revision to address these issues.
Introduction
The global burden of rheumatic heart disease (RHD) remains high, disproportionately affecting marginalised populations in low-income and middle-income countries.1 Outcomes for children diagnosed clinically with RHD remain poor, with a recent study from Uganda showing nearly 30% mortality 9 months after diagnosis.2–4 In part, poor clinical outcomes are driven by late presentation and the missed opportunity to benefit from secondary antibiotic prophylaxis (SAP).
Screening with echocardiography offers a potential opportunity for earlier intervention through identification of early, latent RHD. Recently, the GOAL Trial, a randomised clinical trial of SAP in latent RHD in Uganda, showed that initiation of SAP after screening significantly reduced the progression of latent RHD as compared with no prophylaxis.5 However, practical challenges exist for implementation of a strategy that relies on echocardiography for RHD diagnosis and follow-up.
One important consideration is how reviewers assess outcomes on echocardiogram for children who are diagnosed with latent RHD. Most prior research in this area has followed natural history cohorts of latent RHD using serial interpretation of echocardiograms (initial and follow-up echocardiograms read independently of each other at two different time points, not beside one another), and the interpretations were compared.6–10 The GOAL Trial used a different approach, with side-by-side comparison (enrolment and completion echocardiograms displayed beside one another on the same screen which mimics clinical practice in high-resource settings) to assess functional and morphological differences between the two sets of studies.5
In this substudy of the GOAL Trial, we aimed to compare how these two methods of outcomes interpretation (serial vs side-by-side interpretation) changed echocardiographic diagnoses. We aimed to: (1) Compare determination of echocardiographic progression or regression of latent RHD when adjudicated with serial-read review as compared with blinded side-by-side review; (2) Determine the inter-rater agreement between serial-read and side-by-side comparison methods; and (3) Describe sources of discrepancies between serial-read and side-by-side interpretation.
Methods
Study design
In this planned substudy of the GOAL Trial, we compared determination of 2-year progression or regression of children diagnosed with latent RHD between serial (traditional methodology used for prior RHD research studies) to side-by-side adjudication (review of both studies on the same screen that mimics clinical practice in high-resource settings) by the same four-member expert panel (figure 1).
Parent study
Echocardiograms included in this study were obtained during enrolment (2018) and completion (2020) evaluations for the GOAL Trial, a 2-year randomised controlled trial to evaluate the impact of secondary prophylaxis with intramuscular penicillin G benzathine, as compared with no prophylaxis, in Ugandan children and adolescents with latent (borderline and mild definite) RHD.5 Moderate and severe RHD were excluded from the study. The methods of acquisition, interpretation and adjudication of the echocardiograms in this study have been previously described.11 In brief, qualified echocardiographers obtained a standard 13-view protocol on Vivid Q and Vivid IQ fully functional echocardiography machines (General Electric, Milwaukee, Wisconsin, USA) with ECG gating for both the enrolment and final studies.11 Patients and parents of patients were involved in design of the parent study, specifically focused on patient support group structure.
Echocardiography adjudication
The same four-person adjudication panel reviewed enrolment and final echocardiograms.5 11 Review of enrolment echocardiograms was completed during two in-person meetings (August 2018 in Paris, France and December 2018 in Dubai, United Arab Emirates) for a total of approximately 49 hours over 5 days. Due to the SARS-CoV-2 pandemic, in-person meetings were not possible in 2020 for review of study completion echocardiograms. Instead, nine zoom videoconferencing meetings were held for a total of 36 hours.
Serial Interpretation
Following enrolment of GOAL Trial participants in 2018, the panel members were provided with complete 13-view echocardiograms of moving DICOM images (RadiAnt for Windows or Horos for Mac) in 2D and colour Doppler: parasternal long axis, parasternal short axis, apical four chamber and apical five chamber views; and continuous wave Doppler of the mitral valve (MV) and aortic valve (AV). No still frame 2D or colour Doppler images or measurements were provided. The panel was required to measure mitral regurgitation (MR) and aortic regurgitation colour Doppler jet lengths and anterior mitral leaflet (AMVL) thickness, according to guidance provided in the 2012 World Heart Federation (WHF) criteria.12 The same process was repeated following GOAL Trial completion in 2020. At both time points, the panel determined the WHF category—normal, borderline RHD or definite RHD and subcategory—A, B, C, D (box 1) and determined consensus classification for all functional and morphological components of the WHF criteria.
(A) 2012 WHF guidelines for the diagnosis of RHD. (B) Operational definitions of RHD progression or regression for the GOAL Trial
A. 2012 WHF guidelines for the diagnosis of RHD12
Definite RHD (either A, B, C or D):
A: Pathological MR and at least two morphological features of RHD of the MV
B: MS mean gradient ≥4 mm Hg
C: Pathological AR and at least two morphological features of RHD of the AV
D: Borderline disease of both the MV and AV
Borderline RHD (either A, B or C):
A: At least two morphological features of RHD of the MV without pathological MR or MS
B: Pathological MR
C: Pathological AR
B. Operational definitions of RHD progression or regression for the GOAL Trial
RHD progression: A change in diagnostic category from borderline (A, B or C) to definite (A, B, C or D) or from borderline or definite to moderate/severe disease.
RHD regression: A change in diagnostic category from definite (A, B, C or D) to borderline (A, B or C) or normal; or from borderline (A, B or C) to normal.
AR, aortic regurgitation; AV, aortic valve; MR, mitral regurgitation; MS, mitral stenosis; MV, mitral valve; RHD, rheumatic heart disease; WHF, World Heart Federation.
Side-by-side comparison
Separate adjudication meetings were held for side-by-side comparison following GOAL Trial completion in 2020. Prior to these meetings, entry and final echocardiograms, blinded by date, were preread by all four members of the panel. Each case, focusing on differences between panel members, was also reviewed during teleconference using PowerPoint presentation. These presentations were constructed to show enrolment and final images concurrently (beside one another on the same screen) with random blinded display to the left or right of the screen. The panel was tasked with deciding if the study on the right of the screen presentation was better, worse or the same according to strict study definitions (box 2). These results were unblinded by a single additional team member who then assigned a result of progression, regression or the same based on the unblinded ordering of the echocardiograms and the panel decision.
Criteria for side-by-side comparison and determination of echocardiographic progression or regression of latent RHD
Echocardiographic progression of latent RHD*
New pathological regurgitation (as defined by 2012 WHF criteria)12 at a previously unaffected valve
Worsening grade of existing mitral or aortic regurgitation (non-pathological to pathological by WHF criteria or from none/mild to moderate/severe by ASE criteria25
Development of two morphological features consistent with RHD (2012 WHF criteria)12 at a valve that previously had normal morphology, or the addition of one morphological feature at a valve previously only showing a single morphological abnormality.
*Needed to have at least one out of the three criteria. In all cases, progression involved a change in diagnostic category (borderline to definite (mild or moderate/severe) or definite mild to definite moderate/severe).
Echocardiographic regression of latent RHD**
Disappearance of existing mitral or aortic regurgitation
Change from pathological regurgitation to physiological regurgitation (2012 WHF criteria)12
Disappearance of a morphological feature consistent with RHD (2012 WHF criteria)12 at a valve that previously had abnormal morphology.
**Needed to have at least one out of the three criteria. In all cases, regression involved a change in diagnostic category (borderline to normal or definite mild to borderline/normal).
ASE, American Society of Echocardiography; RHD, rheumatic heart disease; WHF,World Heart Federation.
This review process resulted in two related but distinct classifications for final echocardiograms (figure 1). The first, a serial interpretation, and the second, a side-by-side direct comparison. The latter—side-by-side comparison, was used to determine the GOAL Trial primary outcome as defined in the study protocol.
Statistical analysis
The proportion of progression and regression during the study as assessed by serial interpretation and side-by-side comparison was calculated. Agreement was calculated as the proportion of assessments that were the same using the two different methods. We estimated Cohen’s kappa to evaluate chance-adjusted agreement between the serial interpretation and side-by-side comparison. All p values and CIs are two-sided.
Ethics
Written informed consent was obtained from a parent or guardian of every child, as well as written assent for all participants over the age of 8 years. Participants who turned 18 years during the study follow-up period were asked to provide informed consent to continue being in the trial.
Results
There were 799 pairs of echocardiogram assessments included in this analysis. Of these, a higher number, 54 as compared with 36 (6.8% vs 4.5%), were determined to have progressed by serial interpretation as compared with side-by-side comparison. A higher number, 409 (51.2%) regressed by serial interpretation as compared with 386 (48.3%) by side-by-side comparison. Overall, there was good inter-rater agreement between the serial interpretation and side-by-side comparison methods for the primary outcome (progression), secondary outcome (regression) and for stable disease (table 1, kappa 0.89).
When disagreement was observed between methods, it was most often a result of the difference in classification between borderline RHD and mild definite RHD. Table 2 shows study enrolment adjudication findings using serial methodology (rows 1 and 5), study completion adjudication findings using serial methodology (rows 2 and 6), number of cases when side-by-side methodology (used in the original GOAL publication) agreed with serial methodology (rows 3 and 7) and number of cases when side-by-side methodology disagreed with serial methodology (rows 4–8). Columns 2–5 represent study completion diagnoses by serial methodology. Side-by-side comparison resulted in less recategorisation of disease severity between baseline and study completion time points. Most notably, for participants classified as borderline RHD at study entry, 17 were classified as mild definite RHD at completion by serial interpretation but judged to have no change by side-by-side comparison. Similarly, for those classified as definite RHD at enrolment, 25 were classified by serial interpretation as borderline RHD at study completion, while side-by-side comparison interpreted no changes. There was near perfect agreement between serial and side-by-side methods for participants determined as having a normal study or have moderate/severe RHD at study completion.
Most discrepancies (46 of 47, 97.9%) between interpretation methods resulted from differences in the morphological evaluation of the mitral (n=37), aortic (n=8), or mitral and aortic (n=1) valves (table 3). In these cases, the valves were judged to be morphologically similar between enrolment and final echocardiograms when compared side by side but classified differently on serial interpretation.
As an exploratory analysis, we looked at the effect of serial interpretation compared with side-by-side read on the primary and secondary outcomes of the GOAL Trial; RHD progression and regression among children with latent RHD receiving and not receiving SAP. We would still draw the same conclusions in GOAL had we used the serial method, although the effect size was smaller (online supplemental table 1).
Supplemental material
Discussion
Evaluation of RHD progression is fundamental to the follow-up of children diagnosed with latent RHD through echocardiographic screening, an effort that has gained momentum as an attractive public health approach for the control of RHD in endemic regions.13 To fully align with the requirements of a disease appropriate for screening, it is critical for the scientific community to understand and describe the course of latent RHD.14 Accurate description of this course hinges on interval testing with a test and criteria that are highly reproducible, such that reports of disease stability, progression or regression truly represent the said entities.
In contrast to previous cohort studies of latent RHD which used the traditional serial-read method,6–9 the GOAL Trial used direct side-by-side comparison of entry and final echocardiogram images for decision of outcome after 24 months of follow-up. The advantage of this methodology is that it may reduce the subjective nature of the WHF criteria by allowing direct comparison of images and findings.
When applied to RHD echocardiogram images, side-by-side comparison showed higher specificity for disease progression. In this study, nearly all of the discrepancy between interpretation methods resulted from differences in morphological interpretation, which is partly subjective. However, even AMVL thickness, the only morphological feature of RHD with objective measurement capability, has only moderate repeatability within readers and poor reproducibility.13 15 The remaining three MV morphology features, and all of the four AV morphology features are assessed subjectively. Previous inter-reviewer reliability assessments have shown good-to-perfect agreement on presence of and diagnostic category of MR, but much less agreement on MV morphology features. Additionally, morphological criteria have repeatedly been found to be neither sensitive nor specific, with poor inter-reviewer reliability compared with functional assessment.6 16–19
Latent RHD follow-up cohorts which used serial reads have identified that the morphological features of the WHF criteria are the most challenging to reproduce at interval evaluation, and are marred by lack of intra-rater reproducibility.6–8 One study of echocardiographic interpretation with serial testing within a 12-month period demonstrated large interstudy variability with the diagnosis of borderline RHD, largely arising from morphological features of the applied criteria.20 Our study reconfirms the challenges of applying the morphological evaluation of MV and AV, as prescribed by the WHF criteria.
Previous inter-rater reliability assessment of RHD diagnosis using the WHF criteria has shown only moderate inter-rater agreements and an emphasis that two or more reviewers improve the diagnostic accuracy.11 16 21 22 This has significant human resource implications for RHD screening in endemic regions. In the GOAL Trial evaluation, we found that independent side-by-side comparison showed good agreement with a four-member panel adjudication of the echocardiographic outcomes. Thus, independent side-by-side comparison can potentially be a solution to the need for multiple reviewers doing serial reads.
With this new evidence, we recommend side-by-side review of echocardiographic images when evaluating disease progress in the clinical setting. However, this relies on a functional Picture Archiving and Communication System (PACS)—where past echocardiographic images can be retrieved in routine clinical practice, together with streamlined and consistent identification of patients, for example, use of national identifiers. This may be difficult to achieve in many RHD endemic regions where availability and/or functionality of PACS and national identifiers is still limited. Side-by-side reading therefore requires investment in information technology as well as personnel to archive and retrieve images.
We note that, with serial reads, the diagnosis of progression from borderline to mild definite RHD was largely based on morphological changes. This raises the possibilities of missed abnormal morphology at enrolment or overestimation of the abnormalities at the study completion echocardiogram reads.23 24 These findings emphasise the fact that evaluation of true morphological changes is difficult to replicate in a serial manner, and underscores the advantages of side-by-side comparison.
These findings also support previous findings that there is significant overlap between borderline and mild definite disease, and that these conditions, in reality, exist on a continuum. This has important implications for the 2012 WHF criteria. They support the need to revise the criteria, with the aim of improving inter-rater agreement.
There are several limitations to this study. First, we used PowerPoint for side-by-side echocardiogram image presentation and not original DICOM images because of the challenge to blind the dates of the echocardiogram studies with the latter during virtual review. PowerPoint could have led to loss of image resolution, is more time-consuming and it is not a practical strategy to use at scale. Second, we acknowledge that the ideal strategy would have been to re-review and reclassify the entry echocardiograms by the serial read method along with the final 24-month studies. However, this was not possible due to the impact of the SARS-CoV-2 pandemic on many resources and logistics. We were also not able to have in-person adjudication panel meetings at the end of the study. Virtual meetings are likely to be associated with some subtle loss of rigour as compared with in-person meetings. However, our ability to complete the trial’s echocardiogram interpretations virtually proves a viable avenue for international expert clinical and research support to constrained teams in resource-limited settings for echo interpretations.
Conclusion
In this study, we present an alternative method for reading of interval echocardiographic studies of latent RHD—the side-by-side direct comparison of images, which has multiple advantages over the traditional serial read method. Specifically, the side-by-side direct comparison method presents higher specificity for disease progression by reducing the subjectivity of morphological features of the WHF criteria. For these reasons, this method can serve as a gold standard method for future clinical trials and outcome studies of latent RHD. Policy decisions on screening for RHD require a formalised process to determine echocardiographic disease progress with or without intervention, and this study contributes important new knowledge to better defining this process.
Data availability statement
Data are available upon reasonable request. Authors will share deidentified data on any reasonable request through direct request.
Ethics statements
Patient consent for publication
Ethics approval
This study involves human participants and was approved by the institutional review boards at Makerere University School of Medicine, Kampala, Uganda (REC 2018–048) and Children’s National Hospital (P000010408) in Washington DC. Participants gave informed consent to participate in the study before taking part.
Acknowledgments
The authors thank the Karp Family Foundation, Gift of Life International, Children’s National Hospital Foundation (Zachary Blumenfeld Fund and Race for Every Child (Team Jocelyn)), the Elias–Ginsburg Family, Wiley Rein, Philips Foundation, AT&T Foundation, Heart Healers International, Huron Philanthropies, and the Cincinnati Children’s Hospital Heart Institute Research Core for the support received for this project.
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Twitter @drmirabel
Contributors Study design: JR, AB, EO, DE, RS, AG, AS, CS. Study conduct: JR, AB, EO, MN, NF, JP, AS, CS. Echocardiogram adjudication: MM, MN, LZ, CS. Data analysis: JR, AB, EO, DE, AG, AS, CS. Manuscript preparation: JR, AB, CS. Manuscript review: all authors. Guarantor: CS.
Funding This study was funded by the Thrasher Research Fund Award #13908.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.