Automated external defibrillators (AEDs) and implantable cardioverter defibrillators (ICDs) are used to treat life-threatening arrhythmias. AEDs and ICDs use shock advice algorithms to classify ECG tracings as shockable or non-shockable rhythms in clinical practice. Machine learning algorithms have recently been assessed for shock decision classification with increasing accuracy. Outside of rhythm classification alone, they have been evaluated in diagnosis of causes of cardiac arrest, prediction of success of defibrillation and rhythm classification without the need to interrupt cardiopulmonary resuscitation. This review explores the many applications of machine learning in AEDs and ICDs. While these technologies are exciting areas of research, there remain limitations to their widespread use including high processing power, cost and the ‘black-box’ phenomenon.
- Heart Arrest
- Tachycardia, Ventricular
- Ventricular Fibrillation
- Defibrillators, Implantable
Data availability statement
Data sharing not applicable as no data sets generated and/or analysed for this study.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Artificial intelligence (AI) is a broad term that encompasses the many uses of machine-based data processing to achieve outcomes that would typically require human cognitive function.1 In recent years, AI has expanded its role within medicine. In particular machine learning, a type of AI where a model is trained by a learning algorithm from a data set and then applies this model to new data sets, has been widely used in a variety of medical fields. The availability of large data sets, combined with advances in machine learning technology, has led to an increasing number of medical applications in the last few years.1 In this review, we examine the use of machine learning in rhythm classification in automated external defibrillators (AEDs) particularly without interruption of cardiopulmonary resuscitation (CPR) and predicting successful shocks and electrical storm in implantable cardioverter defibrillators (ICDs). The European Society for Cardiology (ESC) and European Resuscitation Council encourage the use of AEDs by emergency services and non-medical members of the public to reduce time to defibrillation.2 The ESC also recommends the use of ICDs in patients with documented ventricular fibrillation (VF) or haemodynamically unstable ventricular tachycardia (VT) without reversible causes or 48 hours after myocardial infarction (MI) on chronic optimal medical therapy.3 While this is an exciting new area, there are some limitations to the widespread use of these technologies, which we evaluate in the Discussion section.
First, we will explain some of the key AI concepts that are discussed in this paper and are currently being used in ECG detection. Table 1 summarises some of these key concepts. There are multiple machine learning techniques, which can be broadly categorised as supervised and unsupervised learning, figure 1.1
Unsupervised machine learning recognises patterns in unlabelled data sets. This can be useful in identifying subgroups from complex data and where labelled data sets are not available.1 The clusters or patterns found may not be related to the outcome of interest and complex data can require large amounts of preprocessing prior to use in order to yield useful outcomes. However, unsupervised methods have still been used in clinical applications, for example, to find patterns in electronic health record data where noise, heterogeneity and incompleteness limit the use of supervised methods.4
Supervised machine learning, on the other hand, involves training models to correctly classify input data with labelled outputs. This requires large numbers of labelled data sets for training. Once trained, these models can then be used to predict outcomes on new data sets in a process known as testing. This can be used for classifying distinct groups, that is, types of arrhythmias or for regression models in data with continuous outcomes. Common types of supervised machine learning algorithm include deep learning, support vector machines (SVMs), random forest and K-nearest neighbour (k-NN).5
Deep learning is a type of machine learning that mimics neural networks in the brain to perform high levels of data processing.5 Artificial neural networks (ANN) contain layers of nodes which manipulate and transform input data; the layers between the input and output layers are termed ‘hidden layers’. Weighted connections between these hidden layers adjust the signal based on importance. During training, these weights are typically ascribed a random value close but not equal to zero. Using these initial weights, an initial output classification is produced by a process called forward propagation. This prediction is then compared with the true outcome and an error signal is fed back to the model, so that weights can be adjusted in a process called back propagation. In this way, the model is optimised.5 Deep learning has been used since the 1950s for multiple types of data inputs. However, its use was initially limited due to ‘overfitting’ - when there is too much focus on specific data points rendering it no longer generalisable to new unseen data sets.6 There have since been various techniques developed to avoid overfitting. In ANNs, drop-out regularisation techniques are commonly used to counteract overfitting and prevent excessive coadaptation of neurons. This involves randomly removing neurons and their weighted connections either temporarily or permanently during training.7
Convolutional neural networks (CNNs) are a type of ANN, which extract high-level features directly from raw data.5 They have been used extensively in medical imaging but can be used to analyse multiple types of one-dimensional, two-dimensional and three-dimensional data sets. As in ANNs, the inputs, for example, two-dimensional pixels or three-dimensional voxels, are passed through multiple layers of neurons before reaching the output. Each layer has a convolutional filter or kernal, which extracts the high-level features such as locality and subsimilarity. This removes the need for manual feature selection and introduction of human bias.8 For example, a CNN model was used by Cohen-Shelley et al for screening of moderate to severe aortic stenosis (AS).9 The CNN model has 62 convolutional layers and one classification output layer—moderate to severe AS or mild to no AS, figure 2. Each ECG represented a 12×5000 matrix, which was the input for the CNN. In the CNN, the weights and bias are constantly modified to reduce the difference between the given output and the labelled outcome in the data set.9
SVMs, random forest and k-NN are also supervised machine learning models.5 SVMs are used for binary classification. SVMs determine the optimum hyperplane to separate data into two classes. They do this by maximising the distance between the hyperplane and the points to which it lies closest, also known as support vectors. Random forest uses a large number of decision trees called estimators. Each of these estimators is trained using a random subset of samples and features from the training set, which increases the generalisability of the outcomes. The final classification is the mode (classification) or median (regression) outcome among the estimators. K-NN classification does not learn patterns from the training data to apply to new data sets but instead directly compares new data with training data. New data are compared with the k most similar points in the training set and assigned as the most common value (classification) or the mean/median (regression). K is a positive, non-zero integer that must be selected based on the specific dataset, number of features and individual problem.5
Medical applications of AI
The clinical applications of AI have been rapidly expanding. One of the most commonly used settings for machine learning is medical imaging and diagnostics. In cardiac imaging, several AI techniques have been used for identification of structures of the heart, lesion detection and segmentation of heart tissue and histological tissue classification.10
ECG interpretation and classification of cardiac arrhythmias is another obvious application of AI. Manual ECG interpretation is subjective, error-prone and varies widely depending on the knowledge and experience of the clinician. Computer-generated ECG interpretation has been widely available since it was developed in 1960s; however, their manual feature recognition algorithms have faced criticism for missing the complexities and nuances of ECGs.11 Deep learning models, in particular, CNNs, have been used for ECG interpretation with human-like accuracy, with one model even out-performing cardiologists.12 Whether the use of complex deep learning algorithms such as CNNs will be used routinely for automated ECG interpretation remains to be seen.
In the current age of wearable technology and smartwatches with single-lead ECG capabilities, automatic ECG interpretation is becoming particularly important. The Kardia Band (KB) records a single-lead ECG in Apple Watches. This is then paired with an app which uses CNNs to detect atrial fibrillation (AF). Bumgarner et al found that KB interpreted AF with 93% sensitivity and 84% specificity, compared with physician interpretations of KB recordings with 99% sensitivity and 83% specificity. Of the 113 ECG and KB recordings available, 57 of them were uninterpretable by the KB algorithm but were reviewed by clinicians with 100% sensitivity and 80% specificity. Therefore, this technology still requires clinician input and oversight for the best results and is not yet able to function autonomously.13
Not only can AI be used for standard ECG interpretation but also studies have been assessing its use as a screening tool for asymptomatic moderate to severe AS, asymptomatic left ventricular dysfunction and early pulmonary hypertension—helping in early diagnosis and intervention.9 14 15 Attia et al used paired 12-lead ECG and echocardiogram data from nearly 45 000 patients at the Mayo Clinic to train a CNN for the identification of asymptomatic left ventricular dysfunction using the 12-lead ECG data alone. Their model had a sensitivity and specificity of 86.3% and 85.7%, respectively, and they found that those with a positive AI screen had a four times greater risk of developing ventricular dysfunction in the future than those without.14 ECGs are low cost, non-invasive and widely available—making them an ideal candidate for a screening tool.
Another use of this ECG recognition technology is in defibrillators. AEDs were developed for use by untrained bystanders on those who have a sudden cardiac arrest in a public place.2 ICDs are implanted in those with a high risk of sudden cardiac death.3 The key to an appropriate and potentially life-saving shock from the ICD or AED is the recognition of a shockable rhythm such as VF and VT. These rhythms can result in a patient’s death unless a shock is delivered quickly. This is where AI may have a major role to play in reducing time to shock and increasing efficiency of recognition of shockable rhythms.
A search was carried out on Medline and Embase on 3 April 2021 using the terms ‘AED’ ‘ICD’ ‘defibrillator’ together with ‘AI’ and ‘deep learning’. This resulted in 221 abstracts which were screened for relevance to our topic of ‘Applications of machine learning in AEDs and ICDs’.
ECG interpretation for AEDs
Both traditional machine learning and deep learning techniques have been used to classify shockable and non-shockable rhythms. Table 2 shows examples of the techniques, which have been evaluated for use in shock advice algorithms (SAAs) as well as additional applications of this ECG interpretation technology, for example, diagnosis of prearrest MI.
SVMs have been used in rhythm classification of ECG readings, see table 2. Rhythm analysis in AEDs needs to have both high specificity and sensitivity and low processing power, so the machines are cheap and easily available. Therefore, optimising the parameters for the algorithm can increase efficiency. Alonso-Atienza et al initially used an SVM with 13 ECG parameters, which have been used previously to characterise VF and shockable rhythms.16 They then examined the utility of each ECG parameter individually using three different feature selection filters. They found threshold sample count, sample entropy (measure of similarity with an ECG signal segment) and VF filter (measure of residue after a narrowband elimination filter is applied) to be the most effective in diagnosis of VF. Therefore, a system using just these three features could decrease processing power while maintaining accuracy.16 Li et al similarly optimised their SVM algorithm with the use of only two parameters selected using a genetic algorithm, which mimics natural selection and eliminates weaker combinations to find the optimum combinations.17 They achieved higher sensitivities and specificities with two parameters compared with Alonso et al. However, they both used different window sizes, parameters and databases making them difficult to directly compare. Difficulties also arise as many of these databases use ECG traces from Holter monitors, which differ from out of hospital cardiac arrest (OHCA) traces, which often have more noise. In future, a single public OHCA ECG database with training and test data sets would be useful to allow for comparison of algorithms and more similar training sets to actual OHCA traces.
SVMs have also been used for diagnosis of the cause of arrest based on ECG parameters. Thannhauser et al used an SVM to identify previous MI from VF waveforms.18 The diagnosis of previous MI based on VF morphology had previously been performed in animal studies, but this was the first human study that demonstrated ‘proof-of-concept’. This could be used to inform decision-making postcardiac arrest. Elucidating the cause of cardiac arrest is important postresuscitation for prevention of further episodes. However, this is in the early stages and would need to be used with the whole clinical picture for decision-making purposes.
Building on previous SVM models, Krasteva et al assessed a CNN for characterisation of rhythms.19 They used large samples of ECG traces, over 3000 and 6000 for training and validation, respectively. However, there were more than four times more non-shockable rhythm samples available compared with shockable rhythms. Their model used ECG traces as short as 2 s with maximal performance at 5 s, meaning their system would cause a minimal break in CPR before shock decision reached. Previous studies have found an average preshock pause in AEDs to be 18 s; therefore, this new technology could greatly reduce breaks in CPR.20 Krasteva et al found that their model outperformed five CNN models in the literature on public and OHCA databases as well as a current AED shock advisory programme using a decision tree classifier, particularly on shorter 2 s ECG traces. This represents a significant step forward compared with previous model. While they used a large sample size for training and validation of their deep learning model, their data set was imbalanced with four times more non-shockable rhythm samples available compared with shockable rhythms. This is a commonly encountered issue with current databases and can lead to bias within the algorithms.
We can see from multiple studies in table 2 that CNN models have high sensitivity to detect shockable rhythms and high specificity to rule out non-shockable rhythms. Nonetheless, use of these more advanced machine learning algorithms is currently limited in practice due to the difficulties in embedding them into AEDs with their limited processing power. AEDs must be cost-effective to allow widespread use. While the above studies demonstrate high sensitivity and specificity, the algorithms have only been tested on the computer-based systems and not in AED simulations. Bench studies, such as those used by Jekova et al to assess the accuracy of commercial AED arrythmia analysis algorithm in the presence of electromagnetic interferences, help to evaluate algorithms in simulated real-life scenarios.21
Rhythm classification during CPR
One of the major limitations of AED ECG recognition is that CPR must be interrupted for reliable diagnosis as current algorithms are unable to classify shockable and non-shockable rhythms during CPR due to artefacts. CPR is often suspended for 15 s or more for diagnosis rhythm classification to occur.22 Even small breaks in CPR can impact outcomes; an increase in preshock pause of just 5 s decreases survival by 18%.20 The ability to continue chest compressions while analysing the rhythm would help to minimise interruptions. Table 3 summarises the use of machine learning technologies to analyse rhythms during CPR.
Adaptive filters have been used to remove CPR artefacts. These adaptive filters, such as least mean squares or recursive least squares, use signals recorded by defibrillators, including compression depth and thoracic impedance to model the artefact and remove it prior to rhythm classification. Isahi et al used a recursive least square filter to remove CPR artefacts and a CNN for rhythm classification. They found sensitivities and specificities of 95.8% and 96.1%, respectively. The use of such adaptive filters is limited practically as they rely on additional reference channels for information, which are not readily available in all standard AEDs.23 Similarly, Yu et al used noise-assisted multivariate empirical mode decomposition and least mean squares.24 Even following adaptive filters, ECG segments during CPR can still have more noise than standard ECGs, therefore using specific machine learning algorithms can confer increasing accuracy. Yu et al constructed a neural network to assess the rhythms and identify VF. They found sensitivities >95% and specificities >80%. However, the CPR artefacts were taken from porcine ECGs of pigs in asystole receiving chest compressions not real-life OHCA ECGs.24
Didon et al developed a new protocol termed ‘Analyse While Compressing’ (AWC). AWC is a two-step process where the rhythm is initially analysed during chest compressions and if a shock is advised, the rhythm is confirmed in the absence of chest compressions prior to shock delivery. Reconfirmation of rhythm was still required in 34.4% of non-shockable rhythm cases where the rhythm was not able to be accurately classified, therefore CPR interruptions still took place.25
To avoid the need for adaptive filters or external feedback devices, end-to-end analysis of the rhythm has been evaluated. Jekova et al aimed to optimise an end-to-end CNN model for shock advisory decision during CPR using real-life AED recordings in OHCA.26 Their CNN was able to extract features from raw ECGs during CPR with sensitivities and specificities of 89.0% and 91.7%, respectively. They tested their model on 5591 real-life cardiac arrest rhythms during CPR. Nevertheless, their sensitivities and specificities remain below the American Heart Association (AHA) recommendations for SAA by 1% for VF and 3.9% for asystole.27 Their database unfortunately lacked enough shockable VT rhythms, less than 0.2% of the total number of rhythms, therefore they were unable to report statistically significant sensitivities for VT.26 There is scope for further optimisation of the model possibly with further training datasets or additional layers and channels in the CNN to make the model useful clinically.
Implantable Cardioverter Defibrillators
ICDs rely on recognition of life-threatening VT and VF rhythms before delivering a shock. The SAA must differentiate between shockable rhythms and non-shockable rhythms including normal sinus rhythm, supraVTs, sinus bradycardia, AF and idioventricular rhythms. The SAA must have high sensitivity for shockable rhythms and high specificity for non-shockable rhythms, where the delivery of shock will confer no benefit and can even result in deterioration of the rhythm. Given the catastrophic consequences of missing potentially fatal rhythms, ICDs are programmed with a high sensitivity threshold in order to avoid missed shocks. However, this can lead to high numbers of inappropriate shocks. As a result of these shocks, there are device complications such as reduced battery life and requirement of earlier reimplantation. Moreover, for the patient, there is pain associated with the shocks, worse quality of life and increased risk of dangerous arrythmias.28
Table 4 summarises our search on the use of AI in ICDs. Outside of the SAA, machine learning can be used to predict appropriate candidates for ICD insertion and identify adverse events secondary to ICD including risk of electrical storm. The use of machine learning to predict the success of defibrillation will be discussed below.
Electrical storm is a life-threatening condition defined as three or more sustained episodes of VT, VF or appropriate ICD shocks in a 24-hour period. This can be life threatening despite an ICD and, therefore, identifying those at high risk is important. Models for prediction of electrical storm have been assessed; they found percentage of ventricular pacing, cycle length parameters and number of previously untreated tachycardias to be risk factors.29 30
Predicting success of defibrillator shocks
There are multiple potential benefits to the prediction of successful defibrillation. Currently, in OHCA, shocks are delivered, depending on rhythm assessment, following 2 min of CPR.31 This resuscitation protocol does not consider the likelihood of shock delivery being successful at any point during the arrest. AI algorithms can be used to predict likelihood of shock success in the hopes that shocks could be delivered at the optimum time—a summary of these papers is in table 5.
SVMs have been used to predict successful defibrillation in VF arrest.32–34 Multiple VF waveform characteristics were used in these studies; the best predictors of termination of VF including amplitude spectrum area—a frequency domain characteristic—and slope and root mean square amplitude—time domain characteristics. Howe et al found an accuracy of 81.9% using their model with the aforementioned VF waveform characteristics.33 However, this was only based on a small retrospective study of 41 patients with 115 defibrillation ECGs. Larger sample sizes would be required to validate this system. The accuracy of defibrillation success was improved with waveform capnography. Capnography is being used more frequently in cardiac arrest scenarios as it can also be used for early indication of return of spontaneous circulation. However, use would be limited in community AEDs where capnography is not commonplace and could have issues being implemented without training and ICDs where it is not available.
Shandilya et al constructed a similar SVM algorithm assessing VF waveform characteristics with accuracy of 83.3%.34 The patients in their study had received low voltage (120 J) shocks. While this is considered equivalent to higher energy shocks, it may affect how the results can be compared with similar studies. VF is the initial rhythm in only 20%–30% of cardiac arrests.35 We have not seen studies yet assessing prediction of shocks in other rhythms such as VT.
In a more recent paper, Shandilya et al performed a retrospective analysis of 153 patients with OHCA cardiac arrest who received at least one shock for VF. Using a multiple domain integrative model, a type of AI model, to classify ECG rhythms and predict defibrillation success, they found 78.8% accuracy with ECG rhythms alone. As above, addition of end-tidal CO2 increased accuracy to 83.3%, unfortunately this information was only available for 48 patients.36 They did not control for preshock pauses and ‘no-flow’ time before defibrillation, which has been previously shown to impact success.35 This was a relatively small study, and larger sample sizes will be required to get more meaningful data. Current sensitivities and specificities are unlikely to be sufficient to justify changing the current protocols.
AI could also be used to aid decision-making for implantation of an ICD. Patients with previous MI are separated into high arrhythmia risk groups—who could benefit from an ICD—and low arrhythmia risk groups based on clinical guidelines. Clearly, it would be beneficial to risk stratify patients individually to appropriately provide ICDs to those who might benefit. Markers such as left ventricular ejection fraction and myocardial scar size have been used in AI systems to evaluate arrhythmia risk. Kotu et al used cardiac MRI features including size, location and texture of scarred myocardium to characterise labelled high and low risk groups.32 Using an SVM classifier, they were able to obtain an average accuracy of 92.6% with a combination of scar size and heterogeneity. This technology could be used clinically to aid decision-making, nevertheless the final decision would still need to be clinician led and on a case-by-case basis.
As well as predicting the success of ICD shocks, predicting need for a shock prior to the delivery would be useful clinically to warn patients and avoid side effects of ‘surprise shocks’. Au-Yeung et al used data from the Sudden Cardiac Death Heart Failure Trial where they collected preventricular tachyarrhythmia and regular rhythms from patients with congestive heart failure.37 They analysed heart rate variability data 5 min and 10 s before tachyarrhythmia in attempt to identify a ‘signature’ of VF/VT onset. They used both random forest and SVM to assess the data. They found a specificity of 75% for 5 min prediction and 80% for 10 s prediction. With these results, however, there would likely be many false positives. The study was limited as it only assessed patient with heart failure. It is possible that using additional features or more sensitive AI programmes could yield higher sensitivities that could be used in clinical practice.
We have outlined above the enormous potential of AI in cardiology and specifically in AEDs and ICDs. Machine learning offers exciting prospects to reduce peri-shock pauses both with increased efficiency of SAAs and the ability of SAAs to classify rhythms without interrupting CPR. In ICDs, machine learning has a number of applications, which could improve the quality of life of patients, including prediction of shock and electrical storm. Despite the enormous potential of AI in the field of defibrillators, there are some limitations to be aware of. Commercially available AEDs already exhibit high specificity. Compared with ICDs, AEDs favour specificity over sensitivity to reduce inappropriate shocks. International standards advise AED sensitivity >90% and specificity of >95% for detecting coarse VF.27 Nishiyami et al found on assessment of four commercially available AEDs that VF was diagnosed and treated correctly in almost all cases.38 Given the technology already has such high rates, it could be argued that newer AI algorithms increase the cost and complexity of machines with minimal gain. However, none of the AEDs investigated could obtain both a >75% sensitivity for VT and >95% specificity for SVT.38 In the future, looking to improve VT and SVT discrimination could be a key area for AI.
Overfitting represents a challenge to AI algorithms, whereby the model has learnt in such a way that the rules are only applicable to the training sample and are no longer generalisable.6 7 As well as drop-out regularisation, the large data sets now available help to mitigate overfitting in training of algorithms. In medical imaging, data augmentation has been used to artificially increase the data sets available by creating variants of original images in the data sets.39 Whether this could also be used with ECG traces is unclear. Multiple studies that were used in this review also discussed the issue of not having a single large database to use, so that algorithms could be compared. Therefore, the benefits of a single large database would be twofold.
A common issue within the field of AI is the ‘black-box problem’. This is the fact that some AI models, in particular, neural networks, lack interpretability in their decision-making process.40 Many of the studies we reported above have detailed in their methods, which parameters have been used in their algorithms. Nonetheless, it can be difficult to fully explain the outcomes reached based on these parameters. As neural networks become more complex with increasing numbers of layers, they become more difficult to interpret. Explainable AI has been a key area of research, particularly with potential medicolegal issues of incorrect shock decisions.
In the USA, bystander AED use occurs in only 2% of OHCA cardiac arrests.41 Another application of AI which we have not yet discussed is in drone delivery of AEDs in order to increase availability of AEDs and reduce to time to initial defibrillation. A recent simulation study in rural Canada found that drone-delivered AEDs decreased time to defibrillation by between 1.8 min and 8.0 min—which would have a great impact on mortality.42 AI could be used to calculate optimum geographical location and possible patrols to allow greatest access to AEDs. There remain some limitations with drone delivery currently including flight path restrictions and an inability to fly in rainy and windy conditions, which would need to be overcome before widespread use.
One of the most exciting future advances in machine learning use in AEDs is in rhythm recognition during CPR. This technology has developed from adaptive filters to remove CPR artefacts to the development of end-to-end SAAs. AHA recommended sensitivities and specificities have not yet been reached but with further optimisation of algorithms, this could become a reality soon. One key step will be the development of a large database of real-life AED traces during CPR. Jekova et al were able to use a large database but the proportions of VF for example did not meet criteria, and for further optimisation, more studies will be required.26 Current models have not been able to reduce ‘hands-off’ time completely as they often still require reconfirmation of the rhythm in the absence of chest compressions.25 Further optimisation of these algorithms remains an exciting area of research.
Machine learning remains a promising new technology for SAAs in AEDs and ICDs. These technologies have the potential to increase survival in OHCA by removing the need to stop CPR during resuscitation and optimum timing of shock delivery. They can also be used to help diagnose cause of arrest, for example, previous MI and improving patient quality of life by reduction in inappropriate ICD shocks—all of which could have life changing outcomes for patients. Even small improvements in sensitivities and specificities of these widely used defibrillators could save hundreds of lives. In the future, a single large database of real-life training and testing ECGs would be useful for building and assessing algorithms to allow for comparison of different technologies. We hope to see this technology being integrated into clinical practice in the near future.
Data availability statement
Data sharing not applicable as no data sets generated and/or analysed for this study.
Patient consent for publication
Contributors GB—lead author. DA, MA, NK, NP and VM—tables and reviewing references. MA, SC, DP and JJHB—editing and co-authors.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.