Introduction

Recent health policy initiatives prioritize impoved organization and delivery of care to reduce fragmentation and prevent expensive complications of chronic illness or iatrogenic disease. This approach may miss an important opportunity to address quality concerns and rising health care spending: the overuse of low-value services. In 2012, the Institute of Medicine estimated that 30% ($750 billion) of annual health care spending is wasteful and that over half of this spending is on unnecessary services and inefficient care.1 Elimination of low-value services as a cost control strategy has much economic appeal because it would improve quality while reducing costs.

Society increasingly recognizes the importance of excess medical care, but it is difficult pinpointing the services and populations that represent health care overuse. There is general agreement on the definition of overtreatment (treatment of indolent disease, aggressive treatment at the end-of-life) and overdiagnosis (diagnosis and treatment of disease that would not have affected the lives of patients), but consensus has not been sufficient to facilitate their identification in clinical practice.2 Identification of low-value and potentially harmful services is an essential first step in improving quality and reducing overuse. The second critical step is engaging physicians and patients in efforts to reduce use of these services. Together with physician specialty societies, the American Board of Internal Medicine (ABIM) Foundation launched the Choosing Wisely initiative in 2011 to advance both of these aims. Over 60 participating physician societies have now each identified five specialty-specific, low-value services whose avoidance would improve the efficiency of care through higher quality, reduced risks and lower costs.3

In this study, we developed claims-based algorithms to examine 11 services identified in one or more Choosing Wisely lists and estimated the prevalence of these services at the regional and national levels. We created a regional composite measure of overuse based on the prevalence of these 11 services and explored the demographic, health and health care system correlates of overuse at a regional level. Based on this information, we estimate the magnitude of the harm and wasteful spending attributable to each service. This information may aid decision makers in prioritizing areas for intervention and provide a baseline against which to test the impact of policies aimed at reducing use of low-value services.

Methods

Data

We used 100% Medicare administrative claims data (2006–2011) to determine the prevalence of low-value services. We limited our analysis to fee-for-service beneficiaries enrolled in Medicare Parts A and B (inpatient and outpatient insurance); we also required enrollment in Part D (prescription insurance) for three measures of Choosing Wisely services related to prescription drugs (analyses employing Part D data were limited to a 40% sample). We used residential ZIP codes to assign each beneficiary to a Dartmouth Atlas of Health Care hospital referral region (HRR).

Choosing Wisely Measurement

We developed claims-based algorithms for 11 services, representing 37 Choosing Wisely recommendations due to overlap in recommendations across specialty societies. A team of two physician health services researchers, two health economists and a Medicare claims data analyst reviewed all 130 available Choosing Wisely recommendations for inclusion in the study (available as of July 1, 2013). Each recommendation was scored based on: (i) the applicability to the over-65 Medicare population; and (ii) the feasibility of measuring prevalence using claims. Those Choosing Wisely recommendations that scored highest were included in the analysis. Choosing Wisely services excluded from analysis were either universally difficult to measure in claims data (e.g., “Don’t perform stenting of non-culprit lesions during percutaneous coronary intervention for uncomplicated hemodynamically stable ST-segment elevation myocardial infarction”) or not applicable to the elderly Medicare population (e.g., “Don’t schedule elective, non-medically indicated inductions of labor or Cesarean deliveries before 39 weeks, 0 days gestational age”). Measured services included a non-indicated subset of the following: back pain imaging; benign prostatic hypertrophy imaging; cardiac screening in low-risk patients; cervical cancer screening; dual-energy x-ray absorptiometry testing; preoperative cardiac testing in low-risk patients ahead of low-risk surgery (cataract and non-cardiac); vitamin D screening; prescribing antipsychotics and feeding tubes in dementia patients; and prescribing opioids for migraine. We categorize eight of the measures as low-value diagnostic services and three as low-value treatments (Table 1).

Table 1 Measures developed to assess prevalence of services identified as low-value through Choosing Wisely

We used a combination of International Classification of Diseases, Ninth Revision (ICD-9), and current procedural terminology (CPT) codes to construct cohorts at risk for 11 Choosing Wisely services and to identify health service events highlighted by the Choosing Wisely recommendations (Online Appendix 2). We also used Medicare Part D prescription records, where applicable, for cohort inclusion/exclusion or to identify Choosing Wisely prescription service events. In all cases, we conservatively excluded beneficiaries not targeted by the Choosing Wisely recommendation. We limited our analysis to non-indicated tests and procedures, excluding services with claims diagnoses that suggest appropriate medical indication. We drew from measure definitions in the literature and conducted claims-based sensitivity analyses to optimize the measure construct when possible.410] For example, we studied the characteristics and follow-up events for those we deemed “low risk” for the cardiac screening measure. All measures not drawn from the literature were developed by a clinician; each was then reviewed by a second clinician. Disagreements were resolved via discussion. Although we used 2006–2011 data, some measures were limited to smaller windows to permit sufficient look-back periods within the data to identify, for example, prevalent disease states (e.g., long-standing back pain that would result in denominator exclusion for the back pain imaging measure). In Table 1, we describe the data, the time window for cohort qualification, event definitions, measure-specific cohorts, and cohort and event exclusions for each measure.

Area-Level Variables

Based on a conceptual framework for decisions regarding health care services, we created HRR-level covariates to include in an exploratory regression analysis.11 These HRR-level measures characterized population demographics, health and health care systems for each area based on Medicare, Behavioral Risk Factor Surveillance System (BRFSS), U.S. Census and American Community Survey data. Explanatory variables included the following: per-beneficiary Medicare spending (a measure of health care use intensity); physician group concentration (a measure of market competition); the ratio of specialists to primary care physicians; the age-, sex- and race-adjusted mortality rate and the percent of adults reporting fair or poor health (measures of health state); the percent of Medicare beneficiaries of black race; the percent of Medicare beneficiaries of Hispanic ethnicity; a Medicare effective care use score; the percent of HRR residents living in a rural area; and the percent of residents below 150% of the federal poverty limit.

Statistical Analysis

We calculated an average annual prevalence in the at-risk population for each Choosing Wisely service, both nationally and at the HRR level, along with the coefficient of variation across HRRs. We estimated national spending associated with each service by multiplying observed average spending per low-value care event by the national number of low-value care events among fee-for-service Medicare beneficiaries. We constructed an overall composite measure of low-value care for each HRR, equal to the average of the 11 standardized rates (z scores or standard deviations from the mean, Cronbach’s alpha = 0.66). We examined geographic variation in the overall composite measure by dividing the HRRs into quintiles and mapping the results. We used ordinary least squares regression to determine the association of HRR-level characteristics with the composite low-value care scores (N = 306 HRRs).

Statistical analyses were performed using SAS and Stata software. The study was approved by the institutional review board at Dartmouth College. See Online Appendix 1 for further methodology details.

Results

Of the 11 health care services included in our analysis (Table 1), non-cardiac surgery was the most prevalent, with 46.5% of those identified receiving pre-operative tests (Table 2). The use of antipsychotics in dementia patients and opioids in migraine patients were also highly prevalent (31.0% and 23.6%, respectively). Low-value services with low prevalence included non-indicated imaging for benign prostatic hypertrophy (1.2%), non-indicated cervical cancer screening (3.1%) and non-indicated vitamin D screening (8.8%).

Table 2 Average annual prevalence of, variation in and spending associated with Choosing Wisely procedures and tests (N = 306 hospital referral regions)

The prevalence of low-value care varied across the United States. Non-indicated imaging for benign prostatic hypertrophy had the highest variation (coefficient of variation of 0.82), likely in part due to its low prevalence and relatively small affected population. Use of antipsychotics in dementia patients and non-indicated imaging for back pain had relatively low levels of regional variation (coefficient of variations of 0.12 and 0.16, respectively). Overall, use of low-value services, as indicated by our composite measure, was highest in the south and eastern parts of the United States (Fig. 1).

Fig. 1
figure 1

Variation in the composite measure of Choosing Wisely test and treatment use (N = 306 hospital referral regions)

The spending amount associated with each of the low-value services was a function of the prevalence of the service, the size of the affected population and the cost of test or treatment. Non-indicated use of antipsychotics in dementia patients had the highest amount of associated spending ($765.1 million), followed by non-indicated vitamin D screening ($198.6 million). Non-indicated imaging for benign prostatic hypertrophy and non-indicated preoperative cardiac testing for cataract surgery had the lowest levels of associated spending ($0.3 and $0.6 million, respectively).

Health care and health system characteristics were associated with use of low-value services at the regional level in the Medicare population (Table 3). In our exploratory regression model, we found that higher age-, sex-, race- and price-adjusted total Medicare spending per capita was associated with low-value care utilization, in addition to a higher ratio of specialist to primary care physicians, a higher proportion of minority beneficiaries and a higher proportion of residents with poor or fair health. In contrast, a higher proportion of residents with income under 150% of the federal poverty limit was associated with lower low-value care utilization, along with a higher physician group concentration. Notably, use of low-value services was not associated with the quality index, a standardized rate of underuse for a collection of measures thought to represent more effective care.

Table 3 Multivariate linear regression of regional characteristics associated with Choosing Wisely service use (N = 306 hospital referral regions)

Discussion

The Choosing Wisely initiative identified a set of low-value services via high-level expert opinion and consensus. We carefully constructed 11 claims-based algorithms to quantify and track utilization likely to represent overuse by relying on the recommendations from the Choosing Wisely program. Analysis of these services revealed substantial overuse and variation in overuse in the Medicare population by measure and geography. From both patient and societal perspectives, use of these services may have substantial health and economic implications. Some of the measured services represent treatments that may directly confer risk of harm (e.g., opioids in migraine patients), some may directly confer risk of harm and result in significant spending (e.g., antipsychotics in dementia patients) and others may indirectly confer risk of downstream harm by prompting additional testing and possibly resulting in false positive results (e.g., non-indicated preoperative cardiac testing). Our analysis provides an estimate of the opportunity for improving quality while reducing spending on these 11 services.

We found adjusted Medicare spending was positively associated with use of low-value services after controlling for regional health indicators. Many areas identified by others as having consistently high adjusted Medicare spending (e.g., McAllen, TX; Manhattan and Long Island, NY; Miami, FL; and Los Angeles, CA) also have high use of low-value services, indicating that at least some of their high spending results from wasteful services. The strong association between the proportion of racial and ethnic minority beneficiaries in the region and lesser use of low-value services in these exploratory regressions raises questions. We suspect this association is not due to individual-level differences in treatment between racial and ethnic groups, but rather is an artifact of practice styles in regions where these population sub-groups live. Previous research has shown that where a patient lives can affect the level and quality of health care the patient receives independent of individual characteristics, and that overuse patterns do not differ by insurance type.1215

Recent evidence indicates provider organizations and regions with a higher proportion of primary care physicians have lower utilization and spending and better use of recommended preventive and chronic care.16 Moreover, workforce characteristics explain 42% of the state-level variation in Medicare spending per beneficiary.17 The magnitude of the association between specialist ratio and low-value care in our study echoes these results, but does not suggest an obvious policy intervention. It is unknown whether this observation reflects excess testing by specialists or by all types of physicians in regions with a higher relative concentration of specialists.

Overuse of other services not included in our analysis may display different patterns than the 11 services we measured. Our estimates of overuse, however, include generalist- and specialist-directed care, expensive and inexpensive tests and procedures, and a broad range of specialty society lists. The main limitation of this research is our reliance on administrative claims to identify and describe use of low-value services.18 Claims may not provide the clinical detail needed to definitively identify certain examples of low-value care. Claims may miss important patient history such as long-term, untreated back pain that contributes to clinical decision-making and justifies services that would appear in claims as low-value. Often the same service can be high- or low-value depending on the patient; if the cohort exclusions are not adequately detailed, the measure will represent utilization of the procedure and not overuse. While claims data are not ideal for measurement of patient risk or symptoms, we provide algorithms to represent each recommendation and believe they are conservative starting points to estimate the use of these services, associated spending, variation in spending and correlates of use. These algorithms are valuable for research and discussion; use of these algorithms for quality measurement or payment by payers will require validation by chart review. We do not expect the “right” rate for these claims-based measures to be zero, but the differences across geography suggest what is achievable. In research on claims-based measurement of cancer treatment quality, Earle et al. define the 10th percentile as the benchmark for health care systems to set as a goal.19 In their work evaluating the intensity of end-of-life cancer care, for example, this meant that hospitals would be providing appropriate-intensity care if less than 2% of patients started a new chemotherapy regimen in the last 30 days of life. We report the 25th percentile for each measure in Table 2 as a conservative initial benchmark for clinicians and health systems to work toward. This benchmark may have to change as the quality of care improves.

Our analysis of the correlates of low-value care was exploratory and aimed at generating hypotheses. As an ecological correlation analysis, it was not based on individual patient- and provider-level modeling. Each low-value test or procedure likely has its own profile and is differentially affected by payment incentives, malpractice liability concerns, physician comfort with diagnostic uncertainty and patient demand for services, among other factors. Nonetheless, several of the patterns observed, including the association of higher spending and greater specialist supply with a greater provision of low-value care, are consistent with previous work and should serve as the basis for developing a conceptual framework for decision making around low-value care utilization.17,20

The measures developed for this study may help policymakers and payers focus attention on the forms of low-value care that are most harmful, prevalent or costly. Our conservative estimate of the spending for the low-value care services included in our analysis represents a small part of the overall cost problem and does not include other costs associated with the service or downstream costs, but is still an important starting point and opportunity for savings. Reduction in use of the services we measure will improve quality while lowering costs – changes that are hard to find in health care. Our measures also provide a baseline against which to test the impact of policies aimed at controlling costs and improving the efficiency of health care delivery, including, but not limited to, those that target low-value services directly.

The Choosing Wisely initiative has labeled services as low-value and has begun educating both patients and physicians through outreach material.21 Future work should examine the effects of the Choosing Wisely initiative and related programs on use of these services, as well as other reforms (such as accountable care organizations or value-based insurance design) that are intended to slow spending growth and reduce waste in health care. Identifying and eliminating low-value care is a critical component of health care reform and one in which careful measurement and targeting of policies will be essential to maximizing value and minimizing unintended harm.