SAS macros for estimation of direct adjusted cumulative incidence curves under proportional subdistribution hazards models

https://doi.org/10.1016/j.cmpb.2010.07.005Get rights and content

Abstract

The cumulative incidence function is commonly reported in studies with competing risks. The aim of this paper is to compute the treatment-specific cumulative incidence functions, adjusting for potentially imbalanced prognostic factors among treatment groups. The underlying regression model considered in this study is the proportional hazards model for a subdistribution function [1]. We propose estimating the direct adjusted cumulative incidences for each treatment using the pooled samples as the reference population. We develop two SAS macros for estimating the direct adjusted cumulative incidence function for each treatment based on two regression models. One model assumes the constant subdistribution hazard ratios between the treatments and the alternative model allows each treatment to have its own baseline subdistribution hazard function. The macros compute the standard errors for the direct adjusted cumulative incidence estimates, as well as the standard errors for the differences of adjusted cumulative incidence functions between any two treatments. Based on the macros’ output, one can assess treatment effects at predetermined time points. A real bone marrow transplant data example illustrates the practical utility of the SAS macros.

Introduction

In biomedical studies, problems often arise in analyzing competing risks data, in which each subject is at risk of failure from K different causes and failure due to one risk precludes the occurrence of any other risk. With competing risks data, one practical problem is to evaluate treatment efficacy as well as assess the effects of other predictors on the cumulative incidence of a specific risk. Researchers are often interested in comparing the cumulative incidence functions for patients receiving different treatments. The standard statistical approach is to construct the cause-specific hazard functions for all risks and model the cumulative incidence function as a function of all cause-specific hazards [2], [3], [4], [5]. This approach has been criticized for the following drawbacks: first, the hazards of all the risks need to be correctly modeled even when only one risk is of study interest; second, the cumulative incidence does not possess a monotonic relation to the covariate effect and it may be difficult to conclude the effect of a specific covariate on the cumulative incidence curve; third, it may be difficult to identify which specific covariates have time-varying effects on the cumulative incidence function.

Some new regression approaches have been proposed to model the cumulative incidence function directly. Fine and Gray [1] proposed a Cox-type regression model on the subdistribution hazard function, λ1(t;z) =  dlog  {1  F1(t;z)} / dt = λ10(t)exp  (βTz), where F1(t;z) and λ10(t) are the cumulative incidence function for given covariates z and the baseline subdistribution hazard function for risk 1, respectively. Sun et al. [6] considered an alternative flexible model for the subdistribution hazard. Klein and Andersen [7] proposed a pseudo-value approach to conduct regression analysis on the cumulative incidence function at pre-fixed time points. Scheike et al. [8] constructed a linear transformation model on the cumulative incidence function based on binomial regression models. Direct regression modeling on the cumulative incidence function has been frequently adopted in biomedical research, however, the interpretation is somewhat awkward. Recently, Zhang and Fine [9] proposed inferences for the difference, ratio and odds ratio between cumulative incidence functions based on nonparametric estimates.

Some statistical packages have been developed to implement the aforementioned approaches. Fine and Gray’s (FG) model has been implemented in the cmprsk package in R. This package includes several useful features such as estimation of the cumulative incidence function and time-dependent covariates. Klein et al. [10] developed a SAS macro and an R function for the pseudo-value approach. The direct binomial regression modeling approach has been implemented in the R package timereg, which can be used to fit a class of flexible models and develop model identification procedures [11].

Direct adjustment is a common method to tackle the issue of imbalanced distributions of the risk factors (covariates) between treatment groups, when the treatment-specific survival quantities need to be reported. Chen and Tsiatis [12] proposed to compare the direct adjusted restricted mean lifetimes of different treatments for time to event data. The direct adjustment method has also been adopted for univariate survival data to generate the overall survival curves for different treatments [13], [14]. For this approach, each subject’s predicted survival probability is estimated with all possible treatment assignments, regardless of the actual treatment received. The direct adjusted survival probability of one treatment is obtained by averaging these predicted survival probabilities of this treatment over the pooled samples. The objective of this paper is to study direct adjusted cumulative incidence estimates for treatments.

We adopted the FG model [1] as the underlying regression model and developed a SAS macro to compute the direct adjusted cumulative incidence curves for treatments. In biomedical studies, often the treatment effect varies over time. To allow the treatments to have time-varying effects, we considered a stratified proportional subdistribution hazards model and developed an alternative SAS macro implementing the proposed stratified FG model. The macros also compute the standard errors for the adjusted cumulative incidence functions and for the difference between two cumulative incidence estimates. We demonstrate our SAS macros with bone marrow transplant data from the Center for International Blood and Marrow Transplant Research (CIBMTR).

The outline of the remainder of the paper is as follows. In Section 2, we briefly describe the Cox-type model on a subdistribution hazard function and present the direct adjusted cumulative incidence curves for treatment groups. The SAS macros are introduced in Section 3. Bone marrow transplant data is analyzed in Section 4 to demonstrate our macros. The concluding remarks are given in Section 5.

Section snippets

Direct adjusted cumulative incidence curve

Let Tj and Cj be the event time and right censoring time of the jth individual and let ϵj denote the failure type. We assume that we observe n independent identically distributed (i.i.d.) replications of {(Xj, Δj, Zj), j = 1, …, n}, where Xj = min  (Tj, Cj), Δj=ϵjI(TjCj), and Zj = (Zj,1, …, Zj,p)T is the vector of covariates. We assume that (Tj, ϵj) are independent of Cj given covariates of Zj.

To directly assess covariate effects on the cumulative incidence of a specific risk, Fine and Gray [5]

The SAS macros

We have developed two SAS macros, %CIFCOX and %CIFSTRATA, for computing the direct adjusted cumulative incidence curves and relevant standard errors based on model (2.2), (2.3), respectively. The macros have been saved in two text files, “CIFCOX.txt” and “CIFSTRATA.txt”. The input data set structure and the macro parameters are the same for these two macros. In this section, we use %CIFSTRATA to illustrate data preparation and how to invoke the macros.

The macro should be loaded to the current

The example

For illustration purpose, we consider a transplant outcome data set from The Center for International Blood and Marrow Transplant Research (CIBMTR) [15], [16]. The CIBMTR is comprised of clinical and basic scientists who confidentially share data on their blood and bone marrow transplant patients with CIBMTR Data Collection Center located at the Medical College of Wisconsin. The CIBMTR is a repository of information about results of transplants at more than 450 transplant centers worldwide. The

Concluding remarks

We have presented two SAS macros to compute the direct adjusted cumulative incidence functions for different treatments based on the unstratified and stratified FG models for a subdistribution hazard function. The treatment effect on a specific cause of failure can be assessed by constructing a confidence interval for the difference between two direct adjusted estimates. An inference based on a confidence interval, however, is only valid for assessing the effect at a given time point. For

Acknowledgment

Mei-Jie Zhang was supported by National Cancer Institute grant 5R01CA054706.

References (16)

There are more references available in the full text version of this article.

Cited by (107)

  • Haploidentical Versus Matched Unrelated Donor Transplants Using Post-Transplantation Cyclophosphamide for Lymphomas

    2023, Transplantation and Cellular Therapy
    Citation Excerpt :

    Probabilities of PFS and OS were calculated as described previously.13 Cumulative incidences of NRM, lymphoma progression/relapse and hematopoietic recovery were calculated to accommodate for competing risks.14 The primary analysis evaluated associations among patient-, disease-, and transplantation-related variables and outcomes of interest using Cox proportional hazards regression.

View all citing articles on Scopus
View full text