An application of changepoint methods in studying the effect of age on survival in breast cancer

https://doi.org/10.1016/S0167-9473(98)00096-6Get rights and content

Abstract

The role of age in prognostic studies in breast cancer remains to be clearly established. There is reasonable agreement that younger patients have higher risk of an unfavorable outcome but there is little agreement on the precise nature of the relationship between age and prognosis. A first step in studying any such relationship can be based on the division of patients into two groups: a high risk and a low risk group. A simple and popular classification rule consists in determining a cutoff value of the continuous variable age. How to choose the actual cutoff however is not a straightforward problem (Lausen and Schumacher Biometrics 48 (1992) 73–75; Comput. Statist. Data Anal. 21 (1996) 307–326; Altman et al., J. Natl. Cancer Institute 86 (1994) 829–835). We address this problem in a way similar to that of Lausen and Schumacher by showing that the asymptotic distribution of a re-scaled rank statistic is the same as the distribution of the Brownian bridge. Our approach avoids arbitrarily eliminating potential cutpoints near the extremities. The maximisation of the proposed statistic enables estimation of a cutpoint and the calculation of its significance. The statistical problem is presented in the general case and is detailed in the case of survival analysis with censored data. Simulations suggest our approach to have smaller bias and greater power in certain situations than that of Lausen and Schumacher. We present a Monte-Carlo study and an illustration of the approach in a study on the effect of age at diagnosis and subsequent survival in breast cancer.

Introduction

In clinical and epidemiological studies, the quantification of the effect of a continuously measured prognostic factor is not straightforward. A variable measured on a continuous scale may have prognostic significance concerning survival time, but the form of its effect may be non linear, as opposed to that which is often implied by the model, and generally not easy to evaluate.

An attractive approach to problems in which we know little, apart from a reasonable assumption of monotonicity, concerning the relation between the covariable and survival, is to create two groups on the basis of a cutpoint. If the unknown relationship is indeed linear then information is lost but the analysis nonetheless maintains validity. This validity holds for any monotonic relationship so, whilst a changepoint approach is suboptimal, it is a useful alternative having wide generality when we do not have good reasons for pursuing some particular functional form.

The importance of this problem is clear in cancer studies, and in particular in studies involving breast cancer. Beyond trying to explain observed variability, research focuses on the distinction of low and high risk groups. Using changepoint methods to establish low and high risk groups only makes sense in conjunction with input from the medical investigators in the clinical context. This is only part of the analysis and, if the main goal is to establish risk groups, other approaches may be helpful, such as using the predicted 5-year survival probabilities.

Our particular application concerned the effect of age at diagnosis on survival in breast cancer. The problem remains controversial. Several authors have reported different effects of age. Some studies indicate a poorer prognosis for a group of patients under 45 (Marubini et al., 1990). Whilst others make a distinction between patients under and those over 40 (Ribeiro and Swidell, 1981), or 33 (De La Rochefordiere et al., 1993). Some have concluded that prognosis is better for young women (Rutquist and Wallgren, 1983), whilst others conclude there is no effect of age (Wallgren et al., 1977; Sutherland and Mather, 1986). Høst and Lund (1986) explain these differences by different choices in age categorization. Zahl and Tretli (1997) explain these differences by interactions and time varying effects of age.

Patients may be grouped according to whether or not the value of the prognostic factor is above some cutpoint. The difference between the two groups is quantified by the outcome of interest. In the case of survival analysis, the difference between the two groups of patients is quantified by the relative risk. A number of possible approaches are available for such classification. First, the selection of the cutpoint can be done a priori, on the basis of external information, biological reasoning or simplicity. Another widespread method is to select the cutpoint that corresponds to the most significant difference in prognosis between groups. The cutpoint is chosen that minimizes the p-value relating the prognostic factor to outcome. When the outcome of interest is survival time, as in the case of breast cancer, the minimization takes place with respect to the log-rank statistic. However, with such a procedure, investigators are confronted with the problem of multiple testing.

Particular attention has to be paid to the interpretation of results from such an approach. Among others, Courdi et al. (1988), Altman et al. (1994) describe the methodological difficulties of the method. There is a tendancy to overestimate the significance of the obtained cutoff values (McGuire et al., 1992; Hilsenbeck et al., 1992). Recently, Schulgen et al. (1994) propose a method for adjusting results, to correct the p-value. Lausen and Schumacher (1992), Lausen and Schumacher (1996), hereafter refered to as L&S, generalize the theoretical approach of Miller and Siegmund (1982) and propose a procedure to estimate the cutpoint and adjust the p-value. They show that the asymptotic null distribution of the maximally selected rank statistic is the distribution of the supremum of the absolute value of a standardized Gaussian process over an interval.

In this paper, we present a technique similar to that of L&S, in order to estimate the cutpoint, and its significance. In Section 2, we define the underlying model, present the theoretical development of the method, and describe the test. In particular, we see that the asymptotic null distribution of a process based on re-scaled rank statistics is the distribution of the Brownian bridge. This simplifies inference. We focus on the case of survival analysis with censoring. Results are presented in Section 3. A Monte-Carlo study is presented to investigate small-sample behaviour, followed by an application to a study of breast cancer carried out by the Institut Curie (Paris, France). We are interested in the effect of age at diagnosis on survival with breast cancer for premenopausal women. We also determine a two-group classification of these women based on age at diagnosis. Results for data used by L&S on influence of S-phase on survival with breast cancer are briefly presented, as an aid to comparison.

Section snippets

Cutpoint model

The prognostic factor is denoted by Z and the outcome by X. The observations are the pairs (Zi,Xi);i=1,…,n. The variable Z is assumed to have a continuous distribution.

The population is split into two groups: patients for whom the variable Z lies below the cutpoint μ, and patients for whom the variable Z lies above the cutpoint. Generally, the choice of the cutpoint is based on the maximisation of a measure of interest. We are interested not only in the estimation of the cutpoint, but also in

A Monte Carlo study

We carried out a Monte Carlo simulation in order to study the behaviour of the estimators of the cutpoint μ̂, their significance p and associated biases. To be able to compare the estimator developed here with that of L&S, we used the same models and values for the simulations. Simulations and all calculations were performed on a SUN workstation. The pseudo-random number generation was done via the linear congruential algorithm and 48-bit integer arithmetic, implemented in C.

For both the

Discussion and conclusion

Much effort is devoted to study the importance of prognostic factors, especially in breast cancer. Considerable attention is given to continuous variables, but the prognostic importance of such variables is not simple to establish. A common practice is to transform the continuous factor into a categorical variable. The patients are grouped according to the values of Z, often simply two groups.

Special care has to be paid to the choice of the cutpoint which defines the groups. The proposed

Acknowledgements

The authors would like to thank the editors and reviewers for having detected a number of errors in the text and for making a number of invaluable suggestions for improving the clarity of presentation.

References (21)

  • B. Lausen et al.

    Evaluating the effect of optimized cutoff values in the assessment of prognostic factors

    Comput. Statist. Data Anal.

    (1996)
  • G.G. Ribeiro et al.

    The prognosis of breast carcinoma in women aged less than 40 years

    Clin. Radiol.

    (1981)
  • D.G. Altman et al.

    Dangers of using “optimal” cutpoints in the evaluation of prognostic factors

    J. Nat. Cancer Instit.

    (1994)
  • Billingsley, P., 1968. Convergence of Probability Measures. Wiley, New...
  • A. Courdi et al.

    Prognostic value of continuous variables in breast cancer and head and neck cancer. Depencence on the cut-off level

    Br. J. Cancer

    (1988)
  • D.R. Cox

    Regression models and life tables (with discussion)

    J.R. Statist. Soc. B

    (1972)
  • De La Rochefordiere, A., Asselain, B., Campana, F., Scholl, S.M., Fenton, J., Vilcoq, J.R., Durand, J., Poillart, P.,...
  • S. Hilsenbeck et al.

    Why do so many prognostic factors fail to pan out?

    Breast Cancer Res. Treatment

    (1992)
  • H. Høst et al.

    Age as prognostic factor in breast cancer

    Cancer

    (1986)
  • B. Lausen et al.

    Maximally selected rank statistics

    Biometrics

    (1992)
There are more references available in the full text version of this article.

Cited by (488)

  • Incident Atrial Fibrillation and Survival Outcomes in Esophageal Cancer following Radiotherapy

    2024, International Journal of Radiation Oncology Biology Physics
View all citing articles on Scopus
View full text