Weighted estimation of parameters in Cox regression
Weighted estimation of parameters in Cox regression
Disciplines
Biology (100%)
Keywords
-
Lebensdaueranalyse,
Therapiestudien,
Studien von prognostischen Faktoren,
Cox Regressionsmodell,
Regression mit großer Prädiktorenzahl,
Zeitabhängige Effekte Von Einflußfaktore
Cox`s (1972) proportional hazards regression model (CR) continues to be one of the most popular tools in the analysis of censored survival data. Often the effect of at least one of the prognostic factors included in such a model changes over time, which violates the proportional hazards assumption. As a consequence, the average relative risk (over time) for such a prognostic factor is under- or overestimated and the efficiency of the parameter estimate decreases. Several methods have been developed to accommodate time-dependent effects of prognostic factors but in this project we focus on the use of weighted parameter estimates for Cox regression (WCR). The method appears to have fallen into oblivion despite promising reports by Schemper (1992), Sasieni (1993), or Valsecchi, Silvestri and Sasieni (1996). WCR is attractive because (1) it permits unbiased estimation of average hazard ratios (over time), regardless the type of violation of the proportional hazards assumption, (2) the efficiency of the estimates is maintained over a wide range of time-dependent effects, (3) the estimates are more robust in the presence of outlying survival times or contaminated distributions of survival, and (4) it provides an extension of the Breslow (1970) and Prentice (1978) tests to a multi-covariate situation as does the standard Cox model to Mantel`s (1966) logrank test. Due to these attractive properties and also due to positive experiences of applying WCR in previous years, a systematic investigation of potential indications for the use of WCR and the development of extensions appear promising. The project will deal with WCR in three domains of Biostatistics: (A) prognostic factor studies, (B) treatment effects in controlled clinical trials, and (C) studies of gene-expressions and -polymorphisms. Within these domains the following issues will be addressed: A: advantages of WCR over CR under several scenarios likely occurring in practice, explicit modeling of time- dependent effects within the WCR framework, construction and performance of model diagnostics within WCR. B: sometimes, unanticipated at the planning stage of a clinical trial, the treatment effect fades away in time - resulting in low power of the standard logrank test. However, Prentice and Breslow tests are powerful under converging hazards. For such situations Tarone`s (1981) test - based on the maximum of the Breslow- and logrank statistics - is useful. An analogous procedure for the joint analysis of a treatment effect by CR and WCR will be developed and validated. C: Under the "p (# predictors) >> n (# events)" paradigm gene-expressions from micro-arrays or (single nucleotide) gene-polymorphisms are screened for their effects on survival or are used to develop a predictive model. WCR-based approaches could be superior to the CR-based ones, because the proportional hazards assumption will often be violated by chance, additionally, due to the variability inherent in small samples. We will investigate this issue within penalized Cox regression but the conclusions will also extend to other approaches accommodating a large number of predictors within Cox-type modeling. Empirical results for this project will be derived from extensive Monte Carlo studies as well as from detailed, comparative analyses of medical data sets. Possible gains due to the use of WCR-type methods will be quantified in terms of bias and variability of estimates and by means of predictive accuracy, explained variation and ROC- adaptations to the survival setting. The software to be developed along the lines of the project will be made available for free download from our website, as R, S-Plus and SAS-Macro versions.
An important task of clinical biometrics is to investigate the effects of possible prognostic factors (treatments and patient characteristics such as age, sex or disease subtype) on the survival time after initiation of treatment for a chronic disease (such as cancer). For this purpose a mathematical model is fit to the sample provided by a clinical study. It is of medical interest to quantify the effect of a prognostic factor on the remaining survival time, e.g., by the difference in expected (median) survival time resulting from two alternative treatments given to distinct groups of patients. In the last decades the proportional hazards model by Cox has become the standard in investigating the effects of prognostic factors on survival. This model, for example, quantifies the difference of the effects of two treatments by means of a hazard ratio - the ratio of the risks of dying under the first treatment and of dying under the second treatment, both risks expressed for an arbitrary day of the follow-up of the patients. In its most basic specification the proportional hazards model assumes that the hazard ratio for two levels of a prognostic factor stays constant over follow-up time. If, for a given sample, the hazard ratio changes with time this change can be modeled by the use of time-dependent effects terms in the model. An alternative option within the proportional hazards model is to mathematically estimate an average effect over time, an average hazard ratio. This method has received relatively little attention in the past and it has been the goal of our research project to study this methodology and to develop new statistical tools to obtain average hazard ratios by means of weighted parameter estimation within Cox`s proportional hazards regression model. The method is parsimonious with respect to model parameters and therefore may be appealing if available sample sizes are small and/or the numbers of prognostic factors large, or if there is little medical interest in the time-dependence of an effect. Within the project interesting mathematical relationships to other effect size measures have been found. Also issues of appropriately dealing with the censoring of some patients` survival times could be resolved. More complex issues of statistical inference, i.e., quantifying the degree of randomness of estimated effects, could be addressed satisfactorily. Extensive statistical simulation experiments confirm satisfactory performance of the suggested statistical procedures. User-friendly implementations of the procedures in SAS and R software are available for free download from our website.
Research Output
- 337 Citations
- 4 Publications
-
2008
Title Avoiding infinite estimates of time-dependent effects in small-sample survival studies DOI 10.1002/sim.3418 Type Journal Article Author Heinze G Journal Statistics in Medicine Pages 6455-6469 -
2011
Title Non-parametric estimation of relative risk in survival and associated tests DOI 10.1177/0962280211431022 Type Journal Article Author Wakounig S Journal Statistical Methods in Medical Research Pages 856-870 -
2009
Title The estimation of average hazard ratios by weighted Cox regression DOI 10.1002/sim.3623 Type Journal Article Author Schemper M Journal Statistics in Medicine Pages 2473-2489 -
2010
Title Gene selection in microarray survival studies under possibly non-proportional hazards DOI 10.1093/bioinformatics/btq035 Type Journal Article Author Dunkler D Journal Bioinformatics Pages 784-790 Link Publication