Toric section: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Addbot
m Bot: Migrating 3 interwiki links, now provided by Wikidata on d:q985864
en>David Eppstein
source
 
Line 1: Line 1:
'''Proportional hazards models''' are a class of [[survival analysis|survival models]] in [[statistics]].  Survival models relate the time that passes before some event occurs to one or more [[covariate]]s that may be [[association (statistics)|associated]] with that quantity of time.  In a proportional hazards model, the unique effect of a unit increase in a covariate is multiplicative with respect to the [[hazard rate]]. For example, taking a drug may halve one's hazard rate for a stroke occurring, or, changing the material from which a manufactured component is constructed may double its hazard rate for failure.  Other types of survival models such as [[accelerated failure time model]]s do not exhibit proportional hazards.  The [[accelerated failure time model]] describes a situation where the biological or mechanical life history of an event is accelerated.
Irwin Butts is what my wife enjoys to contact me though I don't really like becoming known as like that. To perform baseball is the hobby he will never stop performing. Bookkeeping is my occupation. Years in the past he moved to North Dakota and his family loves it.<br><br>Also visit my webpage :: [http://www.escuelavirtual.registraduria.gov.co/user/view.php?id=140944&course=1 std home test]
 
==Introduction==
 
Survival models can be viewed as consisting of two parts: the underlying [[hazard function]], often denoted <math>\lambda_0(t)</math>, describing how the risk of event per time unit changes over time at ''baseline'' levels of covariates; and the effect parameters, describing how the hazard varies in response to explanatory covariates. A typical medical example would include covariates such as treatment assignment, as well as patient characteristics such as age at start of study, gender, and the presence of other diseases at start of study, in order to reduce variability and/or control for confounding.
 
The ''proportional hazards condition''<ref>{{cite journal
| doi = 10.2307/1402659
| last = Breslow |first = N. E. |authorlink= Norman Breslow
| title = Analysis of Survival Data under the Proportional Hazards Model
| journal = International Statistical Review / Revue Internationale de Statistique
| year = 1975 | volume = 43 | issue=1 |pages = 45–57
| jstor = 1402659}}</ref> states that covariates are multiplicatively related to the hazard. In the simplest case of stationary coefficients, for example, a treatment with a drug may, say, halve a subject's hazard at any given time <math>t</math>, while the baseline hazard may vary. Note however, that this does not double the life time of the subject; the precise effect of the covariates on the life time depends on the type of <math>\lambda_0(t)</math>. Of course, the [[covariate]] is not restricted to binary predictors; in the case of a continuous covariate <math>x</math>, it is typically assumed that the hazard responds logarithmically; each unit increase in <math>x</math> results in proportional scaling of the hazard. The Cox partial likelihood shown below, is obtained by using Breslow's estimate of the baseline hazard function, plugging it into the full likelihood and then observing that the result is a product of two factors. The first factor is the partial likelihood shown below, in which the baseline hazard has "canceled out". The second factor is free of the regression coefficients and depends on the data only through the [[censoring (statistics)|censoring pattern]]. The effect of covariates estimated by any proportional hazards model can thus be reported as [[hazard ratio]]s.
 
[[David Cox (statistician)|Sir David Cox]] observed that if the proportional hazards assumption holds (or, is assumed to hold) then it is possible to estimate the effect parameter(s) without any consideration of the hazard function. This approach to survival data is called application of the '''''Cox proportional hazards model''''',<ref>{{cite journal | last=Cox | first=David R | authorlink=David Cox (statistician) | year=1972 | journal=Journal of the Royal Statistical Society, Series B | volume=34 | issue=2 | title=Regression Models and Life-Tables | pages=187–220 | jstor=2985181}} {{MR|0341758}}</ref> sometimes abbreviated to '''''Cox model''''' or to ''proportional hazards model''. However, Cox also noted that biological interpretation of the proportional hazards assumption can be quite tricky.<ref>{{cite journal
| last = Reid |first = N.
| title = A Conversation with Sir David Cox
| journal = Statistical Science
| year = 1994 | volume = 9 | issue=3 | pages = 439–455
}}</ref> <ref>{{cite conference
| last = Cox |first = D. R. |authorlink= Norman Breslow
| title = Some remarks on the analysis of survival data
| conference = the First Seattle Symposium of Biostatistics: Survival Analysis
| year = 1997 }}</ref>
 
==The partial likelihood==
 
Let ''Y''<sub>''i''</sub> denote the observed time (either censoring time or event time) for subject ''i'', and let ''C''<sub>''i''</sub> be the indicator that the time corresponds to an event (i.e. if ''C''<sub>''i''</sub>&nbsp;=&nbsp;1 the event occurred and if ''C''<sub>''i''</sub>&nbsp;=&nbsp;0 the time is a censoring time).  The hazard function for the Cox proportional hazard model has the form
 
::<math>
\lambda(t|X) = \lambda_0(t)\exp(\beta_1X_1 + \cdots + \beta_pX_p) = \lambda_0(t)\exp(\beta^\prime X).
</math>
 
This expression gives the hazard at time ''t'' for an individual with covariate vector (explanatory variables) ''X''. Based on this hazard function, a partial likelihood can be constructed from the datasets as
 
::<math>
L(\beta) = \prod_{i:C_i=1}\frac{\theta_i}{\sum_{j:Y_j\ge Y_i}\theta_j},
</math>
 
where ''θ''<sub>''j''</sub>&nbsp;=&nbsp;exp(''β''<sup>''′''</sup>''X''<sub>''j''</sub>) and ''X''<sub>1</sub>, ..., ''X''<sub>''n''</sub> are the covariate vectors for the ''n'' independently sampled individuals in the dataset (treated here as column vectors).
 
The corresponding log partial likelihood is
 
::<math>
\ell(\beta) = \sum_{i:C_i=1} \left(\beta^\prime X_i - \log \sum_{j:Y_j\ge Y_i}\theta_j\right).
</math>
 
This function can be maximized over ''β'' to produce maximum partial likelihood estimates of the model parameters.
 
The partial [[Score (statistics)|score function]] is
::<math>
\ell^\prime(\beta) = \sum_{i:C_i=1} \left(X_i - \frac{\sum_{j:Y_j\ge Y_i}\theta_jX_j}{\sum_{j:Y_j\ge Y_i}\theta_j}\right),
</math>
 
and the [[Hessian matrix]] of the partial log likelihood is
 
::<math>
\ell^{\prime\prime}(\beta) = -\sum_{i:C_i=1} \left(\frac{\sum_{j:Y_j\ge Y_i}\theta_jX_jX_j^\prime}{\sum_{j:Y_j\ge Y_i}\theta_j} - \frac{\sum_{j:Y_j\ge Y_i}\theta_jX_j\times \sum_{j:Y_j\ge Y_i}\theta_jX_j^\prime}{[\sum_{j:Y_j\ge Y_i}\theta_j]^2}\right).
</math>
 
Using this score function and Hessian matrix, the partial likelihood can be maximized using the [[Newton's method|Newton-Raphson]] algorithm. The inverse of the Hessian matrix, evaluated at the estimate of ''β'', can be used as an approximate variance-covariance matrix for the estimate, and used to produce approximate [[standard error]]s for the regression coefficients.
 
===Tied times===
 
Several approaches have been proposed to handle situations in which there are ties in the time data. ''Breslow's method'' describes the approach in which the procedure described above is used unmodified, even when ties are present. An alternative approach that is considered to give better results is ''Efron's method''.<ref>{{cite journal | last=Efron | first=Bradley | year=1974 | journal=Journal of the American Statistical Association | pages=557–565 | volume=72 | title=The Efficiency of Cox's Likelihood Function for Censored Data | issue=359 | jstor=2286217}}</ref> Let ''t''<sub>''j''</sub> denote the unique times, let ''H''<sub>''j''</sub> denote the set of indices ''i'' such that ''Y''<sub>''i''</sub>&nbsp;=&nbsp;''t''<sub>''j''</sub> and ''C''<sub>''i''</sub>&nbsp;=&nbsp;1, and let ''m''<sub>''j''</sub>&nbsp;=&nbsp;|''H''<sub>''j''</sub>|.  Efron's approach maximizes the following partial likelihood.
 
::<math>
L(\beta) = \prod_j \frac{\prod_{i\in H_j}\theta_i}{\prod_{\ell=0}^{m-1}[\sum_{i:Y_i\ge t_j}\theta_i - \frac{\ell}{m}\sum_{i\in H_j}\theta_i]
}.
</math>
 
The corresponding log partial likelihood is
 
::<math>
\ell(\beta) = \sum_j \left(\sum_{i\in H_j} \beta^\prime X_i -\sum_{\ell=0}^{m-1}\log\left(\sum_{i:Y_i\ge t_j}\theta_i - \frac{\ell}{m}\sum_{i\in H_j}\theta_i\right)\right),
</math>
 
the score function is
 
::<math>
\ell^\prime(\beta) = \sum_j \left(\sum_{i\in H_j} X_i -\sum_{\ell=0}^{m-1}\frac{\sum_{i:Y_i\ge t_j}\theta_iX_i - \frac{\ell}{m}\sum_{i\in H_j}\theta_iX_i}{\sum_{i:Y_i\ge t_j}\theta_i - \frac{\ell}{m}\sum_{i\in H_j}\theta_i}\right),
</math>
 
and the Hessian matrix is
 
::<math>
\ell^{\prime\prime}(\beta) = -\sum_j \sum_{\ell=0}^{m-1} \left(\frac{\sum_{i:Y_i\ge t_j}\theta_iX_iX_i^\prime - \frac{\ell}{m}\sum_{i\in H_j}\theta_iX_iX_i^\prime}{\phi_{j,\ell,m}} - \frac{Z_{j,\ell,m}\times Z_{j,\ell,m}^\prime}{\phi_{j,\ell,m}^2}\right),
</math>
 
where
 
::<math>
\phi_{j,\ell,m} = \sum_{i:Y_i\ge t_j}\theta_i - \frac{\ell}{m}\sum_{i\in H_j}\theta_i
</math>
::<math>
Z_{j,\ell,m} = \sum_{i:Y_i\ge t_j}\theta_iX_i - \frac{\ell}{m}\sum_{i\in H_j}\theta_iX_i.
</math>
 
Note that when ''H''<sub>''j''</sub> is empty (all observations with time ''t''<sub>''j''</sub> are censored), the summands in these expressions are treated as zero.
 
==Time-varying predictors and coefficients==
 
Extensions to time dependent variables, time dependent strata, and multiple events per subject, can be incorporated by the counting process formulation of Andersen and Gill.<ref>
{{cite journal
| doi = 10.1214/aos/1176345976
| unused_data = .
| last1 = Andersen| first1 = P. |last2 = Gill |first2= R.
| year = 1982
| title = Cox's regression model for counting processes, a large sample study.
| journal = Annals of Statistics |volume = 10 | issue=4 | pages = 1100–1120 | jstor=2240714}}</ref>
 
In addition to allowing [[time-varying covariate]]s (i.e., predictors), the Cox model may be generalized to time-varying coefficients as well. That is, the proportional effect of a treatment may vary with time; e.g. a drug may be very effective if administered within one month of [[morbidity]], and become less effective as time goes on. The hypothesis of no change with time (stationarity) of the coefficient may then be tested. Details and software are available in Martinussen and Scheike (2006).<ref>Martinussen & Scheike (2006) ''Dynamic Regression Models for Survival Data'' (Springer).</ref>
 
In this context, it could also be mentioned that it is theoretically possible to specify the effect of covariates by using additive hazards,<ref>{{cite conference
| last = Cox |first = D. R. |authorlink= Norman Breslow
| title = Some remarks on the analysis of survival data
| conference = the First Seattle Symposium of Biostatistics: Survival Analysis
| year = 1997 }}</ref> i.e. specifying
::<math>
\lambda(t|X) = \lambda_0(t) + \beta_1X_1 + \cdots + \beta_pX_p = \lambda_0(t) + \beta^\prime X.
</math>
However, care must be taken to restrict <math>\lambda(t|X)</math> to non-negative values, if such [[additive hazards model]]s are used. Perhaps as a result of this complication, such models are seldom seen.
 
==Specifying the baseline hazard function==
 
The Cox model may be specialized if a reason exists to assume that the baseline hazard follows a particular form. In this case, the baseline hazard <math>\lambda_0(t)</math> is replaced by a given function. For example, assuming the hazard function to be the ''Weibull'' hazard function gives the ''Weibull proportional hazards model''.
 
Incidentally, using the Weibull baseline hazard is the only circumstance under which the model satisfies both the proportional hazards, and [[accelerated failure time model|accelerated failure time]] models.
 
The generic term ''parametric proportional hazards models'' can be used to describe proportional hazards models in which the hazard function is specified.  The Cox proportional hazards model is sometimes called a ''[[semiparametric model]]''  by contrast.
 
Some authors (e.g. Bender, Augustin and Blettner<ref>Bender, R., Augustin, T. and Blettner, M. (2006). ''Generating survival times to simulate Cox proportional hazards models'', Statistics in Medicine 2005; 24:1713–1723. {{doi|10.1002/sim.2369}}
</ref>) use the term ''Cox proportional hazards model'' even when specifying the underlying hazard function, to acknowledge the debt of the entire field to David Cox.
 
The term ''Cox regression model'' (omitting ''proportional hazards'') is sometimes used to describe the extension of the Cox model to include time-dependent factors.  However, this usage is potentially ambiguous since the Cox proportional hazards model can itself be described as a regression model.
 
==Relationship to Poisson models==
 
There is a relationship between proportional hazards models and [[Poisson regression]] models which is sometimes used to fit approximate proportional hazards models in software for Poisson regression.  The usual reason for doing this is that calculation is much quicker.  This was more important in the days of slower computers but can still be useful for particularly large data sets or complex problems.  Authors giving the mathematical details include Laird and Olivier (1981),<ref>
{{cite journal|unused_data=.|doi=10.2307/2287816|author=Nan Laird and Donald Olivier
|title=Covariance Analysis of Censored Survival Data Using Log-Linear Analysis Techniques
|journal=Journal of the American Statistical Association
|volume=76|issue=374|year=1981|pages=231–240 | jstor=2287816}}</ref> who remark
<blockquote>
    "Note that we do not assume [the Poisson model] is true, but simply use it as a device for deriving the likelihood."
</blockquote>
The book on generalized linear models by McCullagh and Nelder<ref>
{{cite book|author=P. McCullagh and J. A. Nelder
|edition=Second|year=2000
|title=Generalized Linear Models
|location=Boca Raton, Florida|publisher=Chapman & Hall/CRC
|chapter=Chapter 13: Models for Survival Data
|isbn=0-412-31760-5|unused_data=.}} (Second edition 1989; first CRC reprint 1999.)
</ref> has a chapter on converting proportional hazards models to [[generalized linear model]]s.
 
==See also==
{{Portal|Statistics}}
* [[Accelerated failure time model]]
* [[One in ten rule]]
* [[Weibull distribution]]
 
==Notes==
{{Reflist}}
 
==References==
*{{cite journal |first=V. |last=Bagdonavicius |first2=R. |last2=Levuliene |first3=M. |last3=Nikulin |year=2010 |title=Goodness-of-fit criteria for the Cox model from left truncated and right censored data |journal=Journal of Mathematical Sciences |volume=167 |issue=4 |pages=436–443 |doi=10.1007/s10958-010-9929-6 }}
*{{cite book |first=D. R. |last=Cox |first2=D. |last2=Oakes |year=1984 |title=Analysis of Survival Data |publisher=Chapman & Hall |location=New York |isbn=041224490X }}
*{{cite book |first=D. |last=Collett |year=2003 |title=Modelling Survival Data in Medical Research |edition=2nd |publisher=CRC |location=Boca Raton |isbn=1584883251 }}
*{{cite book |first=T. M. |last=Therneau |first2=P. M. |last2=Grambsch |year=2000 |title=Modeling Survival Data: Extending the Cox Model |publisher=Springer |location=New York |isbn=0387987843 }}
 
{{Statistics|analysis}}
 
{{DEFAULTSORT:Proportional Hazards Models}}
[[Category:Survival analysis]]
[[Category:Regression analysis]]
[[Category:Statistical models]]
[[Category:Poisson processes]]

Latest revision as of 07:03, 6 May 2014

Irwin Butts is what my wife enjoys to contact me though I don't really like becoming known as like that. To perform baseball is the hobby he will never stop performing. Bookkeeping is my occupation. Years in the past he moved to North Dakota and his family loves it.

Also visit my webpage :: std home test