Model verification

On the basis of the coefficient and its error of estimation we can infer if the independent variable for which the coefficient was estimated has a significant effect on the dependent variable. For that purpose we use Wald test.

Hypotheses:

\begin{array}{cc}
\mathcal{H}_0: & \beta_i=0,\\
\mathcal{H}_1: & \beta_i\ne 0.
\end{array}

or, equivalently:

\begin{array}{cc}
\mathcal{H}_0: & OR_i=1,\\
\mathcal{H}_1: & OR_i\ne 1.
\end{array}

The Wald test statistics is calculated according to the formula:

\begin{displaymath}
\chi^2=\left(\frac{b_i}{SE_{b_i}}\right)^2
\end{displaymath}

The statistic asymptotically (for large sizes) has the Chi-square distribution with $1$ degree of freedom.

The p-value, designated on the basis of the test statistic, is compared with the significance level $\alpha$:

\begin{array}{ccl}
$ if $ p \le \alpha & \Longrightarrow & $ reject $ \mathcal{H}_0 $ and accept $ 	\mathcal{H}_1, \\
$ if $ p > \alpha & \Longrightarrow & $ there is no reason to reject $ \mathcal{H}_0. \\
\end{array}

A good model should fulfill two basic conditions: it should fit well and be possibly simple. The quality of Cox proportional hazard model can be evaluated with a few general measures based on: $L_{FM}$ - the maximum value of likelihood function of a full model (with all variables),

$L_0$ - the maximum value of the likelihood function of a model which only contains one free word,

$d$ - the observed number of failure events.

  • Information criteria are based on the information entropy carried by the model (model insecurity), i.e. they evaluate the lost information when a given model is used to describe the studied phenomenon. We should, then, choose the model with the minimum value of a given information criterion.

$AIC$, $AICc$, and $BIC$ is a kind of a compromise between the good fit and complexity. The second element of the sum in formulas for information criteria (the so-called penalty function) measures the simplicity of the model. That depends on the number of parameters ($k$) in the model and the number of complete observations ($d$). In both cases the element grows with the increase of the number of parameters and the growth is the faster the smaller the number of observations.

The information criterion, however, is not an absolute measure, i.e. if all the compared models do not describe reality well, there is no use looking for a warning in the information criterion.

  • Akaike information criterion

\begin{displaymath}
AIC=-2\ln L_{FM}+2k,
\end{displaymath}

It is an asymptomatic criterion, appropriate for large sample sizes.

  • Corrected Akaike information criterion

\begin{displaymath}
AICc=AIC+\frac{2k(k+1)}{d-k-1},
\end{displaymath}

Because the correction of the Akaike information criterion concerns the sample size (the number of failure events) it is the recommended measure (also for smaller sizes).

  • Bayesian information criterion or Schwarz criterion

\begin{displaymath}
BIC=-2\ln L_{FM}+k\ln(d),
\end{displaymath}

Just like the corrected Akaike criterion it takes into account the sample size (the number of failure events), Volinsky and Raftery (2000)1).

  • Pseudo R$^2$ - the so-called McFadden R$^2$ is a goodness of fit measure of the model (an equivalent of the coefficient of multiple determination $R^2$ defined for multiple linear regression).

The value of that coefficient falls within the range of $<0; 1)$, where values close to 1 mean excellent goodness of fit of the model, $0$ – a complete lack of fit. Coefficient $R^2_{Pseudo}$ is calculated according to the formula:

\begin{displaymath}
R^2_{Pseudo}=1-\frac{\ln L_{FM}}{\ln L_0}.
\end{displaymath}

As coefficient $R^2_{Pseudo}$ does not assume value 1 and is sensitive to the amount of variables in the model, its corrected value is calculated:

\begin{displaymath}
R^2_{Nagelkerke}=\frac{1-e^{-(2/d)(\ln L_{FM}-\ln L_0)}}{1-e^{(2/d)\ln L_0}} \quad \textrm{or}\quad R^2_{Cox-Snell}=1-e^{\frac{(-2\ln L_0)-(-2\ln L_{FM})}{d}}.
\end{displaymath}

The basic tool for the evaluation of the significance of all variables in the model is the Likelihood Ratio test. The test verifies the hypothesis:

\begin{array}{cc}
\mathcal{H}_0: & \textrm{all }\beta_i=0,\\
\mathcal{H}_1: & \textrm{there is }\beta_i\neq0.
\end{array}

The test statistic has the form presented below:

\begin{displaymath}
\chi^2=-2\ln(L_0/L_{FM})=-2\ln(L_0)-(-2\ln(L_{FM})).
\end{displaymath}

The statistic asymptotically (for large sizes) has the Chi-square distribution with $k$ degrees of freedom.

The p-value, designated on the basis of the test statistic, is compared with the significance level $\alpha$:

\begin{array}{ccl}
$ if $ p \le \alpha & \Longrightarrow & $ reject $ \mathcal{H}_0 $ and accept $ 	\mathcal{H}_1, \\
$ if $ p > \alpha & \Longrightarrow & $ there is no reason to reject $ \mathcal{H}_0. \\
\end{array}

Hypotheses:

\begin{array}{cl}
\mathcal{H}_0: & AUC=0.5, \\
\mathcal{H}_1: & AUC\neq 0.5.
\end{array}

The test statistic has the form:

\begin{displaymath}
Z=\frac{AUC-0.5}{SE_{0.5}},
\end{displaymath}

where:

$SE_{0.5}$ - field error.

The statistic $Z$ has asymptotically (for large numbers) a normal distribution.

The p-value, designated on the basis of the test statistic, is compared with the significance level $\alpha$:

\begin{array}{ccl}
$ if $ p \le \alpha & \Longrightarrow & $ reject $ \mathcal{H}_0 $ and accept $ 	\mathcal{H}_1, \\
$ if $ p > \alpha & \Longrightarrow & $ there is no reason to reject $ \mathcal{H}_0. \\
\end{array}

In addition, a proposed cut-off point value for the combination of independent variables and model parameters is given for the ROC curve.

EXAMPLE cont. (remissionLeukemia.pqs file)

1)
Volinsky C.T., Raftery A.E. (2000) , Bayesian information criterion for censored survival models. Biometrics, 56(1):256–262