## Comparing (Fancy) Survival Curves with Weighted Log-rank Tests

We have just adopted weighted Log-rank tests to the survminer package, thanks to survMisc::comp. What are they and why they are useful? Read this blog post to find out. I used ggthemr to make the presentation a little bit more bizarre.

# Log-rank statistic for 2 groups

Log-rank test, based on Log-rank statistic, is a popular tool that determines whether 2 (or more) estimates of survival curves differ significantly. As it is stated in the literature, the Log-rank test for comparing survival (estimates of survival curves) in 2 groups ($A$ and $B$) is based on the below statistic

where

and

• $t_i$ for $i=1, \dots, T$ are possible event times,
• $n_{t_i}$ is the overall risk set size on the time $t_i$ ($n_{t_i} = n_{t_i}^A+n_{t_i}^B$),
• $n_{t_i}^A$ is the risk set size on the time $t_i$ in group $A$,
• $n_{t_i}^B$ is the risk set size on the time $t_i$ in group $B$,
• $o_{t_i}$ overall observed events in the time $t_i$ ($o_{t_i} = o_{t_i}^A+o_{t_i}^B$),
• $o_{t_i}^A$ observed events in the time $t_i$ in group $A$,
• $o_{t_i}^B$ observed events in the time $t_i$ in group $B$,
• $e_{t_i}$ number of overall expected events in the time $t_i$ ($e_{t_i} = e_{t_i}^A+e_{t_i}^B$),
• $e_{t_i}^A$ number of expected events in the time $t_i$ in group $A$,
• $e_{t_i}^B$ number of expected events in the time $t_i$ in group $B$,
• $w_{t_i}$ is a weight for the statistic,

that’s why we can substitute group $A$ with $B$ in $U$ and receive same results.

# Weighted Log-rank extensions

Regular Log-rank comparison uses $w_{t_i} = 1$ but many modifications to that approach have been proposed. The most popular modifications, called weighted Log-rank tests, are available in ?survMisc::comp

• n Gehan and Breslow proposed to use $w_{t_i} = n_{t_i}$ (this is also called generalized Wilcoxon),
• srqtN Tharone and Ware proposed to use $w_{t_i} = \sqrt{n_{t_i}}$,
• S1 Peto-Peto’s modified survival estimate $w_{t_i} = S1({t_i}) = \prod_{i=1}^{T}(\frac{1-e_{t_i}}{n_{t_i}+1})$,
• S2 modified Peto-Peto (by Andersen) $w_{t_i} = S2({t_i}) = \frac{S1({t_i})n_{t_i}}{n_{t_i}+1}$,
• FH Fleming-Harrington $w_{t_i} = S(t_i)^p(1 - S(t_i))^q$.

Watch out for FH as I submitted an info on survMisc repository where I think their mathematical notation is misleading for Fleming-Harrington.

## Why are they useful?

The regular Log-rank test is sensitive to detect differences in late survival times, where Gehan-Breslow and Tharone-Ware propositions might be used if one is interested in early differences in survival times. Peto-Peto modifications are also useful in early differences and are more robust (than Tharone-Whare or Gehan-Breslow) for situations where many observations are censored. The most flexible is Fleming-Harrington method for weights, where high p indicates detecting early differences and high q indicates detecting differences in late survival times. But there is always an issue on how to detect p and q.

Remember that test selection should be performed at the research design level! Not after looking in the dataset.

# Plots

After preparing a functionality for this GitHub’s issue Other tests than log-rank for testing survival curves we are now able to compute p-values for various Log-rank tests in survminer package. Let as see below examples on executing all possible tests.

### gghtemr

Let’s make it more interesting (or not) with ggthemr package that has many predefinied palettes.

After installation

one can set up a global ggplot2 palette/theme with

and check current colors with

Note: the first colour in a swatch is a special one. It is reserved for outlining boxplots, text etc. For color lines first color is not used.

# References

• Gehan A. A Generalized Wilcoxon Test for Comparing Arbitrarily Singly-Censored Samples. Biometrika 1965 Jun. 52(1/2):203-23. JSTOR

• Tarone RE, Ware J 1977 On Distribution-Free Tests for Equality of Survival Distributions. Biometrika;64(1):156-60. JSTOR

• Peto R, Peto J 1972 Asymptotically Efficient Rank Invariant Test Procedures. J Royal Statistical Society 135(2):186-207. JSTOR

• Fleming TR, Harrington DP, O’Sullivan M 1987 Supremum Versions of the Log-Rank and Generalized Wilcoxon Statistics. J American Statistical Association 82(397):312-20. JSTOR

• Billingsly P 1999 Convergence of Probability Measures. New York: John Wiley & Sons. Wiley (paywall)