Picking the Proper Test for Survival Analysis

Wed Nov 29, 2023

When comparing survival curves, several statistical tests are available, each with its specific characteristics and assumptions. These tests are designed to assess whether there are significant differences between the survival experiences of two or more groups. Here’s concise overview of the Log-rank, Gehan-Breslow, Tarone-Ware, and Peto-Peto tests.

Log-rank Test

  1. The Log-rank test is the most used test for comparing the survival functions of two groups.
  2. It assumes that the hazard ratios are proportional over time (the proportional hazards assumption).
  3. This test gives equal weight to events (like death or relapse) at all time points throughout the study period.

Example for Log-rank Test: You are comparing the survival times of patients with a specific cancer who are treated with two different chemotherapy drugs. The Log-rank test would be suitable if you expect the effectiveness of these drugs to be consistent over the entire study period.

Gehan-Breslow (Generalized Wilcoxon) Test

  1. The Gehan-Breslow test gives more weight to events that occur earlier in the study.
  2. It can be more sensitive to differences between groups when those differences are larger at the beginning of the study period.
  3. It does not assume proportional hazards.

Example for Gehan-Breslow Test: Suppose you're analyzing a dataset of patients who underwent a surgical procedure, and you're particularly interested in early postoperative complications leading to mortality. The Gehan-Breslow test would emphasize early events, which are crucial in this context.

Tarone-Ware Test

  1. The Tarone-Ware test is a compromise between the Log-rank and Gehan-Breslow tests.
  2. It weights the events by the number of individuals at risk but does so in a way that is less extreme than the Gehan-Breslow test.
  3. It can detect differences when there is a moderate change in hazards over time.

Example for Tarone-Ware Test: Consider a study comparing the survival of patients receiving standard treatment versus those enrolled in a new treatment program that's expected to improve survival more in the mid-term rather than immediately or at the very end of the study period.

Peto-Peto (O'Brien-Fleming) Test

  1. The Peto-Peto test is particularly suited for situations where the event rate is low, and the sample size is small.
  2. It is similar to the Log-rank test but is more robust when the proportional hazards assumption may not hold.
  3. This test also tends to give more weight to earlier events in the study.

Example for Peto-Peto Test: Imagine a trial for a rare disease with a very low incidence rate of the event of interest. The Peto-Peto test would be appropriate here, especially if you have a small cohort of patients and you suspect that the new treatment may have a more substantial effect early in the follow-up period.

In summary, while the Log-rank test is most suitable when the hazard ratios are proportional, the Gehan-Breslow, Tarone-Ware, and Peto-Peto tests can be preferable when this assumption doesn't hold or when early events are of particular interest. When writing your blog, you can explain how each test is tailored to different research questions and study designs in survival analysis.

Deciding which statistical test to use in survival analysis depends on the characteristics of your data and the assumptions you can reasonably make:

 

1.    Assessment of Proportional Hazards Assumption: If you expect the risk (hazard) to be proportional over time between your groups, the Log-rank test is typically the appropriate choice.

2.    Event Distribution Over Time: If early events carry more clinical significance or if you expect most events to occur early on, you might choose the Gehan-Breslow test due to its weighting towards earlier events.

3.    Weighting Preferences: The Tarone-Ware test can be a middle ground if you expect the proportional hazards assumption might not strictly hold and you want a test that is somewhat sensitive to the timing of events.

4.    Small Sample Size or Low Event Rate: The Peto-Peto test can be particularly useful when dealing with small sample sizes or low event rates, as it is more robust under these conditions.

 

You can indeed use multiple tests to analyze your survival data. Each test may provide different insights, especially if the proportional hazards assumption is questionable. However, it’s important to interpret the results with caution, as different tests may lead to different conclusions. Multiple testing can also increase the risk of type I error (false positives), so proper statistical adjustments or a pre-specified primary analysis plan is recommended.

 

Dr Shamshad Ahmad
Associate Professor, Department of Community and Family Medicine, AIIMS Patna

Launch your GraphyLaunch your Graphy
100K+ creators trust Graphy to teach online
MERIT INDIA 2024 Privacy policy Terms of use Contact us Refund policy