- Journal List
- J Adv Pract Oncol
- v.7(1); Jan-Feb 2016
- PMC5045282

As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with, the contents by NLM or the National Institutes of Health.

Learn more: PMC Disclaimer | PMC Copyright Notice

J Adv Pract Oncol. 2016 Jan-Feb; 7(1): 91–100.

Published online 2016 Jan 1. doi:10.6004/jadpro.2016.7.1.8

PMCID: PMC5045282

PMID: 27713848

William N. Dudley,1, PhD, Rita Wickham,2, PhD, RN, AOCN®, and NICHOLAS COOMBS,3, MS

Author information Copyright and License information PMC Disclaimer

Studies of how patients respond to treatment over time are fundamentally important to understanding how therapies influence quality of life and progression of disease during survivorship. When investigators examine change over time in continuous variables (e.g., patient self-reports of pain, fatigue, or nausea) in the same individuals, repeated measures are typically analyzed using analysis of variance (ANOVA) or perhaps latent growth curve modeling (Brant et al., 2011; ). Other studies—particularly those that compare the long-term effects of new drugs or other therapeutic regimens to some "standard" therapy—focus on time to *binary* (yes/no) disease-related events of interest, such as death (time to event). Such studies are particularly apropos to generating improvements in cancer therapies, in which new treatments are compared to "standard" regimens, and are shown or disproved to extend progression-free survival (PFS), time to progression, or overall survival (OS) in patients with a particular cancer.

Time-to-event studies typically employ two closely related statistical approaches, Kaplan-Meier (K-M) analysis and Cox proportional hazards model analysis (sometimes abbreviated as proportional hazards model or Cox model). K-M is a univariate approach, while Cox analysis is multivariable. Both use many familiar aspects of parametric and nonparametric statistical techniques (e.g., independent and dependent variables, null hypothesis testing, and confidence intervals). On the other hand, survival analyses employ other analytical techniques, terms, and computations that some oncology advanced practitioners (APs) may be less familiar with.

No published research that addressed oncology APs’ knowledge and ability to interpret statistical tests was found, but a study of medical residents examined their knowledge within the context of statistical procedures used in medical studies (). Results proved that there was a mismatch between statistical procedures used and these clinicians’ understanding, and therefore the ability to judge the quality and veracity of published research. That is, more than 81% correctly interpreted relative risk, but only 10.5% understood K-M, and 11.9% could interpret the 95% confidence interval (CI) and statistical significance. Given that APs’ knowledge deficits may be somewhat similar (consider your own understanding of these concepts), this article will succinctly describe and illustrate K-M analysis.

## KAPLAN-MEIER ANALYSIS

Kaplan and Meier (1958) first described the approach and formulas for the statistical procedure that took their name in their seminal paper, *Nonparametric Estimation From Incomplete Observations*. They described the term "death," which could be used metaphorically to represent any *potential* event subject to random sampling, particularly when complete observations of all members of a random sample cannot be made. Incomplete observations often occur because contact with some sample members is lost before the event, some other intervening variable affects the *event*, or insufficient time has passed to observe the event in all sample members. Any of these cases would result in a participant being *censored*, as discussed further below. An event is a binary variable that can only have a *yes* or *no* value (e.g., death, hospital discharge or readmission, heart attack, recovery from an infection, or relapse from smoking cessation, etc.). K-M analyses are not unique to medical studies; they are used by researchers in other disciplines to study time to particular events.

**Survival Analyses**

Survival analyses are statistical methods used to examine changes over time to a specified event. K-M is the most frequent survival analysis method used in randomized (phase III and some phase II) medical clinical trials in which the following criteria are met:

Patients are randomly assigned to different treatment arms;

All patients do not enter the study at the same time;

Patients drop out of or are lost from the study at different time intervals after entering the study; and

The outcome variable of interest may or may not occur during the study observation period (Rich et al., 2010).

K-M can calculate how long after starting a particular treatment that the studied event (e.g., death, disease progression, etc.) occurred for individuals who were not otherwise lost to the sample—or until the study has ended (Peto et al., 1977; Rich et al., 2010).

**Underlying Concepts and Terms**

Understanding studies analyzed with K-M requires appreciation of associated concepts, terms, assumptions, and methods. Other important concepts are the "rules" for the study and K-M analysis set before the study is implemented. These include the conditions under which a study will be stopped early, stopping boundaries, how to deal with missing data, and the number and points of data analyses.

One important factor is that patients enter clinical trials and are eliminated from the sample (and data analysis) at different times: when a study opens, as accrual continues, for a predetermined period, or until a desired sample size is reached, as patients die (or experience another event of interest) or are lost from the sample for another reason. When patients are lost from a K-M study for any reason, they are considered to be censored. Being censored does not have any negative connotation; it is merely part of the language of K-M.

Censoring is a major difference between K-M and more traditional parametric analyses, in that researchers must adjust the data at each point where one or more patients are lost from the study for any reason to take censored cases into account (Rich et al., 2010). Sample members become censored when investigators cannot determine if or when a subject ultimately experiences the negative event, and it can occur during (when the subject experiences the event or otherwise drops out or is lost from the study) or at the end of the study (right censoring of all remaining subjects because no further data will be collected). Important assumptions are that censored patients have the same likelihood of survival as those continuing in the study (an assumption not easily testable), and that survival probabilities are the same whether individuals enter a study early or late (can be examined with split-half analysis; Jager et al. 2008). Censored patients are included in probability estimates of the event to the evaluation point preceding their censoring, adding a maximum amount of data, but are eliminated from subsequent analyses (Blagoev, Wilkerson, & Fojo, 2012).

Missing data is a problem that can potentially bias data analysis and statistics. One important way to deal with this is to use the intent-to-treat strategy, which includes all patients who entered the study in the sample denominator and requires patient follow-up and data collection whenever possible (Shih, 2002). Another strategy that can help address this problem is to track the numbers of patients in each arm who withdraw and reasons for withdrawal and to include this information in research reports.

A way to envision these concepts is to consider a hypothetical trial, and the first 10 consenting patients randomized to arm A shown in Figure 1A. We can appreciate the sequential order that patients in the cohort entered the study, and whether they experienced the event (E) or were censored (C). We cannot determine how long each patient remained on study (his or her serial time) before E or C occurred, but this might be a brief or extended period. Some study participants do not experience the study event, and others are dropped or are withdrawn from the study for one or more reasons. For instance, since data collection has not yet ended, patient 2 has not experienced the outcome and has not yet been censored, as the serial line is continuing (if this were the endpoint for data collection, all remaining patients become censored). Before data analysis, all patients in each cohort are first arranged from the shortest to longest serial time (time on study) and are analyzed as if they all began the study at the same time point, as shown in Figure 1B. In this representation, it is easier to see that patients have varying serial times to the event or to becoming censored (; Rich et al., 2010).

Figure 1

(A) Hypothetical first patients assigned to arm A as they sequentially enter the study. Each patient’s serial time (on study) ends with an E (for death in this example) or a C (for censored). If the data analysis is completed at the end of the period marked by the right border of the x-axis, all patients who have not died or already been censored are censored at that point. (B) The same hypothetical patients are arranged from shortest to longest serial time before K-M analysis, allowing us to see the initial intervals that will be graphed on the horizontal (time) axis.

In addition, rules for boundaries to stop a study early, and the number of endpoints of planned data analyses, should be explicit before a study is implemented. A predetermined stopping boundary is a method to determine if a study can be stopped early—for instance, when the primary outcome variable has been reached (Pocock, 2005). A stopping boundary must be stringent (e.g., have a small p value) to support meaningful clinical differences in treatments, which is suggested to be .01 to confirm clinical benefit. This is crucial to legitimately support clinically relevant, evidence-based practice; to achieve an adequate sample size within the intended duration of a trial; to positively change particular therapies; to meet the goals of researchers and regulators to conduct scientifically rigorous studies; and to disseminate data supporting therapy advances as rapidly as possible (Zannad et al., 2012). When a highly statistically significant clinical trial benefit is confirmed and leads to early stopping, the researchers have an ethical responsibility to offer the better treatment to patients in the less effective treatment arm (considering that adverse effects and treatment burdens do not outweigh benefits). For example, costs of therapy may be a burdensome limitation for some patients because of insurance reimbursement policies.

**Interpreting a Kaplan-Meier Plot**

The statistical output for a K-M analysis offers a visual representation of predicted survival curves (i.e., from not experiencing the event of interest) of two or more groups. It is not a smooth curve or line, but it has a distinctive monotonic (one-direction) stair-step appearance. For any K-M estimator, the horizontal x-axis represents the time variable expressed in a linear fashion (i.e., weeks, months, years, etc.). All patients start at the top (1.0 or 100%) of the y-axis, which indicates the sample proportion that has not experienced the studied event. Each horizontal line (except for the first) begins and ends with the occurrence of the event in two subsequent patients in a treatment arm (Jager et al., 2008; Rich et al., 2010). Only the event influences the duration of a particular interval, whereas censored patients are usually indicated by tick marks (or dots) along the interval in which they were censored.

In cancer clinical trials, negative events (e.g., PFS or OS) result in a left to right descending pattern as patients no longer "survive" the event: They experience disease progression or death. Survival curves can actually go down or up to show the same information over time; downward plots display patients who have not experienced the event, whereas upward plots illustrate the cumulative patients who did experience the event ().

The length of each horizontal line represents the survival duration for that interval, and all survival estimates to a given point represent the cumulative probability of surviving to that time. Intervals are not identical, and a strength of the K-M plot is that it can manage varying interval lengths (Rich et al., 2010). Figure 2A shows how each study interval (after the first) begins with the studied event in one patient and ends with the event in the next patient in that cohort. This leads to the K-M plot looking like a series of downward steps. The probability of surviving an interval is related to the number of patients in that interval: Both the numerator and the denominator decrease by the number of patients who experienced the event plus those who were censored. Each of these probabilities contributes to the subsequent and final probability of not experiencing the event (e.g., progression or death).

Figure 2

(A) A hypothetical Kaplan-Meier curve of one cohort (arm). Each horizontal portion is the interval between the studied event between one and the next subject in that arm. Only the event influences the interval length, whereas tick marks indicate censored subjects. (B) Median survival (from experiencing the studied event) can be estimated in both arms by drawing a line on the y-axis at 0.5 (50%). Locating the point at which each intersects 0.5 shows median survival is approximately 6.5 months in cohort B and 11 months in cohort A.

The "steps" of the K-M plot provide the visual representation of individuals who have or have not experienced the event. We can look at the K-M plot in Figure 2A and calculate predicted survival for the first interval. Assuming the original sample had 10 patients, if we did not consider the censored patient, the estimated survival at this point (the first drop) would be 9/10 (90%). However, this is actually 8/9 (88.8%). Each interval is assumed to be independent, and each affects the subsequent interval. The vertical lines connect each interval and represent the decrease in likelihood of having experienced the event at that point. In Figure 2A, at 2.5 months after starting the study, the chance of not experiencing the event is 88.8%. It is important to understand that this example only serves to illustrate a K-M "curve" in a very simple fashion. Very small samples are prone to error and are less valid and reliable than large samples. If patients were still accruing to the study, analysis would occur at a later, more reasonable time to judge treatment efficacy.

K-M curves with many small steps have larger sample sizes, while those with large steps usually have a limited number of subjects and are thus less accurate (Rich et al., 2010). "Drops" can be seen in the curve at variable intervals, and in studies with long observation times, bigger decreases toward the right side of the plot are seen because later events are larger fractions of the probability estimate for the remaining cohort (Blagoev et al., 2012). Similarly, few surviving patients at the right side of the K-M curve mean less accurate survival estimates and greater uncertainty compared to when many patients are in a study. It has been recommended to halt estimations of survival curves when the proportion of patients who have not experienced the event becomes unduly small, perhaps when only 10% to 20% of an original large sample or fewer than 10 patients in a small study are still being followed (Bollschweiler, 2003; Pocock et al., 2002).

If we look at the same hypothetical K-M plot with both treatment arms included (Figure 2B), we can see that both curves pass through the 50th percentile point. If a curve passes through 50%, the reader can quickly estimate median survival for patients in that treatment arm by drawing a vertical line from where the curve crosses the 50% to the x (time) axis and comparing median survival if both curves pass through the 50% point. We can see in the hypothetical example that median survival (50% of patients would be estimated to be surviving) is about 11 months for treatment A and 6.5 months for treatment B. Median survival is reported in most studies because survival times are usually skewed, and the median is a better measure of centrality than the mean. Furthermore, there is no way to know if or when patients who are alive and not censored at the end of a study will experience the event of interest, so a mean cannot be calculated (Jager et al., 2008). The reader can also see that the curves appear to have separated. Again, this is not a realistic K-M example, but if this separation were consistent over time, it would give us confidence about real treatment differences.

K-M estimates are most commonly reported with the log-rank test or with hazard ratios. The log-rank test calculates chi-squares (𝝌₂) for each event time, which are summed to calculate an ultimate chi-square for each arm (Jager et al., 2008; Rich et al., 2010). Log-rank results compare the full curves of each group and generate a significance level (*p* value; Rich et al., 2010). The log-rank test allows between-group comparisons of survival estimates but not the size of a potential difference or of confounding variables such as age.

Hazard ratios quantify the opposite likelihood: that the "hazardous" event will occur during study intervals (Blagoev et al., 2012), and are similarly calculated by summing 𝝌₂ for each event, providing the final observed and expected numbers for the full K-M curve (Rich et al., 2010). In the simplest terms, a hazard ratio expresses the chance (or hazard) of the events occurring in the treatment arm as a ratio of the events occurring in the control group. A hazard ratio has no dimensions and by itself provides only information about the uniformity and reliability of the data (Blagoev et al., 2012). Hazard ratios change over time and are reflected in the slope of the K-M plot. Reported hazard ratios assume that the differences between groups are a constant distance apart (i.e., the K-M survival curves) and are proportional. If this assumption is not met, a reported hazard ratio is irrelevant. A hazard ratio of greater than 1 or less than 1 means that survival was better in one of the groups ().

## DISCUSSION: CLEOPATRA ANALYSIS

The Clinical Evaluation of Pertuzumab and Trastuzumab (CLEOPATRA) trial, a randomized, double-blind, multinational phase III study, accrued 808 patients in the intent-to-treat population over 29 months (2008 to 2010). Study findings were published in three articles (Baselga et al., 2012; Swain et al., 2013, 2015). The study timeline is briefly summarized in Table 1. The CLEOPATRA study was sponsored by a pharmaceutical company that, in partnership with the senior academic authors, collected and analyzed the data (no independent statistician was reported). An overview of the CLEOPATRA trial can be found in the article by Karen Herold.

Table 1

CLEOPATRA Overview

The study was planned with two prespecified K-M analyses: an interim one and a final one done after 381 events of disease progression or death from any cause (Baselga et al., 2012). Random assignment resulted in 406 patients in the control group (placebo + trastuzumab + docetaxel) and 402 patients in the pertuzumab group (pertuzumab + trastuzumab + docetaxel); treatment was planned for every 3 weeks to progression. Only the dose of docetaxel could be altered. Eligible patients had locally recurrent, unresectable, or metastatic (no central nervous system spread) HER2-positive breast cancer. Independent tumor assessments were done every 9 weeks until disease progression or death. The primary endpoint was PFS, defined as "radiographic confirmation of disease progression by Response Evaluation Criteria in Solid Tumors (RECIST) guidelines or death from any cause within 18 weeks after the last independent tumor assessment." The study employed the common practice of using date of the radiographic confirmation as the date of disease progression, which most likely occurred somewhere between the 9-week tumor assessment intervals. This would increase the duration of PFS and introduce bias into calculated median survival ().

In accordance with the requirements for a trustworthy K-M trial, a single, prespecified interim analysis of the primary study outcome of PFS after 381 patients experienced progression was planned (Baselga et al., 2012). This was calculated to give the study an 80% power to detect a 33% improvement in median PFS in the pertuzumab group. PFS is frequently the primary endpoint, particularly as new targeted therapies become available (Panageas et al., 2007). Progression-free survival is a desirable outcome because it is not influenced by later-line therapies and can be measured earlier than OS. This reduces new drug development time and more rapidly brings effective agents to market.

In CLEOPATRA, K–M was used to estimate the independently assessed median PFS in each group, log-rank test to compare PFS between the two groups, and Cox proportional-hazards model to estimate the hazard ratio and 95% CIs. Analysis of OS was planned to be done after 385 patients had died if the stopping boundary had not been crossed (Baselga et al., 2012). Other secondary endpoints objective response (OR) rate and safety would be analyzed at this point.

In this analysis, 80.2% of patients in the pertuzumab group and 69.3% in the control group experienced an OR (Baselga et al., 2012). Median PFS was 18.5 months in the pertuzumab group and 12.4 months in the control group. This exceeded the study hypothesis that median PFS would be 33% greater in the pertuzumab than in the control group. The hazard ratio for progression or death was 0.62 (95% CI = 0.51–0.75), *p* < .001 in favor of pertuzumab.

Similar to odds ratios and relative risk, a hazard ratio is interpreted as such: Those in the treatment (pertuzumab) group experienced death at a rate 38% less than those in the control group. This decrease could be as great as 49% or as little as 25% with 95% confidence. With this interval ranged less than and not including the value of one, we would conclude the hazard ratio is both *protective* and *statistically significant*. Interim analysis of OS was done after 165 events: 96 deaths in the control group and 69 in the pertuzumab group. The data showed a "strong trend toward a survival benefit" with pertuzumab and the hazard ratio was 0.64 (95% CI = 0.47–0.88; *p* = .005), which the authors stated was not statistically significant (recall from the earlier discussion that the p value should be ≤ .01). The number of patients censored and the reasons for censoring were not included in this paper, possibly because both disease progression and death were included in the definition of not meeting PFS.

After another year of follow-up, a second *unplanned* interim analysis of CLEOPATRA was done because European health authorities requested more information about OS of study patients (Swain et al., 2013). By that time, 267 deaths—154 in the control group and 113 in the pertuzumab group (69% of prespeciﬁed total for OS)—had occurred. The stopping boundary for each interim analysis was preset to use the O’Brien-Fleming approach to deal with multiple data analyses ("multiple looks"), and the stopping boundary was crossed.

Multiple looks, or data-dependent stopping, are used to find evidence of a significantly large treatment difference to end a study earlier than originally planned. The major problem with this is the likelihood of type I error (falsely rejecting the null hypothesis that there is no difference between treatments) increases with each interim analysis, so unplanned interim analyses are discouraged. A second problem is "multiple outcomes," which occurs when a study that focuses on one outcome necessarily focuses on other outcomes that are likely interdependent and not independent. For instance, the definition of PFS includes disease progression or death, which overlaps with OS. In addition, most patients probably died from their disease, but deaths could be related to other factors.

Swain and colleagues (2013) correctly recognized the importance of not increasing the risk for type I error in the analysis of OS. They amended the protocol again to apply the O’Brien-Fleming stopping boundary, defined as a significance level of ≤ .0138 and a hazard ratio of ≤ 0.739. More patients in the control group died than did those in the pertuzumab cohort, 154 of 406 (38%) and 113 of 402 (28%), respectively. The hazard ratio of 0.66 (95% CI = 0.52−0.84, *p* = .0008) crossed the preset O’Brien-Fleming stopping boundary, leading the authors to conclude there was a statistically significant OS benefit for patients who had received pertuzumab in addition to trastuzumab plus docetaxel.

The final CLEOPATRA article was descriptive and updated OS and PFS (Swain et al., 2015). This was essentially the icing on the cake because significant benefit for adding pertuzumab to trastuzumab plus docetaxel had been established in the reported second interim analysis (Swain et al., 2013). The definition of PFS was changed to "the time from randomization to documented radiographic evidence of progression" (no mention of death).

Patients who were alive or lost to follow-up were censored at the last date they were known to be alive, which is what we would expect in K-M analysis (Swain et al., 2015). The most common reason for censoring was disease progression, followed distantly by life-threatening adverse treatment-related events (see Table 2). In addition to changing definitions for the final analysis, K-M curves were handled differently. That is, in Baselga et al. (2012), the PFS K-M plot indicates all points at which events took place as tick marks on the plot (Figure 3). The reason for this was not given, but it illustrates the importance of reading figure legends. On the other hand, Swain and colleagues (2015) showed the K-M curve we would expect, with tick marks showing the time points at which patients were censored (Figure 4).

Table 2

Patients Withdrawn (Censored) From CLEOPATRA Study

Figure 3

Kaplan-Meier estimates of progression- free survival in patients in the intention to- treat population in the CLEOPATRA trial. Tick marks designate the times of events. This highlights the importance of carefully reading legends, particularly in Kaplan-Meier curves in which tick marks or dots usually indicate censored individuals. Adapted from Baselga et al. (2012).

Figure 4

Kaplan-Meier estimates of overall survival in the intention-to-treat population in the CLEOPATRA trial. In this curve, tick marks indicate censored patients. Because this curve shows overall survival, censored patients most likely experienced progressive disease, and some of the early ones were probably docetaxel-related toxicity. If we look at the plot and estimate overall survival, our calculations will be close to what was found in the statistical analysis (56.5 months in the pertuzumab group and 30.8 months in the control group). Note that the number at risk decreases as the curve moves to the right, and most patients have been censored or died. According to recommendations, analysis after 60 months would not be recommended because of decreased accuracy. Adapted from Swain et al. (2015).

A total of 168 (41.8%) in the pertuzumab group and 221 (54.4%) in the control group had died by the time of the final report (Swain et al., 2015). As expected, the hazard ratio favored the pertuzumab cohort (0.68; 95% CI = 0.56–0.84; *p* < .001). Median OS in the pertuzumab group was 56.5 months (95% CI = 49.3 mo–not reached) and 40.8 months (95% CI = 35.8–48.3 mo) in the control group: a difference of 15.7 months. Estimates of OS shown in Table 3 illustrate the fact that the likelihood of being alive was greater at 1, 2, 3, and 4 years for patients receiving pertuzumab than those in the control group (Swain et al., 2015). The reported CIs overlapped only at year 1, meaning that these values were within the bounds of random chance (Pocock et al., 2002).

Table 3

Overall Predicted Survival in the CLEOPATRA Trial

## CONCLUSION

In sum, K-M analyses of the CLEOPATRA study met the expectations of the statistical technique and addressed potential limitations. For instance, the issue of multiple looks was correctly addressed by making the p value more stringent. The authors also included important measures of statistical uncertainty—confidence intervals—that support the CLEOPATRA conclusions and give readers confidence in the research reports.

## Footnotes

The authors have no potential conflicts of interest to disclose.

## References

1. Baselga José, Cortés Javier, Kim Sung-Bae, Im Seock-Ah, Hegg Roberto, Im Young-Hyuck, Roman Laslo, Pedrini José Luiz, Pienkowski Tadeusz, Knott Adam, Clark Emma, Benyunes Mark C, Ross Graham, Swain Sandra M. Pertuzumab plus trastuzumab plus docetaxel for metastatic breast cancer. *The New England journal of medicine*. 2012;366:109–119. [PMC free article] [PubMed] [Google Scholar]

2. Blagoev Krastan B, Wilkerson Julia, Fojo Tito. Hazard ratios in cancer clinical trials--a primer. *Nature reviews. Clinical oncology*. 2012;9:178–183. [PMC free article] [PubMed] [Google Scholar]

3. Bollschweiler Elfriede. Benefits and limitations of Kaplan-Meier calculations of survival chance in cancer surgery. *Langenbeck’s archives of surgery / Deutsche Gesellschaft für Chirurgie*. 2003;388:239–244. [PubMed] [Google Scholar]

4. Brant J M, Beck S. L, Dudley W N, Cobb P, Pepper G, Miaskowski C. Symptom trajectories in posttreatment cancer survivors. . *Cancer Nursing*. 2011;34(1):67–77. [PubMed] [Google Scholar]

5. Dudley William N, McGuire Deborah B, Peterson Douglas E, Wong Bob. Application of multilevel growth-curve analysis in cancer treatment toxicities: the exemplar of oral mucositis and pain. *Oncology nursing forum*. 2009;36:E11–19. [PubMed] [Google Scholar]

6. Jager Kitty J, van Dijk Paul C, Zoccali Carmine, Dekker Friedo W. The analysis of survival data: the Kaplan-Meier method. *Kidney international*. 2008;74:560–565. [PubMed] [Google Scholar]

7. Kaplan E L, Meier P. Nonparametric estimation from incomplete observations. *Journal of the American Statistical Association*. 1958;53:457–481. [Google Scholar]

8. Panageas Katherine S, Ben-Porat Leah, Dickler Maura N, Chapman Paul B, Schrag Deborah. When you look matters: the effect of assessment schedule on progression-free survival. *Journal of the National Cancer Institute*. 2007;99:428–432. [PubMed] [Google Scholar]

9. Peto R, Pike M C, Armitage P, Breslow N E, Cox D R, Howard S V, Mantel N, McPherson K, Peto J, Smith P G. Design and analysis of randomized clinical trials requiring prolonged observation of each patient. II. analysis and examples. *British journal of cancer*. 1977;35:1–39. [PMC free article] [PubMed] [Google Scholar]

10. Pocock Stuart J. When (not) to stop a clinical trial for benefit. *JAMA*. 2005;294:2228–2230. [PubMed] [Google Scholar]

11. Pocock Stuart J, Clayton Tim C, Altman Douglas G. Survival plots of time-to-event outcomes in clinical trials: good practice and pitfalls. *Lancet (London, England)* 2002;359:1686–1689. [PubMed] [Google Scholar]

12. Rich Jason T, Neely J Gail, Paniello Randal C, Voelker Courtney C J, Nussenbaum Brian, Wang Eric W. A practical guide to understanding Kaplan-Meier curves. *Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery*. 2010;143:331–336. [PMC free article] [PubMed] [Google Scholar]

13. Shih W J. Problems in dealing with missing data and informative censoring in clinical trials. *Current Controlled Trials in Cardiovascular Medicine*. 2002;3:4–11. [PMC free article] [PubMed] [Google Scholar]

14. Spruance S L, Reid J E, Grace M, Samore M. Hazard ratio in clinical trials. *Antimicrobial Agents and Chemotherapy*. 2004;48:2787–2792. [PMC free article] [PubMed] [Google Scholar]

15. Swain Sandra M, Baselga José, Kim Sung-Bae, Ro Jungsil, Semiglazov Vladimir, Campone Mario, Ciruelos Eva, Ferrero Jean-Marc, Schneeweiss Andreas, Heeson Sarah, Clark Emma, Ross Graham, Benyunes Mark C, Cortés Javier. Pertuzumab, trastuzumab, and docetaxel in HER2-positive metastatic breast cancer. *The New England journal of medicine*. 2015;372:724–734. [PMC free article] [PubMed] [Google Scholar]

16. Swain Sandra M, Kim Sung-Bae, Cortés Javier, Ro Jungsil, Semiglazov Vladimir, Campone Mario, Ciruelos Eva, Ferrero Jean-Marc, Schneeweiss Andreas, Knott Adam, Clark Emma, Ross Graham, Benyunes Mark C, Baselga José. Pertuzumab, trastuzumab, and docetaxel for HER2-positive metastatic breast cancer (CLEOPATRA study): overall survival results from a randomised, double-blind, placebo-controlled, phase 3 study. *The Lancet. Oncology*. 2013;14:461–471. [PMC free article] [PubMed] [Google Scholar]

17. Windish Donna M, Huot Stephen J, Green Michael L. Medicine residents’ understanding of the biostatistics and results in the medical literature. *JAMA*. 2007;298:1010–1022. [PubMed] [Google Scholar]

18. Zannad Faiez, Gattis Stough Wendy, McMurray John J V, Remme Willem J, Pitt Bertram, Borer Jeffrey S, Geller Nancy L, Pocock Stuart J. When to stop a clinical trial early for benefit: lessons learned and future approaches. *Circulation. Heart failure*. 2012;5:294–302. [PubMed] [Google Scholar]

Articles from Journal of the Advanced Practitioner in Oncology are provided here courtesy of **BroadcastMed LLC**