Article Text


Later is not necessarily better: limitations of survival analysis in studies of long-term drug treatment of psychiatric conditions
  1. Joanna Moncrieff1,
  2. Janus Christian Jakobsen2,3,
  3. Max Bachmann4
  1. 1Division of Psychiatry,University College London,London, 英国
  2. 2Copenhagen Trial Unit, Centre for Clinical Intervention Research,Copenhagen University Hospital,Copenhagen, Denmark
  3. 3Department of Regional Health Research, The Faculty of Health Sciences,南丹麦大学,Odense, Denmark
  4. 4University of East Anglia,诺里奇, 英国
  1. Correspondence toDr Joanna Moncrieff, Division of Psychiatry, University College London, London WC1E 6BT, UK;j.moncrieff{at}



  • mental disorders
  • health services research
  • psychiatry
  • methods

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from


Survival analysis was initially developed to analyse risk of death over time, but is now used for the analysis of many categorical outcomes in health research, including relapse of mental health conditions. The outcome survival analysis assesses is the time to the outcome or ‘event’ in question. Common methods include Cox regression analysis, which produces a Hazard Ratio (HR) and is dependent on the assumption that the ratio between the hazard rates in the two groups is constant over time (proportional hazards assumption). The rank sum test does not make this assumption and tests the statistical significance of the difference in the overall survival time between the groups but does not yield an effect estimate. The Kaplan Meier method graphically describes survival over time.

Tests of differences between groups based on survival analysis are statistically efficient when the duration of participants’ follow-up varies, and therefore commonly preferred to the alternative method of comparing the proportion of events at a particular time.1 2Limitations to do with censoring and low precision of the last stages of survival curves are well-recognised.3In the following paragraphs, we consider problems of analysing studies of drug treatment for relapse prevention in psychiatric conditions. These follow from the fact that survival analysis depends on the time it takes for an event to occur, combined with the potential for withdrawal of previous drug treatment to exert a differential influence on the risk of relapse in intervention trials that employ a discontinuation design.

Time to event is not necessarily the key outcome

Survival analysis is useful when the timing of the outcome is of importance and depends on people wanting to avoid early adverse events more than late ones. Tests based on survival analysis may show highly significant results if relapse is delayed by a few weeks, for example. However, the clinical relevance of a temporary delay in relapse in a long-term psychiatric condition that may last for decades has not been established, and statistically significant results are a questionable basis for implementing an intervention that may be of limited importance to patients. There has been no published research or discussion about whether patients, carers or clinicians consider a delay in relapse to be important, the minimum length of delay that would qualify, and how these views are influenced by the possible causes of relapse. Despite the widespread use of survival analysis there has been little consideration of its implications or how it compares with other approaches. In an analysis of relapse definitions in 81 trials of antipsychotics, for example, none of them considered whether time to relapse or overall risk was more relevant.4

Non-proportional hazards

If follow-up is short, differences between groups may be misleading, because short-term differences may not persist with longer term follow-up. In other words, it cannot be assumed that the HR or risk ratio between the groups remains constant over a longer duration. The difficulties caused by non-proportional hazards or risks when survival curves converge or cross are well-recognised by statisticians, yet frequently ignored when trials are reported.5In theory, a single HR is misleading in this situation, and Cox regression, which depends on the assumption of proportional hazards, should not be conducted. In practice, non-proportional hazards are common, but the proportional hazards assumption is frequently not tested and single HRs are often quoted.5Although the log rank test is technically correct in this situation, since it simply tests for a difference in the survival curves without assuming proportional hazards, it does not account for the pattern of the curves and its use forecloses discussion about the causes of varying hazards. Therefore, the use of the log rank test may also lead to misleading results, or at least results that are difficult to interpret clinically.6

已经提出了各种统计解决方案,以管理发现非比例危害的情况。6 7However, these are still based on ‘time to event’ analysis, and hence assume that delaying the event is the key desirable outcome.

Withdrawal effects in psychiatric trials

Trials of long-term treatment in psychiatry, including trials referred to as relapse prevention trials, employ a discontinuation design in which the withdrawal of previous treatment is compared with its continuation. Many trials of psychiatric drugs, including antipsychotics, antidepressants and lithium8–11reveal that people randomised to switch to placebo or no treatment show a high rate of early relapse that declines over subsequent months or years (eg, seefigures 1 and 2). This high rate of early relapse is absent or less marked in those randomised to continue treatment, and hence there are substantial differences in relapse rates to begin with, which diminish over time, as demonstrated in meta-analyses of antipsychotic and antidepressant relapse prevention trials, for example.12–14早期复发率高的原因是有争议的,并且存在几种可能性。15It has been argued that it reflects a naturally high risk of relapse in some disorders, such as schizophrenia,16but the untreated risk in most psychiatric disorders, including schizophrenia, is not known. Where historical evidence exists, as in bipolar disorder, it suggests the underlying risk is lower than following discontinuation of drug treatment.17 18在慢性病下,早期复发也可能代表了先前被药物抑制的潜在症状的重新出现,但这只能解释少数早期复发,因为大多数人没有严重的慢性症状和长期症状通常为预防药物规定术语药物。另一种可能性是,由于出现戒断效应或戒断过程引起的复发,先前治疗的撤回会增加不良结果以上的不良结果的风险。


Time to relapse (days) among people with first-episode psychosis randomised to maintenance antipsychotic treatment (MT) or supported reduction (DR). Reproduced with permission from Wunderinket al.8Copyright (2013) American Medical Association. All rights reserved.5DR, discontinuation; MT, maintenance treatment.


Time to relapse among people with treatment-resistant depression randomised to esketamine plus antidepressant or placebo plus antidepressant. Reprinted from Singhet al.34Copyright (2020) with permission from Elsevier.21

It is well-established that psychiatric drugs of all kinds produce physiological withdrawal effects when they are stopped, especially if people have been using them for long periods.19These are manifested in physical and psychological symptoms, and may be mistaken for relapse, since symptoms overlap and there are no definitive ways of distinguishing the two situations.20Although withdrawal symptoms are traditionally thought to be short-lived, accumulating evidence suggests they can sometimes be protracted over many weeks or months.21–23除此之外,戒断本身可能会导致某些疾病的复发,包括精神分裂症或精神病和双相情感障碍,从而增加了几个月复发的风险。19 24 25例如,对被诊断患有躁郁症的人进行锂治疗的研究表明,在锂停止后发展发作的风险高于启动锂之前的风险。10 26 27与突然中断相比,一些证据表明复发的可能性较小,这一事实也支持了可能因药物戒断而导致的可能性。28

Survival analysis in the presence of withdrawal effects

Several commentators have highlighted how withdrawal effects confound the interpretation of relapse prevention studies.15 19 24 25These studies may not, in fact, provide reliable data about the benefits of starting long-term medication, only about the adverse effects of stopping it. Nevertheless, trials of treatment discontinuation are valuable since many people are established on long-term treatment that may not be beneficial or that they want to stop. Whatever the purpose of a treatment withdrawal trial, the pattern of adverse outcomes following randomisation is important to understand. Survival curves can be helpful in illustrating such patterns if follow-up is long enough for withdrawal effects to evolve and dissipate. However, a single, global test of the difference between those who continue on treatment and those who withdraw over the whole period based on the time to relapse obscures these effects. This is important because the occurrence of withdrawal effects may affect how people view the desirability of remaining on, or coming off treatment. Moreover, such tests are often presented as if they are equivalent to tests of cumulative differences in risk at the end of follow-up. In a situation of constant hazards this is the case, but where hazards rates diverge or cross, results of tests based on survival analysis may conflict with tests of the eventual cumulative risk ratio. Asfigures 1 and 2illustrate, if survival curves show an early divergence and then meet, tests based on the overall difference of survival curves will indicate a positive effect for the treatment that produces the fewest early relapses, but tests based on the cumulative risks at later follow-up may show no difference. Two examples illustrate these arguments.

Example 1: antipsychotics and relapse prevention in first-episode psychosis

在荷兰进行的一项试验比较antipsychotic discontinuation with maintenance treatment in people with a first episode of psychosis followed people up initially at 18 months and then again after 7 years.

在18个月的随访,生存分析revealed a constant HR, which was estimated by Cox regression analysis and indicated that the risk of relapse was increased by 2.3 times in the group randomised to discontinuation.29

Data from the 7-year follow-up, however, showed that survival curves converged at around 3 years and then crossed; therefore, the proportional hazard assumption of Cox regression would not be fulfilled (figure 2).8但是,对数秩检验可能会表明维护组的优势,因为停用组中复发的早期发生更频繁。

As discussed, there is no evidence about whether people value a delay in relapse in this situation, and if they do, what duration of delay would balance out the considerable adverse effects associated with antipsychotics; furthermore there is evidence from the long-term follow-up of this particular trial that antipsychotics may impair social functioning.8Although antipsychotics were reduced more gradually in this than in other studies, the fact that the excess risk of relapse occurred early on in the group randomised to antipsychotic reduction suggests it may have been associated with the process of discontinuation. The study demonstrates the importance of long-term follow-up, and of not assuming that short-term outcomes are equivalent to long-term outcomes. The HR generated at the 18 month follow-up is not equivalent to the ratio of the cumulative proportions of people relapsing after 7 years, even though hazards were initially proportional. Use of the log rank test would obscure this withdrawal effect. Therefore, the cumulative proportion of people relapsing at different follow-up points, including the final follow-up, would be a better test of the overall outcome in this situation. This is what the authors presented in the follow-up report.5


A further example is provided by the results of a relapse prevention trial of esketamine for people with treatment resistant depression.30Although esketamine is a relatively new preparation, withdrawal symptoms following recreational use of ketamine (a similar drug) are recognised, and include low mood and anxiety, which may be mistaken for relapse.31Psychological factors may also precipitate relapse following withdrawal, since significant unblinding is likely to occur after switching to placebo (due to loss of the psychoactive effects of esketamine), leading to anxiety about treatment withdrawal.

The original analysis of this trial was performed using the Kaplan Meier method and the log rank test showed a significant difference between esketamine and placebo (p=0.003). Survival curves indicated varying HRs but did not cross. The maximum divergence of risk between the groups occurred within the first 8 weeks following randomisation, suggesting a likely withdrawal effect. Although the authors of the esketamine trial asserted there was no evidence of a withdrawal syndrome, no details were provided, and it was not explained how relapse was distinguished from possible withdrawal.30Subsequent data from this study and from others suggest that withdrawal symptoms occur commonly, are similar to those reported with ketamine,32and may be interpreted as a relapse of depression.

A letter published in response to this trial pointed out that results were strongly influenced by an ‘outlier’ site that reported a particularly large difference between the groups. The author of the letter analysed the data excluding this site by comparing the proportions of participants relapsing in both groups over the whole trial duration using the Fisher’s exact test.33This found no statistically significant difference between the groups (p=0.13). The authors of the original trial objected that this was not the best way to analyse the data and applied survival analysis, as they had done in the original trial report.34This revealed survival curves that crossed at around 9 months of follow-up (figure 2). The log rank test was used to compare the groups and indicated a statistically significant difference (p=0.048), and Cox regression was apparently used to estimate a HR.

比例风险假设是违反了because the survival curves converge, it is incorrect to use Cox regression to calculate an overall HR, but the log rank test could also be misleading, because it obscures the convergence of the survival curves after what appears to be an early withdrawal effect. Treatment resistant depression is a long-term condition, and no research has yet clarified whether delaying symptom recurrence for a few months would outweigh adverse effects or represent value for money. It is also important to recognise the likelihood of a withdrawal effect, rather than to assume that the difference between the groups is the result of the treatment per se. Evidence of a withdrawal effect has a bearing on the cost-benefit analysis of starting treatment, and is particularly important in view of the fact that acute trials of esketamine have not demonstrated a clinically relevant effect.35


There was no direct patient and public involvement in the preparation of this article, due to the lack of funding. However, there is a high degreee of concern among patients and the public about withdrawal effects from psychiatric drugs and how their impact has been misunderstood and underestimated.36这篇文章在一定程度上受第一author’s discussions with patients who are concerend about the interpretation of randomised drug trials.


Both examples illustrate how survival analysis may be misleading because it obscures a possible withdrawal effect, and delaying time to relapse has not been established as a worthwile outcome. Both studies also underline the importance of conducting long-term follow-up, since the outcomes of different interventions and treatment approaches can vary over time. Further research is needed on how patients and carers value a delay in relapse, and further debate is required about whether time to relapse should be preferred over the overall risk of relapse, especially in view of withdrawal-related adverse effects. Until then, we suggest that survival analysis should not be routinely employed in trials of interventions aimed at relapse prevention in long-term psychiatric conditions. Statistical methods for comparing proportions, such as the χ2test or logistic regression, should be used instead, complemented by Kaplan Meir survival curves to illustrate the timing of outcomes.

Ethics statements

Patient consent for publication



  • 推特@joannamoncrieff

  • Contributors所有作者都为手稿提供了合作的想法。JM撰写了初稿,MB和JCJ帮助修改了这一点。

  • FundingThe authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interestsJM is the chief investigator on an NIHR-funded trial of antipsychotic reduction and co-investigator on an NIHR-funded trial of methods of antidepressant discontinuation. MB and JCJ have no competing interests.

  • Provenance and peer reviewNot commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.