Clinical trials are the foundations of evidence-based treatments. Trials must be critically appraised to confirm the validity of conclusions. Further analysis is required to show if the results from the trial, where patients are carefully selected and followed up in detail, can be extrapolated to other patients and different settings. Data from additional sources including other trials, meta-analyses, practice guidelines, trusted opinions and clinical experiences modify prescribing practices.
Clinical trials are prospective studies designed to show the effects of treatments on specified outcomes in defined patient-groups. Randomised controlled trials have become the 'gold standard' for assessing therapeutic interventions. Rational prescribing practices are shaped by how we interpret trials, incorporate additional scientific data and extrapolate to the treatment of patients in our own settings, with the published and/or unpublished guidance of colleagues.1
Appraising clinical trials
We learn about new treatments from print or electronic publications, and at lectures, workshops, clinical and scientific meetings. Other sources are lay media, pharmaceutical companies and patients.
Trials help to resolve current therapeutic uncertainties and to provide frameworks for future decisions. Careful consideration of any trial exposes potential difficulties in the interpretation of studies addressing similar problems.
Questions to consider in appraising trials
What kind of outcomes were determined?
Health outcomes like death and clinical manifestations of disease are more tangible, both to ourselves and our patients, than non-clinical outcomes or 'surrogates' (e.g. effects on laboratory indices). 'Surrogate markers' are only useful when meaningful correlations with clinical events are well established.
Were there differences in beneficial and/or adverse outcomes between groups?
It is sometimes difficult to be certain that significant differences were either excluded or confirmed, that apparent differences were reported when there were no real differences (type I error), or that real differences were missed (type II error).
The probability that chance alone might account for the apparent differences between groups studied is often expressed in terms of 'p' values (a 'p' value of 0.05 means that if this study were repeated 100 times, you would expect this finding by chance 5 times). False-positive conclusions about trial outcomes due to such type I errors are minimised by considering 'p' values. A negative result of a trial might be because the trial was not powerful enough to detect differences of a size considered to be clinically important. This is called a type II error and is a very common problem in many clinical trials.
'Confidence intervals' (usually 95%) express the probability that if the trial were repeated 100 times, the differences between the groups would lie between the stated intervals in 95% of times. The use of confidence intervals is a way of expressing both type I and type II concepts. They provide additional information concerning the precision of differences found between groups by providing a range of effects rather than a 'point estimate'. The narrower the confidence interval, the more reliable the results of the trial. When a broad confidence interval just includes the point estimate of no effect of the intervention in question, a more powerful study might show a significant result.
Were the differences in outcome due to the treatment?
There are other factors which can affect the result of a trial and these should be considered when a trial is analysed.
Were patients allocated at random to different interventions?
Claims of random allocation of patients to different treatment-groups should be confirmed.2 This ensures that results were not influenced by particular interventions being chosen either by patients, their medical advisers, or by those undertaking the trial.
Were similar groups of patients compared?
Randomisation into groups for alternative treatments is necessary to make the patient groups similar. For a sufficiently large number of patients, randomisation should balance known and unknown factors. It is still possible for groups to differ in ways that might influence the outcome of the trial, especially when numbers are small and entry criteria broad. Groups allocated to different interventions following randomisation should have been similar in basic demographic parameters such as age and sex. When death or disability are outcomes, the groups should have been evenly matched for known prognostic factors. For example, the results will be unreliable if those with a good prognosis were given the trial treatment. Efforts should be made to eliminate the possibility of such 'random confounding' which distorts the interpretation of what might have caused a particular outcome.
Where appropriate, were the patients and/or the outcome-assessors and/or those undertaking the intervention 'blinded'?
Knowledge of the nature of the treatment (e.g. placebo or drug) might have resulted in differences in outcomes spuriously attributed to the intervention. At times, adverse consequences of interventions (e.g. drug adverse effects) can effectively 'unblind' a study.
Were the different groups of patients managed identically except for the interventions in question?
Differences in management (e.g. frequency and nature of follow-up visits) might also have altered outcomes. All aspects of management should be the same in each group of patients.
Were all patients in the different groups accounted for?
It is possible to draw false conclusions when assumptions have been made concerning patients lost to follow-up across the different treatment groups. Patients who have 'dropped out' affect the interpretation of the trial.
Were there differences in the uptake of treatment across patient groups?
Compliance - There is evidence to suggest that 'compliant' patients (even when the intervention is a placebo) have better outcomes in some circumstances.3 The report should comment on compliance in the different groups of patients compared. Poor compliance affects the precision of a trial as any effect of treatment is diluted.
'Cross overs' - In some trials (e.g. comparisons of medical and surgical interventions in patients with coronary heart disease), patients may elect to 'cross over', for medical or other reasons, into the alternative intervention group following randomisation. This will usually mean that the most appropriate comparison is made on the basis of randomisation into 'intention to treat', rather than 'treatment received', because of the element of choice involved in selecting treatments following randomisation. The magnitude and direction of such 'cross overs' may lead to contradictory interpretations.
How big was the effect of treatment?
Although a treatment may result in a demonstrable effect in a trial, the magnitude of the results needs to be carefully considered.
Does the outcome have biological relevance?
Some differences in outcomes large enough to have statistical significance may be too small to have biological relevance for future patients considering the treatment.
Relative versus absolute outcome differences
The results of clinical trials are commonly expressed in terms of 'relative risks', usually with confidence intervals. 'Relative risk' is the ratio of the incidences of outcomes in two groups of patients being compared. Large values for relative risk reduction are often incorrectly assumed to mean large effects on outcomes for individual patients. Absolute benefit depends on both the absolute incidence of outcome and the relative risk reduction after intervention (see clinical example 1).
Presenting absolute outcomes as the number of patients needed to be treated for one outcome event (NNT) expresses absolute outcomes concisely.4 For example, in middle-aged men with hypercholesterolaemia and no prior myocardial infarction, 50 need to be treated with pravastatin for 5 years to prevent one non-fatal myocardial infarction5 , NNT = 50.
How do trial outcomes ('efficacy') relate to the treatment of your next patient ('effectiveness')?
Although the purpose of clinical trials is to enhance evidence-based interventions, it is sometimes unreliable to extrapolate directly from trial results to treatments for other patients.
How similar is your patient, setting and follow-up to those reported?
When considering if the results of a trial can be generalised, one should know how patients were recruited (e.g. newspaper advertisements, referral from particular practitioners or hospitals) and what specific inclusion and exclusion criteria were applied, in addition to characteristics such as age, gender,
Clinical example 1
e.g. if disease A has 5% chance, and B has 80% chance of causing death, NNT (number of patients needing treatment for one to benefit) is 22 if A is treated and 13 if B is treated:
||Number of patients||Deaths if untreated||Deaths prevented by treatment||Deaths if treated||NNT to prevent one death|
|A||100||5||5 x 0.9 = 4.5||0.5||100/4.5 = 22|
|B||100||80||80 x 0.1 = 8||72||100/8 = 13|
ethnicity, social factors and medical co-morbidities (see clinical example 2). Furthermore, reported trial outcomes may only be possible with close attention to follow-up. This may be unachievable in community in contrast to teaching hospital settings.
Are results attributed to 'subgroups' of trial patients valid?
Incorrect conclusions may result when patients are sub-divided into smaller groupings.7 Confidence intervals are broader and the probability of spurious conclusions greater, especially when the analyses are made retrospectively after reviewing outcomes. Subgroups should have been specified before the trial commenced.
Group versus individual outcomes
When outcomes reflect recurrent events in groups (e.g. fracture rate per thousand patients per year in osteoporosis trials), it is especially difficult to determine the benefits for an individual patient.8
Acceptability of interventions
Some interventions may be unacceptable to your patient. Although 'efficacious' in the trial situation, they are 'ineffective' in practice.
Clinical example 2
Your patient is a depressed 60-year-old man of Central European background. A heavy smoker, he has a strong family history of coronary disease and marked hyperlipidaemia in spite of an appropriate diet. He is in atrial fibrillation, but has never been known to have had a myocardial infarction.
You wonder if you should recommend simvastatin on the basis of a study showing beneficial outcomes of lipid reduction in asymptomatic men.5
You will need to make a judgement about the potential benefits of simvastatin to your patient. A different but closely-related drug was used in the trial. It is possible, but most unlikely, that the trial findings are not applicable to Central European men. Although men with psychiatric disease were excluded, it seems unlikely that their inclusion would have altered the outcome. Likewise, the presence of atrial fibrillation seems unlikely to alter the potential benefits of lipid reduction, but could indicate significant non-coronary heart disease for which lipid reduction would be inappropriate. Furthermore, the magnitude of the benefits of lipid reduction in patients with atrial fibrillation associated with coronary heart disease and/or hyperlipidaemia may be different.
Clinical trials in perspective
Individual patients are unique and their particular settings mean that the application of trial results requires judgements based on experience, additional scientific knowledge and opinion. Issues include benefit, risk, cost and patient choice.
The unpublished or published opinions (e.g. reviews, editorials) of 'experts' are valuable both to help us understand trial results and to share clinical experiences. 'Experts', however, may disagree and their views may be incorrect and/or biased.
Additional clinical trials
Prescribers often encounter clinical trials with apparently different results for similar interventions. Appraisal of trials usually reveals why the results disagree. Some resolution is usually possible when different trials are 'weighted' according to their qualities.
Systematic reviews and meta-analyses (see future article)
Systematic reviews select clinical trials fulfilling specific criteria from all previous trials (including unpublished trials and diverse publication languages) and then calculate net outcomes. They offer more precision by pooling results over larger patient groups. Like trials, they vary in quality9 (e.g. whether or not data from individual patients are included, how individual trials are weighted, how differences in patients and interventions are handled). As the techniques of clinical mega-trials and systematic reviews evolve, especially when they appear to give contradictory results10 , there is much debate concerning their relative worth. They are best viewed as complementary techniques. The potential contribution of the Cochrane Collaboration11 , which prepares, maintains and disseminates systematic reviews, is enormous.
Clinical practice guidelines
These are designed to blend scientific knowledge with local consensus opinion in a form that is acceptable to prescribers and/or patients with the objective of higher quality and, where possible, more efficient health care.12
Combining risk assessment with trial outcomes
When relative risk is shown to be constant over a range of clinical circumstances, the absolute benefit will be greater (i.e. the NNT is smaller) when the intervention is applied to those at greater risk. For example, in non-valvular atrial fibrillation, when all other factors are equal, NNT will be smaller, and therefore anticoagulation more effective, when used in patients known to be at relatively higher risk of embolic stroke.13
Seeking better treatments by interpreting clinical trials is challenging and rewarding. Patients benefit from the critical appraisal of clinical trials and the evaluation of health-care interventions relevant to their problems. Extrapolation from the results of trials to sound treatment recommendations for particular patients requires additional assumptions and judgements. In those areas of medicine where trials have been undertaken, critical appraisal is a necessary requirement for rational prescribing.
Guyatt GH, Sackett DL, Cook DJ. Users' guides to the medical literature. II. How to use an article about therapy or prevention. A: Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA 1993;270:2598-601.
Guyatt GH, Sackett DL, Cook DJ. Users' guides to the medical literature. II. How to use an article about therapy or prevention. B: What were the results and will they help me in caring for my patients? Evidence-Based Medicine Working Group. JAMA 1994;271:59-63.
Sackett DL. Applying overviews and meta-analyses at the bedside. J Clin Epidemiol 1995;48:61-6.
Sackett DL. Evidence-based medicine: how to practice and teach EBM. New York: Churchill Livingstone, 1997.
Spitzer WO, editor. The Potsdam International Consultation on meta-analysis. J Clin Epidemiol 1995;48:1-171.
Doing more good than harm: the evaluation of health care interventions. Conference Proceedings. New York, March 22-25, 1993. Ann NY Acad Sci 1993;703:1-341.
- Evidence-Based Medicine Working Group. Evidence-based medicine. A new approach to teaching the practice of medicine. JAMA 1992;268:2420-5.
- Schultz KF. Subverting randomization in controlled trials. JAMA 1995;274:1456-8.
- Horwitz RI, Viscoli CM, Berkman L, Donaldson RM, Horwitz SM, Murray CJ, et al. Treatment adherence and risk of death after a myocardial infarction. Lancet 1990;336:542-5.
- Laupacis A, Sackett DL, Roberts RS. An assessment of clinically useful measures of the consequences of treatment. N Engl J Med 1988;318: 1728-33.
- Shepherd J, Cobbe SM, Ford I, Isles CG, Lorimer AR, MacFarlane PW, et al. Prevention of coronary heart disease with pravastatin in men with hypercholesterolaemia. West of Scotland Coronary Prevention Study Group. N Engl J Med 1995;333:1301-7.
- The West of Scotland Coronary Prevention Study Group. A coronary primary prevention study of Scottish men aged 45-64 years: trial design. J Clin Epidemiol 1992;45:849-60.
- Oxman AD, Guyatt GH. A consumer's guide to subgroup analyses. Ann Intern Med 1992;116:78-84.
- Windeler J, Lange S. Events per person year - a dubious concept. Br Med J 1995;310:454-6.
- Oxman AD, Cook DJ, Guyatt GH. Users' guides to the medical literature. VI. How to use an overview. Evidence-Based Medicine Working Group. JAMA 1994;272:1367-71.
- Borzak S, Ridker PM. Discordance between meta-analyses and large-scale randomized controlled trials. Examples from the management of acute myocardial infarction. Ann Intern Med 1995;123:873-7.
- Bero L, Rennie D. The Cochrane Collaboration. Preparing, maintaining and disseminating systematic reviews of the effects of health care. JAMA 1995;274:1935-8.
- National Health and Medical Research Council. Standing Committee on Quality of Care and Health Outcomes. Guidelines for the development and implementation of clinical practice guidelines. Canberra: Australian Government Publishing Service, 1995.
- Glasziou PP, Irwig LM. An evidence based approach to individualising treatment. Br Med J 1995;311:1356-9.