Key concepts of clinical trial analysis for everyone to understand. 
‘Texas shooter’ 2

Key concepts of clinical trial analysis for everyone to understand. 
‘Texas shooter’ 2

Can we help?

Leading expert in clinical trial methodology and epidemiology, Dr. Stephen Evans, MD, explains key statistical concepts for patients. He clarifies what an underpowered trial is and why it fails to detect real treatment effects. Dr. Evans details the importance of pre-specified primary endpoints to avoid bias. He also breaks down the Number Needed to Treat (NNT) metric, highlighting its uses and limitations. These concepts are vital for interpreting medical news and understanding treatment efficacy.

Understanding Clinical Trial Analysis: Power, Endpoints, and NNT Explained

Jump To Section

Underpowered Clinical Trials

An underpowered clinical trial lacks sufficient participants to reliably detect a real treatment effect. Dr. Stephen Evans, MD, explains that a trial's power is its ability to find a true difference if one exists. He uses COVID-19 treatment trials as an example, noting that studying mortality requires a large sample size because death rates can be low.

For instance, detecting a reduction in mortality from 10% to 7% requires a large number of patients. If a trial is too small, it becomes underpowered and may miss a clinically important benefit. Early COVID-19 trials were often underpowered for mortality outcomes. Dr. Stephen Evans, MD, emphasizes that power relates directly to the specific outcome being studied.

Primary vs. Secondary Endpoints

Clinical trials define primary and secondary endpoints to measure treatment success. The primary endpoint is the main outcome the trial is designed to evaluate. Dr. Stephen Evans, MD, notes that mortality is a crucial but challenging primary endpoint because it requires large patient numbers.

Researchers often choose easier-to-study primary outcomes, like time to recovery or viral load. These objective measures can require fewer participants. However, Dr. Evans cautions that these definitions must be clear and established before the trial begins. Changing endpoints after seeing results introduces significant bias and invalidates findings.

Texas Sharpshooter Fallacy

The Texas Sharpshooter Fallacy is a critical concept in clinical trial integrity. Dr. Stephen Evans, MD, describes it as drawing a target around bullet holes after firing a gun. In research, this means changing the trial's primary outcome after seeing the data to get a desired result.

This practice introduces severe bias and undermines the trial's validity. While legitimate reasons to change endpoints exist, they must occur before unblinding results. Dr. Evans stresses that pre-specification of endpoints is essential for credible clinical trial analysis. This prevents researchers from manipulating outcomes to show false positive results.

Number Needed to Treat (NNT)

The Number Needed to Treat (NNT) is a useful metric for patients to understand treatment benefit. Dr. Stephen Evans, MD, defines NNT as the number of patients who need to receive a treatment to prevent one bad outcome. For example, if a drug reduces mortality from 10% to 5%, the NNT is 20.

This means 20 people must be treated to prevent one death. However, Dr. Stephen Evans, MD, notes important limitations. NNT is not a pure number; it depends on follow-up time and outcome definition. Comparisons between treatments are only valid if the NNT is calculated identically. Despite its simplicity, NNT requires careful interpretation.

Interpreting Trial Results

Correctly interpreting clinical trial results requires understanding key statistical concepts. Dr. Stephen Evans, MD, advises looking for adequately powered studies with pre-specified endpoints. This ensures the findings are reliable and not due to chance or bias.

Patients should consider the clinical relevance of the outcomes. A statistically significant result may not be meaningful if the NNT is very high. Dr. Anton Titov, MD, highlights the importance of these concepts for public health literacy. Understanding power, endpoints, and NNT helps everyone critically evaluate medical news and make informed decisions.

Full Transcript

Dr. Anton Titov, MD: Professor Evans, there are several basic concepts in clinical trials. What does it mean, for example, that the trial is underpowered? Clinical trial terminology is now at the forefront; it's out in the newspapers. People have to understand these basic concepts. So what does it mean if a trial is underpowered? What is NNT, number needed to treat? There are pluses and minuses and that kind of basic concept. What are the primary and secondary endpoints of clinical trials? Clearly, a few trials have been moving goalposts, and this has been common data in the medical community.

Dr. Stephen Evans, MD: We'll try to take nearly all of our examples from the current situation with COVID-19. If we are going to study mortality, that will require a fairly large number of people. Fortunately, not everyone will die, even in a hospital situation. If we have, let's say, 10% of people dying, then to find a difference that would probably be quite important—say, reducing that 10% mortality rate within 30 days of beginning treatment down to 7% mortality—we go from 10% down to 7%. We're going to need a large number of patients to be able to find out whether such a difference is actually occurring.

We do statistical analysis on that. But if the numbers are too small in the trial, then that is a trial we call underpowered. The power of the study to detect a real difference, if it exists, was too low. This was true for some of the early trials that were carried out on potential treatments for COVID-19.

Whereas if we study thousands of patients, then it's unlikely that the trial will be underpowered for mortality as an outcome, provided we're dealing with differences that are reasonable. If we wanted to detect a difference between a 10% mortality rate and a 9.9% mortality rate, we would need tens of thousands of patients. That, of course, is not a difference that would be very useful to individual patients.

So underpowered trials are a problem. It's underpowered in relation to the outcome you study. If you made mortality your primary outcome, you would need a lot of patients. Very often, what people do is make mortality a secondary outcome and make their primary outcome something that is easier to study and for which we need fewer patients.

In this kind of situation, that is often the time to recovery from the disease. The problem with that is that it can be slightly subjective. You can define someone reaching a level of recovery based on a clinical assessment, but it may be based on viral load or something of that kind, which is an objective assessment.

So we may be able to have an objective assessment for a primary outcome that is easier to study than mortality. The problem is that when we look at recovery, we have a definition for it. But it may be that people don't meet those definitions. It becomes obvious in the trial that the outcome you set out as a primary one is not going to give you any useful data.

There can be legitimate reasons for changing it. But the difficulty is that if people know what the results are showing, they can change the question and therefore get the answer they want. In epidemiology, this is called the Texas sharpshooter syndrome, where the Texas gunman stands at the side of a barn and fires his gun at the barn, and then afterward walks up and draws a target.

You need in a trial to have a target specified beforehand, then do the trial and see what the results are, rather than changing the target while the trial is running. In general, there can be legitimate reasons for changing your outcome. But you've got to be very careful and make sure that you're not doing it after having already fired your gun and seen where the bullets fall.

You need to do it before you know where the bullets are falling.

When we come to measuring the outcome, one of the things we can do is say, what is the mortality rate? Let's say we have a treatment difference of 10% down to 5%. That means in every hundred people, there will be five people who do not die as a result of having treatment. For every 20 people, there will be one person who does not die.

When we turn that upside down, we say that the number needed to treat to prevent one death will be 20, with our difference between 10% and 5%. That would also be the case if there was a difference between 20% and 15% or between 50% and 45%. It is a measure of the number of patients who need to be treated to prevent one death.

Sometimes, instead of death, we look at a particular event like myocardial infarction or a stroke. The problem with this number is that it isn't a pure number. It depends on how long you followed patients up for. It also has some other statistical problems with it.

So it's not one that I particularly like, even though it sounds quite a nice thing to say: "Oh, this drug needs 20 patients needed to treat to get the benefit, whereas this drug needs 50 patients needing to be treated." If you've used the same rules for both, then NNT can be quite helpful. But you've got to be careful to make sure that your definition of the NNT, which isn't a pure number, is used exactly the same when you make comparisons between treatments.