Guides: Evidence-Based Practice: Biostatistics

Descriptive Statistics

Descriptive statistics are used to DESCRIBE the study population using calculations, tables and/or graphs.

Statistics of central tendency:

Mean	The sum of all values in a group/# items in the group (Average)
Median	The value in the middle of a group of values (Typical)
Mode	The value that appears the most in a group of values (Most Common)

Statistics of variation:

Range

Range = (Highest # – Lowest #)

The simplest way to describe variation in a set of values

Very sensitive to data that doesn’t fit the typical pattern (called outliers)

Interquartile Range (IQR)

Identifies variation in a set of values after removing outliers (focus on the 50% of data closest to the mean)

Reported as a range of numbers

Standard Deviation (SD)

Identifies variation in a set of values by estimating the average distance of each score from the mean

Small SD = more concentrated

Large SD = less concentrated

Inferential Statistics

Inferential statistics use data to make JUDGEMENTS about the differences between study groups for generalizing to the overall population.

P-value	Evaluates the statistical significance of the differences between two study groups or the relationships between two study variables. It estimates the ability to reject the null hypothesis that there is no difference between the two things. Statistical significance is defined as p < 0.05, which is a < 5% chance that the decision to reject the null hypothesis is incorrect.
T-test	Evaluates the difference in means between 2 study groups for a specific thing (called a variable)
Analysis of Variance [ANOVA]	Evaluates the difference in means between 3+ study groups for a specific variable
Correlation coefficient [r]	Evaluates how to variables change in relation to each other. Positive: variables increase or decrease similarly (both up or both down) Negative: variables increase or decrease oppositely (one up, one down)
Risk ratio/ Relative risk [RR]	Evaluates how risk changes based on exposure. Most common in RCTs and cohort studies. Causal: exposure causes increased risk (RR > 1) Protective: exposure causes decreased risk (RR < 1)
Odds ratio [OR]	Evaluates how exposure impacts odds of an outcome or disease. Most common in case-control studies. Causal: outcome is more likely in exposed group (OR > 1) Protective: outcome is less likely in the exposed group (OR < 1)
Confidence interval [95% CI]	Provides a level of certainty about the chance of error by identifying the range of values where the true population value lies. A 95% CI provides a 95% certainty that the true value lies within that range. Smaller CI = greater certainty Larger CI = less certainty

Biostatistical Terms - Therapies

Biostatistical Term	Abbreviation	Definition
Relative risk reduction	RRR	The percentage difference in risk or outcomes between treatment and control groups. Example: if mortality is 30% in controls and 20% with treatment, RRR is (30-20)/30 = 33 percent.
Absolute risk reduction	ARR	The arithmetic difference in risk or outcomes between treatment and control groups. Example: if mortality is 30% in controls and 20% with treatment, ARR is 30-20=10%.
Number needed to treat	NNT	The number of patients who need to receive an intervention instead of the alternative in order for one additional patient to benefit. The NNT is calculated as: 1/ARR. Example: if the ARR is 4%, the NNT = 1/4% = 1/0.04 = 25.

Biostatistical Terms - Diagnostic Testing

Biostatistical Term	Abbreviation	Definition
Sensitivity	Sn	Percentage of patients with disease who have a positive test for the disease in question. (True positives) *chance that the test will correctly identify someone with the disease
Specificity	Sp	Percentage of patients without disease who have a negative test for the disease in question. (True negatives) *chance that the test will correctly identify someone who is disease-free
Positive predictive value	PPV	Percentage of patients with a positive test for a disease who do have the disease in question. (True positives) *impacted by prevalence of the disease in a population
Negative predictive value	NPV	Percentage of patient with a negative test that do not have the disease in question. (True negatives) *impacted by prevalence of the disease in a population
Pre-test probability		Probability of disease before a test is performed.
Post-test probability		Probability of disease after a test is performed.
Likelihood ratio	LR	LR > 1 indicates and increased likelihood of disease. LR < 1 indicates a decreased likelihood of disease. The most helpful tests generally have a ratio of < 0.2 OR > 5.