Descriptive statistics are used to DESCRIBE the study population using calculations, tables and/or graphs.
Statistics of central tendency:
Mean | The sum of all values in a group/# items in the group (Average) |
Median | The value in the middle of a group of values (Typical) |
Mode | The value that appears the most in a group of values (Most Common) |
Statistics of variation:
Range |
Range = (Highest # – Lowest #) The simplest way to describe variation in a set of values Very sensitive to data that doesn’t fit the typical pattern (called outliers) |
Interquartile Range (IQR) |
Identifies variation in a set of values after removing outliers (focus on the 50% of data closest to the mean) Reported as a range of numbers |
Standard Deviation (SD) |
Identifies variation in a set of values by estimating the average distance of each score from the mean Small SD = more concentrated Large SD = less concentrated |
Inferential statistics use data to make JUDGEMENTS about the differences between study groups for generalizing to the overall population.
P-value |
Evaluates the statistical significance of the differences between two study groups or the relationships between two study variables. It estimates the ability to reject the null hypothesis that there is no difference between the two things. Statistical significance is defined as p < 0.05, which is a < 5% chance that the decision to reject the null hypothesis is incorrect. |
T-test | Evaluates the difference in means between 2 study groups for a specific thing (called a variable) |
Analysis of Variance [ANOVA] | Evaluates the difference in means between 3+ study groups for a specific variable |
Correlation coefficient [r] |
Evaluates how to variables change in relation to each other. Positive: variables increase or decrease similarly (both up or both down) Negative: variables increase or decrease oppositely (one up, one down) |
Risk ratio/ Relative risk [RR] |
Evaluates how risk changes based on exposure. Most common in RCTs and cohort studies. Causal: exposure causes increased risk (RR > 1) Protective: exposure causes decreased risk (RR < 1) |
Odds ratio [OR] |
Evaluates how exposure impacts odds of an outcome or disease. Most common in case-control studies. Causal: outcome is more likely in exposed group (OR > 1) Protective: outcome is less likely in the exposed group (OR < 1) |
Confidence interval [95% CI] |
Provides a level of certainty about the chance of error by identifying the range of values where the true population value lies. A 95% CI provides a 95% certainty that the true value lies within that range. Smaller CI = greater certainty Larger CI = less certainty |
Biostatistical Term | Abbreviation | Definition |
---|---|---|
Relative risk reduction | RRR | The percentage difference in risk or outcomes between treatment and control groups. Example: if mortality is 30% in controls and 20% with treatment, RRR is (30-20)/30 = 33 percent. |
Absolute risk reduction | ARR | The arithmetic difference in risk or outcomes between treatment and control groups. Example: if mortality is 30% in controls and 20% with treatment, ARR is 30-20=10%. |
Number needed to treat | NNT | The number of patients who need to receive an intervention instead of the alternative in order for one additional patient to benefit. The NNT is calculated as: 1/ARR. Example: if the ARR is 4%, the NNT = 1/4% = 1/0.04 = 25. |
Biostatistical Term | Abbreviation | Definition |
---|---|---|
Sensitivity | Sn |
Percentage of patients with disease who have a positive test for the disease in question. (True positives) *chance that the test will correctly identify someone with the disease |
Specificity | Sp |
Percentage of patients without disease who have a negative test for the disease in question. (True negatives) *chance that the test will correctly identify someone who is disease-free |
Positive predictive value | PPV |
Percentage of patients with a positive test for a disease who do have the disease in question. (True positives) *impacted by prevalence of the disease in a population |
Negative predictive value | NPV |
Percentage of patient with a negative test that do not have the disease in question. (True negatives) *impacted by prevalence of the disease in a population |
Pre-test probability |
Probability of disease before a test is performed. | |
Post-test probability |
Probability of disease after a test is performed. | |
Likelihood ratio | LR |
LR > 1 indicates and increased likelihood of disease. LR < 1 indicates a decreased likelihood of disease. The most helpful tests generally have a ratio of < 0.2 OR > 5. |