QMP Tutorial Home
Glossary

A
Absolute Benefit Increase (ABI)
Absolute Risk Increase (ARI)
Absolute Risk Reduction (ARR)
Adaptive Randomisation
Absolute sensitivity. See Sensitivity
Allocation concealment
Attrition Bias


Absolute Risk Increase (ARI)
ARI is used when the intervention increases the probability of a bad outcome. ARI is the absolute difference in rates of events (bad outcomes) between the intervention group and the control group. ARI = EER – CER, with 95% confidence intervals. See also Relative Risk Increase (RRI) and Number needed to harm (NNH).

Absolute Benefit Increase (ABI)
ABI is used when the intervention increases the probability of a good outcome. ABI is the absolute difference in rates of events (good outcomes) between the intervention group and the control group. ABI = EER – CER, with 95% confidence intervals. See also Relative Benefit Increase (RBI) and Number needed to treat (NNT).

Another name used for the above two indices is Absolute risk or rate reduction (ARR):

Absolute rate reduction ARR = EER - CER. The inverse of this value is Number needed to treat (NNT) i.e. NNT = 1/ARR

 

Adaptive randomisation
These techniques are used to change allocation probabilities as the allocation proceeds, in the case of imbalances of numbers or in baseline characteristics.

 

Allocation concealment
This is when the person who is enrolling the patients to a study is totally unaware to which group the next patient will be allocated (control or intervention).
 
Attrition bias
Due to patient attrition or drop out. If subjects leave the study for any reason (e.g. death due to other causes, adverse reactions, personal reasons etc) there may be loss of valuable information with each subject. If only a few subjects drop-out, attrition may not be a problem, but if more than 20% are lost to follow-up this might significantly affect the outcome and make it less valid.
 

B
Bar charts
Bayesian theory
Baseline variables/assessment
Best case / worst case method
Bias
Bimodal distributions
Binomial distribution
Blinding
Blocked randomisation
Box and Whisker Plot


Bar charts 

Bar charts are used for qualitative or discontinuous quantitative variables. The categories tend to be on the x-axis (horizontal axis) and frequency or percentage along the y-axis (vertical axis). There are gaps between the columns to illustrate the distinct categories. See example Graph 1A below.  

Often, bar charts are used for showing the relationship between 2 or more variables. For instance, if we also knew the number of hospital admissions per year for the sample in Graph 1A  by compliance, we could plot compliance with treatment on the x-axis against hospital admissions on the y-axis. It would give us an idea if there might be a link between the two variables. (What would you expect to see?)

graph1a

 

Bayesian theory
This is a whole theorem with its own probability calculus. You might come across it is as a completely different way of looking at the probability involved in analysing investigations - a whole different way of doing the statistics in fact.

It was first developed by Thomas Bayes in the 18th century so is based on older, more philosophical probability than the one that we use most (based on frequency). Put rather simply it takes a different approach when looking at a trial. The researchers would start by assigning a specific probability, known as the prior probability to the hypothesis under study. They would then carry out the study, collect the data and modify the prior probability depending on the results obtained. The revised probability is known as the posterior probability. Different statistical methods are used that we may touch on later in the online statistics tutorial course.

This way of looking at statistics is becoming increasingly popular so you may come across it now and then.

 
Baseline variables / assessment
It is important to assess and record the variables that may affect the outcomes for each subject. These can then be compared to assess whether you have a good randomisation: i.e. are the groups similar or are there certain characteristics within one group that are different to the other that might affect the results?

Best case / worse case method
This method is used to minimise attrition bias by accounting for all the drop-outs or cross-overs (subjects moving to another group) and assuming the best or worst case scenario could have occurred for these subjects (they could have the best results or the worst responses). The analysis compares the rate of loss between the groups and presents a worst case and best case scenario by analysing the results assuming that the missing patients all had a bad or all had a good outcome (analysed within their original groups).

 

 
Bias
Bias is a tendency within a study for uncontrolled influences to affect the outcomes and hence distorts the final results.

Binomial distribution

The Binominal distribution is an example of a discrete probability distribution. It is appropriate in cases where there are only two outcomes in a trial, success or failure. Graphs 3abc below show binomial distributions for when p is the success and n is the number of trials, if p for a single trial of this drug is p = 0.3. 

This distribution is very relevant to medicine as we often ask questions that have only two answers - e.g. Did the new drug help? Yes or no.

Graph3abc

Adapted from: Bland, M. (2000). An introduction to Medical Statistics. (3rd ed.).Oxford: Oxford University Press.

 

Blinding
This is a research technique used to eliminate bias due to patients, researchers or clinicians being aware of which particular intervention group the patient is in. Knowledge of which intervention a patient/population is receiving can affect how the patient(s) responds to the intervention, and how the researchers measure the outcome(s) and interpret the results. A double-blinded study is where both patients and researchers are unaware of the allocation. A triple-blinded study involves the above, and even the researcher(s) who do the analysis are blinded too. This is the ultimate in blinded trial design.

 

Blocked randomisation
This is a technique that is used to ensure more equal numbers across the groups. The order in which the intervention is assigned is randomly allocated (usually with random numbers) within each block of subjects (e.g. 6 subjects). This means that every block (e.g. 6 subjects) the numbers of the groups is equal again.

Box and whisker plot

A good way to represent quantiles is the box and whisker plot. These show the shape of the distribution by marking its median, quartile and maximum and minimum. The box itself is not representative of any quantity. It has only a one-dimensional value - frequency.

The median is usually indicated with a point or a mid-point line (50% quartile) in the box. The box of the diagram shows the 25% and 75% quartiles and the whiskers the extremes, as shown in the plot below.

Graph5

C
Case-control study
Case report and case series
Central limit theorem
Cohort study
Co-intervention bias
Complete sensitivity. See Sensitivity
Compliance bias
Confidence Intervals
Confounding bias
Confounding variable
Contamination bias
Control event rate (CER). See Event rate.
Correlation
Cross-sectional study


 

Case-control study
Case-control studies are always retrospective - cases of the health state we are interested in are found and the study aims to work backwards to try and make some conclusions about what might be causing that health state. This is done by comparing the association between the cases and the possible cause, with the association between the same possible cause in people who do not have the health state we are interested in (the controls).

Case report and case series
This is the most basic type of descriptive study, and describes in detail one or more individual cases of interest. It has little statistical significance, but can be the stimulus for further research to look for evidence of links between interventions or exposure and a disease. Many major discoveries about health concerns start this way. For instance, a pulmonary embolism in pre-menopausal women taking the oral contraceptive pill was first described in a case report and led to analytical research that showed a causal link. This led to significant changes in the drug composition and how it is prescribed.

 

Central limit theorem

If  random samples are taken from a population with a Normal Distribution for a variable, then the distribution of sample means for this variable will also be approximately normal. Further, even if the basic distribution is not normal, a frequency plot of the means of multiple samples will be normal.

OK, so what does that mean? Well, it tells us that if we know a particular characteristic is normally distributed in the population we want to study (e.g. height of female adults in NSW), then we can take a sample of the population and infer the 'true' value in the whole population. 

Further, we can take two samples in different places and expect them both to be normal distributions. Statistical testing can then tell us if the two samples are really different to each other (i.e. from different populations) or only different because of random variation. If we take a whole lot of samples and then make a graph of the mean of each one - that too will be a normal distribution if they all come from the same population. 

The mean of the population and the mean of our distribution of sample means tend towards to the same value.

glossar1glossary2

 

Cohort study
In this form of analytical study, the recruited patients are allocated to two or more groups according to their exposure to the risk factor of interest or their current or proposed treatment for a disorder of interest, however there is no randomized allocation. It can be prospective or retrospective. Outcomes of the two groups are measured and compared statistically.

Co-intervention bias
This bias is special case of contamination, but happens unintentionally when an intervention group receives an additional intervention or different care to that which they are assigned at the same time.

Compliance bias
Poor compliance is often due to adverse effects of the medication or if an intervention is impractical, uncomfortable, exhausting, unpleasant etc. If compliance problems are large this may cause a bias by causing drop-outs or by causing a poor outcome effect as subjects are not actually taking or doing the intervention as prescribed.

Confidence intervals (CI) 

The CI quantify how uncertain we are about a measurement. CI are calculated from the Standard Error of the mean, proportion, or difference between the mean or proportion. See Standard Error for the relevant formulae.

We usually quote the 95% CI as defined below:

The 95% confidence interval may be defined as the range of values within which we can be 95% sure the true value or difference for the whole population lies

However, CI of different values are used if different degree of estimation is required e.g. 99% CI where the SE is multiplied by 2.58 rather than 1.96 that we use to calculate the 95% CI. These will give wider confidence intervals and we could be more confident that the true mean lies between these limits.

Confounding bias
Confounding is the presence of certain variables that influence the effect that you are studying, but act independently of the intervention or its comparison. Confounding usually occurs because the subjects with the confounding variable are distributed differently between the trial groups and hence bias the conclusions.

Confounding Variable
A confounding variable is not the one(s) under study (and may even be unknown) but which has an effect on the outcome under investigation. It can affect the study results but can be allowed and adjusted for.

Contamination bias
This bias occurs when the intervention group and the control are exposed inadvertently to the same therapy. That is, both the intervention group and the control group may actually receive the study intervention. This might lead to a contamination of the experiment (or study) and a muddying of the results.

Correlation
A statistical method of analysing relationship between 2 quantitative variables. Correlation plots tell us how close the relationship between x and y is.

Cross-sectional study
This type of descriptive study looks at a particular, well-defined population and analyses both health status / exposure and disease status at a particular point in time (or over a time-interval). Huge studies are carried out to assess the health status of populations and are useful in suggesting areas for further research. However, they fail to show whether exposure preceded the disease development or if having a disease affected the individual’s exposure level. Hence cohort studies are usually the next level of study to investigate a possible link between exposure and disease.

Back to top

D
Degrees of Freedom

Drop-outs See Attrition
Detection bias


Degrees of freedom

Plus see also Standard deviation

In general, the degrees of freedom is given by the number of pieces of information we have minus the number of parameters estimated during the calculation of the parameter we are trying to estimate.

For variance, as above, we need to calculate the mean first. That is one parameter estimated, so we need to use n-1 of our pieces of information as our degrees of freedom. To look at it another way, we have used the pieces of information once already to get the mean, so we now only have n-1 pieces of truly independent information left.

Detection bias
This type of selection bias occurs in observational studies where causes for diseases are being sought. Subjects who have a particular risk factor are preferentially included into the study as exposure has caused a sign or symptom that is targeted for and detected during a search for the disease under study. This risk factor is then blamed for causing the disease as everyone who has the disease has it. In fact it has no actual causal relationship.

E
Enrolment
Enrolment bias
Event rate
Evidence-based Medicine (EBM)
Exclusion criteria. See Selection criteria
Experimental rate (EER). See Event rate.
External validity


Enrolment
This encompasses explaining the process of the trial and implications to the health and life of the subject.

Enrolment bias
It is always important to obtain consent and enrol each participant before allocation to groups in your trial if this is at all possible. This prevents conscious or accidental influence (enrolment bias) on who gets allocated to each trial group (e.g. subjects may be less willing to participate if they know they are in a control group).

 
Event rate
The event rate is the proportion of participants in a group who experience the event in question (e.g. a heart attack may be the outcome in question (the event) in a study to assess the efficacy of a new coronary-protective drug). Control event rate (CER) is the risk of an event happening in the control group. Experimental rate (EER) is the risk of an event happening in the experimental group. The patient expected event rate (PEER) is the expected rate of events that would be seen in patients who receive conventional or no treatment (equivalent in most cases to the CER).

Let’s say that in the above example of the trial of the new coronary-protective drug, the no. of patients in each group is 50, and there are 5 heart attacks in the intervention group and 10 in the control group (who receive conventional treatment). Then, the EER is 5/50 = 1/10 or 10% and CER is 10/50 = 1/5 or 20%.

 

Evidence-based medicine (EBM)
Evidence-based medicine (EBM) is “the integration of the best research evidence with our clinical expertise and our patient’s unique values and circumstances.”
(
Straus, S.E., Richardson, W.S., Glasziou, P., Haynes, R.B. (2005). Evidence-based Medicine: How to Practise and Teach EBM, (3rd ed). Edinburgh: Churchill Livingstone.)

Evidence-based health care or practice is evidence-based medicine extended to all the professions involved in health care.

 
External validity
External validity confirms whether the results of a study can be confidently applied to the genral population or a particular patient (also known as “generalisability”). Is it useful and relevant to your patient(s)? Does the trial method, analysis and conclusion suit your needs?
For instance, the side-effects and cost of the treatment may be influential in your decision, or the severity of the illness in your particular patient and the sex-gender-age of your patient, may be different to the trial patients. These and other factors can affect how you might apply the results of a study in clinical practice.

F
Fisher's exact test
Fixed allocation
Frequency, Frequency distribution, Frequency polygon
Funnel plot


Fisher's exact test
The above discussion of Chi-square testing is applicable to larger samples - the commonly accepted rule is that it should be used only when 80% of the expected values are greater than 5. So Fisher’s exact test should be used if more than 20% of the expected values are less than 5, or when any of the expected values are 1 or 0. Most stats packages have this option and the better packages (for our purposes) warn the user when Fisher's exact might be appropriate. At this stage, you do not need to know more than that.

 

Fixed allocation
This randomisation procedure assigns the subjects to an intervention depending on a pre-specified probability that is usually equal and fixed during the allocation. One type is simple randomisation which can be done using an unbiased coin each time a subject is deemed eligible to the trial: e.g. Heads is allocated to group 1 and Tails to group 2

 

Frequency

Frequency is the count of individuals with a particular quality.

Proportional or relative frequency is the proportion of the study group having that particular quality.

Cumulative frequency for a value is the number of subjects with values less than or equal to that value. This is used only for qualitative variables that can be ordered or quantitative variables. It is useful to illustrate how the data accumulates as the data series progresses for ordered (ordinal) variables or for quantitative variables.  

Relative cumulative frequency is calculated from the percentage of cumulative frequencies.

Frequency distribution is the set of frequencies of all the possible categories  

Frequency distribution shapes

Frequency polygon

Graphs are essential in analysis as well as presentation of data. The cumulative frequency polygon is basically a smoothed out plot of the frequency distribution. Here, rather than bars or columns representing the data frequency, we draw lines between the points. For example, Graph A below shows the frequency distribution of the data we used in MS1 Tutorial

Graph A.  Frequency Polygon of MCV parent sample

 image

From the frequency polygon (see above), we can derive some simple measures to examine the shape of the data. 

Survival curves are a special type of cumulative frequency polygon. For examples, see: http://www.lifeexpectancy.com/survival.shtml

Funnel plot
This is a plot of effect size (x) against sample size (y) used in meta-analysis to detect publication bias. If there is a bias, the plot looks like half an upside-down funnel or asymmetric triangle
.

Back to top

H
Hawthorne effect
Heterogeneity
Histograms
Hypothesis formation

I
Inclusion criteria. See Selection criteria
Intention-to-treat analysis
Intention-to-treat strategy
Internal validity


Hawthorne effect
The Hawthorne effect is a particular type of bias when the subject changes their behaviour whilst being observed during measurement.

Heterogeneity
This is tested for in systematic reviews. Basically, it measures the amount of incompatibility between studies under review, due to differences in the clinical design of the studies or because the results are statistically different. The more homogenous the studies are the better the overall results of the meta-analysis.

 

Histograms 

The best (and most common) way to represent the frequency distribution graphically is by drawing a histogram (see example in Graph A below). This graphically shows the shape of the distribution and tells us a lot about the relative frequencies of the variable. 

Bar charts and histograms are often used interchangeably, but it is useful to make a distinction between them. Bar charts are used for qualitative data or quantitative discontinuous data. Histograms are used for continuous data and therefore it may be necessary to create class intervals for the data prior to plotting the histogram.

Graph A. Histogram of MCV parent sample

graph2

Histograms of frequency densities are very useful. When we represent frequency like this we can visually compare the distribution of the frequencies. 

It is best to use frequency per unit as the vertical axis for direct comparison. This is useful as you will then know that the area under each column represents the frequency for that interval. Graph A does not do this (it has frequency per 2 units) so we cannot say this about the area of its columns.

In theory a histogram should not have gaps between the columns, as the data is continuous, but in practice anything goes these days (see Graph B below)! Bar charts are used for quantitative variables, so do have gaps between the columns - showing that the data are distinct groups. If using Microsoft Excel to plot a  histogram you need to go into "Formatting data series" option to remove the gaps.

graph4

Boundaries for continuous variables

To form a frequency distribution for a continuous variable, we need to divide the data into class intervals. This makes the data easier to handle. 

If the data are values with 1 decimal place, you would have to decide where to put say 69.9 or 70.0. The cut-off point is usually taken to be 69.9 goes below and 70.0 is the start of the new interval. However, sometimes a histogram of the data looks better if the intervals are started off the integer (e.g. if a lot of observations are recorded at 0.5, then starting the class interval at 0.75 gives a better picture as they capture more of the values). So, it can be worth trying out different interval sizes, cut-off points and then plot the histograms to see what suits your data best.

Plot, plot, plot!!!

 

Hypothesis formation

Forming a hypothesis to test for significance is one of the most important tasks when you design a study. For example, say that we observe from our practice that the serum rhubarb is often elevated in patients suffering from Crohn's Disease. This might lead us to the hypothesis that serum rhubarb is predictably higher in these patients than in people who do not have Crohn's. In order to test this, we must devise a research protocol that will allow us to test the clinical question: "Is the serum rhubarb level in patients with Crohn's Disease higher than it is in patients who do not have Crohn's Disease?

So, what is the null hypothesis? To test whether our initial hypothesis above should be accepted or rejected we actually pose the null hypothesis that there is no statistically significant difference between the variable under test in the two groups. The null hypothesis for our question above would be "There is no difference in serum rhubarb between those who have Crohn's and those who do not." This is the hypothesis that we actually go out and test with our statistical methods. 

So, why do we try to disprove the null hypothesis rather than prove the clinical hypothesis? Basically, we think that this is just by convention - early statisticians found that disproving the null hypothesis is mathematically simpler and more elegant than proving the clinical hypothesis, and so it has remained this way. If you can get your head around this part of the process - you are well over half way to understanding p values!

Intention-to-treat analysis
Intention-to-treat analysis is a method that is used in the analysis of randomized control trials to maintain the benefits gained in the randomisation of patients. It specifically analyses all the patients originally enrolled to each intervention or control group, regardless of whether they completed the trial or not.

Intention-to-treat strategy
This is a method to deal with attrition bias and is when the final outcome of patients who did not finish the follow-up or who switched to another group is analysed in their original group. It should be used if >20% loss due to attrition (or less if there is thought to be a strong effect on the outcome).

 
Internal Validity
Internal validity is how sound the study design is regarding bias and confounding factors.
 

L
Life Tables


 

Life Tables (see also Poisson distribution)

Life tables are produced nationally by the Australian Bureau of Statistics (http://www.abs.gov.au) for a hypothetical group of 100,000 Australian people. They are gender-specific and show age by:

lx the number of people alive at that age

qx — proportion of persons dying between exact age x and exact age (x+1)

Lx — number of person years lived within the age interval x to (x+1)

eox — expectation of life at exact age x

M
Mann -Whitney U test
McNemar's test for matched samples
Mean
Measuring Dispersion
Median
Membership bias
Meta-analysis
Misclassification bias
Mode
Multifactorial trial and methods
Multiple logistic regession
Multiple regression
Multivariate analysis


Mann-Whitney U-test
The Mann-Whitney U-test (also called the Wilcoxon Rank Sum Test) is the non-parametric equivalent of the t-test for relatively small samples.

 

McNemar's test for matched samples
This is the Chi-squared equivalent of the paired t-test. It may be used when comparing proportions of individuals under two different conditions or treatments (i.e. the data is non-independent).

 

Measuring Dispersion

The quantiles help show the shape of the frequency distribution, and the mean and median measure what is called the central tendency, but to really get a measure of the spread or variability we need to look at the dispersion of the distribution. 

To do this we use: 

  • range 
  • deviations from the mean 
  • sum of squares

Range

The range is simply the difference between the maximum and minimum values. All the range can tell us though is the difference between the maximum and minimum values and nothing about what lies between – how many values there are or the shape of their distribution.

Interquartile range

The interquartile range is the difference between the first and third quartiles. The interquartile range tells us more than the simple range – it gives us the spread of the central 50% of values, around the median. However, this is still not a very useful measure as it only shows us the spread around the mean.

Deviations from the mean

The most commonly used measures of dispersion are variance and standard deviation.  The theory behind this is pretty complex, but can be simplified if we look at the Normal Distribution.

If a frequency distribution follows the Normal Distribution, then we find that:

  • 68% of the observations lie between the points 1 standard deviation above and 1 standard deviation below the mean
  • 95% of the observations lie between the points 2 standard deviations above and 2 standard deviations below the mean
  • 99.7% of the observations lie between the points 3 standard deviations above and 3 standard deviations below the mean

This is great, as we can now estimate the ranges for  a dataset (that approximates the ND) that would be expected to include 68%, 95%, and 99.7% of the observations. See Graph 6 below. Many biological characteristics conform to the ND, so we can use this for many medical problems. More of this below.

Graph 6. Showing a normal distribution curve with different SD values from the mean

graph6

You do not need to know how variation and standard deviation are derived but the derivations under standard deviation in the GLOSSARY  if you are interested to know more. 

 

Mean mean

The arithmetic mean is the sum of all the observation values divided by the number of observations. 

mean = (x1+x2+x3+…+xn) / n 

where mean is the mean (pronounced "x bar"), (x1+x2+x3+…+xn) is sum of the observations, and n is the total number of observations.

The mean is commonly used to illustrate the central tendency and is useful in comparisons with other datasets of the same variable. 

The mean is more useful than the mode or median, as it:

  • uses all the values
  • gives a sense of location (it is a central value)
  • but also gives a sense of dispersion (the width of the range of the values)

The mean, median and mode are all measures of central tendency but for the same dataset can all be different. The only time that they are all equal is when the frequency distribution is completely symmetrical (e.g. bell-shaped or box shaped). In all other cases, they will be different. For instance - if the distribution is skewed to the right, the mean will be greater than the median and the mode, if skewed to the left, the mean will be lower than the median and the mode.

Medians and Quantiles

Quantiles are used to summarize frequency distributions by dividing it up so that there are a given proportion of observations below the quantile. The median and a particular type of quantile called quartiles are the most commonly used. 

The median divides the distribution into two equal halves as it is the central value. It is not susceptible to extreme high or low values (outliers). We also do not actually need to know all of the values of the median to calculate it and we could change some of the values above or below the median and its value would not change. The median is therefore useful as a measure of location and when the mean would be affected by extreme value(s).

Quantiles

However, we can use quantiles to divide up the distribution and help to describe it further. 

If you ever need to calculate quantiles, it might be useful to have the following formulae as described in: Chapter 4, section 4.5 of Bland, M. (2000). An introduction to Medical Statistics. (3rd ed.). Oxford : Oxford University Press.  

1. To estimate q the quantile from n number of ordered observations which are divide the scale into n + 1 parts. The proportion of the distribution that lies below the ith observation is estimated as i/ (n + 1). So, q = i/ (n + 1), which means that the ith observation i = q (n +1). If i is an integer then the ith observation is the required quantile estimate.

2. If i is not an integer then the formula for q = xj + (xj+1 - xj) x (i - j), where j is the integer part of i, the part before the decimal point. This quantile will lie between the jth and the (j + 1)th observations. 

Sounds complicated but it works.

Quartiles divide the distribution into four equal quarters by finding the points where: 25%, 50% and 75% of the data lie below. The second quartile is the median (50%)So, for the above formula you would use q = 0.25 for the 25%, q = 0.5 for the 50% etc.

These quartiles can then be used to describe the variation of the data. The interquartile range is the distance between the 1st and 3rd quartiles and gives us a measure of how far from the central tendency the observations beyond these points are. 

Percentiles

Centiles are another type of quantile. There are 99 centiles or percentiles. The median is the 50th centile.    

 
Meta-analysis is a particular type of systematic review that uses quantitative methods to analyse the results of valid studies. For instance, all the outcome results from the valid studies are pooled to give a common estimate of the effect of the treatment or risk factor.
 
Membership Bias
This bias might occur if the groups that you chose for the intervention and control were selected from different subgroups that had different health and socio-economic characteristics etc.

Misclassification bias
This is essentially a bias due to misclassification of the outcome due to an inaccurate diagnosis or incorrect assignment of criteria. It is more common is case-control and cohort studies.

Multifactorial trial and methods
This is a trial design used to look at more than one intervention at a time. Multiple regression or Multiple logistic regression are mostly used to analyse the results.

Multivariate analyses
These are techniques used to estimate the relationship between variables when there are more than two variables involved in the regression equation. Multiple regression and Multiple logistic regression are examples.

Multiple regression
Multiple regression is a technique used when there are more than two variables involved in the regression equation. One measure is the dependent variable (outcome) y and the analysis aims to find the most accurate estimate of the linear combination of other measures (independent or predictor variables) on this dependent variable.
y = b0 + (b1 x age) + (b2 x ht)

Multiple logistic regression
This is similar to multiple regression but is used when the dependent variable is binomial (e.g. yes or no).

 

Mode

The mode of a continuous variable is the most frequent value. It is easy to see where the mode lies in a frequency table, a histogram or a polygon.  

The mode can be useful when you need to know which is the most frequently found observation or used to find the most frequent interval (the modal interval). Let's say that in a study of a sample of people with thalassemia you have found out their use of painkillers in tablets per day. Here, you might find it useful to know which was the most frequently used number of tablets per day. 

However, the mode is often misleading as most data is skewed in its distribution (see below). So, it is not often used. 

Unimodal distributions

This is a frequency distribution curve with one peak. The mode is the peak (maximum frequency value). Many biological distributions follow this pattern, e.g. the plot of the height of all UNSW male medical students. Examples of this curve are below in Graphs A.

Graphs A

skew

Bimodal distributions

This is a less common frequency distribution curve that has two peaks. It indicates that the data are a mixture of two separate distributions. For example, a plot of the height of all UNSW medical students, male and female, would probably have 2 distinguishable peaks: one for males and one for females. 

Graph B below is another example, showing the bimodal distribution of MCV in subjects under study in Hong Kong.

Graph B. Shows the distribution of 2640 subjects according to their erythrocyte mean corpuscular volume (MCV) at 2fL intervals. 

graph4

Data source: Chan, L.C. et al. (2001). Should we screen for globin gene mutations in blood samples with mean corpuscular volume (MCV) greater than 80 fL in areas with a high prevalence of thalassemia? J. Clin. Pathol., 54, 317-320.

N
Non-respdonent bias
Normal distribution
Normal plot
Null hypothesis see Hypotheis
Number needed to treat (NNT)
Number needed to harm (NNH)


Non- respondent bias
This is the bias due to non-response during a survey. Those who do respond are likely to be different to those who don’t and so the subjects who are recruited to the survey may under-represent the level of sickness or behaviour under study.

 

Normal Distribution

ND curves

All normal curves look like those shown above - they have values clustered around the middle, with fewer and fewer values out in the 'tails'. This distribution of values produces the familiar 'bell-shaped' curve and because there are equal numbers of values out in each tail, the curve is symmetrical around the middle. 

Mathematically, all can be described using two parameters, the mean and the standard deviation (SD). The mean is simply the average value (add up all the values and divide by the number of values you have), while the standard deviation is a measure of how variable the data is around that average. 

The Standard Normal Distribution

This is the 'reference' Normal Distribution and by definition has a mean = zero and a standard deviation = 1. All normal distributions can be transformed into the standard curve, with the advantage that we can describe any individual score in terms of the number of standard deviations it is away from the mean.

Now, the Normal distribution has some more important characteristics. Knowing the mean and the standard deviation (SD) can tell us something about the probability of an individual having a certain value.

The 68 -95 - 99 rule 

graph7

The graph above shows the standard normal curve with units of SD along the X-axis and moving away from the mean of zero, with the proportion of the total represented in the shaded areas (being symmetrical, the numbers are the same on each side of course). 

We can see that about: 

  • 68% of values are plus or minus 1 SD from the mean (0.341 + 0.341)
  • 95% are +/- 2 SDs from the mean (0.68 + 0.136 + 0.1436) 
  • and nearly all values more than 99%, are +/- 3 SDs from the mean. (Actually it is about 99.7%).

This is always true for a normal distribution!

Knowing how far a value is from the mean in SDs tells us how extreme or unlikely that particular value really is in the sample we have. For example, only just over 2% of the population can have an SD of less than -2, and the same number more than +2 SD.  

Normal plot

There are many ways to help you decide if a set of data like the one above is acceptably close to normally distributed. The most useful and common method (if a little unsatisfying to the mathematically minded) is simply to plot the data and look at it! That is what we did above.

Secondly, you can transform the data into what is called (somewhat confusingly) a normal plot. The normal plot is essentially a plot of actual data values plotted against a set of 'ideal values' - the values your data would have if they conformed exactly to a normal distribution. The resulting plot should be a straight line - the closer to straight line, the better. If not close, then the data set does not conform to a normal distribution.

Here is the data set from β-thalassemia patients discussed in MS1 tutorial represented in a normal plot (see below). Do you think this line is 'acceptably straight'? I agree, it is difficult to interpret!

Normal plot of age of onset of life-threatening medical complications for a hypothetical group of 85 patients

graph10

The third method is to perform a numerical test, of which several are described. The one my statistics programme uses is called the 'Shapiro-Wilk W test'. All such tests have limitations, and we don't think you need to know any more than that they exist.

 
Number needed to treat (NNT)
This is the inverse of the absolute risk reduction (ARR): NNT = 1/ ARR, with 95% confidence intervals. NNT is the number of patients that would be needed to be treated with the intervention under trial in order for 1 additional patient to gain a good outcome compared to the control group (the outcome in the trial may be either a beneficial outcome in itself or maybe the prevention of a bad outcome). This is a very useful measure to use in applying trial results to clinical situations. See also Number need to harm (NNH).

Number need to harm (NNH)
NNH is used when the intervention treatment increases the probability of a bad outcome. NNH is the inverse of the absolute risk reduction (ARR) in trials where the treatment causes more harm than good: NNH = 1/ ARR, with 95% confidence intervals. NNH is the number of patients that would be needed to be treated with the intervention under trial that would result in 1 additional patient being harmed compared to the control group. This is a very useful measure to use when applying trial results to clinical situations. See also Number needed to treat (NNT).

O
Observer bias
Odds ratio


Observer bias
This is a similar bias to subject bias but affects the observer of the subject. They too may consciously or subconsciously tinker with the measuring or recording of the observed outcomes
.

Odds ratio
This is a ratio of events to non-events. For instance, for a cohort study or systematic review, it is the ratio of the odds of having the disorder in question in the experimental group relative to the odds of not having it in the control group. It can also be the ratio of the odds of being exposed to the risk in question in the experimental group divided by the odds of being exposed in control group participants (who do not have the disorder in question).

P
Parameter
Percentage points
Percentiles
Pie charts
Poisson distribution
Power
Probability
Probability distribution
Publication bias


Poisson Distribution (see also Life Tables)

If a probability distribution includes a time factor, as when we are looking at death rates, survival rates, birth rates etc, this distribution is called the Poisson Distribution. It is useful when we are interpreting mortality rates, patient survival, life tables etc. You do not need to know the equations for this distribution, but they are available in Bland.

 

Probability

According to Bland the probability that an event will happen under given circumstances is:

"the proportion of repetitions of those circumstances in which an event would occur in the long run."

Basic rules of probability  

A probability always lies between 0.0 and 1.0. So, when an event for a random variable never happens the probability is zero. When it always happens the probability is 1.0. 

  • Addition rule 

If the probability events are mutually exclusive (that is they cannot occur together) and they occur in the same event or trial, we SUM the probabilities of each up, to calculate the overall  probability. 

PROB (E1 + E2) = PROB (E1) + PROB (E2)

  • Multiplicative rule

If there are two or more independent events (one event does not influence the other), then we MULTIPLY the probabilities together to get the overall probability.

PROB (E1E2) = PROB (E2) x PROB (E1)

  • Conditional probability

This is important when  we want to know the probability of an event if it follows another event. This is very relevant in medicine as we often want to know how one treatment might work following another treatment, or what diagnosis might follow a particular combination of symptoms  etc. This is particularly relevant in decision tree methods and computer-assisted diagnosis software (if interested, further reading is available in Bland Chapter 15.7, p 288). 

So, if a probability of the outcome of one event E1, is affected by the outcome of an event E2, then we say that the probability of E1 is conditional on E2. We can write this as: PROB (E1/E2).

These are non-independent events but we use the multiplicative rule again. We multiply the individual probabilities, but we are more careful about what we are multiplying. For instance, if we want to know the probability of E1 and E occurring together we would need to know the probability of event 1 happening, and the probability of event 2 happening if event one has happened already or vice versa. This can be notated as follows:

PROB (E1E2) = PROB (E1) x PROB (E2/E1) = PROB (E2) x PROB (E1/E2)  

Probability distribution

A probability distribution is a graph or notation show all the possibilities for a random variable. It is analogous to a relative frequency distribution. Common probability distributions are: binomial and Poisson.

Publication Bias
This is simply the bias that occurs when journals publish papers with positive conclusions more often than negative ones. A funnel plot is a visual way of representing this during meta-analysis to show whether the analysis is being unduly influenced by small studies with large effect sizes that actually might be flawed by bias.

 

 

Parameter

A parameter is a summary value which characterises the nature of the population in the variable that is under study.

Percentage points (*ADVANCED STUFF*)

Percentage points tables are another way of tabulating a distribution instead of using the Normal Distribution tables and you are unlikely to come across them at this stage. 

The one-sided percentage point of a distribution is value z such that there is a probability P% of an observation from that distribution being much greater than or equal to z.  In the Graph A of the Normal distribution shows that the tail on the upper side would give you the z value for above a particular point (e.g. here for 2 SD = 5%).

Graph A. One-sided percentage point 

glossary1

 

The two-sided percentage point of a distribution is value z such that there is a probability P% of an observation from that distribution being much greater than or equal to z or less than or equal to -z The Graph B below shows that the tail on both sides giving you the z value above or below a particular point on the x-axis (e.g. here 2SD = 2.5% on either side).

Graph B. Two-sided percentage point 

glossary2

Pie charts

We can also represent the data as a pie chart - a good way to show % values. The total 360 degrees is 100%, so each % is represented by 3.6 degrees. Example in graph 1B below. 

Most software that handles data will automatically be able to draw a pie chart for you (eg. Microsoft Excel, GraphPad Instat). 

graph1b

Power

The power of a study is the ability of that study to detect a given difference. In general, we want to do studies that have a reasonable chance of success, so we try and make sure the power of a study is enough to find a difference. The usual convention for power is that we should have an 80% chance of detecting the difference we seek. The power of a study should be 0.8. 

Q
Quantiles
Quartiles


Quantiles are used to summarize frequency distributions by dividing it up so that there are a given proportion of observations below the quantile. The median and a particular type of quantile called quartiles are the most commonly used. 

The median divides the distribution into two equal halves as it is the central value. It is not susceptible to extreme high or low values (outliers). We also do not actually need to know all of the values of the median to calculate it and we could change some of the values above or below the median and its value would not change. The median is therefore useful as a measure of location and when the mean would be affected by extreme value(s).

However, we can use quantiles to divide up the distribution and help to describe it further. 

If you ever need to calculate quantiles, it might be useful to have the following formulae as described in: Chapter 4, section 4.5 of Bland, M. (2000). An introduction to Medical Statistics. (3rd ed.). Oxford: Oxford University Press.

1. To estimate q the quantile from n number of ordered observations which are divide the scale into n + 1 parts. The proportion of the distribution that lies below the ith observation is estimated as i/ (n + 1). So, q = i/ (n + 1), which means that the ith observation i = q (n +1). If i is an integer then the ith observation is the required quantile estimate.

2. If i is not an integer then the formula for q = xj + (xj+1 - xj) x (i - j), where j is the integer part of i, the part before the decimal point. This quantile will lie between the jth and the (j + 1)th observations. 

Sounds complicated but it works.

 

Quartiles divide the distribution into four equal quarters by finding the points where: 25%, 50% and 75% of the data lie below. The second quartile is the median (50%)So, for the above formula you would use q = 0.25 for the 25%, q = 0.5 for the 50% etc.

These quartiles can then be used to describe the variation of the data. The interquartile range is the distance between the 1st and 3rd quartiles and gives us a measure of how far from the central tendency the observations beyond these points are. 

 

Percentiles

Centiles are another type of quantile. There are 99 centiles or percentiles. The median is the 50th centile.    

R
Random variable
Randomisation
Randomised control trial (RCT)
Range
Recall bias
Referral bias
Regression
Regression to the mean
Relative Benefit Increase (RBI)
Relative risk
Relative Risk Increase (RRI)
Relative risk reduction (RRR)
Risk ratio. See Relative risk


Random variable  

A random variable is a quantity which takes any of the values within a specified set with a specified probability. For instance, the the number of heads which you might get from tossing a coin is called a random variable.

 

Randomisation (random allocation)
The aim of randomisation is to generate trial groups that are very similar in all characteristics except the intervention that they will receive. This minimises potential confounding factors and hence protects the internal validity of the trial.
Random allocation implies that each trial patient has the same chance of receiving any of the trial treatments and that each individual allocation is independent of any other. Various methods are used to assign patients as randomly as possible to the treatment groups: e.g. modern studies often use pre-stuffed, unmarked allocation envelopes allocated by a computer random number programme. The aim of this is to achieve, on average, control of other risk factors affecting the outcome(s) under investigation, including any unknown ones. This reduces bias due to allocation to particular treatment groups and reduces confounding of the results for the exposure or treatment by other risk factors not under study.

 
Randomized control trial
This design is an intervention trial, where the patients of interest are recruited and then randomised into groups. Following that, the therapies (often a new treatment compared to a current treatment or no treatment) are given and the outcomes measured. The study proceeds forward from identifying the subjects, giving the therapy and measuring the effect - this is therefore a prospective study.
 

Recall bias
This bias occurs mostly in retrospective trials where cases or cohort study subjects and controls are asked about certain exposures. People with the disease are more likely to recall being exposed than those who do not have the disease.

Referral Bias
This is a special form of selection bias due to the recruitment of subjects from referral centres (hospitals, outpatients etc). These patients tend to be more severely affected by the disease in question and this causes a bias in the result to show a different effect than one would expect from the general population.

Regression
A statistical method of analysing relationship between 2 quantitative variables. Regression plots the estimate of the numerical relationship between 2 variables x and y. Regression coefficient = r = b

y = a + bx

 
Regression to the mean
This is the phenomenon seen when extreme values move closer to the mean if the measurement is repeated. This is an effect of chance. It has huge implications for the results of clinical trials, as improvements seen in a trial might just be due to this chance effect, rather than a true effect due to the new treatment under research. P values and confidence intervals do not allow for this either. The only way to reduce its effect is to use repeated measurements to gain an average value or to use a control group for comparison, as this would hopefully balance out any shift towards the mean due to the effect. A 2003 BMJ article by Morton and Torgerson sums this up beautifully and should be read [Morton, V. & Torgerson,D.J.(2003). Effect of regression to the mean on decision making in health care. BMJ, 326 (7398), 1083 - 1084. ]
 

Relative Benefit Increase (RBI)
RBI is used when the intervention increases the probability of a good outcome. RBI is the proportional increase in rates of events (good outcomes) between the participants in the intervention and the control groups. RBI = (EER – CER) / CER, with 95% confidence intervals. See also Absolute Benefit Increase (ABI) and Number needed to treat (NNT).

Relative risk
Relative Risk (RR) is the ratio of the risk of an event happening in the intervention group (experimental event rate EER) divided by the risk of an event happening in the control group (control event rate CER): RR=EER/CER. See also Event rate.

Relative risk reduction (RRR)
RRR is used when the intervention treatment reduces the probability of a bad outcome. RRR is the proportional reduction in rates of events (bad outcomes) between the intervention group and the control group participants. RRR = (EER – CER) / CER, with 95% confidence intervals. See also Absolute risk reduction (ARR) and Number needed to treat (NNT).

Relative Risk Increase (RRI)
RRI is used when the intervention increases the probability of a bad outcome. RRI is the proportional increase in rates of bad outcome events between the intervention group and the control group participants. RRI = (EER – CER) / CER, with 95% confidence intervals. See also Absolute Risk Increase (ARI) and Number needed to harm (NNH).

S
Sampling Distribution
Scatter diagrams
Selection bias
Selection criteria
Sensitivity
Significance levels and types of error
Skew
Spectrum bias
Standard deviation
Standard error
Stratified randomisation
Subject bias
Systematic review


Sensitivity

Absolute sensitivity relates to the bare minimum sensitivity of a test - only unequivocally positive diagnoses are included as positives in the calcuation.

Complete sensitivity includes the suspicious or borderline positive test results - so that complete sensitivity measures the test's ability to detect all abnormal patients for further study (e.g. borderline or suspicious cases as well as the outright definite positive diagnoses). E.g. as in a breast screening mammogram

So absolute sensitivity is the minimum sensitivity, whilst complete sensitivity includes equivocal results as positives. It will depend on what the test is being used for (i.e. gold standard or basic screening) as to which is most relevant.

 

Skew

Asymmetrical unimodal curves are described as skewed positively (to the right) or negatively (to the left).

 The 3 common unimodal shapes:

1.    Symmetrical and bell-shaped, e.g. female height. This is the shape of the Normal Distribution.

2.    Positively skewed or skewed to the right (the long tail is on the right), e.g. triceps skinfold measurement

3.    Negatively skewed or skewed to the left (the long tail is to the left), e.g. period of gestation

These curves all have high frequencies in the centre and low frequencies at the two extremes.

Graph A Examples

image

Adapted from:  Kirkwood , B.R. (1997). Essentials of Medical Statistics. (1st ed.). Oxford ; Blackwell Science.

Sampling distribution 

The sampling distribution is the frequency distribution of a statistic for an infinite number of samples of the population. It tends to have the same mean as the population but the SD is much smaller than the population mean. 

Scatter diagrams

These graphs are useful when showing the relationship between two continuous variables (e.g. the obvious ones of height and weight). A bar chart would be rather clumsy here. Each variable is represented along one axis and the data point is represented by a dot or cross on the graph. 

 

Selection Bias (Sampling bias)
Also known as sampling bias, this is when subjects are selected into the trial so that the outcome of the study is adversely affected as the characteristics of the subjects are different to those intended

Selection Criteria
These are the criteria with which the researchers have chosen the subjects for the trial. They are important as this can have a big affect the external validity of the trial. For instance, if a trial has a long list of exclusion criteria you may not be able to apply the results to many people in the target population.
Exclusion criteria: These are the criteria with which you exclude subjects you really cannot have in the trial e.g. those with a terminal illness, those who might bias your trial in a particular way (e.g. those with a poor renal function).
Inclusion criteria: These are the subjects that you wish to have in the trial; they are your trial population and should reflect the target population for your intervention.Spectrum Bias
This is very similar to referral bias and has a similar effect on the external validity of the trial. It is when subjects are recruited from patients with classic or severe symptoms of the disease in question.

Stratified randomisation
This is used in smaller studies to ensure that the baseline characteristics between each group are comparable. Randomisation should achieve this but may not if the sample size is small. You randomise according to strata of the factors that you think may confound the results and randomise within those strata (by blocked randomisation if necessary).

Subject bias
This is when the subject constantly distorts the measurement of an outcome. This usually happens as trial subjects want to please the assessor (the nurse, doctor etc) and give responses to questions that they think that the assessor wants to hear (e.g. that they are no longer feeling pain in their back, when in fact it is as bad as ever). This bias can be committed consciously or subconsciously.

Significance levels and types of error  

     1. Type I error is generally set at P < 0.05

    2. Type II error is generally set at P = 0.2

    3. These levels are 'set' by deciding the difference the study should be looking for, and then working out the appropriate sample size. This implies prior knowledge about likely effects and the amount of variability in the data. 

Type I error 

Type I error can be defined as the chance of falsely rejecting the null hypothesis. By this consensus view, we generally accept that the null hypothesis is 'true' when there is only a 5% (1 in 20) chance or less that this conclusion would be wrong.  

When the chance that the null hypothesis is correct reaches less than a probability of 0.05 (P = <0.05), we reject it and accept there is likely to be a real difference.

Type II error can be defined as the mistake of falsely accepting the null hypothesis. There is real difference between our groups, but we have been unable to confidently reject the null hypothesis. This happens because the uncertainty (variability) in out data has drowned out the real difference between the groups we are comparing. The only way of improving our certainty is to increase the sample size. This works because:

    1. If there is a real difference it will be maintained no matter how big the sample gets, but

    2. The variability of the data will decrease as a proportion of that 'real difference' as the sample gets bigger.

Thus, the bigger our samples, the more powerful they are to detect differences (if they exist). That is why the ability to detect differences is called 'the power' of the study. 

For any given sample size, making the type I error rate smaller will automatically reduce the power of the study. That is to say, it will automatically increase the type II error rate. The same is true in reverse - making the power 0.9 instead of 0.8 cannot be achieved (for a given sample size) without the type I error rate being relaxed.

Standard deviation 

See also: Mean, Measuring Dispersion and Deviations from the mean

We do not think that you need to know these formulae or how to derive them, but you may find that it helps your knowledge about SD. If it doesn't - don't bother!

To calculate variance and standard deviation we need to start with deviations from the mean. Firstly, we calculate the mean. Then we can calculate the difference in value from the mean of each observation (xi-mean)

If values of x are scattered far from the mean, the differences (xi-mean) will be big, but if the observations (x) are distributed close to the mean, the differences will be smaller. Therefore these values can give an idea of spread of the observation values x. To do this we need to find some sort of average of these deviations from the mean, but if we sum them up the total is zero (as they are spread symmetrically about the mean). So, instead we square the deviations (xi-mean)2 and then sum them up . This is the sum of the squares (about the mean).

Sum of the squares = Sigma (xi-mean)

where Sigma is sigma, the Greek capital S, used here to mean "sum of"

 Variance

So we have a measure, sum of squares, which gives us an idea of the spread of the observations about the mean. However, it would be more useful to find measure that we could then use to compare with the scatter around the population mean. This is where variance comes in and degrees of freedom. 

Variance (s2) is the sum of the squares divided by the degrees of freedom.

variance

Degrees of freedom

The minimum number of observations that are required from which we can calculate variability is 2. One observation alone cannot be used. So, instead of using the number of observations n as the divisor for the variance, we use (n-1). This is known as degrees of freedom.

 Standard deviation

The variance gives us an estimate of the variability about the mean, but it is calculated from squared values and so has different units than the mean. So, to get a dispersion measure that has the same units as the mean, we just take the square root of the variance. This is the standard deviation.

SD formula

For a dataset with a very skewed distribution, the standard deviation will be over-inflated and not give us a good idea of the variability. A better way would be to transform the data (e.g. with a log transformation) which might make the distribution more symmetrical and hence the SD would be reasonable and more useful. More simply, we could just quote the interquartile range.

Online Medical Statistics at Square One: (Swinscow, TDV. (1997) Statistics at Square One (9th edition). BMJ Publishing Group)
shows how to use a calculator to work out SD for a dataset using a "calculator friendly" version of the formula.

 

Systematic review is a study method developed to synthesize the evidence systematically across all trials for a given intervention. A systematic review can provide the most complete, accurate and authoritative guide to a therapy or risk factor. Visit the Cochrane Library website for more information and 1000’s of reviews.

Standard error

The standard error of the mean is used to estimate confidence limits for a mean. In doing this it assumes that the standard error of the sample in question is a good estimate of the standard deviation of the population.*

SEM = SD (sample) / √n

or 

SEM = √ [SD2 (sample) / n]

When the sample size (n) is large, the SEM is small.

To work out the 95% CI for a sample, use the formula to estimate SEM and then the 95% CI will be: mean +/- (1.96 x SEM)

*Assuming that the frequency distribution of the means has a Normal distribution

 

The standard error of a proportion is used to estimate confidence limits for a proportion. In doing this it assumes that the standard error of the sample in question is a good estimate of the standard deviation of the population.*

 SE (proportion) = √ [p (1-p) / n]

If  p (= r/n) where the random sample size is n and the number in the sample with the condition is r, then:

SE (proportion) = √ [ r/n (1-r/n) / n]

To work out the 95% CI for a sample, use the formula to estimate SE and then the 95% CI will be: proportion +/- (1.96 x SE)

* Assuming a Normal distribution, and for this to apply, np and n(1-p) should both be > 5.

 

The standard error of the difference between 2 proportions is used to estimate confidence limits for a difference between 2 proportions. In doing this it assumes that the standard error of the sample in question is a good estimate of the standard deviation of the population.*

 SE (of difference between 2 proportions) = √ [p1 (1-p1 ) / n  + p (1-p2 ) / n2  ]

To work out the 95% CI for a sample, use the formula to estimate SE and then the 95% CI will be: difference between the proportions +/- (1.96 x SE)

 

The standard error of the difference between 2 means is used to estimate confidence limits for a difference between 2 means. In doing this it assumes that the standard error of the sample in question is a good estimate of the standard deviation of the population.*

SE (of difference between 2 means) =  √(s12/n1 + s22/n2)

To work out the 95% CI for a sample, use the formula to estimate SE and then the 95% CI will be: difference between the means +/- (1.96 x SE)

*Assuming that the frequency distribution of the difference between the means has a Normal distribution

T
Tables
t-distribution
t-tests


Tables

A table is a good way of organising information (data, results, etc) into columns and rows. Tables are easier to follow than plain text but may not show you as much information as a graph. Tables are most often used to display the main results of a study, and to give a brief summary of the dataset. Usually a table contains important information - so take a good look at them! Remember to make your own tables clear and concise as readers often glance at the text and focus on the information in tables and graphs.

Always label the table clearly, label rows and columns, taking care to show units and to distinguish between rates, %, frequencies and proportions. A "TOTAL" row and column totals help the reader, especially for %.

 

The t-distribution 

When we take a small sample of the population, random chance variation means it is quite likely that the distribution will not look all that normal. What's more, when we take repeated small samples, the distributions may not look much like each other either. 

As you now know, it is the usual case in Medical research that we need to extrapolate our conclusions about medical populations from small samples studied formally in trials. We want to know if the apparent difference between them is 'real' (they are likely to be samples from different populations), or if it is more likely to be due to random variation (they come from the same population, and chance is enough to account for the difference). 

One of our most used tools to make this distinction is a second family of distributions where the 'tails' are bigger than for the normal distribution. This allows for our increased uncertainty about the 'real' normal distribution of the population. Of great personal interest, the chap who worked these distributions out for us was trying to help his employer solve some problems in the brewing industry! His employer wanted him to remain anonymous, so he published his work under the name 'Student'. This is therefore still known as "Student's t distribution" and "Student's t test". Publication took the form of a whole lot of tables for values of t for different sample sizes (measured as degrees of freedom). Indeed, there are many of us left who are ancient enough to remember the little books of tables we had to have at hand. We would work out the value of t (see below) and then look up the corresponding P-value. Fortunately, we have computer programs to do all that for us now. 

In this family of distributions, the smaller the number of independent data points, the greater the uncertainty and the bigger the tails in the distribution. Each distribution curve looks a bit like a normal curve, but the shape is changed according to the number of data points in our sample. The term used is 'degrees of freedom' (df), where this is the number of data points minus one (df = n-1). Have a look at these curves:

Distribution curves for a variable for different degrees of freedom

 

graph11

Figure 5.1 A small family of t-distributions

The smaller the degrees of freedom, the wider the tails. With enough data points, the curve becomes identical with the standard normal curve above. In fact, there is little difference between the t-distribution and the normal distribution above about 30 degrees of freedom (sample sizes of 31 or more). 

The standard normal distribution is a special case of the t-distribution when df = infinity.

 

T-tests

Calculating t-values

The numerical value of the t statistic is best thought of as the number of standard deviations that one group is from another. Usually we think of this as comparing the means of a test group to the mean of a control group. The probability of that test group mean being from the same population as the control group mean is given with reference to the t-distribution rather than than the normal distribution - by this method we adjust for the uncertainty of small sample sizes.

Once the sample size is big enough so that the t-distribution approaches Normal, then the t-statistic will equal 1.96 when there is a 5% probability that the test group is actually no different from the control group (from the point of view of the characteristic being measured).

To calculate t, we need to work out the two group means ( T = treatment and C = control for this formula), and estimate the variance of each group (which is the standard deviation squared). 

 

stat5t2

 

stat5t

Figure 5.2 Calculation of the t-statistic

 

To put this in words, t is given by the difference between the means of the two groups, divided by square root of the the sum of the variance divided by n for each group. The bottom term is actually the Standard Error. One useful analogy is to think of this as a signal (difference between the means) to noise (variability in each group) ratio.

Once t is calculated you look up the t table (e.g. Appendix B from Statistics at square one ) with the appropriate degrees of freedom (here, this will be the sum of both groups minus 2, as (nt - 1) + (nc - 1) = (nt + nc) - 2). 

1 and 2 sided t tests  

Most statistic tests will give answers that are 'One-sided' and 'Two-sided'. The one-sided value is exactly half of the two-sided value. The two-sided value allows for a possible difference, of the magnitude measured, in either direction - e.g. if looking at age: younger or older. The one-sided value allows only for a difference in the direction measured (e.g. if looking at age: older only). The 'tails' referred to are the unlikely values at each end of our distribution.

Graphs showing t-distribution with the two 'tails' highlighted. Curves for DF=5 and DF=100

The higher the degrees of freedom, the smaller the tails for any given P-value. The numbers on the graph are values for 't' shown at the P=0.05 level.

tjak5

tjak100

Which we should use depends on our hypothesis, and on what we know about our question. A two-sided test is most appropriate if we admit we have no idea whether one group will be different that the other by being bigger or smaller. Either is a possibility. In our example above, we would have no way of knowing before we did the experiment, that the mean age in the active group would be older or younger - so a two-sided test is appropriate.

In general, two-sided test are wise. If you are absolutely confident that a difference can only occur in a single direction, than a one-tailed test could be appropriate. But be prepared for an argument!

 

U
Unimodal distributions

V
Validity. See Internal Validity or External Validity)
Variance
Variables


Variance 

See also Standard deviation

Variance (s2) is the sum of the squares divided by the degrees of freedom.

image

Variables

Variables can be classified into 2 types:

QUALITATIVE - individuals can be divided into separate classes: e.g. gender (male or female); hair colour (brown, blonde, black).  

These qualitative variables are either:

unordered (nominal) such as seen in the groups: i.e. gender- male / female; thalassemia sufferer, thalassemia carrier, non-carrier; or blood group O, A, B, AB. The order in which we look at these classes does not matter. For example, we cannot say that people with blood type 0 lie in any particular order in relation to those with A, B or those with AB.  

or 

ordered (ordinal). These are ordered responses, such as: severe, moderate, mild or no problem; or, patients could be asked if they agree that they are worried about their health and either "agree", "have no strong feeling either way" or "disagree". In these cases the order that we look at the variable does matter and we should usually take it into account when we manipulate the data.

QUANTITATIVE - Numerical: e.g. numbers of students in a class; height and weight. These data can be divided into class intervals to make them easier to handle (see later).

Continuous quantitative data is where there are no gaps between the data, i.e. the variable is a measurement on a continuous scale (e.g. height, weight, blood pressure, haemoglobin values). 

Discontinuous (or discrete) quantitative data is data where the data is not continuous, eg. shoe size (whole and 1/2 number integers), the number of cases of Thalassemia detected each year by prenatal testing.

Y
Yates continuity correction


Yate's continuity correction
Some discrepancies arise from Chi-square analysis because the observed frequencies only move in full integers (1 extra person). Yate's continuity correction brings each observed frequency closer to the corresponding expected frequency by 1/2. This is best thought of as a 'conservative factor' when estimating probabilities between 0.1 and 0.01. It is really an approximation of Fisher’s exact test.