skip to primary navigation skip to content

Studying at Cambridge

DAPA Measurement Toolkit

 

Error and bias

;

All measures of diet, physical activity, and anthropometry contain some degree of error. Understanding the source of this error is vital to help refine and develop the method, and strengthen the inferences that are made from the data. It is also important to understand that:

  • Measurement error and bias are different
  • Measurement error does not necessarily cause bias
  • Without measurement error, bias can occur
Estimated Value = True Value + Total Measurement Error

The sources of measurement error fall into two categories:

  • Systematic error
  • Random error

These two types are not mutually exclusive, and often co-exist. Scrutinising the source and magnitude of both types of error is an important step in deciding whether or not a method is appropriate to help answer the research question.

Total Measurement Error = Systematic Error + Random Error

Effect of random error on estimated values

  • Random errors cause the estimated values to deviate from the true value due to chance alone
  • They lead to measurements that are imprecise in an unpredictable way, resulting in less certain conclusions
  • Random error is assumed to cause predicted values to fluctuate above and below the true value by the same extent
  • This means that estimates of central tendency such as the mean or median are unaffected, but greater random error does increase the variability about the mean
  • Methods with less random error produce values which are closer together; this is known as the method’s precision

For example, a calibrated set of digital weighing scales may not show exactly the same reading on each measurement due to small fluctuations in posture and instrumentation. Measured values will be dispersed in a normal distribution around the true weight (Figure C.4.1). Methods with high random error will be more dispersed, methods with low random error less so. For both methods, with repeated measurement the true value can be estimated as the mean of the values from each. Estimates based on the mean from both methods may be similar; however we can be more certain about estimates given by the method with smaller random error.

Figure C.4.1 The normal distribution of measured values due to random errors. Note that the mean of the measured values approximates the true population mean.

Sources of random error

Random error can occur for a variety of reasons. These include:

  • Individual biological variation, e.g. biochemical measurement
  • Sampling error, e.g. subset of a larger population only selected
  • Measurement error, e.g. estimation of portion sizes in a dietary assessment

Examples of random error in diet, physical activity and anthropometric measurement include, but are not limited to, errors resulting from:

  • Inadequate or confused explanations from the investigators
  • Changes in behaviour due to the measurement procedure
  • Dietary coding errors
  • Participant error in estimating portion sizes
  • Limitations within a nutrient database
  • Improper calibration of instruments
  • Inconsistent placements of instruments on the body
  • Incorrect translations or coding of outcome data

If the error can be directly linked to specific factors, the error is not random, but systematic. For example, errors caused by inadequate recall may differ by specific personal traits, such as age – this would be a systematic error (see section below).

Example: sources of variation in nutrient intakes among men in Shanghai, China [2]

Example: sources of variation in daily physical activity levels as measured by an accelerometer [10]

Controlling random error

Random error generally affects a method’s reliability. Random errors may be reduced or compensated for by:

  • Increasing the number of measurements taken per participant
  • Increasing the sample size or the length of follow up in a cohort study
  • Frequent calibration of instruments
  • Incorporating quality control and assurance in the assessment process:
    • Rigorous standard operating procedures
    • Manual of procedures
    • Standardised rigorous and consistent training of fieldworkers/interviewers undertaking assessments
    • Inter and intra observer reliability checks
  • Pilot studies, particularly when designing a new questionnaire

Effect of systematic error on predicted values

Systematic error causes deviation away from the true value in a particular direction (i.e. higher or lower, as shown by C.4.2). Unlike random error (see section above) systematic error distorts the mean and median of the estimated values, and is more commonly associated with the validity of a method.

Figure C.4.2 The distribution of measured values distorted due to systematic error. Note that the mean of the estimated values has deviated from the true population mean.

The validity page describes systematic error in the context of taking height measurements with participants’ shoes still worn. This causes deviation from the true score in an upwards direction. Although random error can be compensated for by an increased number of measurements or sample size, this systematic error would persist to the same extent however many replicate measurements were taken.

Systematic error does not always result in bias

If physical activity measurements were performed only in summer, individuals’ levels of physical activity over a year are likely to be overestimated because of being more physically active in summer than in winter. This is a systematic error due to insufficient sampling across seasons.

However, if the degree of the overestimation is consistent across individuals, the ranking of individuals by physical activity level remains unbiased; in other words this is still a valid method for ranking individuals.

If researchers are primarily interested in relative levels, rather than estimating absolute levels of a key variable, systematic error is not always problematic. Ideally this would be tested in a subset of a study to confirm that systematic error does not distort ranking of individuals.

In research on population health, assessments of diet, physical activity and anthropometry are implemented in different population subgroups depending on the research question. These subgroups may be defined by: sex, age, socioeconomic status, lifestyle behaviours, risk factors for a disease, disease status, and the degrees of systematic and random errors may vary by those characteristics.

If error varies by one or more of those characteristics, the error is differential; and if not, non-differential. For example:

  • Men and women may both over-report physical activity (systematic error), but this may be greater in males leading to a higher proportion being categorised as meeting guidelines (differential systematic error by sex)
  • Alternatively the misclassification can occur non-differentially, whereby the degree of misclassification is the same by group (i.e. men and women over-reporting physical activity to the same degree; non-differential systematic error by sex)

Wording can vary in the literature; terms such as homogeneity (non-differential) or heterogeneity (differential) in errors by subgroups are often used.

Different types of errors are visualised in Figure C.4.3, which illustrates:

  • If random error is present, variability is wide
  • If systematic error is present, mean estimate deviates from a true mean
  • If differential error is present, random error, systematic error, or both are different between men and women
Figure C.4.3 Display of random, systematic, differential, and non-differential errors with respect to men and women.

Bias is the term used to describe the cumulative effect of deviation of observations from truth which investigators wish to measure. So, in the example above the poor calibration of the weighing scales was just one form of bias that distorted the weight data consistently upwards.

There are at least 50 forms of bias that occur in population health sciences and several different systems for classifying and subcategorising them. One of the simplest systems is to separate them into:

  • Information bias
  • Selection bias
  • Confounding

Further information regarding the categories and subcategories of bias can be found in the following article [3]

Information bias

Information bias is a result of errors introduced during data collection and is therefore the form of bias most applicable to the issue of measurement. These errors originate from:

  • Those measuring the variable, known as observer bias. For example, researcher knowledge of hypotheses and group allocations could influence the way information is collected and interpreted
  • The tool being used for measurement, known as measurement bias. For example a malfunctioning gas analyser or poorly designed questionnaire
  • Those being measured, known as respondent bias. Can take many forms, such as:
    • Reporting bias, whereby participants provide answers in the hypothesised direction
    • Social desirability bias, which can cause over- or under-reporting of certain behaviours, in order to appear favourable or avoid criticism
    • Recall bias, which occurs due to differences in the accuracy or completeness of recollections by study participants relating to prior events or experiences

These forms of bias can result from different forms of errors (e.g., mixture of differential and non-differential systematic and random errors)

Observer and respondent bias can be minimised by undertaking many different approaches: Training observers, requesting participants not to change lifestyle during a study or behave differently when being monitored, and others as described above (see maximising reliability).

In a randomised controlled trial, participants and observers are ideally blinded to the intervention so that they do not behave differently. The same principles apply for measurement as administration of assessment is also a form of intervention.

Selection bias

Selection bias occurs when the study sample is systematically unrepresentative of the target population about which conclusions are to be drawn, resulting in data with insufficient external validity. The resulting data may therefore be flawed in its ability to answer any research questions about the target population, even if the study itself has a high level of internal validity. Selection bias can occur because of:

  • The way the study population (or subgroups) are defined
  • The inclusion or exclusion criteria
  • Study withdrawals/rate of follow up
  • Non-responders

Selection bias can often be reduced through careful study design, ensuring that no element of recruitment or data collection systematically favours one type of individual over another, as well as actively monitoring the characteristics of sample population during the course of the study.

If selection bias is likely to exist by design, it is helpful to collect other information which makes it possible to estimate the likely size and direction of the bias in the results. For instance, it is possible to collect the demographic characteristics of non-responders, who can then be compared with the responders to determine whether they differ systematically by social status, education etc.

Confounding

Confounding does not result from measurement error and is not a measurement issue per se, but is an important epidemiological concept relating to study design, analysis and interpretation of data. Although confounding does not directly generally affect choice of method of assessment directly, it is important to ensure that potential confounding variables are measured using reliable and valid methods in order to reduce the effect of residual confounding.There are three criteria for confounding:

  1. Variable has an independent association with an outcome of interest
  2. Variable has an association with both the exposure and outcome
  3. Variable is not on the causal pathway between the exposure and outcome

In typical epidemiological research on a lifestyle-disease association, confounding is explained by the relationships depicted in Figure C.4.4

Figure C.4.4 Example of relationship between exposure, outcome and confounder.

In the above example, one of assumptions is that smoking status is not on the causal pathway of coffee-cancer relationship or, in other words, coffee consumption does not change smoking status. Confounding is controlled in the design of a study by randomisation, restriction or matching of participants, and during the analysis phase by stratification or statistical modelling.

References

  1. Bellach B, Kohlmeier L. Energy adjustment does not control for differential recall bias in nutritional epidemiology. Journal of clinical epidemiology. 1998;51(5):393-8.
  2. Cai H, Yang G, Xiang YB, Hebert JR, Liu DK, Zheng W, et al. Sources of variation in nutrient intakes among men in Shanghai, China. Public Health Nutr. 2005;8(8):1293-9.
  3. Delgado-Rodriguez M, Llorca J. Bias. J Epidemiol Community Health. 2004;58(8):635-41.
  4. Dodd KW, Guenther PM, Freedman LS, Subar AF, Kipnis V, Midthune D, et al. Statistical methods for estimating usual intake of nutrients and foods: a review of the theory. Journal of the American Dietetic Association. 2006;106(10):1640-50.
  5. Ferrari P, Friedenreich C, Matthews CE. The role of measurement error in estimating levels of physical activity. American journal of epidemiology. 2007;166(7):832-40.
  6. Ferrari P, Kaaks R, Fahey MT, Slimani N, Day NE, Pera G, et al. Within- and between-cohort variation in measured macronutrient intakes, taking account of measurement errors, in the European Prospective Investigation into Cancer and Nutrition study. American journal of epidemiology. 2004;160(8):814-22.
  7. Freedman LS, Midthune D, Carroll RJ, Krebs-Smith S, Subar AF, Troiano RP, et al. Adjustments to improve the estimation of usual dietary intake distributions in the population. J Nutr. 2004;134(7):1836-43.
  8. Jakes RW, Day NE, Luben R, Welch A, Bingham S, Mitchell J, et al. Adjusting for energy intake--what measure to use in nutritional epidemiological studies? International journal of epidemiology. 2004;33(6):1382-6.
  9. Kaaks R, Ferrari P, Ciampi A, Plummer M, Riboli E. Uses and limitations of statistical accounting for random error correlations, in the validation of dietary questionnaire assessments. Public Health Nutr. 2002;5(6a):969-76.
  10. Matthews CE, Ainsworth BE, Thompson RW, Bassett DR, Jr. Sources of variance in daily physical activity levels as measured by an accelerometer. Medicine and science in sports and exercise. 2002;34(8):1376-81.
  11. Wong MY, Day NE, Bashir SA, Duffy SW. Measurement error in epidemiology: the design of validation studies I: univariate situation. Statistics in medicine. 1999;18(21):2815-29.
  12. Wong MY, Day NE, Wareham NJ. Measurement error in epidemiology: the design of validation studies II: bivariate situation. Statistics in medicine. 1999;18(21):2831-45.