Quick search:
Contributed paper list

   (Thursday 6th, 16:00-18:00)

A real data approach to teaching the consequences of non-random sampling



An important topic, which most statisticians fully understand, is that in order for many statistical inference procedures to be valid, the data must come from a random sample. However, the consequences of not meeting this assumption are seldom demonstrated. In this paper, we discuss how heart rate data collected by students may be used to demonstrate this concept. When asked to collect data from a sample of five people, the students will never follow true random selection. The resulting data then gives the instructor a large number of samples of size n=5, ready-made to estimate the coverage probability of confidence intervals for the mean heart rate. This coverage probability has never failed to be far below the stated confidence level. We then pool the data and divide the observations randomly into samples of size n=5. Once confidence intervals are constructed on each of these random samples, the coverage probability is correct.