Towards statistical thinking: making real data real



  • Frauke Kreuter


Although the Statistics Education community has advocated using real data to teach introductory statistics for quite some time, often these data sets are not recognizably real to statisticians since the students’ limited experience with “real” statistical software and data management techniques precludes the use of truly messy data. But grappling with messy and complex data sets is important for teaching Statistical Thinking (broadly defined as “thinking like a statistician”) and is appropriate for an introductory statistics course. We describe our experience collecting rich data sets and developing computer lab assignments using STATA to teach statistical thinking to first-year university students using these data sets. Collecting useable, real, data sets turns out to be fairly difficult for several reasons, and teaching data management and analysis without resorting to rote-based rules is quite challenging.