Contributed paper list
(Wednesday 16th, 10:55-12:25) In session C10C
It is time to include data management in introductory statistics
Robert H Carver, Mia Stephens
Presenter Robert H Carver
There has been widespread adoption of real data sets and computational software in the teaching of introductory statistics. To sustain these two developments and to maintain currency with the explosion in freely available data from public sources, it is important for students to learn methods for obtaining, cleaning, organizing and manipulating large datasets from multiple sources prior to analysis. Though experimental studies remain central, the practice of data analytics in many disciplines begins with observational data. This paper begins with a rationale for including foundational concepts of data management such as joining tables, selecting rows, and inserting rows, as well as practice with automated approaches to missing and dirty data. It then goes on to provide illustrative examples of how such topics can be taught in engaging and accessible ways, and to suggest course topics that can be suppressed to open room for these topics.