Contributed paper list


   (Monday 14th, 10:55-12:25)   In session C2B

A shiny new opportunity for big data in statistics education


Presenter

Karsten Maurer (United States)

Abstract

As the availability of truly massive data sets proliferates it is enticing to incorporate these data sources into the curriculum of an undergraduate statistics course. Major barriers exist for inclusion of big data due to the computationally intense nature of working with large databases. Difficulties include gaining access to the database, interacting with database management software and obtaining manageable subsamples from the database for student use. This paper describes a web based application, the Shiny Database Sampler, which allows instructors to bypass these barriers using a simple JavaScript based tool. The tool is constructed using R and the R packages Shiny and RMySQL to allow the instructor and/or students to sample observations from a number of different large databases, using selected sampling schemes, for use in the statistics classroom.