Introducing large data sets into the classroom: a graphical user interface for teaching with databases


Ulrike Genschel, Heike Hofmann, Danielle S. Wrolstad


Ulrike Genschel (United States)


Analysis of large, complex data sets is increasingly relevant for today’s statisticians. To help facilitate training of databases and SQL (Structured Query Language) at the undergraduate level, we propose a graphical user interface allowing for statistical analyses of large databases using subsampling techniques. The example database contains information on 25 variables for over 120 million commercial flights across the United States since 1987, including information on originating and destination airport and temporal information, such as planned flight schedule, actual take-off and landing times and further qualitative variables. Textual output of a session's SQL commands summarizes students' attempts in interacting with the database providing not only feedback to the instructor but also serving as starting points for more complex aspects of the SQL language similar to SAS (Statistics Analysis Software) scripts initiated from JMP sessions.