Proceedings





Invited Talks

Topic 4 : Statistics education at the post secondary (tertiary) level

This topic will address the reality of the modern requirements of data analysis and context in statistics education. Various new approaches to practical problems have been developed in recent years: resampling, Bayesian inference, nonparametric smoothing, computer-intensive techniques, multivariate software, and data mining, among others. These innovations have been made accessible to a wide variety of researchers and professionals outside of the statistics profession. The topics in this session have been suggested to facilitate our involvement in the modernization of the statistics curricula.



Session 4A: A taxonomy of statistics courses


4A1: Banishing the theory-applications dichotomy from statistics education

Larry Weldon   Simon Fraser University, Canada

The math-stat versus applications dichotomy in statistics courses has had a regressive influence on modernization of statistics education. Statistics theory involves much more than mathematics, and safe application of statistical methods requires an understanding of the theory behind the methods. Courses which focus on theory alone, or on applications alone, are lacking the linkages needed for a useful knowledge of the subject. But while all statistics courses should include guided experiences in applying statistics theory to application contexts, the choice of contexts should reflect the needs of the particular group of students targeted. In this paper, the implications of a context-dependent taxonomy of courses is explored. This apprenticeship approach has advantages for student motivation and for the authenticity of student learning.

Paper


4A2: Accommodating specialists and non-specialists in statistics courses

Kevin Keen   University of Northern British Columbia, Canada

Seven upper division Statistics courses in support of a new Minor in Statistics have been developed to be taught by two statisticians at The University of Northern British Columbia. These courses have been designed so that non-specialist undergraduate and postgraduate students can enrol in these courses. Five undergraduate courses are paired with parallel courses for postgraduate credit for a proposed Master of Science to be offered by the School of Business and a proposed Health Sciences doctoral program. Through discussions with the curricula committees for these programs, course prerequisites became calculus and linear algebra courses as taught to non-science majors. This requirement generally exceeds that expected for typical non-specialist courses but is less than the usual for specialist courses. The R statistical software package and the fundamentals of graphical display will be incorporated throughout this new curriculum.

Paper


4A3: Specialized basic courses for engineering students: a necessity or a nuisance?

Lena Zetterqvist   Lund University, Sweden

Basic statistics courses for engineering students often focus on general applications in engineering, using course literature intended for ‘engineers’. However, these students are not a homogeneous group, and the differences among engineering programmes appear to be increasing. One way to meet the challenge of the diversity of needs within a discipline is to adapt the courses for different programmes. This could also be a way of increasing students’ motivation. During the past decade, several of the statistics courses given at the Faculty of Engineering at Lund University have been specialized with regard to syllabi, applications and teaching methods. We discuss the factors involved in specialized course development, as well as the challenging implications for the teachers and department involved.

Paper




Session 4B: Less parametric methods in statistics


4B1: The use of statistical software to teach nonparametric curve estimation: from Excel to R

Ricardo Cao   University of La Coruña, Spain
Salvador Naya   University of La Coruña, Spain

The advantages of using R and Excel for teaching nonparametric curve estimation are presented in this paper. The use of these two tools for teaching nonparametric curve estimation is illustrated by means of several well-known data sets. Computation of histogram and kernel density estimators as well as kernel and local polynomial regression estimators is presented using Excel and R. Interactive changes in the sample and the smoothing parameter are illustrated using both tools. R incorporates sophisticated routines for crucial issues in nonparametric curve estimation, as smoothing parameter selection. The paper concludes summarizing the relative merits of these two tools for teaching nonparametric curve estimation and presenting RExcel, a free add-in for Excel that can be downloaded from the R distribution network.

Paper


4B2: On teaching bootstrap confidence intervals

Joachim Engel   University of Education Ludwigsburg, Germany

The basis for most inferential procedures is the idea of a sampling distribution. Computer simulation lets students gain experience with and intuition for this concept. The bootstrap can reinforce that learning. While today the bootstrap belongs to the toolkit of any professional statistician, its vast pedagogical potential is still in the process of being discovered. We discuss several bootstrap methods to compute confidence intervals from a teaching perspective.

Paper


4B3: Exploring data with non- and semiparametric models

Marlene Müller   Beuth University of Applied Sciences, Germany

Today the use of exploratory and graphical techniques to analyze data is practically standard. With R (www.R-project.org) the appropriate software tools are available to everyone. We address in particular kernel density estimation and non- and semiparametric kernel regression techniques as methods that at one hand can help to explore data and on the other hand may assist in finding appropriate parametric models for fitting data. We discuss how to introduce the these methods in class and shows some examples using R.

Paper




Session 4C: Methods for ordinal data analysis


4C1: Teaching: a way of implementing novel statistical methods for ordinal data to researchers

Elisabeth Svensson   Swedish Business School, Sweden

The use of questionnaires, rating scales and other kinds of ordered classifications is unlimited and inter-disciplinary, so it can take long time before novel statistical methods presented in statistical journals reach researchers of applied sciences. Therefore, teaching is an effective way of introducing novel methods to researchers at an early stage. Assessments on scales produce ordinal data having rank-invariant properties only, which means that suitable statistical methods are nonparametric and often rank-based. These limited mathematical properties have been taken into account in my research regarding development of statistical methods for paired ordinal data. The aim is to present a statistical method for paired ordinal data that has been successfully implemented to researchers from various disciplines together with statisticians attending interactive problem solving courses of biostatistics.

Paper


4C2: Fitting transition models to longitudinal ordinal response data using available software

Mojtaba Ganjali   Shahid Beheshti University, Iran

In many areas of medical and social research, there has been an increasing use of repeated ordinal categorical response data in longitudinal studies. Many methods are available to analyze complete and incomplete longitudinal ordinal responses. In this paper a general transition model is presented for analyzing complete and incomplete longitudinal ordinal responses. How one may obtain Maximum Likelihood (ML) estimates for the transition probabilities by existing software is also illustrated. The approach is implemented on a real application. For this data set, two important results are underlined: (1) some transition probabilities may be estimated to be zero and (2) the model for current response, which conditions on previous response may reduce the effects of some covariates that had previously been strongly significant.

Paper


4C3: An illustration of multilevel models for ordinal response data

Ann A O’Connell   Ohio State University, United States

Variables measured on an ordinal scale may be meaningful and simple to interpret, but their statistical treatment as response variables can create challenges for applied researchers. When research data are obtained through natural hierarchies, such as from children nested within schools or classrooms, clients nested within health clinics, or residents nested within communities, the complexity of studies examining ordinal outcomes increases. The purpose of this paper is to present an application of multilevel ordinal models for the prediction of proficiency data. Implications for teaching and learning of multilevel ordinal analyses are discussed.

Paper




Session 4D: Innovations in teaching statistics at the tertiary level


4D1: Real-life module statistics: a happy Harvard experiment

Xiao-Li Meng   Harvard University, United States
Kari Lock   Harvard University, United States

Five years ago, a discussion ensued over wine about how to make learning statistics a “happy” experience. This turned into many discussions over dinners and wine, and the formation of the “happy team”: a team of faculty and grad students dedicated to creating the course “Real-Life Statistics: Your Chance for Happiness (or Misery)”. The course is module based, featuring modules such as “Romance”, “Wine and Chocolate”, “Finance”, “Medical”, and more. We’ve taught this course at Harvard three times; twice as a second level course and once with no prerequisites. Here we discuss the team approach to creating a course, the module approach to teaching statistics, and the happiness (and misery) involved both for us and our students.

Paper


4D3: Enriching statistics courses with statistical diversions

Eric Sowey   University of New South Wales, Australia
Peter Petocz   Macquarie University, Australia

Memorable teaching—teaching that makes memorable not only the teacher but also what the student has learned—is necessarily strong in both the cognitive and the affective domain. University statistics courses make heavy cognitive demands of students: these demands become more intense as the level of study increases. In such a setting, many students value an opportunity to break away for a while from the focus on technical aspects, to engage with a thematically-related challenge that the teacher has designed to provoke them or to pique their curiosity. We explore the design of such statistical diversions to enrich undergraduate courses at various levels. Diversions seem to be little recognized or utilized in statistics education, yet they can reinvigorate students’ interest in their study of the subject. Diversions can also produce clarifying perspectives and other enlightening insight—characteristic cognitive strengths of memorable teaching.

Answers to the statistical diversions presented as examples in this paper are available at:
www.stat.mq.edu.au/our_staff/staff_-_alphabetical/staff/peter_petocz/publications/

Paper


4D4: Stats2: An applied statistics modeling course

Jeffrey Witmer   Oberlin College, United States

The typical Stats1 introductory course at the tertiary level covers one-sample and two-sample inference and ends with regression or perhaps one-way ANOVA. We propose that a second course in statistics be built around the idea of statistical modeling (“data = model + error”), beginning with a review of simple linear regression and continuing through two-way ANOVA and logistic regression. Unlike the situation in Stat1, students in Stat2 require access to powerful and flexible computing, which suggests that R be used to fit models. We discuss a Stat2 course that includes both traditional, normal-theory based, inference and randomization tests.

Paper




Session 4E: Heterogeneity of student levels


4E1: Teaching critical thinking to first year university students

Jennifer Brown   University of Canterbury, New Zealand
Irene David   University of Canterbury, New Zealand

We discuss a major change in the way we teach our first year statistics course at UC. In 2008 we redesigned the course with emphasis on teaching critical thinking. The catalyst for change was recognition that most of the students take the course for general knowledge and support of other majors, and very few are planning to major in statistics. We indentified the essential aspects of a first year statistics course, given this student mix, focusing on a simple question, “Given this is the last chance you have to teach statistics, what are the essential statistics skills your bank manager/car salesmen/primary school teacher, need to have?” We have moved from thinking about statistics skills needed for a statistician to skills needed for a manager in today’s society. We have changed the way we deliver the course with less emphasis on lectures and more on computer based tutorials, Excel, and computer skills testing, and written assignments.

Paper


4E2: Medical students and statistics challenges in teaching, learning and assessment

Philip Sedgwick   St George’s University of London, United Kingdom

Over the last fifteen years, graduate entry programmes for admission to undergraduate MBBS courses have become increasingly popular in the UK. At St. George’s, University of London, students entering the graduate entry MB BS programme will have a first degree in any discipline whilst some have further degrees. Challenges typically arise when teaching and learning medical statistics. Students do not necessarily expect to study statistics at medical school, whilst confidence and expertise varies depending on the student’s previous degree and how long ago they studied mathematics. The presentation will focus on how teaching of medical statistics when integrated with the basic and clinical sciences, plus learning through group work can help address heterogeneity in students. In particular students are encouraged to be independent, identifying and meeting their own individual learning needs in order to progress.

Paper


4E3: An overview of techniques used in the teaching and assessing of knowledge and application of statistical skills across undergraduate levels

Rosie McNiece   Kingston University, United Kingdom

Teaching statistical methodology and application of such methodologies requires a varied approach depending on the abilities and level of expertise of students. This paper outlines some of the varying methods used in teaching and assessing statistics at different levels of an undergraduate degree program in Statistics. The discussion includes teaching methods used to instil a thorough understanding of the basic concepts that underpin statistical analysis to a student body with varied academic backgrounds and assessment strategies that are designed to provide frequent monitoring of students progress, for both students and instructors. Different approaches to delivery and assessment of more advanced methodologies and concepts, including approach to data analysis, problem solving skills and practical applications incorporating statistical IT literacy are also reviewed.

Paper




Session 4F: Sensible use of multivariate software


4F1: Effect sizes and confidence intervals for multivariate analysis: how complete are published accounts of research in psychology?

Fiona Fidler   La Trobe University, Australia
Lisa L Harlow   University of Rhode Island, United States
Geoff Cumming   LaTrobe University, Australia
Jacenta Abbott   LaTrobe University, Australia

Effect sizes (ESs) and confidence intervals (CIs) are widely advocated in psychology and other disciplines. To date most expository articles have focused only on univariate analyses, despite there being similarly good reasons for reporting and interpreting ESs and CIs following multivariate analyses. We surveyed articles published in leading psychology journals in 2008 to discover: a) which multivariate methods were in common use, b) what types of ESs accompany typical multivariate reports, c) whether CIs on ESs were routinely reported d) whether error bars are reported in figures and e) what software authors were using to conduct these analyses. Our results revealed varying traditions of ES reporting for different multivariate techniques, but CIs were in all cases rare. These results highlight areas for software development and for increased educational efforts.

Paper


4F2: A sampling of analyses and software use for cluster randomized trials over the last decade

Elly Korendijk   Utrecht University, The Netherlands
Joop Hox   Utrecht University, The Netherlands

In experimental research it is not always possible or desirable to randomize at the individual level; instead clusters of individuals are assigned to conditions. The clusters may be existing groups, like classes, families or communities; or may be established for the purpose of the trial, like therapy groups, where subjects within clusters are likely to respond more alike than subjects between clusters. Due to this dependency it is necessary to use hierarchical linear models, which are also referred to as multilevel models, when analyzing data of cluster randomized trials. A literature review over the last decade will summarize with which model data of cluster randomized trials have been analyzed and which software package has been used.

Paper


4F3: Applying idiographic research methods: two examples

Wayne Velicer   University of Rhode Island, United States

Idiographic methods focus on time-dependent variation within a single individual (intra-subject variability) in contrast to group-level relationships (inter-subject variability) that may yield different results. Equivalent results occur only if two, probably unrealistic, conditions specified by Ergodic Theorems are met: (1) Each trajectory follows the same dynamic laws, and (2) Each trajectory has equal mean levels and serial dependencies. Two studies illustrate the difficulty of meeting ergodic conditions and unique types of research questions that can be addressed by idiographic methods. The first example involves longitudinal smoking patterns. The second study involves reactions of autistic students to environmental stressors. Neither study supports the ergodic conditions. Both studies illustrate research questions unique to idiographic methods, which produce information that is likely to be distinct from that provided by group-based methods.

Paper


4F4: Exploratory factor analysis in Mplus, R and SPSS

Sigbert Klinke   Humboldt University of Berlin, Germany
Andrija Mihoci   Humboldt University of Berlin, Germany
Wolfgang Härdle   Humboldt University of Berlin, Germany

In teaching, factor analysis and principal component analysis are often used together, although they are quite different methods. We first summarise the similarities and differences between both approaches. From submitted theses it appears that student have difficulties seeing the differences. Although books and online resources mention some of the differences they are incomplete. A view either oriented on the similarities or the differences is reflected in software implementations. We therefore look at the implementations of factor analysis in Mplus, R and SPSS and finish with some conclusions for the teaching of Multivariate Statistics.

Paper




Session 4G: Learning statistics through projects


4G1: Incorporating a research experience into an early undergraduate statistics course

Shonda Kuiper   Grinnell College, United States

This paper describes guided interdisciplinary projects for early undergraduate courses that encourage students to experience the role of a research scientist and to understand how statistics is used in advancing scientific knowledge. An inquiry-based introductory lab activity walks students through a relatively advanced statistical topic. After the introductory lab, students conduct a research project that involves reading primary journal articles, developing their own research hypothesis, conducting a study, and presenting their results. The global warming hockey stick controversy described in this paper is one example of several intriguing real-world projects that can demonstrate the intellectual content and broad applicability of statistics as a discipline.

Paper


4G2: Student discovery projects in data analysis

Mike Forster   University of Auckland, New Zealand
Helen MacGillivray   Queensland University of Technology, Australia

AAt Queensland University of Technology, students have been doing self-selected group projects in data analysis as a major component of their course assessment for well over a decade. These projects provide experiential learning of statistical problem-solving and the data investigation cycle of plan, collect, process, discuss in topics of their choice. Datasets must involve at least four, and preferably more, variables. Auckland University is introducing similar discovery projects in a second year data analysis course in 2010. In this paper, we discuss the background and motivation for such projects; experiences in setting up, running, assessing and administering self-selected student project work; the types of projects students choose to do and guidance they receive; effects of the students’ work and reports; our experiences with project students and finally, our assessments of the value to students of doing self-selected group project work.

Paper


4G3: Formulating statistical questions and implementing statistics projects in an introductory applied statistics course

Katherine Halvorsen   Smith College, United States

Students taking introductory statistics in the Mathematics and Statistics Department at Smith College, a liberal arts college for women in Northampton, Massachusetts, conduct independent research projects as part of their coursework. Instructions for the project, given to students with the syllabus, include a list of deliverables: research proposal, peer reviews, data, progress reports, draft analyses, and due dates for each. Students, working in groups of two to four, write a research proposal that includes their research question and an outline of their data collection and analysis methods. Before students begin data collection, research proposals must be approved by the instructor(s) and, when human or animal subjects are used, by the Smith College Institutional Review Board. Students carry out the research during the term and present their work in a poster session, oral presentation, or a research paper at the end of term.

Paper




Session 4H: Integrating consulting with graduate education


4H1: Experiences with research teams comprised of graduate students, faculty researchers, and a statistical consulting team

Heather Smith   California Polytechnic State University, United States
John Walker   Cal Poly State University, United States

Each year at Cal Poly, statistics faculty provide consulting services to over 100 non-statistician graduate students and research faculty from across the university as part of our Statistical Consulting Service. In addition, all undergraduate/tertiary statistics majors take a capstone course titled ‘Statistical Communication and Consulting.’ This course is a blend of the theoretical and practical aspects of statistical consulting; helping our statistics majors develop the tools necessary to successfully participate as statistical consultants. Following this training, many of these statistics majors work collaboratively with clients in their research efforts. These research teams are often comprised of a faculty statistician, a trained statistics major, non-statistician graduate students, and research faculty. We report on the key aspects of the statistical consulting course, and provide examples of these research efforts, emphasizing the learning benefits of this research arrangement to both the statistics majors and the graduate students.

Paper


4H2: Communication in statistical consultation

Wessel Hendrik Moolman   University of Kwazulu-Natal, South Africa

Almost all statisticians get requests for help with data analysis from clients in other fields. In order to provide such help it is essential that the statistician understands the client’s problem prior to solving it and is able to explain the answer to the problem to the client once it has been solved. From the client’s point of view, the problem should be properly explained to the statistician and the statistician’s solution should be understood so that it can used. This need for understanding the problem and explaining the solution leads to a series of communications between the statistician and client. The purpose of this study is to explain how this communication unfolds and issues that might influence it.

Paper


4H3: Lessons we have learned from post-graduate students

Sue Finch   University of Melbourne, Australia
Ian Gordon   University of Melbourne, Australia

The Statistical Consulting Centre at The University of Melbourne has been providing a postgraduate consulting service for many years. Each year, the service is used by about 200 students who come from a wide range of disciplines. Consultants can assist these highly motivated students with any part of the research cycle, from refining a research question and developing a suitable design to presenting and communicating findings. The rich and varied consulting interactions with these students provide us with an opportunity to observe applied researchers using statistical knowledge to find out about things that matter to them. We report on a survey of consulting sessions and identify important aspects of statistical thinking that commonly arise across the research cycle. These observations help us develop approaches to graduate education that are applied and contextually relevant.

Paper




Session 4I: Integrating Bayesian methods with traditional statistics education


4I1: Psychology students’ understanding of elementary Bayesian inference

Carmen Díaz   University of Huelva, Spain

We explore the possibility of introducing basic ideas of Bayesian inference to undergraduate psychology students and report on the outcomes of our training. We present empirical results on how 78 psychology students learned the basics of Bayesian inference after a 12 hour teaching experience, which included Bayes’ theorem, inference on proportions (discrete and continuous case) and means. Learning was assessed through a questionnaire that included multiple choice items and open-ended problems that were solved with the help of computers. In this paper we report part of our results, that show that a majority of students reached good intuitive understanding of most teaching goals, even with a limited time of teaching. We also remark that the main problems detected do not directly relate to Bayesian inference. Difficulties in distinguishing a conditional probability and its inverse that have been repeatedly pointed out in the literature arose in our students and had an influence in general performance.

Paper


4I2: Comparing the Bayesian and likelihood approaches to inference: a graphical approach

Bill Bolstad   University of Waikato, New Zealand

Both likelihood inference and Bayesian inference arise from a surface defined on the inference universe which is the Cartesian product of the parameter space and the sample space. Likelihood inference uses the sampling surface which is a probability distribution in the sampling dimension only. Bayesian inference uses the joint probability distribution defined on the inference universe The likelihood function and the Bayesian posterior distribution come from cutting the respective surfaces with a (hyper)plane parallel to the parameter space and through the observed sample values. Unlike the likelihood function, the posterior distribution always will be a probability distribution. This is responsible for the different choices of estimators, and the different way the two approaches have of dealing with nuisance parameters. In this paper we present a graphical approach for teaching the difference between the two approaches.

Paper


4I3: The very beginning of a class on inference: classical vs Bayesian

Lisbeth Cordani   Maua Institute of Technology, Brazil

Although the original Bayesian theory was settled in the 18th century, due to various previous computational difficulties, only in the last 20 years has the Bayesian method grown substantially. This may explain why only the classical approach has been offered at the educational level. In our view, it is important to present both approaches to undergraduate students, enlarging their vision not only about statistical tools but also about the inherent philosophy of those schools. Both approaches offer tools to solve practical problems. The students may have a quite different background and come from different courses, and can be encouraged to select, in the future, the best method for their purposes. Our suggestion is to present the main ideas of inference, starting with some issues about conditional logic and its influence on the inferential conclusions, comparing both classical and Bayesian approaches. Examples will be presented.

Paper


4I4: Teaching young grownups how to use Bayesian networks

Stefan Krauss   University of Regensburg, Germany
Georg Bruckmaier   University of Regensburg, Germany
Laura Martignon   Ludwigsburg University of Education, Germany

A Bayesian network, or directed acyclic graphical model is a probabilistic graphical model that represents conditional dependencies and conditional independencies of a set of random variables. Each node is associated with a probability function that takes as input a particular set of values for the node’s parent variables and gives the probability of the variable represented by the node, conditioned on the values of its parent nodes. Links represent probabilistic dependencies, while the absence of a link between two nodes denotes a conditional independence between them. Bayesian networks can be updated by means of Bayes’ Theorem. Because Bayesian networks are a powerful representational and computational tool for probabilistic inference, it makes sense to instruct young grownups on their use and even provide familiarity with software packages like Netica. We present introductory schemes with a variety of examples.

Paper




Session 4J: Sampling populations


4J1: Teaching survey sampling with the “sampling” R package

Alina Matei   University of Neuchâtel, Switzerland
Yves Tillé   University of Neuchâtel, Switzerland

The R language is a free software environment for statistical computing and graphics. This software is complemented by almost 2000 packages developed by researchers. R is thus the most complete statistical software environment. At the University of Neuchâtel, Alina Matei and Yves Tillé have developed the ‘sampling’ R package that contains a set of tools for selecting and calibrating samples. Modern procedures, like the cube algorithm for selecting balanced samples or calibration with several distances and bounds, are implemented. This package is very easy to use and is thus an interesting tool for teaching. Simulations can be done very quickly. Different methods of sampling and calibration can be run and compared. We will present several exercises that can be done quickly and that allow a practical application of the sampling theory.

Paper


4J2: The use of Monte Carlo simulations in teaching survey sampling

Anne Ruiz-Gazen   Toulouse School of Economics, France
Camelia Goga   University of Bourgogne, France

Our objective is to illustrate the use of simulations in the teaching of a graduate course on survey sampling theory. Students are from a master’s degree in statistics and have a strong mathematical background. The course consists essentially in theoretical lectures and exercises but it contains also some computer based training which appears to be very helpful for students to understand the theoretical concepts taught. For the computer based training, real populations from official French surveys data base are used and students are asked to carry out some Monte Carlo simulations. Simulations consist in generating a large number of samples according to different sample designs and estimate some finite population parameters according to different estimation methods. Many properties of the sample designs and of the estimation methods can be recovered by using simulations and this point will be illustrated in more detail.

Paper


4J3: Understanding sample survey theory with the “replicates-duplicates” approach

Pierre Lavallée   Statistics Canada, Canada

In sampling, a sample is selected from a finite population in order to produce some estimates for this population. We want these estimations to be unbiased and the most precise. A good precision corresponds to the situation where different samples produce about the same estimation. In other words, we want the replicates (i.e., the results of the sample selection process) to be as duplicates. The use of auxiliary information (e.g. through a linear regression estimator) also helps in making replicates to be as duplicates, and the concept of superpopulations allows alleviating some emerging conceptual problems. Based on the “replicate-duplicate” approach, we can develop a complete philosophy of teaching sampling theory where, at the start, formulas are left behind to concentrate in the development of the intuitive aspect of sampling theory.

Paper