Proceedings





Invited Talks

Topic 5 : Assessment in statistics education

Assessment plays a vital role in the teaching/learning process in statistics education, and its content, methods, and results influence learners, teachers, programs, institutions, and other stakeholders. Yet, assessment poses many challenges in our discipline. Assessment needs to mirror changing instructional goals which involve understanding of context- and data-based arguments, statistical reasoning, and communication skills, while requiring students to integrate abstract ideas and linked conceptual networks with procedural and technical skills. Our diverse learners possess a wide range of ages, backgrounds, and preconceptions in data analysis and probability, as well as diverse goals and perceptions regarding how they will apply statistical knowledge. Further, learners of all ages are increasingly expected to be able to use various technologies and understand their role and promise as part of accessing, learning, analyzing, or communicating about statistical issues, and in parallel new technologies expand the realm of feasible assessment techniques. Lastly, current emphasis on accountability demands reliable and valid assessments that can be applied on a larger scale or at the program level. With these issues in mind, session topics will describe and take a critical look at current and recommended practices, explore challenges and future directions, and present research results and suggested improvements that can contribute to the reliability, validity, practicality, fairness, and overall value of assessment procedures in statistics education.



Session 5A: Assessing progress and performance with authentic and alternative assessment techniques


5A1: Assessment within Census at School: a pilot program in the United States

Martha Aliaga   American Statistical Association, United States
Rebecca Nichols   American Statistical Association, United States

Census at School is an internationally developed program for grades 4–12 operating in the United Kingdom, New Zealand, Australia, Canada, and South Africa. By collecting data about their classmates, students learn data analysis and statistical concepts. Introducing the program in the United States now is particularly opportune, as the decennial census is being conducted this year. The U.S. Census at School evaluation involves two parts: implementation analysis and impact analysis. The first stage, implementation analysis, will cover teacher training, rating of trainers, video class instruction, survey of teachers, and time by activity. In this presentation, we will focus on aspects of the study that are under way, including training and observing teachers.

Paper


5A2: Contrasting cases: the “B versus C” assessment tool for activating transfer

Rachelle Hackett   University of the Pacific, United States

This paper focuses on an assessment method that has been employed on exams given to education students in an applied graduate-level statistics course, but could be incorporated as a class activity or given as homework in undergraduate or graduate courses in other fields. Students review the work of two presumably competent statistical consultants labeled “B” and “C”, who have each attempted to address the same research hypothesis using the same data. After contrasting the cases, the students write letters to either consultant (or to both) who they think is in error, explaining the nature of the mistakes. Sample “B vs. C” problems are presented including descriptions of the consultants’ work and key features upon which the scoring of student answers focus. In addition to identifying theoretical underpinnings (especially Bransford & Schwartz, 1999), student reactions to this assessment method are shared.

Paper


5A3: Assessing pre-service teachers’ conceptions of randomness through project work

Carmen Batanero   University of Granada, Spain
Pedro Arteaga   University of Granada, Spain
Blanca Ruiz   Institute of Technology Monterrey, Mexico
Rafael Roa   Institute of Technology Monterrey, Mexico

In this paper we present results of assessing conceptions of randomness in a sample of 215 prospective primary school teachers in Spain. Data were collected as a part of a statistical project where teachers first collected data from a classical experiment directed to assess their intuitions of randomness, then analysed these data and produced a report where they had to justify their conclusions. Conceptions are first analysed from the data collected in the experiments and secondly from the teachers’ written reports. Results show a good perception of expected values in a series of experiments and poor conceptions of variation and independence in random sequences. These results also indicate a need for better statistics preparation of these teachers and illustrate the usefulness of working with statistical projects in assessing the teachers’ knowledge and improving their statistical and pedagogical knowledge.

Paper




Session 5B: Methods for large scale assessment of meaningful knowledge of statistics


5B1: The statistics items in the Brazilian National Student Performance Exam (ENADE)

Claudette Vendramini   San Francisco University, Brazil
Samantha Oliveira Nogueira   San Francisco University, Brazil
Fernanda Luzia Lopes   San Francisco University, Brazil

The National Student Performance Exam is part of the Brazilian Higher Education Evaluation System and aims to assess the acquisition of competences, the development of abilities, and knowledge, which are considered essential for the student’s formation. The main of this exam is to analyze students’ changes and gains along their trajectory in the higher educational institution. We analyzed the items of the exam by Item Response Theory, and highlighted the items concerning statistics. We used information of 823,892 students from 48 knowledge areas selected by stratified random sampling from all Brazilian courses, from 2004 to 2006. The exam was composed of 40 questions. The statistics questions presented a higher grade of difficulty and significant differences were verified among gender and careers in Statistics items, which were the most difficult ones regardless of the areas.

Paper


5B2: What do you know? Assessment beyond the traditional roles

Susan Starkings   London South Bank University, United Kingdom

The main emphasis of this paper is to look at non-traditional ways of assessing students work. Assessment is often described as having two purposes: one is for measuring student’s performance to indicate how well a student is progressing and to allocate a grade or mark on that performance; the other is in helping a student to learn. The National Union of Students (NUS) in the UK and other educators who study assessment often argue that there is too much emphasis on performance at the expense of aiding students learning. The NUS has stressed that assessment should be ‘for’ learning and not simply ‘of’ learning and calls for more formative assessment. The aim of this paper is threefold namely: (i) to briefly look at traditional methods; (ii) to identify other assessment methods and (iii) to elucidate the advantages and disadvantages of the assessment methods and their usefulness in statistics education.

Paper


5B3: Text analytic tools for the cognitve diagnosis of student writings

Tjaart Imbos   Maastricht University, Netherlands
Ton Ambergen   Maastricht University, The Netherlands

Students can be stimulated to become active learners using a tool for active writing. In our university, we developed such a tool: POLARIS. Active writings of students about statistical concepts are valuable for the students and the teacher. In their writings, students show their understanding of statistical topics. The problem then is how to interpret and score the writings of students in relation to their proficiency in statistics. In this paper text analytic tools are used to cluster and score sample papers of students. Two approaches are compared: a statistics-based approach, Latent Semantic Analysis (LSA) and a linguistic-based approach, known as natural language processing (NLP). The key features of both approaches are discussed, as well as the usability, reliability and validity.

Paper




Session 5D: The use of innovative technologies to enhance assessment of statistical knowledge


5D1: Statistics assessment: the good, the bad, and the ugly

James Nicholson   University of Durham, United Kingdom
Jim Ridgway   University of Durham, United Kingdom
Sean McCusker   University of Durham, United Kingdom

Assessment tasks and scoring schemes convey information to students and teachers about the nature of each discipline, and what is valued. It is important that tasks provoke desirable classroom practices, and motivate students to continue to engage with the discipline. Tasks from high-stakes national examinations are presented that are likely to have the opposite effect; Most are ‘toy’ problems where a technical exercise is presented in a context which is uninteresting and oversimplified. In contrast, we show tasks that engage students in reasoning with authentic data from large scale surveys, and that require statistical reasoning to challenge assertions about matters of fact or plausible courses of action. We believe that better tasks are an important element in addressing Hand’s (2009) concerns about the public image of statistics.

Paper


5D2: Issues for the assessment and measurement of statistical understanding in a technology-rich environment

Rosemary Callingham   University of Tasmania, Australia

As diverse technology use increases in education a number of issues are raised for assessment. This is true in all subjects, but may be particularly pertinent to statistics because students can now use large data sets and deal with multiple variables as part of their learning experiences. The issues are of two kinds. In classrooms, how do teachers assess work that has been produced through the use of technology? Using criteria that were developed for pencil-and-paper assessment may not be sufficient to capture the nature of students’ thinking when the burden of computational data analysis and data display are removed. Outside assessment processes are also challenged with the advent of tools such as Computer Adaptive Testing, and complex interactive data displays. The implications of using technology as part of assessment processes, both inside and outside the classroom, are explored.

Paper


5D3: Technologies for enhancing project assessment in large classes

Michael Bulmer   University of Queensland, Australia

The use of technology in statistics assessment is widespread. These uses include assessment tasks that are moderated by technology, such as formative or summative online quizzes, as well as the more fundamental empowerment of students to be able to tackle realistic data sets and more sophisticated modelling in their assessment tasks through technology. In this paper we identify and survey a third role of technology, supporting project assessment in large classes, and give two key examples of this. The first will be the use of virtual environments to engage students with statistics in context, including issues in experimental design and measurement. The second will look at using technology to enable a one-day statistics conference where each student in a class of eight hundred can give a ten-minute oral presentation on the use of statistics in scientific research.

Paper




Session 5E: Assessing statistical literacy and critical understanding of real-world messages related to statistics, probability, and risk


5E1: Assessing the interpretation of two-way tables as part of statistical literacy

Jane Watson   University of Tasmania, Australia
Erica Nathan   University of Tasmania, Australia

This paper analyses the interview responses of 29 teachers to a question based on the interpretation of a 2-way table. The teachers were asked to articulate the big statistical ideas behind the problem, to suggest appropriate and inappropriate responses they would expect from their students, and then to indicate how they would respond to two selected specific responses given by students in previous student surveys. Hierarchical rubrics were developed for assessing the teachers’ pedagogical content knowledge and examples are given of teachers’ responses at each level. Implications for the statistical literacy curriculum, for the use of authentic problems, and for teacher professional learning are discussed.

Paper


5E2: It’s not what you know, it’s recognising the power of what you know: assessing understanding of utility

Janet Ainley   University of Leicester, United Kingdom
Dave Pratt   University of London, United Kingdom

Traditional approaches to assessing ‘understanding’ in mathematics and statistics education tend to focus on the two strands of procedural competence and conceptual knowledge. We take as our starting point the idea that this does not fully capture what it is to understand mathematical and statistical ideas, and suggest a third dimension of understanding which we call utility; that is, knowing why, when and how a particular idea can be used and the power which it offers. We suggest that this is a key feature of statistical literacy, without which knowledge of statistical ideas cannot be effectively applied. In this paper we draw on examples from our current and past research to explore how the assessment of understanding of utility may be approached.

Paper


5E3: Post secondary and adult statistical literacy: assessing beyond the classroom

Jennifer Kaplan   Michigan State University, United States
Justin Thorpe   Michigan State University, United States

There is no question that an informed citizenry needs to be statistically literate. Many definitions for statistical literacy have been proposed. While there are certain similarities among them, consensus has not yet been reached. This paper defines adult statistical literacy as the set of skills and knowledge used by expert consumers of statistics and then provides a potential framework, based in the literature, to describe the components of statistical literacy. The paper then illustrates an assessment method, interpreting 40 responses to an authentic task through the lens of Watson and Callingham’s Statistical Literacy Construct. Potential extensions are discussed.

Paper




Session 5F: Assessing statistical reasoning and statistical thinking


5F1: Assessing student learning about statistical inference

Beth Chance   California Polytechnic State University, United States
John Holcomb   Cleveland State University, United States
Allan Rossman   California Polytechnic State University, United States
George Cobb   Mount Holyoke College, United States

Statistical significance and p-values can be a particularly challenging topic for introductory statistics students. In an effort to assess curricular changes aimed at deepening student understanding of significance, we have developed assessment strategies to diagnose students’ conceptualization of p-value and their ability to communicate their understanding. We will present our approaches and discuss student performance after participating in randomization-based modules introducing the concept of significance.

Paper


5F2: Development of an instrument to assess statistical thinking

Auðbjörg Björnsdóttir   University of Minnesota, United States
Andrew Zieffler   University of Minnesota, United States
Joan Garfield   University of Minnesota, United States
Robert C delMas   University of Minnesota, United States

The Comprehensive Assessment of Outcomes in a First Statistics course (CAOS) test consists of 40 multiple-choice items that were judged by a group of Statistics Education experts in 2004 to cover important learning outcomes for a first course in statistics (delMas, Garfield, Ooms & Chance, 2007). More currently, the Guidelines for Assessment and Instruction in Statistics Education (GAISE) college report suggested several important learning goals for students enrolled in an introductory statistics course which have been endorsed by the American Statistical Association. This paper describes the development of a new instrument to assess the desired student learning outcomes presented in the GAISE college report. This paper discusses the process used to select and add items, which was based not only on content analysis but also on psychometric methods (e.g., item response theory, differential item functioning).

Paper


5F3: Towards assessing understanding of prerequisite knowledge for sampling distributions

Michelle Sisto   International University of Monaco, Monaco
Tisha Hooks   Winona State University, United States
Michael Posner   Villanova University, United States
Dale Berger   Claremont Graduate University, United States

Sampling distributions play a vital role in understanding how statistical inferences are made, yet students often fail to achieve proficiency in this important topic. The goals of our project are to examine the literature on sampling distributions and concepts that may be prerequisite knowledge for sampling distributions - sampling, variability and distribution, develop an assessment tool that can be used in a timely fashion to provide instructors with feedback on the understanding and misconceptions that students have about sampling distributions and prerequisite concepts, and better understand misconceptions to improve the teaching and learning of basic statistics. We developed new and modified existing assessment items to study the relationship between comprehension of sampling distributions and mastery of each prerequisite area. Open ended questions were refined through a pilot study and we share our experiences in developing items.

Paper