This quasi-experiment compares student learning outcomes from three college statistics courses to investigate whether greater randomization test content explains gains in conceptual understanding of inference, adjusting for prior knowledge and mathematical ability. The study uses a 34-item Reasoning about P-values and Statistical Significance (RPASS) scale to measure gains in students’ inferential understanding. Of two introductory courses, one has limited randomization content (n1 = 55). The second emphasizes randomization, simulation, and P-values throughout (n2 = 26). The third is a second course in statistics that reviews randomization tests at the beginning of the course (n3 = 24). Comparative results, score reliability, and the changes in respondents’ correct conceptions and misconceptions are reported. Directions for future research are discussed.