Dawson (1995) presented a data set giving a population at risk and …
Dawson (1995) presented a data set giving a population at risk and fatalities for an “unusual episode” (the sinking of the ocean liner Titanic) and discussed the use of the data set in a first statistics course as an elementary exercise in statistical thinking, the goal being to deduce the origin of the data. Simonoff (1997) discussed the use of this data set in a second statistics course to illustrate logistic regression. Moore (2000) used an abbreviated form of the data set in a chapter exercise on the chi-square test. This article describes an activity that illustrates contingency table (two-way table) analysis. Students use contingency tables to analyze the “unusual episode” data (from Dawson 1995) and attempt to use their analysis to deduce the origin of the data. The activity is appropriate for use in an introductory college statistics course or in a high school AP statistics course.
The activity begins with an explanation of the Caesar Shift for message …
The activity begins with an explanation of the Caesar Shift for message encryption (Singh, 1999). The Caesar Shift is a translation of the alphabet; for example, a five-letter shift would code the letter a as f, b as g, … z as e. We describe a five-step process for decoding an encrypted message. First, groups of size 4 construct a frequency table of the letters in two lines of a coded message. Second, students construct a bar chart for a reference message of the frequency of letters in the English language. Third, students create a bar chart of the coded message. Fourth, students visually compare the bar chart of the reference message (step 2) to the bar chart of the coded message (step 3). Based on this comparison, students hypothesize a shift. Fifth, students apply the shift to the coded message.
Students explore the definition and interpretations of the probability of an event …
Students explore the definition and interpretations of the probability of an event by investigating the long run proportion of times a sum of 8 is obtained when two balanced dice are rolled repeatedly. Making use of hand calculations, computer simulations, and descriptive techniques, students encounter the laws of large numbers in a familiar setting. By working through the exercises, students will gain a deeper understanding of the qualitative and quantitative relationships between theoretical probability and long run relative frequency. Particularly, students investigate the proximity of the relative frequency of an event to its probability and conclude, from data, that the dispersion of the relative frequency diminishes on the order .
This activity provides students with 24 histograms representing distributions with differing shapes …
This activity provides students with 24 histograms representing distributions with differing shapes and characteristics. By sorting the histograms into piles that seem to go together, and by describing those piles, students develop awareness of the different versions of particular shapes (e.g., different types of skewed distributions, or different types of normal distributions), that not all histograms are easy to classify, that there is a difference between models (normal, uniform) and characteristics (skewness, symmetry, etc.).
This activity leads students to appreciate the usefulness of simulations for approximating …
This activity leads students to appreciate the usefulness of simulations for approximating probabilities. It also provides them with experience calculating probabilities based on geometric arguments and using the bivariate normal distribution. We have used it in courses in probability and mathematical statistics, as well as in an introductory statistics course at the post-calculus level.
This article describes an interactive activity illustrating sampling distributions for means, properties …
This article describes an interactive activity illustrating sampling distributions for means, properties of confidence intervals, properties of hypothesis testing, confidence intervals for means, and hypothesis tests for means. Students generate and analyze data and through simulation explore these concepts. The activity is completed in three parts. The three parts of the activity can be used in sequence or they can be used individually as stand alone activities. This allows the educator flexibility in utilizing the activity. Part I illustrates the sampling distribution of the sample mean. Part II illustrates confidence intervals for the population mean. Part III illustrates hypothesis tests for the population mean. This activity is appropriate for use in an introductory college or high school AP statistics course.
As teachers of statistics, we know that residual plots and other diagnostics …
As teachers of statistics, we know that residual plots and other diagnostics are important to deciding whether or not linear regression is appropriate for a set of data. Despite talking with our students about this, many students might believe that if the correlation coefficient is strong enough, these diagnostic checks are not important. The data set included in this activity was created to lure students into a situation that looks on the surface to be appropriate for the use of linear regression but is instead based (loosely) on a quadratic function.
This activity is an advanced version of the “Keep your eyes on …
This activity is an advanced version of the “Keep your eyes on the ball” activity by Bereska, et al. (1999). Students should gain experience with differentiating between independent and dependent variables, using linear regression to describe the relationship between these variables, and drawing inference about the parameters of the population regression line. Each group of students collects data on the rebound heights of a ball dropped multiple times from each of several different heights. By plotting the data, students quickly recognize the linear relationship. After obtaining the least squares estimate of the population regression line, students can set confidence intervals or test hypotheses on the parameters. Predictions of rebound length can be made for new values of the drop height as well. Data from different groups can be used to test for equality of the intercepts and slopes. By focusing on a particular drop height and multiple types of balls, one can also introduce the concept of analysis of variance.
In these activities designed to introduce sampling distributions and the Central Limit …
In these activities designed to introduce sampling distributions and the Central Limit Theorem, students generate several small samples and note patterns in the distributions of the means and proportions that they themselves calculate from these samples.
This group activity illustrates the concepts of size and power of a …
This group activity illustrates the concepts of size and power of a test through simulation. Students simulate binomial data by repeatedly rolling a ten-sided die, and they use their simulated data to estimate the size of a binomial test.
An important objective in hiring is to ensure diversity in the workforce. …
An important objective in hiring is to ensure diversity in the workforce. The race or gender of individuals hired by an organization should reflect the race or gender of the applicant pool. If certain groups are under-represented or over-represented among the employees, then there may be a case for discrimination in hiring. On the other hand, there may be a number of random factors unrelated to discrimination, such as the timing of the interview or competition from other employers, that might cause one group to be over-represented or under-represented. In this exercise, we ask students to investigate the role of randomness in hiring, and to consider how this might be used to help substantiate or refute charges of discrimination.
The activity is designed to help students develop a better intuitive understanding …
The activity is designed to help students develop a better intuitive understanding of what is meant by variability in statistics. Emphasis is placed on the standard deviation as a measure of variability. As they learn about the standard deviation, many students focus on the variability of bar heights in a histogram when asked to compare the variability of two distributions. For these students, variability refers to the variation in bar heights. Other students may focus only on the range of values, or the number of bars in a histogram, and conclude that two distributions are identical in variability even when it is clearly not the case. This activity can help students discover that the standard deviation is a measure of the density of values about the mean of a distribution and to become more aware of how clusters, gaps, and extreme values affect the standard deviation.
The Food and Drug Administration requires pharmaceutical companies to establish a shelf …
The Food and Drug Administration requires pharmaceutical companies to establish a shelf life for all new drug products through a stability analysis. This is done to ensure the quality of the drug taken by an individual is within established levels. The purpose of this out-of-class project or in-class example is to determine the shelf life of a new drug. This is done through using simple linear regression models and correctly interpreting confidence and prediction intervals. An Excel spreadsheet and SAS program are given to help perform the analysis.
This article describes an interactive activity illustrating general properties of hypothesis testing …
This article describes an interactive activity illustrating general properties of hypothesis testing and hypothesis tests for proportions. Students generate, collect, and analyze data. Through simulation, students explore hypothesis testing concepts. Concepts illustrated are: interpretation of p-values, type I error rate, type II error rate, power, and the relationship between type I and type II error rates and power. This activity is appropriate for use in an introductory college or high school statistics course.
This group activity focuses on conducting an experiment to determine which of …
This group activity focuses on conducting an experiment to determine which of two brands of paper towels are more absorbent by measuring the amount of water absorbed. A two-sample t-test can be used to analyze the data, or simple graphics and descriptive statistics can be used as an exploratory analysis. Students are asked to think about design issues, and to write a short report stating their results and conclusions, along with an evaluation of the experimental design.
Christopher J. Malone, Christopher R. Bilder, Deborah J. Rumsey, John E. Boyer, Kansas State University, Ohio State University, Oklahoma State University
This course focuses on the problem of supervised learning from the perspective …
This course focuses on the problem of supervised learning from the perspective of modern statistical learning theory starting with the theory of multivariate function approximation from sparse data. It develops basic tools such as Regularization including Support Vector Machines for regression and classification. It derives generalization bounds using both stability and VC theory. It also discusses topics such as boosting and feature selection and examines applications in several areas: Computer Vision, Computer Graphics, Text Classification and Bioinformatics. The final projects and hands-on applications and exercises are planned, paralleling the rapidly increasing practical uses of the techniques described in the subject.
Statistical Mechanics is a probabilistic approach to equilibrium properties of large numbers …
Statistical Mechanics is a probabilistic approach to equilibrium properties of large numbers of degrees of freedom. In this two-semester course, basic principles are examined. Topics include: thermodynamics, probability theory, kinetic theory, classical statistical mechanics, interacting systems, quantum statistical mechanics, and identical particles.
This course discusses the principles and methods of statistical mechanics. Topics covered …
This course discusses the principles and methods of statistical mechanics. Topics covered include classical and quantum statistics, grand ensembles, fluctuations, molecular distribution functions, other concepts in equilibrium statistical mechanics, and topics in thermodynamics and statistical mechanics of irreversible processes.
This course is an introduction to statistical data analysis. Topics are chosen …
This course is an introduction to statistical data analysis. Topics are chosen from applied probability, sampling, estimation, hypothesis testing, linear regression, analysis of variance, categorical data analysis, and nonparametric statistics.
No restrictions on your remixing, redistributing, or making derivative works. Give credit to the author, as required.
Your remixing, redistributing, or making derivatives works comes with some restrictions, including how it is shared.
Your redistributing comes with some restrictions. Do not remix or make derivative works.
Most restrictive license type. Prohibits most uses, sharing, and any changes.
Copyrighted materials, available under Fair Use and the TEACH Act for US-based educators, or other custom arrangements. Go to the resource provider to see their individual restrictions.