Learning Unit
Data Analysis
Learning Unit Title: Data Analysis |
Author: Amy Fox |
Grade Level: 8-12 |
School Address:
Morrisville-Eaton School |
Subject Area: Math |
School Phone/Fax: (315) 684 9121 |
CONTENT KNOWLEDGE
| Declarative |
Procedural |
| · Steps to conducting a randomized sample · Vocabulary terms: trend, variation, outlier, etc. |
· Gather valid data in a randomized sample · Analyze data and the resulting graph |
· Formulas for Standard Deviation, Mean, etc. |
· Input raw data into organized spreadsheet form |
· Difference between categorical and quantitative |
· Graph or display data using the computer |
· Steps to analyzing data and graphs using statistical measurements |
· Create histograms, boxplots, scatter plots, five number summaries, and line graphs |
· Steps to creating graphical displays by hand and on the computer |
· Compute the standard deviation, variance, and five number summary for data with a calculator, and using a computer |
NOTE: Depending on age or ability, some material will have already been previously learned and need only be reviewed.
ESSENTIAL QUESTION
· How can graphs be used in almost every aspect of society to make us a more efficient, productive, and informed society?
INITIATING ACTIVITY
Inform students to wear or bring sneakers to class. Weigh each pair and plot the weights on a prepared graph. Have students make predictions concerning the weight of other size sneakers based on their graph. Introduce possible conflicts with their predictions like other brands, the sneaker's purpose (like basketball versus running shoes), the age of the shoe, and the original cost. Discuss the importance of graphs such as this in marketing, government, commerce, banking, investments, and other areas of society
Unit Schedule/Time Plan:
The unit and culminating activity should take roughly three weeks of classes if on a one hour-everyday schedule. To shorten the time requirement, instructors may wish to incorporate the culminating activity on a daily basis to go along with each lesson. In order to do this, the first lesson would need to be changed to include random sampling discussion, lurking variables, and the difference between categorical and quantitative data. I chose to conduct the culminating activity at the end of the unit instead, in order to assess their holistic understanding of the unit rather than their immediate understanding or each days lesson.
LEARNING EXPERIENCES
Text Book Used: McCabe and Moore: Introduction to the Practice of Statistics, 1996
Lesson 1 - May take one or two days.
Introduction and Benefits: Introduce new (or reintroduce) vocabulary terms: variable, value, quantitative and categorical variables, variation, mean, median, mode, range, symmetric, skewed, gaps, outliers, frequency, trend, seasonal variation, etc. Draw references from previous statistics knowledge, graphs used in previous years, and geometry to connect to previous learning. Illustrate each term with labeled graphic representations and scenario examples for visual cues. Compare the types of graphs students are already accustomed to, including, bar graphs, pie graphs, line graphs, and histograms. Using fictional data, charts, and these types of graphs, exemplify what each term would "look like". When finished with the definitions, allow students to identify key examples of each on a prepared worksheet (see attachment1) filled with different types of graphs, charts, and data. By the end of this declarative experience, each student will be able to discuss, recognize, and illustrate each term listed above. The teacher should assess student understanding continuously by informal means and then more formally by checking each student's worksheet. Lastly, teacher should come prepared with several examples of how graphs are used in everyday society. These should be shown during a discussion of how the students have already seen how statistics and graphs are utilized. Good places to look: electric bill, newspaper, magazines, budgets, stock market. Be sure to initiate talk about how statistics and graphs can help us live our lives better, more efficiently, knowledgeably, and economically. Also, illustrate how advertising can mislead consumers by using statistics against us to portray information differently. Be sure students are aware that they will need to discuss the usefulness of statistics and graphs later in their culminating performance.
Lesson 2 - May take two to three days.
Displaying Distributions: Students will (re)learn the steps or procedures of creating histograms, pie graphs, bar graphs, line graphs, and scatterplots . Teacher should illustrate each by at least one short example. The class will be given five or six different charts of actual data. They will need to decide in collaborative pairs which type(s) of graph(s) would be most efficient and appropriate to use to illustrate the data. In the same pairs, they would then practice graphing each using the steps given previously. During the discussion and graphing, the instructor should point out common errors like forgetting to label or title the graph, or confusing the independent and dependent variables. Also, initiate discussion on the similarities of bar graphs, pictographs, and histograms. Discuss what kinds of data line graphs and pie graphs should be used for. Lastly, display the graphs after having each pair present and discuss theirs in front of the class. This presentation and discussion can be used to assess how effective the lesson has been.
Lesson 3 - May take four to five days.
Describing Distributions: Define: mean, median, resistance, variability, quartiles, interquartile range, five number summary, variance, and standard deviation. Discuss the relevance, give formulas, and/or the steps for calculating the mean, median, quartiles, interquartile range, five number summary, variance, and standard deviation. Show how to create a boxplot using the five number summary. Similar to lessons 2 and 3 above, give examples of how we have used things like mean, median, and quartiles before using different names: average, midpoint, and fourths or quarters. Also, give examples specific to each in the shape of a graph or type of data. Have students practice the steps for calculating these statistical measurements on actual data and discuss what they tell us about the data. Once students have conveyed their understanding of the relevance and importance of these descriptive measurements, introduce them to Microsoft Excel (or some other statistical analysis software including graphing calculators). List the steps to entering data onto a spreadsheet, saving their work, and calculating each of the previously discussed measurements. If possible, illustrate these steps visually using an LCD display to project the instructor's computer screen onto an overhead projection screen for all the class to follow along visually. Using the LCD display, model the opening of a new spreadsheet file, the entering of data, the saving of the file, and the various commands, processes, and/or icons for computing the mean, five number summary, variance, standard deviation, etc. Finally, allow the students full computer independence to practice the skill of computer-generated statistical computations. During the practice, check each student's progress. When finished, ask students to compare their computer-generated computations with their own pen-and-pencil calculations. Before completing this lesson, have students calculate specific statistical measurements for a given data set in a written quiz (see attachment #2) for more formal assessment of student understanding.
Lesson 4 - May take one to two days.
Interpreting Scatterplots & Making Predictions: Define and show examples of scatterplots. Have students construct their own scatterplots given a set of data. Discuss any observations like possible linear trends or bell-shaped distributions. Define: positive and negative association, correlation, linear relationship, clusters, outliers, explanatory and response variables. Illustrate the terms by way of example data sets and graphs. Have students classify their scatterplots' trend(s) as positive, negative, linear, having clusters, outliers, etc.. Using data with an obvious linear relationship, teach the steps to creating a scatterplot and generating a regression line using a computer. Allow students practice these steps on their own computer using data already saved in files. Next, teach the steps to making predictions based on this regression line by using the substitution method . Model these steps using a set of actual data then let students practice them in cooperative pairs using different data. Lastly, discuss and practice the formula and steps for finding the correlation of data and what this number tells us about the strength of the relationship between the variables. Assess student understanding based on informal observations and the cooperative pairs' work on scatterplots, predictions, and correlation.
Lesson 5 - May take one to two days.
Producing Valid Data & Assessing Accuracy of Predictions: As the last prerequisite experience before beginning work on the culminating performance, students will be forced to challenge predictions based on statistical measurements, the sampling technique used, extrapolation, and lurking variables. Begin by illustrating how certain sampling methods can produce misleading data. For example, polling people at a health club as to whether they smoke or not to get a fair reading of the number of smokers in an area would probably result in a lower smoking population than is true of the larger group. Explain that when sampling, we need to resist using a single type of group as a basis for the whole. It only makes sense that health club users are less likely to smoke than are non-health club members. At this point, the instructor should initiate a discussion of good sampling strategies to use for 2 or 3 different kinds of questions. When assured that students are capable of creating a suitably random sample, the instructor will finally bring the entire unit to a close by questioning how accurate a hypotheses is based on statistical measurements like correlation, variance, standard deviation, and based on the sampling method used. Students will then need to consider whether a prediction for a given data set is reasonable and/or accurate based on the statistical measurements and sampling methods. Introduce the culminating activity (See Culminating Activity page 5) at the end of this lesson and allow students to choose partners and to brainstorm ideas. Students should be thinking about what question they would like to research, how they will conduct a random sample, what kind of graph they will use to display their data, and what statistical measurements they will use to learn about their data. They should leave class with their question and sampling method finalized so that they can conduct their sample in their free time before the next class if at all possible.
Lesson 6 May take three to four days.
Culminating Activity: Students should work independently of teacher throughout most of the culminating activity so as to allow partnerships to form and discovery learning to occur. Be sure to provide and review the rubric within the first day or two so students are well aware of the expectations, requirements, and grading policy. It may also be helpful to offer a set of deadlines which students are to meet in a timely fashion in order to maintain a constant effort on the project. Be sure to offer guidance and encouragement when students appear confused. On the last day of the project, students should present their work to the class.
Connections to Standards
"The Data-ing Game"
Data Analysis
In This Unit Students Will:
Standard: MST 2 - Access, generate, and process information using technology.
Benchmarks: Understand and use more advanced features of spreadsheets.
Standard: MST 3 - Apply mathematics in real-world settings through data analysis.
Benchmarks: Input probabilities in real-world situations.
Make predictions based on interpolation and extrapolations from data.
Apply the concept of random variable to generate and interpret probability
distributions.
Use computers to analyze mathematics
Standard: MST 6 Understand the relationships and common themes that connect math, science, and technology and apply them to other areas of learning.
Benchmarks: Analyze data by making tables and graphs and looking for
patterns of change.
Select an appropriate model to begin the search for answers or solutions to a
question or problem.
Standard: MST 7 - Apply the knowledge and thinking skills of mathematics and technology to address real-life problems and make informed decisions.
Benchmarks: Gather and process information. Generate and analyze ideas. Present results.
Standard: ELA 1 Listen, speak, read, and write for information and understanding.
Benchmarks: Collect data, facts, and ideas. Discover relationships. Use oral and written language that follows the accepted conventions of the English language to apply and transmit information.
CULMINATING PERFORMANCE
Students will work in collaborative pairs to conduct a randomized sampling. The data they collect should be informative, interesting, and useful. In the final write-up, pairs will need to discuss how and/or why their data could be useful on a larger scale. Once the data is collected, pairs will choose an appropriate graph type and generate a well-labeled and titled graphic display using the computer. Also, pairs will need to compute all necessary statistical measurements using the software program(s) provided in order to analyze the data and to predict future or alternative outcomes or values. This analysis should also include measures of the accuracy of the predictions made. Finally, in a formal essay-format write-up, pairs will need to discuss the overall relevance of their data and all graphs in at least one aspect of society. Included in this write-up should be a discussion of the importance of random, unbiased samples. To conclude the culminating performance, the pairs will summarize and discuss their sampling techniques, data, graph, analysis, and prediction(s) in front of their classmates in an organized two to four minute talk. All relative computer-generated products will be displayed within the school after the talk has been completed.
Rubric for Culminating Activity
| Elements |
Graph and Chart |
Statistical Measurements and Analysis |
Prediction(s) and Accuracy Assessment |
Written Report |
Oral Report |
| WEIGHT |
25 % |
25 % |
25 % |
15 % |
10 % |
| A |
Graph and chart are appropriate for the data type. Both are well labeled, titled, organized, and attractive.
Up to 25 Points |
Includes each of the appropriate measurements correctly calculated via computer. Analysis of data is correct based on calculated measurements.
Up to 25 Points |
Predictions were made for the data using the correct method(s). Predictions were assessed in terms of their validity and accuracy using all the appropriate measures and techniques. Up to 25 Points |
Report is typed and well organized. Uses proper grammar, sentence structure and statistical terminology.
Up to 15 Points |
Always uses a clear voice that can be easily heard. Rate of speech is appropriate. Consistently makes eye contact with audience. Report is within time frame of 2-4 minutes. Up to 10 Points |
| A |
Graph or chart is missing less than three of the following: label, title, and good organization. Is not the most effective or appropriate graph or chart for the data type.
Up to 20 Points |
Includes most of the appropriate measurements correctly calculated via computer. Analysis of data is correct based on calculated measurements.
Up to 20 Points |
Predictions were made for the data using appropriate methods. Predictions were assessed in terms of their validity and accuracy using the most of the appropriate techniques and measures.
Up to 20 Points |
More than a few
errors which do not cause interference in understanding.
Up to 12 Points |
Usually uses a clear voice that can be heard well. Rate of speech is sometimes appropriate. Makes eye contact most of the time. Report is within or close to the time frame.
Up to 8 Points |
| Almost There |
Is either unattractive or poorly organized. Missing more than three labels or titles. Is not the most appropriate graph of chart for data type.
Up to 12 Points |
Many of the
calculated measurements are incorrect or missing.
Up to 12 Points |
Predictions were made using the correct methods. Predictions were not assessed in terms of their validity and accuracy using the appropriate measures.
Up to 12 Points |
Multiple errors are present and are significant enough to hinder understanding. A few of the statistical terms are used incorrectly.
Up to 8 Points |
Uses a somewhat clear voice that is not always heard well. Rate of report is too fast or slow. Somewhat over or short of the time frame.
Up to 5 Points |
| Awful |
Graph or chart is inappropriate for data type. Is unattractive and/or poorly organized. Missing most labels & titles.
Up to 6 Points |
Very few or none of the appropriate measurements are correct of included. Analysis based on any obtained measurements is completely incorrect.
Up to 6 Points |
Predictions were made using incorrect methods. The validity of predictions was not assessed using the appropriate measures.
Up to 6 Points |
Up to 4 Points |
Incoherent. Uses a very unclear voice. Rate is too fast or slow. No eye contact. Well over or short of the time frame. Up to 3 Points |
GRADE=
0.25´ Graph points + 0.25´ Statistical points + 0.25´ Prediction points + 0.15´ Written Report points + 0.1´ Oral Report points
Name __________________________ Statistics Vocabulary Worksheet #1 Attachment #1



Attachment #2
Name __________________________ Statistics Quiz Computing Statistical Measurements
Using the data displayed in the table below, find the following statistical measurements of spread and variability. Be sure to show all work and calculations or write what was done on your calculator to get the answer.
Number of Home Runs Hit by Babe Ruth
| Year |
1914 |
1915 |
1916 |
1917 |
1918 |
1919 |
1920 |
1921 |
1922 |
1923 |
1924 |
1925 |
1926 |
1927 |
1928 |
| #Home Runs |
0 |
4 |
3 |
2 |
11 |
29 |
54 |
59 |
35 |
41 |
46 |
25 |
47 |
60 |
54 |
| Year |
1929 |
1930 |
1931 |
1932 |
1933 |
1934 |
1935 |
| # Home Runs |
46 |
49 |
46 |
41 |
34 |
22 |
6 |