Out.Stats4

Inferential Statistics

Introduction

If you have data on all of the observations, you now know how to summarize the data with some basic descriptive statistics.  If you were given the grades of the 20 students in an upper level ECN class, you could calculate the mean and standard deviation as well as construct a histogram.   But how often do we have data on the entire population, and if the data were available, how often would it be too costly to collect and analyze in a timely fashion? 

As mentioned in the sampling distribution section, the answer is not too often.  In the majority of instances we are concerned with understanding some features of a population without going to the expense of collecting data for the entire population.   A candidate running for office may want to know if there is enough support to justify staying in a campaign; a president may want to find out about the public's attitude before taking a position on some policy; the university may want to get students' views on a new general education curriculum; policy officials may want to know how many people are unemployed; and advertisers may want to know how many people watch a certain television show.  Those of you who followed the impeachment hearings of President Clinton know full well the power of these polls conducted to gauge the public's support for Clinton.  In economics you will find that nearly all of the macroeconomic data for the US economy, such as the unemployment rate, average earnings, GDP, and inflation, are numbers that are based on small samples of the US workers and producers.  

Another example of the relationship between a sample and the population would be the census of the US population that is conducted every ten years.  The Census Bureau has proposed moving to a sample to provide a more cost-effective means of counting certain parts of the 270+ million people in the US for the census of 2000.  The proposal has not been universally supported in Congress, in part because of a political dimension to the results.  Since the sampling results in different population numbers, you can expect those who represent areas where the population figures will be revised upward by the sampling technique will support the approach while those losing people will oppose the technique. [For a discussion of the Census 2000, you can check out the Census 2000 web site.]

The common denominator with these questions is that in each of these cases you would expect that the cost of gathering the data for the entire population would be too expensive to justify.  As we saw earlier, there is a potential alternative.  It is possible to gather data from a subset of the population, a sample of observations, that would allow one to make inferences about the entire population.  The polls that you hear about at election time, the reports of students' position on an issue, the unemployment rate, and the TV ratings are all based on small samples of the populations.  In our grade example, we could have looked at the data for one of the four groups and taken the average of the group as our estimate of the average for the entire class.  In that example, the ECN202 class average was 84.454 and the four sample averages ranged from 85.3138 to 82.5756. 

There are, however, a very large number of possible samples that could be selected. Getting back to the dart throwing, you could have ten lists of all scores drawn up and then have ten people throw thirteen darts at the list.  Each person would then record the list of thirteen scores that were "hit" by the darts and this would become the sample of grades.  What you know is that the chances that any two of the samples would be the same is slim to none.  As another sampling technique, a little safer if you play darts as I do, you might place each of the scores on an index card and then select thirteen cards.  If you drew thirteen names from the hat twenty times, it is unlikely that any of the twenty samples of scores would be the same.

Once you move from the population to the sample you realize that there are many potential samples that can be drawn from the population. We have seen this with the assignments where you each rolled the dice, picked cards from a deck, and flipped a coin.  For each assignment you calculated the mean and standard deviation and the numbers that you recorded were not all the same.  There was a distribution of possible results which is what we discussed in the section on sampling distributions where we looked at the distribution of possible values for the sample mean.  To make our lives real easy, the Central Limit Theorem allowed us to approximate the distribution of sample means with a normal distribution.  This means that once we have the mean and standard deviation of the sample mean, we can construct a Z variable and compute probabilities using the standard normal distribution.  In the grade example, we could compute the probability of a 15 student sample providing a mean value below 82 when the true class average was 85. 

Estimation and confidence intervals.

In this section we want to move beyond saying something about the distribution of the sample statistics and use the sample statistics to make some inferences about the complete population.  We want to be able to say things like: 30 percent of RIU students work 20 hours a week; 40 percent of students support affirmative action policies at the university; I am 95 percent confident that candidate X will win the election; I am 95 percent confident that the average on the first exam will be between 70 and 75.

We will begin with the concept of estimators - both point and interval estimates and then move on to hypothesis testing.

Point estimation provides us with a "best" estimate for the value of a specific characteristic of the population.  One of the properties that we would want an estimator to have is unbiasedness - we want our estimate to center on the true value.  In more technical term, we would want the expected value of the sample estimator to equal the population parameter. 

An example of an unbiased sample estimator would be the sample mean.  It is an unbiased estimator of the population mean wpe1.gif (1007 bytes).   Another example would be the sample variance that is an unbiased estimator of the population variance   wpe2.gif (1042 bytes).  If you want to know the percentage of the RIU students that support spending funds for a new convocation center, then you would take a small sample of students and use the share of those students that support the center.  If you were concerned with the variance in Rhode Islanders' income, then you could take a sample of state residents and calculate the variance in their incomes and this would be an unbiased estimate of the state figure.

Interval estimation allows us to introduce some notion of uncertainty into our estimation procedure.  For example, consider the two sets of exam scores below.  The mean value for both classes is the same, but the distribution of scores is very different.  In class 1 the scores range from 43 to 98, while in class 2 the range is only 65 to 78.  If you were to take a sample from the two classes, the expected value of the mean for the two classes would be the same, but you would have more confidence in the result from the second class being close to the true mean.

Exam Results

Class 1

Class 2

1

55

65

2

68

65

3

43

68

4

98

78

5

78

78

6

68

69

7

91

71

8

48

68

9

46

65

10

74

77

11

79

69

12

82

72

13

86

71

14

76

76

Mean

70.85

70.85

This notion of confidence is formalized in the concept of confidence intervals that will allow one to make a statement such as "I am 95 percent confident that the mean for the scores is between 73 and 68."  It is also the concept behind those election time statements such as "with a margin of error of plus or minus five points" that you hear with the release of all poll data.  

We will look at two situations.  The first is one where the variance of the population is known and the second is where the variance is unknown.

Confidence intervals for normal means: known variance

The concept of a confidence interval can be illustrated by an earlier experiment that you conducted.  You were asked to compute the mean, variance, and standard deviation on the score that you obtained when you rolled two dice ten times.  You will recall that the mean values computed for each student's sample were not identical since the samples that they chose were different.  There was, however, a pattern to the means that were calculated.  We know from our earlier work that the distribution of sample means will be normally distributed.  The probability / frequency distribution of possible scores would look like the diagram below.  Most of the sample means will be clustered around the true value for the population mean, but there will be some sample means that are outliers.  The sample mean for a sample of size n is distributed with a mean of m and a standard deviation of s/SQRT(n).

Normal Distribution

In the case of the dice, the mean value for the number rolled is 7 and the standard deviation is 2.45.  If we examine the mean values obtained by 16 students conducting the experiment, then the mean would be (7) and standard deviation would be (  .60375 = 2.45/SQRT16 = 2.45/4). 

For another example, assume that each ECN306 student takes his/her own sample of 5 test grades from the 14 grades reported above.  You would expect that the mean grade would be different for each sample.  Furthermore, you would expect that the differences between the observed means would be smaller in Class 1. The distribution of  mean scores would be represented by the diagram above where the mean (m) would be 70.85.

In both instances, the grades and the dice, because you know that the mean is distributed normally, you know that approximately 95 percent of the time, the value for the statistic will fall within two standard deviations of the mean.  If we turn this statement around we can say that we are 95% percent confident that the values are within two standard deviations of m.

We are now ready to look at confidence intervals for sample means where the variance is known.  Assume that you have rolled the dice and calculated a mean score or sampled the grades and calculated an average grade. Because the sample mean is an unbiased estimator of the population mean, we will take this as the estimator of the mean and establish confidence intervals about the sample mean. We will construct a confidence interval so that we are 95% confident that the true mean falls within this range. 

Because the sample means are distributed normally with a mean of m and a standard deviation of s/SQRT(n), you can construct a standardized normal form with a zero mean and standard deviation of one so that we can use the normal tables to answer probability questions. The standardized normal variable is  wpe1D.jpg (1601 bytes).  The standardized normal distribution diagram appears below.   68% of the time we would expect a sample mean to be within one standard deviation of the sample mean, while 95 percent of the time they will be within two standard deviations.

If we want a 95% level of confidence on our estimate, then we want to set the width (number of standard deviations) such that only 5% of the times we would expect observations to fall outside of the range.  

Standard Normal Distribution - Z

In the diagram below, we would want to choose a value for Z such that 95% of observations fell within the shaded area in the center diagram.  This would happen if we chose Z such that the areas shaded in the left and right side diagrams were each .025.  You would look up in the standardized normal table to find a Z value where this was true.  You would find a Z = 1.96, a result that can be interpreted as follows: for a 95% confidence interval we would need to establish an interval that extended 1.96 standard deviations on either side of the sample mean. In practice, you will use a Z of 2 as the limit for the 95% confidence limit.

Normal Distribution and Z Values

Let's look at this a bit more formally, along the lines that you might run into in a statistics text.  We begin by taking the mean of a sample from a population with a known variance of s2.  The mean for the sample mean is the population mean and the variance is s2/n.   The 100(1-a)% confidence interval for the population mean when the population is normally distributed and the variance s2 is known, is the interval bounded by the confidence limits.  1-a is the confidence coefficient, 100(1-a)% is the confidence level, and the confidence limits are:

lower limit L = wpe2.jpg (1610 bytes)

upper limit U =  wpe3.jpg (1640 bytes)

Enough of the jargon.  The lower limit is derived by subtracting from the sample mean the appropriate Z value times the standard deviation.  The upper limit is derived by adding from the sample mean the appropriate Z value times the standard deviation.  In our example above where we were interested in a 95% confidence level, we would want to be within 1.96 standard deviations of the mean.  The value for Za/2 would be 1.96 which would then be multiplied by the sample variance.

Now let's look at a few examples.  If we knew grades were normally distributed with a mean of m and a variance of 4, what would be the 95% confidence interval for the mean if a sample of 9 students turned up a mean of 70?  The standard deviation of the sample mean would be SQRT(4)/SQRT(9) = 2/3.  For a 95% confidence interval, the confidence coefficient would be 95 = (1-a) so a = .05 and a/2 = .025.  You would now look up the value for Z at .025 and you get 1.96.  If you move 1.96 standard deviations to either side of the mean, you will expect to include 95% of the observations.  If we round 1.96 to 2, we get the intervals as 70 - 2*(2/3)  and  70 + 2*(2/3).  Given the sample mean of 70, we would expect that 95% of the time the mean would be expected to fall within the range of 68.67 and 71.33. 

As for the rolling of the dice example, if we had nine student roll the dice and calculate the mean, the standard deviation for the means would be the standard deviation of scores divided by the SQRT of the sample size.  Given that s = 2.415, we would have the standard deviation for the sample mean of 2.45/SQRT(9) = 2.45/3 = .817.  If the mean for the sample of students was 6.9, then the 95% confidence interval for the mean will extend from approximately two standard deviations below the mean to two above the mean (1.96 to be exact).  The 95% confidence interval will be 5.3 to 8.5 which does include the true parameter value of 7.

Confidence intervals for normal means: unknown variance

What happens if the variance is not known?  It turns out that the procedure is very much the same except that the Z distribution is not used.  What is used is the t-distribution that allows for greater uncertainty due to fact that there is one more unknown - the variance.  The t-distribution looks very much like the standardized normal, except that its shape changes with the number of observations.   As the number of observations increases and our confidence in the sample statistics increases, the t-distribution approached the Z distribution. 

You can convince yourself by looking up the Z and t statistics for a confidence level of 95%.  How far from the mean do we need to move to be confident that only 5 percent of the time the mean will fall in one of the tails?  As we saw earlier, the Z term that is associated with an a value of .025 is 1.96. When you turn to the t-statistic, you will notice that the t value depends on the sample size.  For a sample of 10, the t-statistic would be based on 9 degrees of freedom (degrees of freedom = n - 1) and it would be 2.262.  For a sample size of 31 (30 degrees of freedom), the t-statistic would be 2.045.  With the smaller sample we would need to move 2.262 standard deviations above the mean to be 95% certain that we have included the population mean, while in the larger sample we would need to move only 2.045 standard deviations above the mean. The lower and upper limits on the t-distribution are: 

lower limit L = wpe7.jpg (1858 bytes)

upper limit U =  wpe6.jpg (1862 bytes)

To see how this works, assume that a study has been done in which 100 families were surveyed and the mean and standard deviation in family credit card debt were $1,200 and $300. What is the 95 % confidence interval for the mean? Because the variance is unknown, you will generally want to use the t-statistic.  The lower and upper limits would be set at .025.  The standard deviation of the sample mean is s/SQRT(n) =  300/SQRT(100) = 300/10 = 30.  If we round the t-statistic to 2, the confidence intervals is between 1200 - 2*30 and 1200 + 2*30 = 1140 - 1260.  The 95% confidence interval is the range 1140 to 1260 so that you would be 95% confident that the true mean for the population was somewhere within this range.  This would make it very unlikely that the average credit card debt per family for the population was 1300 since it lies outside of the confidence interval.

Sample size and binomial distribution

There is another set of questions one could look at when dealing with confidence intervals - sample size.  How large a sample do I need to obtain a certain level of confidence?  If we look back at the formula for the situation where the variance is known, the lower limit to the confidence interval is given by the formula Lower Limit =    wpe2.jpg (1610 bytes).   The interval's size is therefore dependent on the sample size.  As the sample size increases (n gets larger), then the interval decreases in size since the precision of our estimate increases.  

As a last example of the concept of confidence intervals, let's return to the binomial distribution.   The 100(1-a)% confidence interval for the binomial proportion of successes p is the interval bounded by the confidence limits below where f is the observed frequency.

lower limit L = wpe8.jpg (2186 bytes)

upper limit U = wpe9.jpg (2210 bytes)

For an example of a binomial distribution problem, let's assume that a sample of 100 students was taken and that 25 supported a proposed curriculum reform. What is the 95% confidence interval?  Because it is a 95% confidence interval, we want a t value of 1.96.  If we then plug in the values for f (.25) and n (100) we get the following as upper and lower limits.

L = .25 - 1.96* SQRT[(.25)(.75)/100] = .25 - 1.96*.043301 = .25 - .085 = .165

U = .25 + 1.96* SQRT[(.25)(.75)/100] = .25 + 1.96*.043301 = .25 + .085 = .335

Based on these findings, we can expect that 95% of the time the percentage of the students supporting the proposal will fall within the range of 16.5% to 33.5%.

We have now looked at the concept of confidence intervals and it is time to move to our last topic, hypothesis testing. 

Hypothesis testing

We are now ready to extend our analysis of inferential statistics that began with the treatment of sampling distributions and continued on through the discussion of estimation and confidence intervals. The common denominator in all of these sections is the relationship between sample statistics and population parameters.  The relationship between the three concepts can be best seen by returning to an earlier experiment where you were asked to roll two dice ten times and record the mean and standard deviation of the number that appeared on the dice.  It is highly unlikely that any two samples were the same and that the sample statistics (mean and standard deviation) were the same.  What we observed was that there was a distribution of values for the sample mean and that as the sample size increased, the sample mean was distributed normally for most underlying distributions - a conclusion to be drawn from the central limit theorem. 

In the second section we examined the link between the population parameters mean (m) and variance (s2) and the sample statistics mean (X) and variance (s2).   We found that the sample statistics were unbiased estimates of the true population parameters.  We also looked into the question of how confident we were in our estimates, specifically the estimate of the population mean.  We used our knowledge of the distribution of sample means to construct confidence intervals around the observed sample mean.  In the case of the rolled dice, we found that when we had a sample of 9, that the 95% confidence interval around the sample mean of 6.9 extended from 5.3 to 8.5.

In this section we will extend our analysis to hypothesis testing.  In the dice rolling example, we could want to test the hypothesis that the true mean is 7 or that the dice are fair.  In the case such as this where we are talking about the relationship between the sample mean and the population mean, hypothesis testing will be a simple extension of the analysis that we did on confidence intervals.   Unfortunately, there are many types of hypotheses that we would want to test so this section will contain a number of subsections devoted to specific types of hypotheses.   The good news is that once you figure out the hypothesis type, you will simply need to follow the script for that type of hypothesis test.

The script that you will follow will have a common form regardless of the hypothesis type.  In this section we are going to adopt a cookbook approach - similar to the cookbook approach used for S&D.  There will be six steps in the hypothesis testing process and you should follow each step without skipping over any one. 

The six steps are:

  1. Formulate the null hypothesis
  2. Formulate the alternative hypothesis
  3. Set the level of significance and the sample size
  4. Select the appropriate statistic and the rejection region
  5. Collect the data and calculate the statistic.
  6. If the calculated statistic falls in the rejection region, reject the null hypothesis.  If the calculates statistic falls outside the rejection region, do not reject the null hypothesis.

Formulate the hypotheses.   You begin the process by setting up a hypothesis concerning a population parameter.   For example, you could set up the hypothesis that the population mean is 7 in the dice rolling example or that the class average on the exam was 68.  These would be examples of null hypotheses.  The null hypothesis always refers to a specified value of a population parameter and not a sample statistic and the statement of the null hypothesis generally contains an = sign regarding the specified value of the parameter.

Once you set up the set up a null hypothesis, the alternative hypothesis contains all the other potential values for the population parameter.  For example, in the two examples above, the alternative hypothesis would be that the means were not 7 or 68.  The alternative hypothesis never refers to a specified value of a population parameter and the statement of the alternative hypothesis never contains an = sign regarding the specified value of the parameter. 

Generally the hypothesis testing situations are set up in such a way that we accept the null hypothesis unless there is some strong evidence to suggest that it is wrong.  For example, in our exam scores example, we would tend to accept the null hypothesis that the average score is 70 unless the sample mean turned out to be substantially higher or lower than 70.  If the sample mean turned out to be 90, there may be some real reason to doubt that the class average was 70 and we would be tempted to reject the null hypothesis.

Selecting the level of significance and the sample size.  You will recall from the earlier discussion of sampling distributions and confidence intervals that the standard deviation of the distribution of sample mean is dependent upon the sample size - the larger the sample the smaller the standard deviation.  You can get more precision in your estimates if you increase the sample size.  We would also need to determine the level of significance that is comparable to the confidence level.  Normally you will be concerned with situations where the level of significance is at least 95%.

We can demonstrate the situation with the diagram below, one that you have already seen.  The diagram is of a normal distribution that would be the appropriate distribution of sample means where the mean is the sample mean.  The null hypothesis would be that the population mean has a specific value.  You would then determine the level of significance.  If you wanted a 95% significance level, you would choose a value for the critical value such that 95% of the possible values for the mean fell within the red area. If the specified value for the mean fell within this range, then you would not be able to reject the null hypothesis at the 95% level of significance. If on the other hand the mean falls outside of that area, then there is reason to reject the null hypothesis.

Distribution of Sample Means

normal8x.gif (2533 bytes)

In the example of the dice, the sample mean was 6.9 and the Z value assuming a 95% confidence level is 1.96.   The standard deviation (s) for a sample size of 9 is s/SQRT(n) = 2.45/SQRT(9) = .817.  We now multiply the standard deviation by the Z value and we get 1.60 which we use to obtain the critical value.  If we add and subtract this value 1.60 from the sample mean we obtain the two critical values.  The lower limit equals 6.9 - 1.6 = 5.3 and the upper limit equals 6.9 + 1.6 = 8.5. If the hypothesis is that the mean is 9, then this can be rejected at the 95% confidence interval since it does not lie within the 95% confidence range.  If the null hypothesis was that the mean was 8, however, then the hypothesis could not be rejected. 

Once we have specified the critical value, we can see that there are two types of potential error.  You could reject a true hypothesis or you could accept a false one.  The first of these is a Type I error while the second is a type II error.  The probability of a Type I error, denoted by a, is called the level of significance of the statistical test, while the confidence coefficient (1-a) is the probability that the null hypothesis will be accepted when in fact it is false.  We will ignore for now any discussion of the probability of type II error.

As we move into our discussion of hypothesis testing, you should keep in mind that there are two approaches to the problem of testing the hypothesis.  These two approaches will be described in the next section where we discuss tests for the hypotheses concerning the mean. 

Hypotheses concerning the mean when s is known

Should we accept that the mean is m given the results of a sample? This is a classic manufacturing problem where we have filling machines or stamping machines that are supposed to be filling containers or making parts that meet certain specifications.  A quality control expert would want to know if the machine was working correctly, if the specifications were being met. If you were a gambler, you might want to know if the cards or dice were fair. 

The procedure is real simple.  You begin by computing the appropriate Z value for the level of significance.  With this you then calculate the critical values and then determine whether the hypothesized mean falls within the critical values.  If it does, then you must accept the null hypothesis.

Let's look at an example.  We have a filling machine that is supposed to fill bags with 8 pounds of cereal with a standard deviation of .15 pounds.   Assume that a sample of 50 bags has been chosen and that the mean weight is 7.98 ounces.  Is there any reason to believe that the machine is not working correctly at the 95% confidence level?

The first step will be the development of the two hypotheses.   The null hypothesis will be that the mean is 8 while the alternative hypothesis will be that it is not 8.  The standard deviation of the sample means is .15/SQRT(50) = .0212.  The Z value for a 95% level of confidence is 1.96 so the critical values are 7.94 and 8.02.  Because the value of 8 lies within the critical area, the hypothesis cannot be rejected. 

Another way to construct the test would be to estimate the Z value for the hypothesized mean.  In this case Z = (7.98 - 8)/.0212 = .02/.0212 = .94.  Because the Z value for the 95% confidence level was 1.96, the value of Z is smaller so that we cannot reject the hypothesis. 

There is a third way of testing the hypothesis - one that has become popular in recent years.  The good news is that it is simply a new package for the old information.  If you examine a Z table you will find that there is a value associated with every value of Z.  For example, in a table where the Z value is 1.5, the probability that a number is greater than 1.5 standard deviations above the mean would be .0668.  The probability that a number is less than 1.5 standard deviations below the mean would also be .0668.  In this case, the probability that we would find a number further than 1.5 standard deviations from the mean would be 2*.0668 = .1336.  This would be the published p-value.  In the example above where the Z value was .94, the p value would be .3471.  The odds are fairly large that we could get numbers of this magnitude so we would accept the hypothesis.

Now for a short cut.  When you are dealing with 95% confidence levels, you will find that the hypothesis will be accepted (not rejected) as long as the Z-value has a value greater than 1.96.   If the confidence level were 98%, then the critical  Z-value would be approximately 2.33. You would not be able to reject the null hypothesis if the Z value was less than 2.3. 

Sometimes, however, there is reason to conduct a one-tailed test.   For example, consider a situation where a firm is supplying fasteners that must be 1.4 inches in length to be useful.  The real concern is with fasteners that are less than this length since they cannot be used.  If the mean length for a sample of 25 is 1.38 and the standard deviation for the population is .1 inches, should we accept the fact that the equipment is operating correctly. 

We start with calculating the standard deviation of the sample mean that is .1/SQRT(25) = .1/5 = .02.  Next we can compute the Z value.  Z = (1.38-1.4)/.02 = -1.   Here is where things are a bit different.  In the past we selected a Z value of 1.96 for the 95% level of significance because it allowed for a 2.5% chance of being too high and a 2.5% chance of being too low.  Here we want to examine a situation where we are concerned only with the results being too low.   The 5% in this case will be concentrated in one tail so you would look for a Z value with .05 rather than .025. The Z value is 1.645 and therefore you would accept the hypothesis as long as the computed Z value was less than 1.645.  In this case where the Z value was 1, you would not be able to reject the hypothesis that the machine is operating effectively.  If we took the p-value approach to hypothesis testing, the critical p-value would be .0668.  If the p-value was greater than that, you would not be able to reject the null hypothesis. 

The procedure for testing a hypothesis when we have the binomial distribution is the same except that we have the different formula for the Z value.  When you have a binomial distribution, the Z statistic is computed as

Z = (ps -p)/SQRT(p(1-p)/n))

or

Z = (X - np)/SQRT(np(1-p))

Let's look at a new tutoring package that is being evaluated.   In the past, 25 percent of the students' that used the service improved their grade.  With the new procedure, 29 of 100 students that went through the program improved their grade.  Is the new procedure more effective?

We'll use the second approach and plug the values into the equation.  Z = (29 - .25*100)/SQRT(100*.25*.75) = .924.  Because this value is less than 1.96, there is no reason to reject the hypothesis that there has been no change.   Would your answer change if the number of students with improved grades was 36?   We would use the same procedure and the Z value would be  Z = (36 - .25*100)/SQRT(100*.25*.75) = 2.54.  Based on this Z value you would reject the null hypothesis and accept the alternative that there was a difference. 

Hypotheses concerning the mean when s is unknown

The procedure when the variance is unknown is virtually identical to what we just did, the only difference being that the t-distribution rather than the Z-distribution would be used to test hypotheses.

Hypotheses concerning differences between two means

Now that we are on a roll, let's look at the procedure for comparing the differences between two means. The good news is the procedure is basically the same - set up the hypothesis, determine the critical values associated with the significance level, and calculate the Z and t  statistics.  The bad news is that the formula will be a bit more cumbersome.  If the two samples are 1 and 2, then the Z and t statistics would be

wpe2.jpg (3725 bytes)

wpe3.jpg (3814 bytes)

where:

wpe4.jpg (3247 bytes)

To see how this test works, let's look at the familiar grade problem.  Below you will find the test grades for two sections of ECN201.  Our interest is in determining whether there is a significant difference between the two groups. 

Exam Scores

Class 1

Class 2

70.59

67.80

71.99

66.38

73.68

66.65

71.39

69.63

74.42

67.06

73.85

68.61

71.34

66.19

74.54

67.28

74.80

70.91

71.90

69.29

73.85

69.38

73.01

70.36

71.96

70.80

71.20

69.01

74.19

66.81

70.07

69.29

74.88

69.46

70.91

67.60

74.61

67.25

72.80

68.41

The means for the two classes are 72.8 and 68.4 and the standard deviations are 1.57 and 1.48.  We can plug these numbers into the formula and get

wpe6.jpg (3426 bytes)

If we plug into the formula for S2 we get

wpe7.jpg (2959 bytes)

Combining all of this information we get

wpe8.jpg (3336 bytes) = 21.97.  

Based on this t-statistic value, you would reject the null hypothesis that the two performances were the same.  There is support for the hypothesis that the two classes performed differently.

We have now examined in some detail hypothesis testing when we are dealing with sample means.  There are other tests for sample medians and variances that you can find in standard texts, but this will be the end of our discussion of hypothesis testing.