Testing a Population Mean

November 29, 2011

This is a straightforward hypothesis test for sample means. We just give the value to test and the sample statistic with no interpretation so you see how the method works.

Notice we use a t-score and t-distribution in this case. We always do this when the hypothesis test is about a population mean.

Problem adapted from Triola’s Elementary Statistics

Height of Women

November 29, 2011

This is a problem involving the \bar x-distribution and involves finding out how likely a certain size sample mean will be.

Here the idea is to change areas in the \bar x-distribution into areas for the standard normal curve and look them up in the table.

The \mu is the population mean and it corresponds to z=0 as a z-score. The \bar x is a sample mean and it corresponds to a data value in the left graph above. You change it into a z-value in the right graph by using the formula for the z-score:

z = \frac{{data - mean}}{{stddev}} = \frac{{bar x- mu}}{{\frac{\sigma }{{\sqrt n }}}}

Notice that the standard deviation this time on the bottom is

stddev = \frac{\sigma }{{\sqrt n }}

This comes from the Central Limit Theorem and is the form you use when you are working with the \bar x-distribution.

Problem adapted from Larson/Farber’s Elementary Statistics

Huge z-values

November 17, 2011

What to do for “huge z-values”? They do sometimes turn up, especially when the evidence for a hypothesis test is very strong one way or the other. Anyway, just use a spreadsheet to find your p-values instead of the standard normal table. The following formulas work for finding p-values using a spreadsheet (and work for any size z-values actually, not just “huge” ones):

1) left tailed test, use =text{NORMSDIST}(z)

For example if z=-3.8

p= text{NORMSDIST}(-3.8)=0.000072348261104

2) right tailed test, use = 1.0-text{NORMSDIST}(z)

For example if z=4.4

p=1.0 - text{NORMSDIST}(4.4)=0.000005408458265

3) two tailed test, and z is negative, use = 2* text{NORMSDIST}(z)

For example if z = -4.8

p=2*text{NORMSDIST}(-4.8)=0.00000171710376

4) two tailed test and z is positive use = 2* text{NORMSDIST}(-z)

For example if  z = 5.1

p=2*text{NORMSDIST}(-5.1)=0.000001969336413

Of course any time the z-value is “huge” that means the p-value is going to be small.

And remember a very small p-value means very strong evidence against the null hypothesis H_0.

You still want to find the actual p-value though and state what it is when you do a hypothesis test even if you do know because the z-value is “huge” it is going to lead to very strong evidence against H_0.

How Do You Know Which Test to Use?

November 12, 2011

When you are doing a hypothesis test how do you decide which form to use? Say we are going to test the population proportion .27. Which of the three possibilities do we choose?

H_0 : \theta  = .27
H_a : \theta  \ne .27

or

H_0 : \theta  = .27
H_a : \theta < .27

or

H_0 : \theta  = .27
H_a : \theta  > .27

Let’s look at some typical phrases used with these forms.

If you claim the population proportion “is smaller than 27%” or “is less than 27%” then you would use the form

\\H_0 : \theta  = .27 \\H_a : \theta < .27\text{\,\,      (your claim) }

and your claim corresponds to the alternative hypothesis.

If you claim the population proportion “is greater than 27%” or “is more than 27%” then you would use the form

\\H_0 : \theta  = .27 \\H_a : \theta > .27\text{\,\,      (your claim) }

and your claim corresponds to the alternative hypothesis.

If you claim the population proportion “differs from 27%” or “is not equal to 27%” then you would use this form

\\H_0 : \theta  = .27 \\H_a : \theta \ne .27 \text{\,\,      (your claim) }

and your claim corresponds to the alternative hypothesis.

When to use p-value, when to use level of significance

November 12, 2011

When you do a hypothesis test, you use a sample to come up with a test statistic (z-value). Then you use the z-value to come up with a p-value, where the p-value is the area of the “tail(s)” involved.

Now the book says that you decide the result of the hypothesis test based on the size of the p-value (page 438). The smaller the p-value the more evidence against the null hypothesis you have. This is true if you are not given a level of significance (ie no \alpha).

But what happens if the problem says to use 1% or 5% level of significance (\alpha=.01 or \alpha=.05) ?

This just means that when you are done you compare the p-value to this \alpha to decide whether you reject the null hypothesis or not.

If p \leqslant \alpha, then you reject the null hypothesis and say the result is “significant”.

If p >  \alpha, then you say there is not enough evidence to reject the null hypothesis and say the result is “not significant”.

Example (Using a level of significance in hypothesis test):

As an example suppose you are using a level of significance \alpha=.05 and your p-value is .032. Then reject the null hypothesis and say the result is significant.

Or suppose your level of significance is \alpha=.01 and your p-value is .025. Then you do not have enough evidence to reject the null hypothesis and say the result is not significant.

So the upshot is compare the p-value to the level of significance (\alpha) to decide whether to reject when you have a level of significance. If the problem doesn’t give a level of significance, then decide your conclusion by using the size of the p-value and the table on p. 438 of our book.

Extraterrestrials

November 11, 2011

Here is an example of a two tailed hypothesis test.

For a two-tailed test you have to find the z-value and then take the p-value to be twice the area of the tail you get. Since for a two-tailed test the alternative hypothesis is H_a : \theta  \ne \theta_0 we have to allow that the test statistic could be bigger than \theta_0 or it could be smaller than \theta_0. That is why we need two tails.

Hmm… even if USA Today is right I’m beginning to wonder under what conditions people did see an extraterrestrial. Late at night I bet…

Problem adapted from Larson/Farber’s Elementary Statistics

Do You Eat Breakfast

November 11, 2011

Here’s another example of a hypothesis test. Again you tell the type of test (right-tailed or left-tailed or two-tailed) from the form of the alternative hypothesis.

Notice in this one we got a fairly big p-value compared to the level of significance we were using. So we didn’t even come close to rejecting the null hypothesis this time.

Problem adapted from Larson/Farber’s Elementary Statistics

Network News

November 11, 2011

This is a hypothesis test. This time the claim made by the research involved corresponds to the alternative hypothesis.

Maybe the data this problem talks about is old. Can it really be 55% watching network news? What about all those people who only watch podcasts or only read google news?

Problem adapted from Larson/Faber’s Elementary Statistics

Outlawing Cigarettes

November 11, 2011

This problem shows an example of a hypothesis test. This one is a right tailed test, which you can tell from the form of the alternative hypothesis.

Note that if you are not using a level of significance to decide whether to reject or not, then you wind up with a p-value and make some conclusion about how much evidence there is against the null hypothesis based on the size of the p-value. The smaller the p-value the less likely you are to believe in the null hypothesis.

Problem adapted from Larson/Farber’s Elementary Statistics

Coke vs Pepsi – Some Terms You Need

October 28, 2011

Let’s clear up some terms…. you will need these when you are working on the Spreadsheet Assignment for Module 5

  • Population – the ideal group you want to study… usually it is too big to get all the data from this group
  • Sample – the smaller group that you do actually collect data from (the members of your sample come from your population…. they are some smaller portion of the population)
  • Parameter – the number or quantity that you are interested in about the population
  • Statistic – the number or quantity that you are interested but which just uses the data from your sample

Now for problems that involve the percentage of your population that feel a certain way, the parameter is called the population proportion (\theta ) and the statistic is called the sample proportion ( \hat p )

Let’s do an example similar to the one you need to do for the spreadsheet assignment.

Example: Suppose I want to know what percent of all people prefer Coke over Pepsi. I sample some people and get 163 say they prefer Coke and 87 say they prefer Pepsi.

a) What is the population proportion
b) What is the sample proportion?
c) What is the population?
d ) What is the parameter?
e) What is the sample?
f) What is the sample size?
g) What is the statistic?

If you can answer these questions… then you understand the terms…  Here are the answers:

Solution:

a) What is the population proportion?

The proportion (or percentage) of all people that prefer coke. This is called \theta
(Note: we don’t know this value exactly…)

b) What is the sample proportion?

The proportion of people in the sample that prefer coke.

This is called \hat p

To compute this we divide the number in the sample that prefer coke by the entire sample size:

\hat p =  \frac{163}{250} = .652

So the sample proportion is 65.2%.

c) What is the population?

all people

d) What is the parameter?

Here it is the population proportion (or percentage) of all people that prefer coke. (this is the same as in part a) above)

e) What is the sample?

“some people”. It looks there are 250 people in the sample.

f) What is the sample size?

n = 250

g) What is the statistic?

Here it is the sample proportion of people(or percentage) of people in the sample that prefer coke. This is the same as in part b) above


Follow

Get every new post delivered to your Inbox.