Statistics Public Forums

Confidence Intervals and Hypothesis Testing: Example 6.6

alexisvdb
Aug 23
ReplyFlag

Hi Everyone,

I guess I am looking for a bit of a reality check...

In Example 6.6 (3rd edition), we are looking for a sample size that will lead to a certain margin of error at a 95% confidence level. The answer is that we would need to have a sample of 601 students to meet the described conditions.

So far so good.

Then it suddenly occurred to me that 601 students could be more than 10% of the total population of students of some universities. In that case (>10%) we could no longer regard the observations as independent.

Perhaps I am confused now, but in practice, what would be the correct way to proceed? It seems strange to reduce the sample size here for the sake of meeting the independence condition...?

Best,

Alex

I guess I am looking for a bit of a reality check...

In Example 6.6 (3rd edition), we are looking for a sample size that will lead to a certain margin of error at a 95% confidence level. The answer is that we would need to have a sample of 601 students to meet the described conditions.

So far so good.

Then it suddenly occurred to me that 601 students could be more than 10% of the total population of students of some universities. In that case (>10%) we could no longer regard the observations as independent.

Perhaps I am confused now, but in practice, what would be the correct way to proceed? It seems strange to reduce the sample size here for the sake of meeting the independence condition...?

Best,

Alex

David
Aug 23
ReplyFlag

Hi Alex,

The 10% guideline has some peculiarities to it, and it's actually something we may retire in future editions.

The main reason it's less critical than other conditions because when it's broken the SE should actually become smaller. You can see this in the extreme: sample an entire population, and the SE for the estimate characterizing that population should now be 0.

The second context is more nuanced. If we're wondering about whether a relationship between variables in a population are due to chance or not, then the 10% guideline is irrelevant. Even if we sample the entire population and find a minuscule difference, that difference could be due to randomness since populations themselves grow from random processes. (This is pretty nuanced, so don't worry if it isn't clear.)

Hope this helps!

Best,

David

The 10% guideline has some peculiarities to it, and it's actually something we may retire in future editions.

The main reason it's less critical than other conditions because when it's broken the SE should actually become smaller. You can see this in the extreme: sample an entire population, and the SE for the estimate characterizing that population should now be 0.

The second context is more nuanced. If we're wondering about whether a relationship between variables in a population are due to chance or not, then the 10% guideline is irrelevant. Even if we sample the entire population and find a minuscule difference, that difference could be due to randomness since populations themselves grow from random processes. (This is pretty nuanced, so don't worry if it isn't clear.)

Hope this helps!

Best,

David

alexisvdb
Aug 23
ReplyFlag

Hi David,

Thanks for the explanation!

Alex

Thanks for the explanation!

Alex

To add a comment or subscribe, please sign in or register.

Your User ID will appear with your posts.