Why n 1 standard deviation




















So, we take the square root to get us back to the original units. Sample standard deviation. And then there's the formula for the sample standard deviation. The name "sample" versus "population" gives some indication of the difference between the two types of standard deviation. For a sample standard deviation, you are sampling. You don't have all the data. That kinda makes it easy. In the real world, you never have all the data. Then again, are we looking for the variation in one lot of product, or the variation that the production equipment is capable?

In general, you don't have all the data, so all you can compute is the sample standard deviation. Formula for the sample standard deviation. Let's look at the other differences. The first symbol stands for the actual value of the average of all the data.

The latter stands for an estimate of the average of all the data. Estimate of the average? I have a subtle distinction to make. We are used to thinking that the statistical mean is just a fancy word for "average", but there is a subtle difference.

The average or should I say "an" average is one estimate of the mean. If I take another collection of data points from the whole set of them if I sample the population , then I get another estimate of the mean. One may ask "how good is this estimate? If you take one data point to compute the average kind of a silly average, since there is only one then you have no idea how good the average is. But if you have the luxury of taking a bunch of data points, then you have some information about how close the average might be to the mean.

I'm, not being very statistical here, but it seems like a good guess that the true mean would lie somewhere between the smallest data point and the largest. Let's be a bit more precise. This is kind of an important result. If you wish to improve the statistical accuracy of your estimate of the mean by, for example, a factor of two, then you need to average four points together.

If you want to improve your estimate by a factor of ten, you will need to average data points. Difference between sample and population standard deviation.

Finally, I can state a little more precisely how to decide which formula is correct. It all comes down to how you arrived at your estimate of the mean. If you have the actual mean, then you use the population standard deviation, and divide by n. If you come up with an estimate of the mean based on averaging the data, then you should use the sample standard deviation, and divide by n But let's think about why this estimate would be biased and why we might want to have an estimate like that is larger.

And then maybe in the future, we could have a computer program or something that really makes us feel better, that dividing by n minus 1 gives us a better estimate of the true population variance. So let's imagine all the data in a population. And I'm just going to plot them on number a line.

So this is my number line. This is my number line. And let me plot all the data points in my population. So this is some data. This is some data.

Here's some data. And here is some data here. And I can just do as many points as I want. So these are just points on the number line. Now, let's say I take a sample of this. So this is my entire population. So let's see how many. I have 1 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, So in this case, what would be my big N?

My big N would be Big N would be Now, let's say I take a sample, a lowercase n of-- let's say my sample size is 3. I could take-- well, before I even think about that, let's think about roughly where the mean of this population would sit. So the way I drew it --and I'm not going to calculate exactly-- it looks like the mean might sit some place roughly right over here.

So the mean, the true population mean, the parameter's going to sit right over here. Now, let's think about what happens when we sample. And I'm going to do just a very small sample size just to give us the intuition, but this is true of any sample size. So let's say we have sample size of 3. So there is some possibility, when we take our sample size of 3, that we happen to sample it in a way that our sample mean is pretty close to our population mean.

So for example, if we sampled to that point, that point, and that point, I could imagine in our sample mean might actually said pretty close, pretty close to our population mean. But there's a distinct possibility, there's a distinct possibility, that maybe when I take a sample, I sample that and that. And the key idea here is when you take a sample, your sample mean is always going to sit within your sample. And so there is a possibility that when you take your sample, your mean could even be outside of the sample.

And so in this situation-- and this is just to give you an intuition. So here, your sample mean is going to be sitting someplace in there. And so if you were to just calculate the distance from each of this points to the sample mean --so this distance, that distance, and you square it, and you were to divide by the number of data points you have-- this is going to be a much lower estimate than the true variance the true variance, from the actual population mean, where these things are much, much, much further.

Now, you're always not going to have the true population mean outside of your sample. But it's possible that you do. So in general, when you just take your points, find the squared distance to your sample mean, which is always going to sit inside of your data even though the true population mean could be outside of it, or it could be at one end of your data, however, you might want to think about it, you are likely to be underestimating, you're likely to be underestimating the true population variance.

So this right over here is an underestimate-- underestimate. You want to draw conclusions about the population. The Bayesian approach would be to evaluate the posterior predictive distribution over the sample, which is a generalized Student's T distribution the origin of the T-test. The generalized Student's T distribution has three parameters and makes use of all three of your statistics.

If you decide to throw out some information, you can further approximate your data using a two-parameter normal distribution as described in your question. From a Bayesian standpoint, you can imagine that uncertainty in the hyperparameters of the model distributions over the mean and variance cause the variance of the posterior predictive to be greater than the population variance.

I'm jumping VERY late into this, but would like to offer an answer that is possibly more intuitive than others, albeit incomplete. The non-bold numeric cells shows the squared difference. My goodness it's getting complicated!

I thought the simple answer was You just don't have enough data outside to ensure you get all the data points you need randomly. The n-1 helps expand toward the "real" standard deviation. Sign up to join this community. The best answers are voted up and rise to the top. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Learn more. Ask Question. Asked 11 years ago. Active 10 months ago. Viewed k times. Improve this question.

Tal Galili Tal Galili You ask them "why this? Watch this, it precisely answers you question. Add a comment. Active Oldest Votes. Improve this answer. Michael Lew Michael Lew In essence, the correction is n-1 rather than n-2 etc because the n-1 correction gives results that are very close to what we need.

More exact corrections are shown here: en. What if it overestimates? Show 1 more comment. Dror Atariah 2 2 silver badges 15 15 bronze badges. Why is it that the total variance of the population would be the sum of the variance of the sample from the sample mean and the variance of the sample mean itself? How come we sum the variances? See here for intuition and proof. Show 4 more comments. I have to teach the students with the n-1 correction, so dividing in n alone is not an option. As written before me, to mention the connection to the second moment is not an option.

Although to mention how the mean was already estimated thereby leaving us with less "data" for the sd - that's important.

Regarding the bias of the sd - I remembered encountering it - thanks for driving that point home. In other words, I interpreted "intuitive" in your question to mean intuitive to you. Thank you for the vote of confidence :. The loose of the degree of freedom for the estimation of the expectancy is one that I was thinking of using in class.

But combining it with some of the other answers given in this thread will be useful to me, and I hope others in the future. Show 3 more comments. You know non-mathers like us can't tell. I did say gradually. Mooncrater 2 2 gold badges 8 8 silver badges 19 19 bronze badges.



0コメント

  • 1000 / 1000