Questions concerning the power of the standard deviation
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
The formula for standard deviation is
$$S_x = sqrtfrac1n-1sum_i=1^n(x_i-barx)^2$$
I learn that $68$% of the values fall within $S_x$, $95$% of the values fall within $2S_x$, and $99.7$% of the values fall within $3S_x$.
My question is that why is it the second power? Can it also be $(x_i-barx)^4$, or any other even powers?
What is the reason behind the second power? Is it just easy to use? Or is here any other meaning to it?
statistics
add a comment |Â
up vote
1
down vote
favorite
The formula for standard deviation is
$$S_x = sqrtfrac1n-1sum_i=1^n(x_i-barx)^2$$
I learn that $68$% of the values fall within $S_x$, $95$% of the values fall within $2S_x$, and $99.7$% of the values fall within $3S_x$.
My question is that why is it the second power? Can it also be $(x_i-barx)^4$, or any other even powers?
What is the reason behind the second power? Is it just easy to use? Or is here any other meaning to it?
statistics
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
The formula for standard deviation is
$$S_x = sqrtfrac1n-1sum_i=1^n(x_i-barx)^2$$
I learn that $68$% of the values fall within $S_x$, $95$% of the values fall within $2S_x$, and $99.7$% of the values fall within $3S_x$.
My question is that why is it the second power? Can it also be $(x_i-barx)^4$, or any other even powers?
What is the reason behind the second power? Is it just easy to use? Or is here any other meaning to it?
statistics
The formula for standard deviation is
$$S_x = sqrtfrac1n-1sum_i=1^n(x_i-barx)^2$$
I learn that $68$% of the values fall within $S_x$, $95$% of the values fall within $2S_x$, and $99.7$% of the values fall within $3S_x$.
My question is that why is it the second power? Can it also be $(x_i-barx)^4$, or any other even powers?
What is the reason behind the second power? Is it just easy to use? Or is here any other meaning to it?
statistics
statistics
asked Sep 10 at 20:58
Larry
370117
370117
add a comment |Â
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
1
down vote
accepted
Some reasons to define the variance and standard deviation the way they're defined:
With this definition, the mean minimizes the variance, meaning: If we compute the mean square deviation from some value $mu$, it's minimal if $mu$ is the mean:
begineqnarray*
f(mu)&=&sum_i(x_i-mu)^2;,\
f'(mu)&=&-2sum_i(x_i-mu);,\
f'(mu)=0&Leftrightarrow&mu=frac1nsum_ix_i;.
endeqnarray*
This doesn't work the same way with higher even powers, e.g.:
begineqnarray*
f(mu)&=&sum_i(x_i-mu)^4;,\
f'(mu)&=&-4sum_i(x_i-mu)^3;,\
f'(mu)=0&Leftrightarrow&sum_i(x_i-mu)^3=0;,
endeqnarray*
a cubic equation for $mu$ without a natural interpretation. Thus, the median minimizes the mean absolute deviation, and the mean minimizes the mean square deviation, whereas the number minimizing the mean quartic deviation isn't known to have any nice properties.
The variance of independent random variables is additive:
begineqnarray*
mathsfVar(X+Y)&=&mathsf Eleft[(x+y-bar x-bar y)^2right]\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]+2mathsf Eleft[xy-bar xy-xbar y+bar xbar yright]
\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]+2(bar xbar y-bar xbar y-bar xbar y+bar xbar y)
\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]
\
&=&
mathsfVar(X)+mathsfVar(Y);.
endeqnarray*
This, too, wouldn't work with higher even powers. This sort of additivity is at the heart of important theorems like the central limit theorem.
add a comment |Â
up vote
2
down vote
Those statement about $68%$, $95%,$ and $99.7%$ apply to the normal distribution, but certainly do not apply to all distributions.
Defining the variance by using $n-1$ in the denominator, where $n$ is the sample size, is done only when using the sample variance to estimate the population variance or otherwise drawing inferences about the population by using a random sample. The population variance is $operatorname E((X-mu)^2)$ where $mu=operatorname E(X),$ and if the population consists of $n$ equally probablye outcomes, then the standard deviation is given by a formula that looks like what you wrote except that it has $n$ where you have $n-1.$
The reason the second power is used in measuring dispersion is that if $X_1,ldots,X_n$ are independent, then
$$
operatornamevar(X_1+cdots+X_n) = operatornamevar(X_1)+cdots + operatornamevar(X_n).
$$
You need that whenever you apply the central limit theorem.
add a comment |Â
up vote
0
down vote
Standard deviation is one way to measure the spread of some data. You could certainly introduce another measure of spread that used 4th powers and took the fourth root. It would have different properties, and might not be useful.
For example with data that is normally distributed, the property you cite about 68% and 95% would not hold with such a different measure of spread.
There are genuine reasons to work with a measure of spread that involves squaring the residuals like standard deviation/error does. I don't know that I could be successful at explaining them in a short SE post. Maybe someone else will though.
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
Some reasons to define the variance and standard deviation the way they're defined:
With this definition, the mean minimizes the variance, meaning: If we compute the mean square deviation from some value $mu$, it's minimal if $mu$ is the mean:
begineqnarray*
f(mu)&=&sum_i(x_i-mu)^2;,\
f'(mu)&=&-2sum_i(x_i-mu);,\
f'(mu)=0&Leftrightarrow&mu=frac1nsum_ix_i;.
endeqnarray*
This doesn't work the same way with higher even powers, e.g.:
begineqnarray*
f(mu)&=&sum_i(x_i-mu)^4;,\
f'(mu)&=&-4sum_i(x_i-mu)^3;,\
f'(mu)=0&Leftrightarrow&sum_i(x_i-mu)^3=0;,
endeqnarray*
a cubic equation for $mu$ without a natural interpretation. Thus, the median minimizes the mean absolute deviation, and the mean minimizes the mean square deviation, whereas the number minimizing the mean quartic deviation isn't known to have any nice properties.
The variance of independent random variables is additive:
begineqnarray*
mathsfVar(X+Y)&=&mathsf Eleft[(x+y-bar x-bar y)^2right]\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]+2mathsf Eleft[xy-bar xy-xbar y+bar xbar yright]
\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]+2(bar xbar y-bar xbar y-bar xbar y+bar xbar y)
\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]
\
&=&
mathsfVar(X)+mathsfVar(Y);.
endeqnarray*
This, too, wouldn't work with higher even powers. This sort of additivity is at the heart of important theorems like the central limit theorem.
add a comment |Â
up vote
1
down vote
accepted
Some reasons to define the variance and standard deviation the way they're defined:
With this definition, the mean minimizes the variance, meaning: If we compute the mean square deviation from some value $mu$, it's minimal if $mu$ is the mean:
begineqnarray*
f(mu)&=&sum_i(x_i-mu)^2;,\
f'(mu)&=&-2sum_i(x_i-mu);,\
f'(mu)=0&Leftrightarrow&mu=frac1nsum_ix_i;.
endeqnarray*
This doesn't work the same way with higher even powers, e.g.:
begineqnarray*
f(mu)&=&sum_i(x_i-mu)^4;,\
f'(mu)&=&-4sum_i(x_i-mu)^3;,\
f'(mu)=0&Leftrightarrow&sum_i(x_i-mu)^3=0;,
endeqnarray*
a cubic equation for $mu$ without a natural interpretation. Thus, the median minimizes the mean absolute deviation, and the mean minimizes the mean square deviation, whereas the number minimizing the mean quartic deviation isn't known to have any nice properties.
The variance of independent random variables is additive:
begineqnarray*
mathsfVar(X+Y)&=&mathsf Eleft[(x+y-bar x-bar y)^2right]\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]+2mathsf Eleft[xy-bar xy-xbar y+bar xbar yright]
\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]+2(bar xbar y-bar xbar y-bar xbar y+bar xbar y)
\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]
\
&=&
mathsfVar(X)+mathsfVar(Y);.
endeqnarray*
This, too, wouldn't work with higher even powers. This sort of additivity is at the heart of important theorems like the central limit theorem.
add a comment |Â
up vote
1
down vote
accepted
up vote
1
down vote
accepted
Some reasons to define the variance and standard deviation the way they're defined:
With this definition, the mean minimizes the variance, meaning: If we compute the mean square deviation from some value $mu$, it's minimal if $mu$ is the mean:
begineqnarray*
f(mu)&=&sum_i(x_i-mu)^2;,\
f'(mu)&=&-2sum_i(x_i-mu);,\
f'(mu)=0&Leftrightarrow&mu=frac1nsum_ix_i;.
endeqnarray*
This doesn't work the same way with higher even powers, e.g.:
begineqnarray*
f(mu)&=&sum_i(x_i-mu)^4;,\
f'(mu)&=&-4sum_i(x_i-mu)^3;,\
f'(mu)=0&Leftrightarrow&sum_i(x_i-mu)^3=0;,
endeqnarray*
a cubic equation for $mu$ without a natural interpretation. Thus, the median minimizes the mean absolute deviation, and the mean minimizes the mean square deviation, whereas the number minimizing the mean quartic deviation isn't known to have any nice properties.
The variance of independent random variables is additive:
begineqnarray*
mathsfVar(X+Y)&=&mathsf Eleft[(x+y-bar x-bar y)^2right]\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]+2mathsf Eleft[xy-bar xy-xbar y+bar xbar yright]
\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]+2(bar xbar y-bar xbar y-bar xbar y+bar xbar y)
\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]
\
&=&
mathsfVar(X)+mathsfVar(Y);.
endeqnarray*
This, too, wouldn't work with higher even powers. This sort of additivity is at the heart of important theorems like the central limit theorem.
Some reasons to define the variance and standard deviation the way they're defined:
With this definition, the mean minimizes the variance, meaning: If we compute the mean square deviation from some value $mu$, it's minimal if $mu$ is the mean:
begineqnarray*
f(mu)&=&sum_i(x_i-mu)^2;,\
f'(mu)&=&-2sum_i(x_i-mu);,\
f'(mu)=0&Leftrightarrow&mu=frac1nsum_ix_i;.
endeqnarray*
This doesn't work the same way with higher even powers, e.g.:
begineqnarray*
f(mu)&=&sum_i(x_i-mu)^4;,\
f'(mu)&=&-4sum_i(x_i-mu)^3;,\
f'(mu)=0&Leftrightarrow&sum_i(x_i-mu)^3=0;,
endeqnarray*
a cubic equation for $mu$ without a natural interpretation. Thus, the median minimizes the mean absolute deviation, and the mean minimizes the mean square deviation, whereas the number minimizing the mean quartic deviation isn't known to have any nice properties.
The variance of independent random variables is additive:
begineqnarray*
mathsfVar(X+Y)&=&mathsf Eleft[(x+y-bar x-bar y)^2right]\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]+2mathsf Eleft[xy-bar xy-xbar y+bar xbar yright]
\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]+2(bar xbar y-bar xbar y-bar xbar y+bar xbar y)
\
&=&
mathsf Eleft[(x-bar x)^2right]+mathsf Eleft[(y-bar y)^2right]
\
&=&
mathsfVar(X)+mathsfVar(Y);.
endeqnarray*
This, too, wouldn't work with higher even powers. This sort of additivity is at the heart of important theorems like the central limit theorem.
answered Sep 10 at 21:29
joriki
169k10181337
169k10181337
add a comment |Â
add a comment |Â
up vote
2
down vote
Those statement about $68%$, $95%,$ and $99.7%$ apply to the normal distribution, but certainly do not apply to all distributions.
Defining the variance by using $n-1$ in the denominator, where $n$ is the sample size, is done only when using the sample variance to estimate the population variance or otherwise drawing inferences about the population by using a random sample. The population variance is $operatorname E((X-mu)^2)$ where $mu=operatorname E(X),$ and if the population consists of $n$ equally probablye outcomes, then the standard deviation is given by a formula that looks like what you wrote except that it has $n$ where you have $n-1.$
The reason the second power is used in measuring dispersion is that if $X_1,ldots,X_n$ are independent, then
$$
operatornamevar(X_1+cdots+X_n) = operatornamevar(X_1)+cdots + operatornamevar(X_n).
$$
You need that whenever you apply the central limit theorem.
add a comment |Â
up vote
2
down vote
Those statement about $68%$, $95%,$ and $99.7%$ apply to the normal distribution, but certainly do not apply to all distributions.
Defining the variance by using $n-1$ in the denominator, where $n$ is the sample size, is done only when using the sample variance to estimate the population variance or otherwise drawing inferences about the population by using a random sample. The population variance is $operatorname E((X-mu)^2)$ where $mu=operatorname E(X),$ and if the population consists of $n$ equally probablye outcomes, then the standard deviation is given by a formula that looks like what you wrote except that it has $n$ where you have $n-1.$
The reason the second power is used in measuring dispersion is that if $X_1,ldots,X_n$ are independent, then
$$
operatornamevar(X_1+cdots+X_n) = operatornamevar(X_1)+cdots + operatornamevar(X_n).
$$
You need that whenever you apply the central limit theorem.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
Those statement about $68%$, $95%,$ and $99.7%$ apply to the normal distribution, but certainly do not apply to all distributions.
Defining the variance by using $n-1$ in the denominator, where $n$ is the sample size, is done only when using the sample variance to estimate the population variance or otherwise drawing inferences about the population by using a random sample. The population variance is $operatorname E((X-mu)^2)$ where $mu=operatorname E(X),$ and if the population consists of $n$ equally probablye outcomes, then the standard deviation is given by a formula that looks like what you wrote except that it has $n$ where you have $n-1.$
The reason the second power is used in measuring dispersion is that if $X_1,ldots,X_n$ are independent, then
$$
operatornamevar(X_1+cdots+X_n) = operatornamevar(X_1)+cdots + operatornamevar(X_n).
$$
You need that whenever you apply the central limit theorem.
Those statement about $68%$, $95%,$ and $99.7%$ apply to the normal distribution, but certainly do not apply to all distributions.
Defining the variance by using $n-1$ in the denominator, where $n$ is the sample size, is done only when using the sample variance to estimate the population variance or otherwise drawing inferences about the population by using a random sample. The population variance is $operatorname E((X-mu)^2)$ where $mu=operatorname E(X),$ and if the population consists of $n$ equally probablye outcomes, then the standard deviation is given by a formula that looks like what you wrote except that it has $n$ where you have $n-1.$
The reason the second power is used in measuring dispersion is that if $X_1,ldots,X_n$ are independent, then
$$
operatornamevar(X_1+cdots+X_n) = operatornamevar(X_1)+cdots + operatornamevar(X_n).
$$
You need that whenever you apply the central limit theorem.
answered Sep 11 at 0:17
Michael Hardy
206k23187466
206k23187466
add a comment |Â
add a comment |Â
up vote
0
down vote
Standard deviation is one way to measure the spread of some data. You could certainly introduce another measure of spread that used 4th powers and took the fourth root. It would have different properties, and might not be useful.
For example with data that is normally distributed, the property you cite about 68% and 95% would not hold with such a different measure of spread.
There are genuine reasons to work with a measure of spread that involves squaring the residuals like standard deviation/error does. I don't know that I could be successful at explaining them in a short SE post. Maybe someone else will though.
add a comment |Â
up vote
0
down vote
Standard deviation is one way to measure the spread of some data. You could certainly introduce another measure of spread that used 4th powers and took the fourth root. It would have different properties, and might not be useful.
For example with data that is normally distributed, the property you cite about 68% and 95% would not hold with such a different measure of spread.
There are genuine reasons to work with a measure of spread that involves squaring the residuals like standard deviation/error does. I don't know that I could be successful at explaining them in a short SE post. Maybe someone else will though.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Standard deviation is one way to measure the spread of some data. You could certainly introduce another measure of spread that used 4th powers and took the fourth root. It would have different properties, and might not be useful.
For example with data that is normally distributed, the property you cite about 68% and 95% would not hold with such a different measure of spread.
There are genuine reasons to work with a measure of spread that involves squaring the residuals like standard deviation/error does. I don't know that I could be successful at explaining them in a short SE post. Maybe someone else will though.
Standard deviation is one way to measure the spread of some data. You could certainly introduce another measure of spread that used 4th powers and took the fourth root. It would have different properties, and might not be useful.
For example with data that is normally distributed, the property you cite about 68% and 95% would not hold with such a different measure of spread.
There are genuine reasons to work with a measure of spread that involves squaring the residuals like standard deviation/error does. I don't know that I could be successful at explaining them in a short SE post. Maybe someone else will though.
answered Sep 10 at 21:04
alex.jordan
37.2k559117
37.2k559117
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2912354%2fquestions-concerning-the-power-of-the-standard-deviation%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password