What is Expected Prediction Error (EPE) a function of?
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
Context
I am self-studying Elements of Statistical Learning (2nd ed), by Friedman, Hastie & Tibshirani. I have a question with regards to what the EPE, as defined in this text, is a function of. Namely, on page 10 (equation 2.9), EPE is defined as:
$textEPE(f) = textE_X, Y(Y - f(X))^2$
This implies that EPE is a function of the learner, $f$, and that it is an expectation over the joint distribution of $X$ and $Y$.
I accept and understand this definition, since it distinguishes EPE from mean squared error (MSE), which is defined for some $x_0 in Omega_X$ as:
$textMSE(x_0) = textE_mathcalT(y_0 - hatf(x_0))^2$
Where $mathcalT$ represents the training set, and $hatf$ represents the corresponding learner. Note that the above equation assumes the relationship between $X$ and $Y$ is deterministic -- $y_0$ is assumed to be a constant.
Question
Where my confusion arises is in the use of EPE on page 18 (equation 2.27). The context of its use is this: the relationship between $Y$ (the dependent variable) and $X$ (the independent variable) is assumed to be linear in $X$:
$Y = X^Tbeta + epsilon$, where $epsilon sim mathcalN(0, sigma^2)$ independently of $X$.
Then, for some arbitrary test point $x_0$, equation 2.27 is stated in the following manner:
$textEPE(x_0) = textE_y_0textE_mathcalT(y_0 - haty_0)^2$, where $mathcalT$ is the training set.
My issue with the above is that I cannot see how the use of EPE in equation 2.27 is equivalent to that of equation 2.9; is there a way to show that they are? Or is the latter notation incorrect, and EPE is only a function of the learner?
Related question
I have just taken a look at the "related questions" after posting this one, and have found the following question. It appears to express a similar sentiment as mine. The accepted answer, as I understand it, seemingly implies the usage of EPE in equation 2.27 is inconsistent. If this is the case, and the two usages cannot be consolidated, I suppose this question can be taken down as a duplicate (? New here.).
statistics terminology statistical-inference descriptive-statistics
add a comment |Â
up vote
2
down vote
favorite
Context
I am self-studying Elements of Statistical Learning (2nd ed), by Friedman, Hastie & Tibshirani. I have a question with regards to what the EPE, as defined in this text, is a function of. Namely, on page 10 (equation 2.9), EPE is defined as:
$textEPE(f) = textE_X, Y(Y - f(X))^2$
This implies that EPE is a function of the learner, $f$, and that it is an expectation over the joint distribution of $X$ and $Y$.
I accept and understand this definition, since it distinguishes EPE from mean squared error (MSE), which is defined for some $x_0 in Omega_X$ as:
$textMSE(x_0) = textE_mathcalT(y_0 - hatf(x_0))^2$
Where $mathcalT$ represents the training set, and $hatf$ represents the corresponding learner. Note that the above equation assumes the relationship between $X$ and $Y$ is deterministic -- $y_0$ is assumed to be a constant.
Question
Where my confusion arises is in the use of EPE on page 18 (equation 2.27). The context of its use is this: the relationship between $Y$ (the dependent variable) and $X$ (the independent variable) is assumed to be linear in $X$:
$Y = X^Tbeta + epsilon$, where $epsilon sim mathcalN(0, sigma^2)$ independently of $X$.
Then, for some arbitrary test point $x_0$, equation 2.27 is stated in the following manner:
$textEPE(x_0) = textE_y_0textE_mathcalT(y_0 - haty_0)^2$, where $mathcalT$ is the training set.
My issue with the above is that I cannot see how the use of EPE in equation 2.27 is equivalent to that of equation 2.9; is there a way to show that they are? Or is the latter notation incorrect, and EPE is only a function of the learner?
Related question
I have just taken a look at the "related questions" after posting this one, and have found the following question. It appears to express a similar sentiment as mine. The accepted answer, as I understand it, seemingly implies the usage of EPE in equation 2.27 is inconsistent. If this is the case, and the two usages cannot be consolidated, I suppose this question can be taken down as a duplicate (? New here.).
statistics terminology statistical-inference descriptive-statistics
1
From what I remember of this topic, it is correct to say that the two uses of $textEPE$ are not the same. For the record, I've given up on trying to understand the notation in Elements of Statistical Learning. Inconsistencies galore.
â Clarinetist
Aug 20 at 14:00
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
Context
I am self-studying Elements of Statistical Learning (2nd ed), by Friedman, Hastie & Tibshirani. I have a question with regards to what the EPE, as defined in this text, is a function of. Namely, on page 10 (equation 2.9), EPE is defined as:
$textEPE(f) = textE_X, Y(Y - f(X))^2$
This implies that EPE is a function of the learner, $f$, and that it is an expectation over the joint distribution of $X$ and $Y$.
I accept and understand this definition, since it distinguishes EPE from mean squared error (MSE), which is defined for some $x_0 in Omega_X$ as:
$textMSE(x_0) = textE_mathcalT(y_0 - hatf(x_0))^2$
Where $mathcalT$ represents the training set, and $hatf$ represents the corresponding learner. Note that the above equation assumes the relationship between $X$ and $Y$ is deterministic -- $y_0$ is assumed to be a constant.
Question
Where my confusion arises is in the use of EPE on page 18 (equation 2.27). The context of its use is this: the relationship between $Y$ (the dependent variable) and $X$ (the independent variable) is assumed to be linear in $X$:
$Y = X^Tbeta + epsilon$, where $epsilon sim mathcalN(0, sigma^2)$ independently of $X$.
Then, for some arbitrary test point $x_0$, equation 2.27 is stated in the following manner:
$textEPE(x_0) = textE_y_0textE_mathcalT(y_0 - haty_0)^2$, where $mathcalT$ is the training set.
My issue with the above is that I cannot see how the use of EPE in equation 2.27 is equivalent to that of equation 2.9; is there a way to show that they are? Or is the latter notation incorrect, and EPE is only a function of the learner?
Related question
I have just taken a look at the "related questions" after posting this one, and have found the following question. It appears to express a similar sentiment as mine. The accepted answer, as I understand it, seemingly implies the usage of EPE in equation 2.27 is inconsistent. If this is the case, and the two usages cannot be consolidated, I suppose this question can be taken down as a duplicate (? New here.).
statistics terminology statistical-inference descriptive-statistics
Context
I am self-studying Elements of Statistical Learning (2nd ed), by Friedman, Hastie & Tibshirani. I have a question with regards to what the EPE, as defined in this text, is a function of. Namely, on page 10 (equation 2.9), EPE is defined as:
$textEPE(f) = textE_X, Y(Y - f(X))^2$
This implies that EPE is a function of the learner, $f$, and that it is an expectation over the joint distribution of $X$ and $Y$.
I accept and understand this definition, since it distinguishes EPE from mean squared error (MSE), which is defined for some $x_0 in Omega_X$ as:
$textMSE(x_0) = textE_mathcalT(y_0 - hatf(x_0))^2$
Where $mathcalT$ represents the training set, and $hatf$ represents the corresponding learner. Note that the above equation assumes the relationship between $X$ and $Y$ is deterministic -- $y_0$ is assumed to be a constant.
Question
Where my confusion arises is in the use of EPE on page 18 (equation 2.27). The context of its use is this: the relationship between $Y$ (the dependent variable) and $X$ (the independent variable) is assumed to be linear in $X$:
$Y = X^Tbeta + epsilon$, where $epsilon sim mathcalN(0, sigma^2)$ independently of $X$.
Then, for some arbitrary test point $x_0$, equation 2.27 is stated in the following manner:
$textEPE(x_0) = textE_y_0textE_mathcalT(y_0 - haty_0)^2$, where $mathcalT$ is the training set.
My issue with the above is that I cannot see how the use of EPE in equation 2.27 is equivalent to that of equation 2.9; is there a way to show that they are? Or is the latter notation incorrect, and EPE is only a function of the learner?
Related question
I have just taken a look at the "related questions" after posting this one, and have found the following question. It appears to express a similar sentiment as mine. The accepted answer, as I understand it, seemingly implies the usage of EPE in equation 2.27 is inconsistent. If this is the case, and the two usages cannot be consolidated, I suppose this question can be taken down as a duplicate (? New here.).
statistics terminology statistical-inference descriptive-statistics
edited 22 hours ago
asked Aug 20 at 13:25
Nurmister
267
267
1
From what I remember of this topic, it is correct to say that the two uses of $textEPE$ are not the same. For the record, I've given up on trying to understand the notation in Elements of Statistical Learning. Inconsistencies galore.
â Clarinetist
Aug 20 at 14:00
add a comment |Â
1
From what I remember of this topic, it is correct to say that the two uses of $textEPE$ are not the same. For the record, I've given up on trying to understand the notation in Elements of Statistical Learning. Inconsistencies galore.
â Clarinetist
Aug 20 at 14:00
1
1
From what I remember of this topic, it is correct to say that the two uses of $textEPE$ are not the same. For the record, I've given up on trying to understand the notation in Elements of Statistical Learning. Inconsistencies galore.
â Clarinetist
Aug 20 at 14:00
From what I remember of this topic, it is correct to say that the two uses of $textEPE$ are not the same. For the record, I've given up on trying to understand the notation in Elements of Statistical Learning. Inconsistencies galore.
â Clarinetist
Aug 20 at 14:00
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
The two usages of EPE are unlike each other; the latter usage of EPE is closer to that of an "enhanced MSE". Namely, in Equation 2.27 the expectation is of the difference between the predicted and true value of the dependent variable, conditional on $X=x_0$ and also over the distribution of T. The "conditional on $X=x_0$" part distinguishes it from MSE as I have understood it to be defined. It is needed here because the relationship between X and Y is not deterministic.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
The two usages of EPE are unlike each other; the latter usage of EPE is closer to that of an "enhanced MSE". Namely, in Equation 2.27 the expectation is of the difference between the predicted and true value of the dependent variable, conditional on $X=x_0$ and also over the distribution of T. The "conditional on $X=x_0$" part distinguishes it from MSE as I have understood it to be defined. It is needed here because the relationship between X and Y is not deterministic.
add a comment |Â
up vote
0
down vote
accepted
The two usages of EPE are unlike each other; the latter usage of EPE is closer to that of an "enhanced MSE". Namely, in Equation 2.27 the expectation is of the difference between the predicted and true value of the dependent variable, conditional on $X=x_0$ and also over the distribution of T. The "conditional on $X=x_0$" part distinguishes it from MSE as I have understood it to be defined. It is needed here because the relationship between X and Y is not deterministic.
add a comment |Â
up vote
0
down vote
accepted
up vote
0
down vote
accepted
The two usages of EPE are unlike each other; the latter usage of EPE is closer to that of an "enhanced MSE". Namely, in Equation 2.27 the expectation is of the difference between the predicted and true value of the dependent variable, conditional on $X=x_0$ and also over the distribution of T. The "conditional on $X=x_0$" part distinguishes it from MSE as I have understood it to be defined. It is needed here because the relationship between X and Y is not deterministic.
The two usages of EPE are unlike each other; the latter usage of EPE is closer to that of an "enhanced MSE". Namely, in Equation 2.27 the expectation is of the difference between the predicted and true value of the dependent variable, conditional on $X=x_0$ and also over the distribution of T. The "conditional on $X=x_0$" part distinguishes it from MSE as I have understood it to be defined. It is needed here because the relationship between X and Y is not deterministic.
answered 22 hours ago
Nurmister
267
267
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2888788%2fwhat-is-expected-prediction-error-epe-a-function-of%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
From what I remember of this topic, it is correct to say that the two uses of $textEPE$ are not the same. For the record, I've given up on trying to understand the notation in Elements of Statistical Learning. Inconsistencies galore.
â Clarinetist
Aug 20 at 14:00