In the choice of bandwidth for kernel density estimator. Why usually minimize MISE instead of minimizing ISE?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












Before presenting my question (which I already formulate in the title of this post) is important to establish the context of my problem:



Let $xi$ be a random variable with density function $f$ unknow. Given a sample $widehatxi_1,ldots,widehatxi_N$ of $xi$ we have that the kernel density estimator (KDE) of $f$ is
$$widehatf_h(x):=frac1Nhsum_i=1^NmathcalKleft(fracx-widehatxi_ihright).$$
where $h> 0$ and $mathcalK$ is a probability density function such that $int x mathcalK(x)dx=0$ and $int x^2 mathcalK(x)dx=1$.



In the literature on this topic is customary to say tha $h$ must be chosen so as to minimize the MISE where



beginequation
mathrmMISE(h):=mathbbE_mathbbP^Nleft[ int left( widehatf_h(x)- f(x)right)^2dx right].
endequation
the randomness is in the vector $left(widehatxi_1,ldots,widehatxi_Nright)$, that has distribution $mathbbP^N=mathbbPtimescdots timesmathbbP$, where $mathbbP$ where
$mathbbP(A):=int_Af(x)dx.$
Therefore, $mathbbP$ is unknow. There are many techniques to estimate MISE.



The question: If $h_MISE$ minimizes $mathrmMISE$ then that $h_MISE$ is used to determine $widehatf_h_MISE$ for any sample of size $N$, the same is always used. Is not it better to find an $h$ for each sample?



Given a sample $widehatxi_1,ldots,widehatxi_N$, we consider the expression
$$mathrmISE(h):= int left( widehatf_h(x)- f(x)right)^2dx. $$
If $h_ISE$ minimizes $mathrmISE$. Is not $widehatf_h_ISE$ a better estimator than $widehatf_h_MISE$?



Why usually minimize MISE instead of minimizing ISE?










share|cite|improve this question





















  • Check this "sciencedirect.com/science/article/pii/016771529190163L" it will help you a lot.
    – Angel
    Sep 1 at 6:12














up vote
0
down vote

favorite












Before presenting my question (which I already formulate in the title of this post) is important to establish the context of my problem:



Let $xi$ be a random variable with density function $f$ unknow. Given a sample $widehatxi_1,ldots,widehatxi_N$ of $xi$ we have that the kernel density estimator (KDE) of $f$ is
$$widehatf_h(x):=frac1Nhsum_i=1^NmathcalKleft(fracx-widehatxi_ihright).$$
where $h> 0$ and $mathcalK$ is a probability density function such that $int x mathcalK(x)dx=0$ and $int x^2 mathcalK(x)dx=1$.



In the literature on this topic is customary to say tha $h$ must be chosen so as to minimize the MISE where



beginequation
mathrmMISE(h):=mathbbE_mathbbP^Nleft[ int left( widehatf_h(x)- f(x)right)^2dx right].
endequation
the randomness is in the vector $left(widehatxi_1,ldots,widehatxi_Nright)$, that has distribution $mathbbP^N=mathbbPtimescdots timesmathbbP$, where $mathbbP$ where
$mathbbP(A):=int_Af(x)dx.$
Therefore, $mathbbP$ is unknow. There are many techniques to estimate MISE.



The question: If $h_MISE$ minimizes $mathrmMISE$ then that $h_MISE$ is used to determine $widehatf_h_MISE$ for any sample of size $N$, the same is always used. Is not it better to find an $h$ for each sample?



Given a sample $widehatxi_1,ldots,widehatxi_N$, we consider the expression
$$mathrmISE(h):= int left( widehatf_h(x)- f(x)right)^2dx. $$
If $h_ISE$ minimizes $mathrmISE$. Is not $widehatf_h_ISE$ a better estimator than $widehatf_h_MISE$?



Why usually minimize MISE instead of minimizing ISE?










share|cite|improve this question





















  • Check this "sciencedirect.com/science/article/pii/016771529190163L" it will help you a lot.
    – Angel
    Sep 1 at 6:12












up vote
0
down vote

favorite









up vote
0
down vote

favorite











Before presenting my question (which I already formulate in the title of this post) is important to establish the context of my problem:



Let $xi$ be a random variable with density function $f$ unknow. Given a sample $widehatxi_1,ldots,widehatxi_N$ of $xi$ we have that the kernel density estimator (KDE) of $f$ is
$$widehatf_h(x):=frac1Nhsum_i=1^NmathcalKleft(fracx-widehatxi_ihright).$$
where $h> 0$ and $mathcalK$ is a probability density function such that $int x mathcalK(x)dx=0$ and $int x^2 mathcalK(x)dx=1$.



In the literature on this topic is customary to say tha $h$ must be chosen so as to minimize the MISE where



beginequation
mathrmMISE(h):=mathbbE_mathbbP^Nleft[ int left( widehatf_h(x)- f(x)right)^2dx right].
endequation
the randomness is in the vector $left(widehatxi_1,ldots,widehatxi_Nright)$, that has distribution $mathbbP^N=mathbbPtimescdots timesmathbbP$, where $mathbbP$ where
$mathbbP(A):=int_Af(x)dx.$
Therefore, $mathbbP$ is unknow. There are many techniques to estimate MISE.



The question: If $h_MISE$ minimizes $mathrmMISE$ then that $h_MISE$ is used to determine $widehatf_h_MISE$ for any sample of size $N$, the same is always used. Is not it better to find an $h$ for each sample?



Given a sample $widehatxi_1,ldots,widehatxi_N$, we consider the expression
$$mathrmISE(h):= int left( widehatf_h(x)- f(x)right)^2dx. $$
If $h_ISE$ minimizes $mathrmISE$. Is not $widehatf_h_ISE$ a better estimator than $widehatf_h_MISE$?



Why usually minimize MISE instead of minimizing ISE?










share|cite|improve this question













Before presenting my question (which I already formulate in the title of this post) is important to establish the context of my problem:



Let $xi$ be a random variable with density function $f$ unknow. Given a sample $widehatxi_1,ldots,widehatxi_N$ of $xi$ we have that the kernel density estimator (KDE) of $f$ is
$$widehatf_h(x):=frac1Nhsum_i=1^NmathcalKleft(fracx-widehatxi_ihright).$$
where $h> 0$ and $mathcalK$ is a probability density function such that $int x mathcalK(x)dx=0$ and $int x^2 mathcalK(x)dx=1$.



In the literature on this topic is customary to say tha $h$ must be chosen so as to minimize the MISE where



beginequation
mathrmMISE(h):=mathbbE_mathbbP^Nleft[ int left( widehatf_h(x)- f(x)right)^2dx right].
endequation
the randomness is in the vector $left(widehatxi_1,ldots,widehatxi_Nright)$, that has distribution $mathbbP^N=mathbbPtimescdots timesmathbbP$, where $mathbbP$ where
$mathbbP(A):=int_Af(x)dx.$
Therefore, $mathbbP$ is unknow. There are many techniques to estimate MISE.



The question: If $h_MISE$ minimizes $mathrmMISE$ then that $h_MISE$ is used to determine $widehatf_h_MISE$ for any sample of size $N$, the same is always used. Is not it better to find an $h$ for each sample?



Given a sample $widehatxi_1,ldots,widehatxi_N$, we consider the expression
$$mathrmISE(h):= int left( widehatf_h(x)- f(x)right)^2dx. $$
If $h_ISE$ minimizes $mathrmISE$. Is not $widehatf_h_ISE$ a better estimator than $widehatf_h_MISE$?



Why usually minimize MISE instead of minimizing ISE?







statistics probability-distributions density-function parameter-estimation






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Apr 15 at 19:45









Diego Fonseca

1,442621




1,442621











  • Check this "sciencedirect.com/science/article/pii/016771529190163L" it will help you a lot.
    – Angel
    Sep 1 at 6:12
















  • Check this "sciencedirect.com/science/article/pii/016771529190163L" it will help you a lot.
    – Angel
    Sep 1 at 6:12















Check this "sciencedirect.com/science/article/pii/016771529190163L" it will help you a lot.
– Angel
Sep 1 at 6:12




Check this "sciencedirect.com/science/article/pii/016771529190163L" it will help you a lot.
– Angel
Sep 1 at 6:12















active

oldest

votes











Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2738792%2fin-the-choice-of-bandwidth-for-kernel-density-estimator-why-usually-minimize-mi%23new-answer', 'question_page');

);

Post as a guest



































active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2738792%2fin-the-choice-of-bandwidth-for-kernel-density-estimator-why-usually-minimize-mi%23new-answer', 'question_page');

);

Post as a guest













































































這個網誌中的熱門文章

How to combine Bézier curves to a surface?

Mutual Information Always Non-negative

Why am i infinitely getting the same tweet with the Twitter Search API?