Estimate how many tracks the city has.(Almost done, but can not find whether estimator is biased or not)

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I try to estimate the number of tracks in the city by observing their serial numbers. Assume that the serial numbers are drawn from a uniform probability density ranging from 0 to an unknown parameter $theta$, as which I take the number of trucks in the city. I use maximum likelihood method to estimate the unknown parameter.



I observe L trucks, take their serial numbers(s1, s2, ...,sL) down. Assume the observation results are independent, the maximum likelihood estimation of $theta$ is $hattheta$ = maxs1, s2,..., sL.



To find estimator baised or not, I take expectation of $hattheta$.
$$E(hattheta)=sum_i=0^theta ifrac(i+1)^L-i^L)(theta+1)^L$$
where $frac(i+1)^L-i^L)(theta+1)^L$ is the ditribution of estimator $hattheta$.



The problem is I can not find a closed form formula for the expectation. Anyone can hep?







share|cite|improve this question




















  • I seem to recall that your estimate should be $frac L+1L$ times the highest serial number observed.
    – Ross Millikan
    Aug 16 at 3:50










  • @RossMillikan I don't understand. Could you show me why is that?
    – Chuan Huang
    Aug 16 at 6:35










  • It is because if you pick L items independently from a uniform distribution the expected value of the largest is $frac LL+1$. I don't have a demonstration of that, but it is a standard result.
    – Ross Millikan
    Aug 16 at 14:29










  • You might be interested in the "German tank problem". en.wikipedia.org/wiki/German_tank_problem
    – awkward
    Aug 16 at 18:55














up vote
0
down vote

favorite












I try to estimate the number of tracks in the city by observing their serial numbers. Assume that the serial numbers are drawn from a uniform probability density ranging from 0 to an unknown parameter $theta$, as which I take the number of trucks in the city. I use maximum likelihood method to estimate the unknown parameter.



I observe L trucks, take their serial numbers(s1, s2, ...,sL) down. Assume the observation results are independent, the maximum likelihood estimation of $theta$ is $hattheta$ = maxs1, s2,..., sL.



To find estimator baised or not, I take expectation of $hattheta$.
$$E(hattheta)=sum_i=0^theta ifrac(i+1)^L-i^L)(theta+1)^L$$
where $frac(i+1)^L-i^L)(theta+1)^L$ is the ditribution of estimator $hattheta$.



The problem is I can not find a closed form formula for the expectation. Anyone can hep?







share|cite|improve this question




















  • I seem to recall that your estimate should be $frac L+1L$ times the highest serial number observed.
    – Ross Millikan
    Aug 16 at 3:50










  • @RossMillikan I don't understand. Could you show me why is that?
    – Chuan Huang
    Aug 16 at 6:35










  • It is because if you pick L items independently from a uniform distribution the expected value of the largest is $frac LL+1$. I don't have a demonstration of that, but it is a standard result.
    – Ross Millikan
    Aug 16 at 14:29










  • You might be interested in the "German tank problem". en.wikipedia.org/wiki/German_tank_problem
    – awkward
    Aug 16 at 18:55












up vote
0
down vote

favorite









up vote
0
down vote

favorite











I try to estimate the number of tracks in the city by observing their serial numbers. Assume that the serial numbers are drawn from a uniform probability density ranging from 0 to an unknown parameter $theta$, as which I take the number of trucks in the city. I use maximum likelihood method to estimate the unknown parameter.



I observe L trucks, take their serial numbers(s1, s2, ...,sL) down. Assume the observation results are independent, the maximum likelihood estimation of $theta$ is $hattheta$ = maxs1, s2,..., sL.



To find estimator baised or not, I take expectation of $hattheta$.
$$E(hattheta)=sum_i=0^theta ifrac(i+1)^L-i^L)(theta+1)^L$$
where $frac(i+1)^L-i^L)(theta+1)^L$ is the ditribution of estimator $hattheta$.



The problem is I can not find a closed form formula for the expectation. Anyone can hep?







share|cite|improve this question












I try to estimate the number of tracks in the city by observing their serial numbers. Assume that the serial numbers are drawn from a uniform probability density ranging from 0 to an unknown parameter $theta$, as which I take the number of trucks in the city. I use maximum likelihood method to estimate the unknown parameter.



I observe L trucks, take their serial numbers(s1, s2, ...,sL) down. Assume the observation results are independent, the maximum likelihood estimation of $theta$ is $hattheta$ = maxs1, s2,..., sL.



To find estimator baised or not, I take expectation of $hattheta$.
$$E(hattheta)=sum_i=0^theta ifrac(i+1)^L-i^L)(theta+1)^L$$
where $frac(i+1)^L-i^L)(theta+1)^L$ is the ditribution of estimator $hattheta$.



The problem is I can not find a closed form formula for the expectation. Anyone can hep?









share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Aug 16 at 3:39









Chuan Huang

51




51











  • I seem to recall that your estimate should be $frac L+1L$ times the highest serial number observed.
    – Ross Millikan
    Aug 16 at 3:50










  • @RossMillikan I don't understand. Could you show me why is that?
    – Chuan Huang
    Aug 16 at 6:35










  • It is because if you pick L items independently from a uniform distribution the expected value of the largest is $frac LL+1$. I don't have a demonstration of that, but it is a standard result.
    – Ross Millikan
    Aug 16 at 14:29










  • You might be interested in the "German tank problem". en.wikipedia.org/wiki/German_tank_problem
    – awkward
    Aug 16 at 18:55
















  • I seem to recall that your estimate should be $frac L+1L$ times the highest serial number observed.
    – Ross Millikan
    Aug 16 at 3:50










  • @RossMillikan I don't understand. Could you show me why is that?
    – Chuan Huang
    Aug 16 at 6:35










  • It is because if you pick L items independently from a uniform distribution the expected value of the largest is $frac LL+1$. I don't have a demonstration of that, but it is a standard result.
    – Ross Millikan
    Aug 16 at 14:29










  • You might be interested in the "German tank problem". en.wikipedia.org/wiki/German_tank_problem
    – awkward
    Aug 16 at 18:55















I seem to recall that your estimate should be $frac L+1L$ times the highest serial number observed.
– Ross Millikan
Aug 16 at 3:50




I seem to recall that your estimate should be $frac L+1L$ times the highest serial number observed.
– Ross Millikan
Aug 16 at 3:50












@RossMillikan I don't understand. Could you show me why is that?
– Chuan Huang
Aug 16 at 6:35




@RossMillikan I don't understand. Could you show me why is that?
– Chuan Huang
Aug 16 at 6:35












It is because if you pick L items independently from a uniform distribution the expected value of the largest is $frac LL+1$. I don't have a demonstration of that, but it is a standard result.
– Ross Millikan
Aug 16 at 14:29




It is because if you pick L items independently from a uniform distribution the expected value of the largest is $frac LL+1$. I don't have a demonstration of that, but it is a standard result.
– Ross Millikan
Aug 16 at 14:29












You might be interested in the "German tank problem". en.wikipedia.org/wiki/German_tank_problem
– awkward
Aug 16 at 18:55




You might be interested in the "German tank problem". en.wikipedia.org/wiki/German_tank_problem
– awkward
Aug 16 at 18:55










1 Answer
1






active

oldest

votes

















up vote
0
down vote



accepted










For independent and identically distributed observations $boldsymbol s = (s_1, ldots, s_n)$ from a discrete uniform distribution $S$ on $1, 2, ldots, theta$, the estimator $$hat theta = max_i s_i$$ has the probability mass function
$$beginalign* Pr[hat theta = x]
&= Pr[hat theta le x] - Pr[hat theta le x-1] \
&= prod_i=1^n Pr[S_i le x] - prod_i=1^n Pr[S_i le x-1] \
&= left(fracxthetaright)^n - left(fracx-1thetaright)^n, quad x in 1, 2, ldots, theta, quad n in mathbb Z^+.
endalign*$$



The expectation is then
$$beginalign* operatornameE[hattheta]
&= sum_i=1^theta x Pr[hat theta = x] \
&= theta^-n sum_x=1^theta x(x^n - (x-1)^n) \
&= theta^-n sum_x=1^theta x^n+1 - sum_x=0^theta-1 (x+1)x^n \
&= theta^-n sum_x=1^theta x^n+1 - sum_x=0^theta-1 x^n+1 + x^n \
&= theta^-n left( theta^n+1 - 0^n+1 - sum_x=1^theta-1 x^n right) \
&= theta - fracH_theta-1,-ntheta^n \
endalign*$$
where $H_m,n = sum_x=1^m frac1x^n$ is a generalized harmonic number. It is trivial to see that $operatornameE[hat theta] < theta$, since $$fracH_theta-1,-ntheta^n = sum_x=1^theta-1 left(fracxthetaright)^n$$ is clearly a finite sum of positive rationals.



It is worth noting that the above calculation is completely unnecessary to establish that $hat theta$ is necessarily biased, since $$Pr[hat theta > theta] = 0,$$ yet $$Pr[hat theta < theta] > 0.$$






share|cite|improve this answer






















    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "69"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2884365%2festimate-how-many-tracks-the-city-has-almost-done-but-can-not-find-whether-est%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote



    accepted










    For independent and identically distributed observations $boldsymbol s = (s_1, ldots, s_n)$ from a discrete uniform distribution $S$ on $1, 2, ldots, theta$, the estimator $$hat theta = max_i s_i$$ has the probability mass function
    $$beginalign* Pr[hat theta = x]
    &= Pr[hat theta le x] - Pr[hat theta le x-1] \
    &= prod_i=1^n Pr[S_i le x] - prod_i=1^n Pr[S_i le x-1] \
    &= left(fracxthetaright)^n - left(fracx-1thetaright)^n, quad x in 1, 2, ldots, theta, quad n in mathbb Z^+.
    endalign*$$



    The expectation is then
    $$beginalign* operatornameE[hattheta]
    &= sum_i=1^theta x Pr[hat theta = x] \
    &= theta^-n sum_x=1^theta x(x^n - (x-1)^n) \
    &= theta^-n sum_x=1^theta x^n+1 - sum_x=0^theta-1 (x+1)x^n \
    &= theta^-n sum_x=1^theta x^n+1 - sum_x=0^theta-1 x^n+1 + x^n \
    &= theta^-n left( theta^n+1 - 0^n+1 - sum_x=1^theta-1 x^n right) \
    &= theta - fracH_theta-1,-ntheta^n \
    endalign*$$
    where $H_m,n = sum_x=1^m frac1x^n$ is a generalized harmonic number. It is trivial to see that $operatornameE[hat theta] < theta$, since $$fracH_theta-1,-ntheta^n = sum_x=1^theta-1 left(fracxthetaright)^n$$ is clearly a finite sum of positive rationals.



    It is worth noting that the above calculation is completely unnecessary to establish that $hat theta$ is necessarily biased, since $$Pr[hat theta > theta] = 0,$$ yet $$Pr[hat theta < theta] > 0.$$






    share|cite|improve this answer


























      up vote
      0
      down vote



      accepted










      For independent and identically distributed observations $boldsymbol s = (s_1, ldots, s_n)$ from a discrete uniform distribution $S$ on $1, 2, ldots, theta$, the estimator $$hat theta = max_i s_i$$ has the probability mass function
      $$beginalign* Pr[hat theta = x]
      &= Pr[hat theta le x] - Pr[hat theta le x-1] \
      &= prod_i=1^n Pr[S_i le x] - prod_i=1^n Pr[S_i le x-1] \
      &= left(fracxthetaright)^n - left(fracx-1thetaright)^n, quad x in 1, 2, ldots, theta, quad n in mathbb Z^+.
      endalign*$$



      The expectation is then
      $$beginalign* operatornameE[hattheta]
      &= sum_i=1^theta x Pr[hat theta = x] \
      &= theta^-n sum_x=1^theta x(x^n - (x-1)^n) \
      &= theta^-n sum_x=1^theta x^n+1 - sum_x=0^theta-1 (x+1)x^n \
      &= theta^-n sum_x=1^theta x^n+1 - sum_x=0^theta-1 x^n+1 + x^n \
      &= theta^-n left( theta^n+1 - 0^n+1 - sum_x=1^theta-1 x^n right) \
      &= theta - fracH_theta-1,-ntheta^n \
      endalign*$$
      where $H_m,n = sum_x=1^m frac1x^n$ is a generalized harmonic number. It is trivial to see that $operatornameE[hat theta] < theta$, since $$fracH_theta-1,-ntheta^n = sum_x=1^theta-1 left(fracxthetaright)^n$$ is clearly a finite sum of positive rationals.



      It is worth noting that the above calculation is completely unnecessary to establish that $hat theta$ is necessarily biased, since $$Pr[hat theta > theta] = 0,$$ yet $$Pr[hat theta < theta] > 0.$$






      share|cite|improve this answer
























        up vote
        0
        down vote



        accepted







        up vote
        0
        down vote



        accepted






        For independent and identically distributed observations $boldsymbol s = (s_1, ldots, s_n)$ from a discrete uniform distribution $S$ on $1, 2, ldots, theta$, the estimator $$hat theta = max_i s_i$$ has the probability mass function
        $$beginalign* Pr[hat theta = x]
        &= Pr[hat theta le x] - Pr[hat theta le x-1] \
        &= prod_i=1^n Pr[S_i le x] - prod_i=1^n Pr[S_i le x-1] \
        &= left(fracxthetaright)^n - left(fracx-1thetaright)^n, quad x in 1, 2, ldots, theta, quad n in mathbb Z^+.
        endalign*$$



        The expectation is then
        $$beginalign* operatornameE[hattheta]
        &= sum_i=1^theta x Pr[hat theta = x] \
        &= theta^-n sum_x=1^theta x(x^n - (x-1)^n) \
        &= theta^-n sum_x=1^theta x^n+1 - sum_x=0^theta-1 (x+1)x^n \
        &= theta^-n sum_x=1^theta x^n+1 - sum_x=0^theta-1 x^n+1 + x^n \
        &= theta^-n left( theta^n+1 - 0^n+1 - sum_x=1^theta-1 x^n right) \
        &= theta - fracH_theta-1,-ntheta^n \
        endalign*$$
        where $H_m,n = sum_x=1^m frac1x^n$ is a generalized harmonic number. It is trivial to see that $operatornameE[hat theta] < theta$, since $$fracH_theta-1,-ntheta^n = sum_x=1^theta-1 left(fracxthetaright)^n$$ is clearly a finite sum of positive rationals.



        It is worth noting that the above calculation is completely unnecessary to establish that $hat theta$ is necessarily biased, since $$Pr[hat theta > theta] = 0,$$ yet $$Pr[hat theta < theta] > 0.$$






        share|cite|improve this answer














        For independent and identically distributed observations $boldsymbol s = (s_1, ldots, s_n)$ from a discrete uniform distribution $S$ on $1, 2, ldots, theta$, the estimator $$hat theta = max_i s_i$$ has the probability mass function
        $$beginalign* Pr[hat theta = x]
        &= Pr[hat theta le x] - Pr[hat theta le x-1] \
        &= prod_i=1^n Pr[S_i le x] - prod_i=1^n Pr[S_i le x-1] \
        &= left(fracxthetaright)^n - left(fracx-1thetaright)^n, quad x in 1, 2, ldots, theta, quad n in mathbb Z^+.
        endalign*$$



        The expectation is then
        $$beginalign* operatornameE[hattheta]
        &= sum_i=1^theta x Pr[hat theta = x] \
        &= theta^-n sum_x=1^theta x(x^n - (x-1)^n) \
        &= theta^-n sum_x=1^theta x^n+1 - sum_x=0^theta-1 (x+1)x^n \
        &= theta^-n sum_x=1^theta x^n+1 - sum_x=0^theta-1 x^n+1 + x^n \
        &= theta^-n left( theta^n+1 - 0^n+1 - sum_x=1^theta-1 x^n right) \
        &= theta - fracH_theta-1,-ntheta^n \
        endalign*$$
        where $H_m,n = sum_x=1^m frac1x^n$ is a generalized harmonic number. It is trivial to see that $operatornameE[hat theta] < theta$, since $$fracH_theta-1,-ntheta^n = sum_x=1^theta-1 left(fracxthetaright)^n$$ is clearly a finite sum of positive rationals.



        It is worth noting that the above calculation is completely unnecessary to establish that $hat theta$ is necessarily biased, since $$Pr[hat theta > theta] = 0,$$ yet $$Pr[hat theta < theta] > 0.$$







        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Aug 16 at 5:17

























        answered Aug 16 at 5:08









        heropup

        59.9k65895




        59.9k65895






















             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2884365%2festimate-how-many-tracks-the-city-has-almost-done-but-can-not-find-whether-est%23new-answer', 'question_page');

            );

            Post as a guest













































































            這個網誌中的熱門文章

            How to combine Bézier curves to a surface?

            Mutual Information Always Non-negative

            Why am i infinitely getting the same tweet with the Twitter Search API?