Chernoff-like Concentration Bounds on Permutations

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
5
down vote

favorite












Suppose I have $n$ balls. Among them, there are $m leq n$ black balls and the other $n - m$ balls are white. Fix a random permutation $pi$ over these balls and denote by $Y_i$ the number of black balls in the first $i$ positions of our permutation $pi$. I would like to show that $Y_i$ is sharply concentrated around its mean.



The expected value



It is not hard to compute the expected value of $Y_i$: the probability that a black ball appears somewhere along the first $i$ positions is $i/n$; moreover, there are $m$ black balls, hence $$mathbbE[Y_i] = fracmcdot in.$$



But can we show that $Y_i$ is actually very close to its mean?



A failed attempt



For every black ball $b$, one can define an indicator variable $X_b$ which equals 1 if $b$ appears in the first $i$ positions of $pi$ and 0 otherwise. Denoting the set of all black balls by $B$, we have



$$ Y_i = sum_b in B X_b.$$



I was hoping to be able to use a Chernoff bound type of a concentration bound, but unfortunately for two black balls $b$ and $b'$, $X_b$ and $X_b'$ are not completely independent. If $X_b = 1$, then there are only $i-1$ spots left for $b'$ to appear in the first $i$ locations and hence the probability of $X_b'$ being 1 gets smaller.







share|cite|improve this question
























    up vote
    5
    down vote

    favorite












    Suppose I have $n$ balls. Among them, there are $m leq n$ black balls and the other $n - m$ balls are white. Fix a random permutation $pi$ over these balls and denote by $Y_i$ the number of black balls in the first $i$ positions of our permutation $pi$. I would like to show that $Y_i$ is sharply concentrated around its mean.



    The expected value



    It is not hard to compute the expected value of $Y_i$: the probability that a black ball appears somewhere along the first $i$ positions is $i/n$; moreover, there are $m$ black balls, hence $$mathbbE[Y_i] = fracmcdot in.$$



    But can we show that $Y_i$ is actually very close to its mean?



    A failed attempt



    For every black ball $b$, one can define an indicator variable $X_b$ which equals 1 if $b$ appears in the first $i$ positions of $pi$ and 0 otherwise. Denoting the set of all black balls by $B$, we have



    $$ Y_i = sum_b in B X_b.$$



    I was hoping to be able to use a Chernoff bound type of a concentration bound, but unfortunately for two black balls $b$ and $b'$, $X_b$ and $X_b'$ are not completely independent. If $X_b = 1$, then there are only $i-1$ spots left for $b'$ to appear in the first $i$ locations and hence the probability of $X_b'$ being 1 gets smaller.







    share|cite|improve this question






















      up vote
      5
      down vote

      favorite









      up vote
      5
      down vote

      favorite











      Suppose I have $n$ balls. Among them, there are $m leq n$ black balls and the other $n - m$ balls are white. Fix a random permutation $pi$ over these balls and denote by $Y_i$ the number of black balls in the first $i$ positions of our permutation $pi$. I would like to show that $Y_i$ is sharply concentrated around its mean.



      The expected value



      It is not hard to compute the expected value of $Y_i$: the probability that a black ball appears somewhere along the first $i$ positions is $i/n$; moreover, there are $m$ black balls, hence $$mathbbE[Y_i] = fracmcdot in.$$



      But can we show that $Y_i$ is actually very close to its mean?



      A failed attempt



      For every black ball $b$, one can define an indicator variable $X_b$ which equals 1 if $b$ appears in the first $i$ positions of $pi$ and 0 otherwise. Denoting the set of all black balls by $B$, we have



      $$ Y_i = sum_b in B X_b.$$



      I was hoping to be able to use a Chernoff bound type of a concentration bound, but unfortunately for two black balls $b$ and $b'$, $X_b$ and $X_b'$ are not completely independent. If $X_b = 1$, then there are only $i-1$ spots left for $b'$ to appear in the first $i$ locations and hence the probability of $X_b'$ being 1 gets smaller.







      share|cite|improve this question












      Suppose I have $n$ balls. Among them, there are $m leq n$ black balls and the other $n - m$ balls are white. Fix a random permutation $pi$ over these balls and denote by $Y_i$ the number of black balls in the first $i$ positions of our permutation $pi$. I would like to show that $Y_i$ is sharply concentrated around its mean.



      The expected value



      It is not hard to compute the expected value of $Y_i$: the probability that a black ball appears somewhere along the first $i$ positions is $i/n$; moreover, there are $m$ black balls, hence $$mathbbE[Y_i] = fracmcdot in.$$



      But can we show that $Y_i$ is actually very close to its mean?



      A failed attempt



      For every black ball $b$, one can define an indicator variable $X_b$ which equals 1 if $b$ appears in the first $i$ positions of $pi$ and 0 otherwise. Denoting the set of all black balls by $B$, we have



      $$ Y_i = sum_b in B X_b.$$



      I was hoping to be able to use a Chernoff bound type of a concentration bound, but unfortunately for two black balls $b$ and $b'$, $X_b$ and $X_b'$ are not completely independent. If $X_b = 1$, then there are only $i-1$ spots left for $b'$ to appear in the first $i$ locations and hence the probability of $X_b'$ being 1 gets smaller.









      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked Aug 12 at 0:35









      Jeff Cooper

      484




      484




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          4
          down vote



          accepted










          Chernoff's bound applies to negatively correlated random variables, such as your hypergeometric distribution. You can find a full treatment in Dubhashi and Panconesi's very useful monograph Concentration of measure for the analysis of algorithms, as well as in many lecture notes, such as this one by Hariharan Ramesh.






          share|cite|improve this answer




















          • The lecture note of Ramesh that is provided seems to be giving a very strong "dimension free" concentration bound which is exactly the same as Chernoff's bound. I haven't been able to find such bounds for hypergeometric distributions in any formal references. The bounds mentioned in the book of Dubhashi and Paconesi (posted below in another answer) are way weaker. @Yuval do you know any other resources giving similar bounds? I am actually afraid that it is wrong.
            – Jeff Cooper
            Aug 12 at 23:31











          • See e.g. ahlenotes.wordpress.com/2015/12/08/hypergeometric_tail which also mentions the issue mentioned in the comment above.
            – Jeff Cooper
            Aug 12 at 23:43











          • If no proof is given and the bound seems excessive, it is better not to use it.
            – Yuval Filmus
            Aug 13 at 3:39










          • Ok, I actually found another note that nicely describes the proof: cs.cmu.edu/~dwajc/notes/Negative%20Association.pdf
            – Jeff Cooper
            Aug 13 at 18:26

















          up vote
          2
          down vote













          Following Yuval's answer, I found the following inequality in Dubhashi and Paconesi's book:



          $$ Pr Big[big|Y_i - mathbbE[Y_i]big| geq t Big] leq expBig(frac-(n-1)t^22(n-i)(i-1) Big)$$






          share|cite|improve this answer




















            Your Answer




            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "419"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );








             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f96172%2fchernoff-like-concentration-bounds-on-permutations%23new-answer', 'question_page');

            );

            Post as a guest






























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            4
            down vote



            accepted










            Chernoff's bound applies to negatively correlated random variables, such as your hypergeometric distribution. You can find a full treatment in Dubhashi and Panconesi's very useful monograph Concentration of measure for the analysis of algorithms, as well as in many lecture notes, such as this one by Hariharan Ramesh.






            share|cite|improve this answer




















            • The lecture note of Ramesh that is provided seems to be giving a very strong "dimension free" concentration bound which is exactly the same as Chernoff's bound. I haven't been able to find such bounds for hypergeometric distributions in any formal references. The bounds mentioned in the book of Dubhashi and Paconesi (posted below in another answer) are way weaker. @Yuval do you know any other resources giving similar bounds? I am actually afraid that it is wrong.
              – Jeff Cooper
              Aug 12 at 23:31











            • See e.g. ahlenotes.wordpress.com/2015/12/08/hypergeometric_tail which also mentions the issue mentioned in the comment above.
              – Jeff Cooper
              Aug 12 at 23:43











            • If no proof is given and the bound seems excessive, it is better not to use it.
              – Yuval Filmus
              Aug 13 at 3:39










            • Ok, I actually found another note that nicely describes the proof: cs.cmu.edu/~dwajc/notes/Negative%20Association.pdf
              – Jeff Cooper
              Aug 13 at 18:26














            up vote
            4
            down vote



            accepted










            Chernoff's bound applies to negatively correlated random variables, such as your hypergeometric distribution. You can find a full treatment in Dubhashi and Panconesi's very useful monograph Concentration of measure for the analysis of algorithms, as well as in many lecture notes, such as this one by Hariharan Ramesh.






            share|cite|improve this answer




















            • The lecture note of Ramesh that is provided seems to be giving a very strong "dimension free" concentration bound which is exactly the same as Chernoff's bound. I haven't been able to find such bounds for hypergeometric distributions in any formal references. The bounds mentioned in the book of Dubhashi and Paconesi (posted below in another answer) are way weaker. @Yuval do you know any other resources giving similar bounds? I am actually afraid that it is wrong.
              – Jeff Cooper
              Aug 12 at 23:31











            • See e.g. ahlenotes.wordpress.com/2015/12/08/hypergeometric_tail which also mentions the issue mentioned in the comment above.
              – Jeff Cooper
              Aug 12 at 23:43











            • If no proof is given and the bound seems excessive, it is better not to use it.
              – Yuval Filmus
              Aug 13 at 3:39










            • Ok, I actually found another note that nicely describes the proof: cs.cmu.edu/~dwajc/notes/Negative%20Association.pdf
              – Jeff Cooper
              Aug 13 at 18:26












            up vote
            4
            down vote



            accepted







            up vote
            4
            down vote



            accepted






            Chernoff's bound applies to negatively correlated random variables, such as your hypergeometric distribution. You can find a full treatment in Dubhashi and Panconesi's very useful monograph Concentration of measure for the analysis of algorithms, as well as in many lecture notes, such as this one by Hariharan Ramesh.






            share|cite|improve this answer












            Chernoff's bound applies to negatively correlated random variables, such as your hypergeometric distribution. You can find a full treatment in Dubhashi and Panconesi's very useful monograph Concentration of measure for the analysis of algorithms, as well as in many lecture notes, such as this one by Hariharan Ramesh.







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered Aug 12 at 1:31









            Yuval Filmus

            180k12170329




            180k12170329











            • The lecture note of Ramesh that is provided seems to be giving a very strong "dimension free" concentration bound which is exactly the same as Chernoff's bound. I haven't been able to find such bounds for hypergeometric distributions in any formal references. The bounds mentioned in the book of Dubhashi and Paconesi (posted below in another answer) are way weaker. @Yuval do you know any other resources giving similar bounds? I am actually afraid that it is wrong.
              – Jeff Cooper
              Aug 12 at 23:31











            • See e.g. ahlenotes.wordpress.com/2015/12/08/hypergeometric_tail which also mentions the issue mentioned in the comment above.
              – Jeff Cooper
              Aug 12 at 23:43











            • If no proof is given and the bound seems excessive, it is better not to use it.
              – Yuval Filmus
              Aug 13 at 3:39










            • Ok, I actually found another note that nicely describes the proof: cs.cmu.edu/~dwajc/notes/Negative%20Association.pdf
              – Jeff Cooper
              Aug 13 at 18:26
















            • The lecture note of Ramesh that is provided seems to be giving a very strong "dimension free" concentration bound which is exactly the same as Chernoff's bound. I haven't been able to find such bounds for hypergeometric distributions in any formal references. The bounds mentioned in the book of Dubhashi and Paconesi (posted below in another answer) are way weaker. @Yuval do you know any other resources giving similar bounds? I am actually afraid that it is wrong.
              – Jeff Cooper
              Aug 12 at 23:31











            • See e.g. ahlenotes.wordpress.com/2015/12/08/hypergeometric_tail which also mentions the issue mentioned in the comment above.
              – Jeff Cooper
              Aug 12 at 23:43











            • If no proof is given and the bound seems excessive, it is better not to use it.
              – Yuval Filmus
              Aug 13 at 3:39










            • Ok, I actually found another note that nicely describes the proof: cs.cmu.edu/~dwajc/notes/Negative%20Association.pdf
              – Jeff Cooper
              Aug 13 at 18:26















            The lecture note of Ramesh that is provided seems to be giving a very strong "dimension free" concentration bound which is exactly the same as Chernoff's bound. I haven't been able to find such bounds for hypergeometric distributions in any formal references. The bounds mentioned in the book of Dubhashi and Paconesi (posted below in another answer) are way weaker. @Yuval do you know any other resources giving similar bounds? I am actually afraid that it is wrong.
            – Jeff Cooper
            Aug 12 at 23:31





            The lecture note of Ramesh that is provided seems to be giving a very strong "dimension free" concentration bound which is exactly the same as Chernoff's bound. I haven't been able to find such bounds for hypergeometric distributions in any formal references. The bounds mentioned in the book of Dubhashi and Paconesi (posted below in another answer) are way weaker. @Yuval do you know any other resources giving similar bounds? I am actually afraid that it is wrong.
            – Jeff Cooper
            Aug 12 at 23:31













            See e.g. ahlenotes.wordpress.com/2015/12/08/hypergeometric_tail which also mentions the issue mentioned in the comment above.
            – Jeff Cooper
            Aug 12 at 23:43





            See e.g. ahlenotes.wordpress.com/2015/12/08/hypergeometric_tail which also mentions the issue mentioned in the comment above.
            – Jeff Cooper
            Aug 12 at 23:43













            If no proof is given and the bound seems excessive, it is better not to use it.
            – Yuval Filmus
            Aug 13 at 3:39




            If no proof is given and the bound seems excessive, it is better not to use it.
            – Yuval Filmus
            Aug 13 at 3:39












            Ok, I actually found another note that nicely describes the proof: cs.cmu.edu/~dwajc/notes/Negative%20Association.pdf
            – Jeff Cooper
            Aug 13 at 18:26




            Ok, I actually found another note that nicely describes the proof: cs.cmu.edu/~dwajc/notes/Negative%20Association.pdf
            – Jeff Cooper
            Aug 13 at 18:26










            up vote
            2
            down vote













            Following Yuval's answer, I found the following inequality in Dubhashi and Paconesi's book:



            $$ Pr Big[big|Y_i - mathbbE[Y_i]big| geq t Big] leq expBig(frac-(n-1)t^22(n-i)(i-1) Big)$$






            share|cite|improve this answer
























              up vote
              2
              down vote













              Following Yuval's answer, I found the following inequality in Dubhashi and Paconesi's book:



              $$ Pr Big[big|Y_i - mathbbE[Y_i]big| geq t Big] leq expBig(frac-(n-1)t^22(n-i)(i-1) Big)$$






              share|cite|improve this answer






















                up vote
                2
                down vote










                up vote
                2
                down vote









                Following Yuval's answer, I found the following inequality in Dubhashi and Paconesi's book:



                $$ Pr Big[big|Y_i - mathbbE[Y_i]big| geq t Big] leq expBig(frac-(n-1)t^22(n-i)(i-1) Big)$$






                share|cite|improve this answer












                Following Yuval's answer, I found the following inequality in Dubhashi and Paconesi's book:



                $$ Pr Big[big|Y_i - mathbbE[Y_i]big| geq t Big] leq expBig(frac-(n-1)t^22(n-i)(i-1) Big)$$







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered Aug 12 at 21:04









                Jeff Cooper

                484




                484






















                     

                    draft saved


                    draft discarded


























                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f96172%2fchernoff-like-concentration-bounds-on-permutations%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    這個網誌中的熱門文章

                    How to combine Bézier curves to a surface?

                    Mutual Information Always Non-negative

                    Why am i infinitely getting the same tweet with the Twitter Search API?