Variation of coupon collector problem with distinct coupons in each type

up vote
2
down vote

favorite

I am interested in yet another variation of the coupon collector problem. Specifically, we have an urn with $n_i$ "distinct" coupons of type $i$, where $i = 1,ldots, K$ (for example, a type can be thought of as a color but within that color the coupons are numbered). The goal is to sample sufficiently with replacement to get a collection having at least $m_i$ distinct coupons from type $i$ for $i = 1,ldots, K$ (any $m_i$ out of the $n_i$ different coupons within that type would work). Note that this is different from the variation in:

N. Shank and H. Yang, "Coupon Collector Problem for Non-Uniform Coupons and Random Quotas," Electronic J. of Combinatorics 20(2) 2013

since the coupons here are numbered and the analysis needs to ensure the $m_i$ coupons from type $i$ are distinct. So the question is:

For a given number of samples $n$, what is the probability that we achieve the requirement above (i.e., obtain $m_i$ distinct coupons from the types $i = 1,ldots,K$). Upper and lower bounds would also be useful. Also, what is the expected value of the number of samples $E[N]$ to achieve this requirement?

asked Aug 12 at 2:43

Abas

8810

Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â€¦
â€“Â Abas
Aug 12 at 2:56

please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â€“Â esg
Aug 12 at 14:27

I corrected another typo. Let me know when anything needs more explanation.
â€“Â esg
Aug 13 at 16:17

Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â€“Â Abas
Aug 13 at 16:28

1

(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â€“Â esg
Aug 14 at 17:27

Â |Â
show 3 more comments

up vote
2
down vote

favorite

N. Shank and H. Yang, "Coupon Collector Problem for Non-Uniform Coupons and Random Quotas," Electronic J. of Combinatorics 20(2) 2013

since the coupons here are numbered and the analysis needs to ensure the $m_i$ coupons from type $i$ are distinct. So the question is:

asked Aug 12 at 2:43

Abas

8810

Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â€¦
â€“Â Abas
Aug 12 at 2:56

please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â€“Â esg
Aug 12 at 14:27

I corrected another typo. Let me know when anything needs more explanation.
â€“Â esg
Aug 13 at 16:17

Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â€“Â Abas
Aug 13 at 16:28

1

(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â€“Â esg
Aug 14 at 17:27

Â |Â
show 3 more comments

up vote
2
down vote

favorite

N. Shank and H. Yang, "Coupon Collector Problem for Non-Uniform Coupons and Random Quotas," Electronic J. of Combinatorics 20(2) 2013

since the coupons here are numbered and the analysis needs to ensure the $m_i$ coupons from type $i$ are distinct. So the question is:

asked Aug 12 at 2:43

Abas

8810

N. Shank and H. Yang, "Coupon Collector Problem for Non-Uniform Coupons and Random Quotas," Electronic J. of Combinatorics 20(2) 2013

since the coupons here are numbered and the analysis needs to ensure the $m_i$ coupons from type $i$ are distinct. So the question is:

asked Aug 12 at 2:43

Abas

8810

asked Aug 12 at 2:43

Abas

8810

asked Aug 12 at 2:43

Abas

8810

asked Aug 12 at 2:43

Abas

8810

Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â€¦
â€“Â Abas
Aug 12 at 2:56

please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â€“Â esg
Aug 12 at 14:27

I corrected another typo. Let me know when anything needs more explanation.
â€“Â esg
Aug 13 at 16:17

Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â€“Â Abas
Aug 13 at 16:28

1

(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â€“Â esg
Aug 14 at 17:27

Â |Â
show 3 more comments

Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â€¦
â€“Â Abas
Aug 12 at 2:56

please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â€“Â esg
Aug 12 at 14:27

I corrected another typo. Let me know when anything needs more explanation.
â€“Â esg
Aug 13 at 16:17

Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â€“Â Abas
Aug 13 at 16:28

1

(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â€“Â esg
Aug 14 at 17:27

Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â€¦
â€“Â Abas
Aug 12 at 2:56

please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â€“Â esg
Aug 12 at 14:27

I corrected another typo. Let me know when anything needs more explanation.
â€“Â esg
Aug 13 at 16:17

Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â€“Â Abas
Aug 13 at 16:28

(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â€“Â esg
Aug 14 at 17:27

Â |Â
show 3 more comments

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

I use the same notation as the cited article (so that the results can be directly compared), so
consider an urn which for $iin1,ldots n$ contains $x_igeq 1$ distinguishable coupons of type $i$, altogether
$X_n:=x_1+ldots+x_n$ coupons.

Coupons are drawn with replacement from this urn until (for each $i$) $m_i$ (where $1leq m_ileq x_i$) mutually different coupons
of type $i$ have been drawn.

Let $m:=(m_1,ldots,m_n)$ and $T(m)$ the random time at which this happens, and let $p_x_i(m_i,t):=sum_k=0^m_i-1x_i choose k,t^k$.

Let $Y_1(k),ldots,Y_n(k)$ be the random variables $Y_i(k):=$ number of different coupons of type $i$'' that have been drawn
at "time" $k$.

I use generating functions and start from the following basic

Proposition
The generating function of (the joint distribution of)
$Y_1(k),ldots,Y_n(k)$ is given by:
beginequation*
mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)=frack! X_n^k[t^k] (1+(e^t-1),t_1)^x_1cdotldotscdot(1+(e^t-1),t_n)^x_n
endequation*

Proof Let $j_1+ldots +j_mleq k$.
Clearly
$$mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)=frac1X_n^kcdot x_1 choose j_1cdots x_m choose j_mcdot Sur(k,j_1+ldots+j_m)$$
where $Sur(k,r)$ denotes the number of surjective mappings from $1,ldots,k$ onto $1,ldots,r$. It is known that
$Sur(k,r)=k!,[t^k],(e^t-1)^r$ (since a such a surjective mapping corresponds uniquely to an ordered partition of
$1,ldots,k$ into $r$ non-empty subsets, and $e^t-1$ is the exponential generating function for non-empty sets). The assertion
about the g.f. follows. End of proof.

(I) The distribution of $T(m)$

Since $,T(m)leq k=,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,$ the above gives
beginequation*
mathbbP(T(m)leq k) =frack! X_n^k[t^k] (e^tx_1 -p_x_1(m_1,e^t-1))cdotldotscdot(e^tx_n -p_x_n(m_n,(e^t-1),)
endequation*

From g.f. to probability:
we have to compute beginalign*mathbbP(,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,)
&=sum_j_1geq m_1,ldots, j_ngeq m_n mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)\
&=sum_j_1geq m_1,ldots, j_ngeq m_n [t_1^j_1ldots t_n^j_n]mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)
endalign*
Since the function after $[t^k]$ is a product of factors each containing only one of the $t_i$ variables we can treat these factors individually.
E.g. to account for $mathbbP(Y_1(k)geq m_1,...)$ we have to sum up all the coefficients $[t_1^j]$ with $jgeq m_1$ of the first factor.
We may equivalently sum up all coefficients (i.e. put $t_1=1$) and subtract the sum of the first $m_1$ coefficients. Doing that
"inside" (and leaving the $t^k$ extraction "outside" (coefficients may be extracted in arbitrary order)) we arrive at the factor $e^tx_1-p_x_1(m_1,e^t-1)$, etc..

(II) The expectation of $T(m)$

Finally, using $mathbbE T(m)=sum_kgeq 0 mathbbP(T(m)>k)$ and writing $k! over X_n^k =X_n,int_0^infty s^k e^-X_ns,ds$ leads to

$$mathbbE(T(m))=X_nint_0^infty big(1-prod_i=1^n left(1-p_x_i(m_i,e^s-1),e^-x_i sright)big),ds$$

edited Aug 14 at 17:35

answered Aug 12 at 11:44

esg

20614

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2879927%2fvariation-of-coupon-collector-problem-with-distinct-coupons-in-each-type%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Coupons are drawn with replacement from this urn until (for each $i$) $m_i$ (where $1leq m_ileq x_i$) mutually different coupons
of type $i$ have been drawn.

Let $m:=(m_1,ldots,m_n)$ and $T(m)$ the random time at which this happens, and let $p_x_i(m_i,t):=sum_k=0^m_i-1x_i choose k,t^k$.

Let $Y_1(k),ldots,Y_n(k)$ be the random variables $Y_i(k):=$ number of different coupons of type $i$'' that have been drawn
at "time" $k$.

I use generating functions and start from the following basic

(I) The distribution of $T(m)$

(II) The expectation of $T(m)$

Finally, using $mathbbE T(m)=sum_kgeq 0 mathbbP(T(m)>k)$ and writing $k! over X_n^k =X_n,int_0^infty s^k e^-X_ns,ds$ leads to

$$mathbbE(T(m))=X_nint_0^infty big(1-prod_i=1^n left(1-p_x_i(m_i,e^s-1),e^-x_i sright)big),ds$$

edited Aug 14 at 17:35

answered Aug 12 at 11:44

esg

20614

add a commentÂ |Â

up vote
1
down vote

accepted

Coupons are drawn with replacement from this urn until (for each $i$) $m_i$ (where $1leq m_ileq x_i$) mutually different coupons
of type $i$ have been drawn.

Let $m:=(m_1,ldots,m_n)$ and $T(m)$ the random time at which this happens, and let $p_x_i(m_i,t):=sum_k=0^m_i-1x_i choose k,t^k$.

Let $Y_1(k),ldots,Y_n(k)$ be the random variables $Y_i(k):=$ number of different coupons of type $i$'' that have been drawn
at "time" $k$.

I use generating functions and start from the following basic

(I) The distribution of $T(m)$

(II) The expectation of $T(m)$

Finally, using $mathbbE T(m)=sum_kgeq 0 mathbbP(T(m)>k)$ and writing $k! over X_n^k =X_n,int_0^infty s^k e^-X_ns,ds$ leads to

$$mathbbE(T(m))=X_nint_0^infty big(1-prod_i=1^n left(1-p_x_i(m_i,e^s-1),e^-x_i sright)big),ds$$

edited Aug 14 at 17:35

answered Aug 12 at 11:44

esg

20614

add a commentÂ |Â

up vote
1
down vote

accepted

Coupons are drawn with replacement from this urn until (for each $i$) $m_i$ (where $1leq m_ileq x_i$) mutually different coupons
of type $i$ have been drawn.

Let $m:=(m_1,ldots,m_n)$ and $T(m)$ the random time at which this happens, and let $p_x_i(m_i,t):=sum_k=0^m_i-1x_i choose k,t^k$.

Let $Y_1(k),ldots,Y_n(k)$ be the random variables $Y_i(k):=$ number of different coupons of type $i$'' that have been drawn
at "time" $k$.

I use generating functions and start from the following basic

(I) The distribution of $T(m)$

(II) The expectation of $T(m)$

Finally, using $mathbbE T(m)=sum_kgeq 0 mathbbP(T(m)>k)$ and writing $k! over X_n^k =X_n,int_0^infty s^k e^-X_ns,ds$ leads to

$$mathbbE(T(m))=X_nint_0^infty big(1-prod_i=1^n left(1-p_x_i(m_i,e^s-1),e^-x_i sright)big),ds$$

edited Aug 14 at 17:35

answered Aug 12 at 11:44

esg

20614

Coupons are drawn with replacement from this urn until (for each $i$) $m_i$ (where $1leq m_ileq x_i$) mutually different coupons
of type $i$ have been drawn.

Let $m:=(m_1,ldots,m_n)$ and $T(m)$ the random time at which this happens, and let $p_x_i(m_i,t):=sum_k=0^m_i-1x_i choose k,t^k$.

Let $Y_1(k),ldots,Y_n(k)$ be the random variables $Y_i(k):=$ number of different coupons of type $i$'' that have been drawn
at "time" $k$.

I use generating functions and start from the following basic

(I) The distribution of $T(m)$

(II) The expectation of $T(m)$

Finally, using $mathbbE T(m)=sum_kgeq 0 mathbbP(T(m)>k)$ and writing $k! over X_n^k =X_n,int_0^infty s^k e^-X_ns,ds$ leads to

$$mathbbE(T(m))=X_nint_0^infty big(1-prod_i=1^n left(1-p_x_i(m_i,e^s-1),e^-x_i sright)big),ds$$

edited Aug 14 at 17:35

answered Aug 12 at 11:44

esg

20614

edited Aug 14 at 17:35

answered Aug 12 at 11:44

esg

20614

answered Aug 12 at 11:44

esg

20614

answered Aug 12 at 11:44

esg

20614

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

Vtyjkyuk