Variation of coupon collector problem with distinct coupons in each type
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
I am interested in yet another variation of the coupon collector problem. Specifically, we have an urn with $n_i$ "distinct" coupons of type $i$, where $i = 1,ldots, K$ (for example, a type can be thought of as a color but within that color the coupons are numbered). The goal is to sample sufficiently with replacement to get a collection having at least $m_i$ distinct coupons from type $i$ for $i = 1,ldots, K$ (any $m_i$ out of the $n_i$ different coupons within that type would work). Note that this is different from the variation in:
N. Shank and H. Yang, "Coupon Collector Problem for Non-Uniform Coupons and Random Quotas," Electronic J. of Combinatorics 20(2) 2013
since the coupons here are numbered and the analysis needs to ensure the $m_i$ coupons from type $i$ are distinct. So the question is:
For a given number of samples $n$, what is the probability that we achieve the requirement above (i.e., obtain $m_i$ distinct coupons from the types $i = 1,ldots,K$). Upper and lower bounds would also be useful. Also, what is the expected value of the number of samples $E[N]$ to achieve this requirement?
probability coupon-collector
 |Â
show 3 more comments
up vote
2
down vote
favorite
I am interested in yet another variation of the coupon collector problem. Specifically, we have an urn with $n_i$ "distinct" coupons of type $i$, where $i = 1,ldots, K$ (for example, a type can be thought of as a color but within that color the coupons are numbered). The goal is to sample sufficiently with replacement to get a collection having at least $m_i$ distinct coupons from type $i$ for $i = 1,ldots, K$ (any $m_i$ out of the $n_i$ different coupons within that type would work). Note that this is different from the variation in:
N. Shank and H. Yang, "Coupon Collector Problem for Non-Uniform Coupons and Random Quotas," Electronic J. of Combinatorics 20(2) 2013
since the coupons here are numbered and the analysis needs to ensure the $m_i$ coupons from type $i$ are distinct. So the question is:
For a given number of samples $n$, what is the probability that we achieve the requirement above (i.e., obtain $m_i$ distinct coupons from the types $i = 1,ldots,K$). Upper and lower bounds would also be useful. Also, what is the expected value of the number of samples $E[N]$ to achieve this requirement?
probability coupon-collector
Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â¦
â Abas
Aug 12 at 2:56
please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â esg
Aug 12 at 14:27
I corrected another typo. Let me know when anything needs more explanation.
â esg
Aug 13 at 16:17
Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â Abas
Aug 13 at 16:28
1
(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â esg
Aug 14 at 17:27
 |Â
show 3 more comments
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I am interested in yet another variation of the coupon collector problem. Specifically, we have an urn with $n_i$ "distinct" coupons of type $i$, where $i = 1,ldots, K$ (for example, a type can be thought of as a color but within that color the coupons are numbered). The goal is to sample sufficiently with replacement to get a collection having at least $m_i$ distinct coupons from type $i$ for $i = 1,ldots, K$ (any $m_i$ out of the $n_i$ different coupons within that type would work). Note that this is different from the variation in:
N. Shank and H. Yang, "Coupon Collector Problem for Non-Uniform Coupons and Random Quotas," Electronic J. of Combinatorics 20(2) 2013
since the coupons here are numbered and the analysis needs to ensure the $m_i$ coupons from type $i$ are distinct. So the question is:
For a given number of samples $n$, what is the probability that we achieve the requirement above (i.e., obtain $m_i$ distinct coupons from the types $i = 1,ldots,K$). Upper and lower bounds would also be useful. Also, what is the expected value of the number of samples $E[N]$ to achieve this requirement?
probability coupon-collector
I am interested in yet another variation of the coupon collector problem. Specifically, we have an urn with $n_i$ "distinct" coupons of type $i$, where $i = 1,ldots, K$ (for example, a type can be thought of as a color but within that color the coupons are numbered). The goal is to sample sufficiently with replacement to get a collection having at least $m_i$ distinct coupons from type $i$ for $i = 1,ldots, K$ (any $m_i$ out of the $n_i$ different coupons within that type would work). Note that this is different from the variation in:
N. Shank and H. Yang, "Coupon Collector Problem for Non-Uniform Coupons and Random Quotas," Electronic J. of Combinatorics 20(2) 2013
since the coupons here are numbered and the analysis needs to ensure the $m_i$ coupons from type $i$ are distinct. So the question is:
For a given number of samples $n$, what is the probability that we achieve the requirement above (i.e., obtain $m_i$ distinct coupons from the types $i = 1,ldots,K$). Upper and lower bounds would also be useful. Also, what is the expected value of the number of samples $E[N]$ to achieve this requirement?
probability coupon-collector
asked Aug 12 at 2:43
Abas
8810
8810
Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â¦
â Abas
Aug 12 at 2:56
please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â esg
Aug 12 at 14:27
I corrected another typo. Let me know when anything needs more explanation.
â esg
Aug 13 at 16:17
Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â Abas
Aug 13 at 16:28
1
(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â esg
Aug 14 at 17:27
 |Â
show 3 more comments
Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â¦
â Abas
Aug 12 at 2:56
please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â esg
Aug 12 at 14:27
I corrected another typo. Let me know when anything needs more explanation.
â esg
Aug 13 at 16:17
Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â Abas
Aug 13 at 16:28
1
(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â esg
Aug 14 at 17:27
Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â¦
â Abas
Aug 12 at 2:56
Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â¦
â Abas
Aug 12 at 2:56
please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â esg
Aug 12 at 14:27
please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â esg
Aug 12 at 14:27
I corrected another typo. Let me know when anything needs more explanation.
â esg
Aug 13 at 16:17
I corrected another typo. Let me know when anything needs more explanation.
â esg
Aug 13 at 16:17
Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â Abas
Aug 13 at 16:28
Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â Abas
Aug 13 at 16:28
1
1
(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â esg
Aug 14 at 17:27
(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â esg
Aug 14 at 17:27
 |Â
show 3 more comments
1 Answer
1
active
oldest
votes
up vote
1
down vote
accepted
I use the same notation as the cited article (so that the results can be directly compared), so
consider an urn which for $iin1,ldots n$ contains $x_igeq 1$ distinguishable coupons of type $i$, altogether
$X_n:=x_1+ldots+x_n$ coupons.
Coupons are drawn with replacement from this urn until (for each $i$) $m_i$ (where $1leq m_ileq x_i$) mutually different coupons
of type $i$ have been drawn.
Let $m:=(m_1,ldots,m_n)$ and $T(m)$ the random time at which this happens, and let $p_x_i(m_i,t):=sum_k=0^m_i-1x_i choose k,t^k$.
Let $Y_1(k),ldots,Y_n(k)$ be the random variables $Y_i(k):=$ number of different coupons of type $i$'' that have been drawn
at "time" $k$.
I use generating functions and start from the following basic
Proposition
The generating function of (the joint distribution of)
$Y_1(k),ldots,Y_n(k)$ is given by:
beginequation*
mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)=frack! X_n^k[t^k] (1+(e^t-1),t_1)^x_1cdotldotscdot(1+(e^t-1),t_n)^x_n
endequation*
Proof Let $j_1+ldots +j_mleq k$.
Clearly
$$mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)=frac1X_n^kcdot x_1 choose j_1cdots x_m choose j_mcdot Sur(k,j_1+ldots+j_m)$$
where $Sur(k,r)$ denotes the number of surjective mappings from $1,ldots,k$ onto $1,ldots,r$. It is known that
$Sur(k,r)=k!,[t^k],(e^t-1)^r$ (since a such a surjective mapping corresponds uniquely to an ordered partition of
$1,ldots,k$ into $r$ non-empty subsets, and $e^t-1$ is the exponential generating function for non-empty sets). The assertion
about the g.f. follows. End of proof.
(I) The distribution of $T(m)$
Since $,T(m)leq k=,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,$ the above gives
beginequation*
mathbbP(T(m)leq k) =frack! X_n^k[t^k] (e^tx_1 -p_x_1(m_1,e^t-1))cdotldotscdot(e^tx_n -p_x_n(m_n,(e^t-1),)
endequation*
From g.f. to probability:
we have to compute beginalign*mathbbP(,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,)
&=sum_j_1geq m_1,ldots, j_ngeq m_n mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)\
&=sum_j_1geq m_1,ldots, j_ngeq m_n [t_1^j_1ldots t_n^j_n]mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)
endalign*
Since the function after $[t^k]$ is a product of factors each containing only one of the $t_i$ variables we can treat these factors individually.
E.g. to account for $mathbbP(Y_1(k)geq m_1,...)$ we have to sum up all the coefficients $[t_1^j]$ with $jgeq m_1$ of the first factor.
We may equivalently sum up all coefficients (i.e. put $t_1=1$) and subtract the sum of the first $m_1$ coefficients. Doing that
"inside" (and leaving the $t^k$ extraction "outside" (coefficients may be extracted in arbitrary order)) we arrive at the factor $e^tx_1-p_x_1(m_1,e^t-1)$, etc..
(II) The expectation of $T(m)$
Finally, using $mathbbE T(m)=sum_kgeq 0 mathbbP(T(m)>k)$ and writing $k! over X_n^k =X_n,int_0^infty s^k e^-X_ns,ds$ leads to
$$mathbbE(T(m))=X_nint_0^infty big(1-prod_i=1^n left(1-p_x_i(m_i,e^s-1),e^-x_i sright)big),ds$$
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
I use the same notation as the cited article (so that the results can be directly compared), so
consider an urn which for $iin1,ldots n$ contains $x_igeq 1$ distinguishable coupons of type $i$, altogether
$X_n:=x_1+ldots+x_n$ coupons.
Coupons are drawn with replacement from this urn until (for each $i$) $m_i$ (where $1leq m_ileq x_i$) mutually different coupons
of type $i$ have been drawn.
Let $m:=(m_1,ldots,m_n)$ and $T(m)$ the random time at which this happens, and let $p_x_i(m_i,t):=sum_k=0^m_i-1x_i choose k,t^k$.
Let $Y_1(k),ldots,Y_n(k)$ be the random variables $Y_i(k):=$ number of different coupons of type $i$'' that have been drawn
at "time" $k$.
I use generating functions and start from the following basic
Proposition
The generating function of (the joint distribution of)
$Y_1(k),ldots,Y_n(k)$ is given by:
beginequation*
mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)=frack! X_n^k[t^k] (1+(e^t-1),t_1)^x_1cdotldotscdot(1+(e^t-1),t_n)^x_n
endequation*
Proof Let $j_1+ldots +j_mleq k$.
Clearly
$$mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)=frac1X_n^kcdot x_1 choose j_1cdots x_m choose j_mcdot Sur(k,j_1+ldots+j_m)$$
where $Sur(k,r)$ denotes the number of surjective mappings from $1,ldots,k$ onto $1,ldots,r$. It is known that
$Sur(k,r)=k!,[t^k],(e^t-1)^r$ (since a such a surjective mapping corresponds uniquely to an ordered partition of
$1,ldots,k$ into $r$ non-empty subsets, and $e^t-1$ is the exponential generating function for non-empty sets). The assertion
about the g.f. follows. End of proof.
(I) The distribution of $T(m)$
Since $,T(m)leq k=,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,$ the above gives
beginequation*
mathbbP(T(m)leq k) =frack! X_n^k[t^k] (e^tx_1 -p_x_1(m_1,e^t-1))cdotldotscdot(e^tx_n -p_x_n(m_n,(e^t-1),)
endequation*
From g.f. to probability:
we have to compute beginalign*mathbbP(,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,)
&=sum_j_1geq m_1,ldots, j_ngeq m_n mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)\
&=sum_j_1geq m_1,ldots, j_ngeq m_n [t_1^j_1ldots t_n^j_n]mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)
endalign*
Since the function after $[t^k]$ is a product of factors each containing only one of the $t_i$ variables we can treat these factors individually.
E.g. to account for $mathbbP(Y_1(k)geq m_1,...)$ we have to sum up all the coefficients $[t_1^j]$ with $jgeq m_1$ of the first factor.
We may equivalently sum up all coefficients (i.e. put $t_1=1$) and subtract the sum of the first $m_1$ coefficients. Doing that
"inside" (and leaving the $t^k$ extraction "outside" (coefficients may be extracted in arbitrary order)) we arrive at the factor $e^tx_1-p_x_1(m_1,e^t-1)$, etc..
(II) The expectation of $T(m)$
Finally, using $mathbbE T(m)=sum_kgeq 0 mathbbP(T(m)>k)$ and writing $k! over X_n^k =X_n,int_0^infty s^k e^-X_ns,ds$ leads to
$$mathbbE(T(m))=X_nint_0^infty big(1-prod_i=1^n left(1-p_x_i(m_i,e^s-1),e^-x_i sright)big),ds$$
add a comment |Â
up vote
1
down vote
accepted
I use the same notation as the cited article (so that the results can be directly compared), so
consider an urn which for $iin1,ldots n$ contains $x_igeq 1$ distinguishable coupons of type $i$, altogether
$X_n:=x_1+ldots+x_n$ coupons.
Coupons are drawn with replacement from this urn until (for each $i$) $m_i$ (where $1leq m_ileq x_i$) mutually different coupons
of type $i$ have been drawn.
Let $m:=(m_1,ldots,m_n)$ and $T(m)$ the random time at which this happens, and let $p_x_i(m_i,t):=sum_k=0^m_i-1x_i choose k,t^k$.
Let $Y_1(k),ldots,Y_n(k)$ be the random variables $Y_i(k):=$ number of different coupons of type $i$'' that have been drawn
at "time" $k$.
I use generating functions and start from the following basic
Proposition
The generating function of (the joint distribution of)
$Y_1(k),ldots,Y_n(k)$ is given by:
beginequation*
mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)=frack! X_n^k[t^k] (1+(e^t-1),t_1)^x_1cdotldotscdot(1+(e^t-1),t_n)^x_n
endequation*
Proof Let $j_1+ldots +j_mleq k$.
Clearly
$$mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)=frac1X_n^kcdot x_1 choose j_1cdots x_m choose j_mcdot Sur(k,j_1+ldots+j_m)$$
where $Sur(k,r)$ denotes the number of surjective mappings from $1,ldots,k$ onto $1,ldots,r$. It is known that
$Sur(k,r)=k!,[t^k],(e^t-1)^r$ (since a such a surjective mapping corresponds uniquely to an ordered partition of
$1,ldots,k$ into $r$ non-empty subsets, and $e^t-1$ is the exponential generating function for non-empty sets). The assertion
about the g.f. follows. End of proof.
(I) The distribution of $T(m)$
Since $,T(m)leq k=,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,$ the above gives
beginequation*
mathbbP(T(m)leq k) =frack! X_n^k[t^k] (e^tx_1 -p_x_1(m_1,e^t-1))cdotldotscdot(e^tx_n -p_x_n(m_n,(e^t-1),)
endequation*
From g.f. to probability:
we have to compute beginalign*mathbbP(,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,)
&=sum_j_1geq m_1,ldots, j_ngeq m_n mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)\
&=sum_j_1geq m_1,ldots, j_ngeq m_n [t_1^j_1ldots t_n^j_n]mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)
endalign*
Since the function after $[t^k]$ is a product of factors each containing only one of the $t_i$ variables we can treat these factors individually.
E.g. to account for $mathbbP(Y_1(k)geq m_1,...)$ we have to sum up all the coefficients $[t_1^j]$ with $jgeq m_1$ of the first factor.
We may equivalently sum up all coefficients (i.e. put $t_1=1$) and subtract the sum of the first $m_1$ coefficients. Doing that
"inside" (and leaving the $t^k$ extraction "outside" (coefficients may be extracted in arbitrary order)) we arrive at the factor $e^tx_1-p_x_1(m_1,e^t-1)$, etc..
(II) The expectation of $T(m)$
Finally, using $mathbbE T(m)=sum_kgeq 0 mathbbP(T(m)>k)$ and writing $k! over X_n^k =X_n,int_0^infty s^k e^-X_ns,ds$ leads to
$$mathbbE(T(m))=X_nint_0^infty big(1-prod_i=1^n left(1-p_x_i(m_i,e^s-1),e^-x_i sright)big),ds$$
add a comment |Â
up vote
1
down vote
accepted
up vote
1
down vote
accepted
I use the same notation as the cited article (so that the results can be directly compared), so
consider an urn which for $iin1,ldots n$ contains $x_igeq 1$ distinguishable coupons of type $i$, altogether
$X_n:=x_1+ldots+x_n$ coupons.
Coupons are drawn with replacement from this urn until (for each $i$) $m_i$ (where $1leq m_ileq x_i$) mutually different coupons
of type $i$ have been drawn.
Let $m:=(m_1,ldots,m_n)$ and $T(m)$ the random time at which this happens, and let $p_x_i(m_i,t):=sum_k=0^m_i-1x_i choose k,t^k$.
Let $Y_1(k),ldots,Y_n(k)$ be the random variables $Y_i(k):=$ number of different coupons of type $i$'' that have been drawn
at "time" $k$.
I use generating functions and start from the following basic
Proposition
The generating function of (the joint distribution of)
$Y_1(k),ldots,Y_n(k)$ is given by:
beginequation*
mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)=frack! X_n^k[t^k] (1+(e^t-1),t_1)^x_1cdotldotscdot(1+(e^t-1),t_n)^x_n
endequation*
Proof Let $j_1+ldots +j_mleq k$.
Clearly
$$mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)=frac1X_n^kcdot x_1 choose j_1cdots x_m choose j_mcdot Sur(k,j_1+ldots+j_m)$$
where $Sur(k,r)$ denotes the number of surjective mappings from $1,ldots,k$ onto $1,ldots,r$. It is known that
$Sur(k,r)=k!,[t^k],(e^t-1)^r$ (since a such a surjective mapping corresponds uniquely to an ordered partition of
$1,ldots,k$ into $r$ non-empty subsets, and $e^t-1$ is the exponential generating function for non-empty sets). The assertion
about the g.f. follows. End of proof.
(I) The distribution of $T(m)$
Since $,T(m)leq k=,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,$ the above gives
beginequation*
mathbbP(T(m)leq k) =frack! X_n^k[t^k] (e^tx_1 -p_x_1(m_1,e^t-1))cdotldotscdot(e^tx_n -p_x_n(m_n,(e^t-1),)
endequation*
From g.f. to probability:
we have to compute beginalign*mathbbP(,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,)
&=sum_j_1geq m_1,ldots, j_ngeq m_n mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)\
&=sum_j_1geq m_1,ldots, j_ngeq m_n [t_1^j_1ldots t_n^j_n]mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)
endalign*
Since the function after $[t^k]$ is a product of factors each containing only one of the $t_i$ variables we can treat these factors individually.
E.g. to account for $mathbbP(Y_1(k)geq m_1,...)$ we have to sum up all the coefficients $[t_1^j]$ with $jgeq m_1$ of the first factor.
We may equivalently sum up all coefficients (i.e. put $t_1=1$) and subtract the sum of the first $m_1$ coefficients. Doing that
"inside" (and leaving the $t^k$ extraction "outside" (coefficients may be extracted in arbitrary order)) we arrive at the factor $e^tx_1-p_x_1(m_1,e^t-1)$, etc..
(II) The expectation of $T(m)$
Finally, using $mathbbE T(m)=sum_kgeq 0 mathbbP(T(m)>k)$ and writing $k! over X_n^k =X_n,int_0^infty s^k e^-X_ns,ds$ leads to
$$mathbbE(T(m))=X_nint_0^infty big(1-prod_i=1^n left(1-p_x_i(m_i,e^s-1),e^-x_i sright)big),ds$$
I use the same notation as the cited article (so that the results can be directly compared), so
consider an urn which for $iin1,ldots n$ contains $x_igeq 1$ distinguishable coupons of type $i$, altogether
$X_n:=x_1+ldots+x_n$ coupons.
Coupons are drawn with replacement from this urn until (for each $i$) $m_i$ (where $1leq m_ileq x_i$) mutually different coupons
of type $i$ have been drawn.
Let $m:=(m_1,ldots,m_n)$ and $T(m)$ the random time at which this happens, and let $p_x_i(m_i,t):=sum_k=0^m_i-1x_i choose k,t^k$.
Let $Y_1(k),ldots,Y_n(k)$ be the random variables $Y_i(k):=$ number of different coupons of type $i$'' that have been drawn
at "time" $k$.
I use generating functions and start from the following basic
Proposition
The generating function of (the joint distribution of)
$Y_1(k),ldots,Y_n(k)$ is given by:
beginequation*
mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)=frack! X_n^k[t^k] (1+(e^t-1),t_1)^x_1cdotldotscdot(1+(e^t-1),t_n)^x_n
endequation*
Proof Let $j_1+ldots +j_mleq k$.
Clearly
$$mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)=frac1X_n^kcdot x_1 choose j_1cdots x_m choose j_mcdot Sur(k,j_1+ldots+j_m)$$
where $Sur(k,r)$ denotes the number of surjective mappings from $1,ldots,k$ onto $1,ldots,r$. It is known that
$Sur(k,r)=k!,[t^k],(e^t-1)^r$ (since a such a surjective mapping corresponds uniquely to an ordered partition of
$1,ldots,k$ into $r$ non-empty subsets, and $e^t-1$ is the exponential generating function for non-empty sets). The assertion
about the g.f. follows. End of proof.
(I) The distribution of $T(m)$
Since $,T(m)leq k=,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,$ the above gives
beginequation*
mathbbP(T(m)leq k) =frack! X_n^k[t^k] (e^tx_1 -p_x_1(m_1,e^t-1))cdotldotscdot(e^tx_n -p_x_n(m_n,(e^t-1),)
endequation*
From g.f. to probability:
we have to compute beginalign*mathbbP(,Y_1(k)geq m_1,ldots,Y_n(k)geq m_n,)
&=sum_j_1geq m_1,ldots, j_ngeq m_n mathbbP(Y_1(k)=j_1,ldots,Y_n(k)=j_n)\
&=sum_j_1geq m_1,ldots, j_ngeq m_n [t_1^j_1ldots t_n^j_n]mathbbE, t_1^Y_1(k)ldots t_n^Y_n(k)
endalign*
Since the function after $[t^k]$ is a product of factors each containing only one of the $t_i$ variables we can treat these factors individually.
E.g. to account for $mathbbP(Y_1(k)geq m_1,...)$ we have to sum up all the coefficients $[t_1^j]$ with $jgeq m_1$ of the first factor.
We may equivalently sum up all coefficients (i.e. put $t_1=1$) and subtract the sum of the first $m_1$ coefficients. Doing that
"inside" (and leaving the $t^k$ extraction "outside" (coefficients may be extracted in arbitrary order)) we arrive at the factor $e^tx_1-p_x_1(m_1,e^t-1)$, etc..
(II) The expectation of $T(m)$
Finally, using $mathbbE T(m)=sum_kgeq 0 mathbbP(T(m)>k)$ and writing $k! over X_n^k =X_n,int_0^infty s^k e^-X_ns,ds$ leads to
$$mathbbE(T(m))=X_nint_0^infty big(1-prod_i=1^n left(1-p_x_i(m_i,e^s-1),e^-x_i sright)big),ds$$
edited Aug 14 at 17:35
answered Aug 12 at 11:44
esg
20614
20614
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2879927%2fvariation-of-coupon-collector-problem-with-distinct-coupons-in-each-type%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Two answers were provided by @joriki and esg at math.stackexchange.com/questions/2825115/â¦
â Abas
Aug 12 at 2:56
please note the correction of the typo (missing $s^k$, observed by joriki) in the answer below
â esg
Aug 12 at 14:27
I corrected another typo. Let me know when anything needs more explanation.
â esg
Aug 13 at 16:17
Thank you @esg. It's a bit involved so I am trying to read the [Shank and Yang] paper carefully then will let you know if I am having difficulties. I also wonder if this can be easily extended to the case where the probabilities of the different coupons within each type are different? I guess in this case, one will have to sum over the probabilities of the sets corresponding to the surjection mappings (I dunno if this is tractable).
â Abas
Aug 13 at 16:28
1
(1) $[t^k] F(t)$ denotes the coefficient of $t^k$ in the power series $F(t)$ (with obvious generalization to several variables), the operator $[t^k]$ is frequently called "(extraction of) the $k$-th coefficient (operator)" (2) For example, $[t^k] (e^t-1)^r$ is the number that you get when you expand $(e^t-1)^r$ as a power series of the variable $t$ and take the $k$-th coefficient - it's just a number and does not depend on $t$ in any way. (3) I've answered the last question in the text
â esg
Aug 14 at 17:27