expected number of probes for a search miss in a given hash table using linear probing

Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
$m$ is the size of the hash table and $n$ is the number of entries in the hash table.
We assume the hash function uniformly and independently distributes the keys among the values $0$ to $m-1$.
In Robert Sedgewick's book Algorithms 4th ed. He writes
If a cluster is of length $t$, then the expression $(t+(t-1)+ldots
+2+1)/m=t(t+1)/(2m)$ counts the contribution of that cluster to the grand total.
The sum of the cluster lengths is $n$, so, adding this cost for all
entries in the table, we find that the total average cost for a search
miss is $1+n/(2m)$ plus the sum of the squares of the lengths of the
clusters, divided by $2m$. Thus, given a table, we can quickly compute
the average cost of a search miss in that table.
My question is why do we add $n/(2m)$ to the expected number of probes. I understand the $1$ is added because every entry requires at least one probe, and adding the square of the length of a cluster (divided by $(2m))$ accounts for a search miss beginning within that cluster, but I don't see how the $n/(2m)$ is relevant.
algorithms computational-complexity
add a comment |Â
up vote
0
down vote
favorite
$m$ is the size of the hash table and $n$ is the number of entries in the hash table.
We assume the hash function uniformly and independently distributes the keys among the values $0$ to $m-1$.
In Robert Sedgewick's book Algorithms 4th ed. He writes
If a cluster is of length $t$, then the expression $(t+(t-1)+ldots
+2+1)/m=t(t+1)/(2m)$ counts the contribution of that cluster to the grand total.
The sum of the cluster lengths is $n$, so, adding this cost for all
entries in the table, we find that the total average cost for a search
miss is $1+n/(2m)$ plus the sum of the squares of the lengths of the
clusters, divided by $2m$. Thus, given a table, we can quickly compute
the average cost of a search miss in that table.
My question is why do we add $n/(2m)$ to the expected number of probes. I understand the $1$ is added because every entry requires at least one probe, and adding the square of the length of a cluster (divided by $(2m))$ accounts for a search miss beginning within that cluster, but I don't see how the $n/(2m)$ is relevant.
algorithms computational-complexity
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
$m$ is the size of the hash table and $n$ is the number of entries in the hash table.
We assume the hash function uniformly and independently distributes the keys among the values $0$ to $m-1$.
In Robert Sedgewick's book Algorithms 4th ed. He writes
If a cluster is of length $t$, then the expression $(t+(t-1)+ldots
+2+1)/m=t(t+1)/(2m)$ counts the contribution of that cluster to the grand total.
The sum of the cluster lengths is $n$, so, adding this cost for all
entries in the table, we find that the total average cost for a search
miss is $1+n/(2m)$ plus the sum of the squares of the lengths of the
clusters, divided by $2m$. Thus, given a table, we can quickly compute
the average cost of a search miss in that table.
My question is why do we add $n/(2m)$ to the expected number of probes. I understand the $1$ is added because every entry requires at least one probe, and adding the square of the length of a cluster (divided by $(2m))$ accounts for a search miss beginning within that cluster, but I don't see how the $n/(2m)$ is relevant.
algorithms computational-complexity
$m$ is the size of the hash table and $n$ is the number of entries in the hash table.
We assume the hash function uniformly and independently distributes the keys among the values $0$ to $m-1$.
In Robert Sedgewick's book Algorithms 4th ed. He writes
If a cluster is of length $t$, then the expression $(t+(t-1)+ldots
+2+1)/m=t(t+1)/(2m)$ counts the contribution of that cluster to the grand total.
The sum of the cluster lengths is $n$, so, adding this cost for all
entries in the table, we find that the total average cost for a search
miss is $1+n/(2m)$ plus the sum of the squares of the lengths of the
clusters, divided by $2m$. Thus, given a table, we can quickly compute
the average cost of a search miss in that table.
My question is why do we add $n/(2m)$ to the expected number of probes. I understand the $1$ is added because every entry requires at least one probe, and adding the square of the length of a cluster (divided by $(2m))$ accounts for a search miss beginning within that cluster, but I don't see how the $n/(2m)$ is relevant.
algorithms computational-complexity
asked Aug 9 at 15:13
Ken Tjhia
458
458
add a comment |Â
add a comment |Â
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2877310%2fexpected-number-of-probes-for-a-search-miss-in-a-given-hash-table-using-linear-p%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password