What enables hash function to produce uniform distribution given any distribution of input
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
I always take the uniformity of hash output as a given and didn't think much of it. Now I am kind of curious, how does good hash function like sha guarantees output uniformity.
Intuitively, given 1:1 input cardinality to output cardinality, the same amount distribution that has high entropy (uniform) in hash output must be equal to the amount of distribution that has low entropy in hash output.
So that means approximately 50% of the possible input distribution will experience less uniformity after hashing.
This cannot be right, can someone point out where my logic is wrong? And guide me on the how good hash function output uniformity for any random distribution?
Edit:
I realize my question is a bit ambiguous, so I think I should try to clarify.
For simplification assuming sha1 is a function that map Integer 2^64 -> Integer 2^64
My question is that given a stream of n
random number where n << 2^62, the uniformity property of good hash function (eg. sha1) will redistribute these n number uniformly on a number line [0, 2^k]
such that the space between each number is roughly equal (uniformly distributed on a number line).
My first impression is that its not always possible.
If you conduct this experiment multiples times (say m
times) where each experiment is to supply n
random number and calculate how uniformly distributed the output is on the number line. You will see approximately 50% (or m/2 ) of the experiment to be less uniform and 50% of experiment to be more uniform.
I think my intuition is wrong, but I don't know where is it wrong.
untagged
migrated from security.stackexchange.com Aug 15 at 13:02
This question came from our site for information security professionals.
 |Â
show 11 more comments
up vote
0
down vote
favorite
I always take the uniformity of hash output as a given and didn't think much of it. Now I am kind of curious, how does good hash function like sha guarantees output uniformity.
Intuitively, given 1:1 input cardinality to output cardinality, the same amount distribution that has high entropy (uniform) in hash output must be equal to the amount of distribution that has low entropy in hash output.
So that means approximately 50% of the possible input distribution will experience less uniformity after hashing.
This cannot be right, can someone point out where my logic is wrong? And guide me on the how good hash function output uniformity for any random distribution?
Edit:
I realize my question is a bit ambiguous, so I think I should try to clarify.
For simplification assuming sha1 is a function that map Integer 2^64 -> Integer 2^64
My question is that given a stream of n
random number where n << 2^62, the uniformity property of good hash function (eg. sha1) will redistribute these n number uniformly on a number line [0, 2^k]
such that the space between each number is roughly equal (uniformly distributed on a number line).
My first impression is that its not always possible.
If you conduct this experiment multiples times (say m
times) where each experiment is to supply n
random number and calculate how uniformly distributed the output is on the number line. You will see approximately 50% (or m/2 ) of the experiment to be less uniform and 50% of experiment to be more uniform.
I think my intuition is wrong, but I don't know where is it wrong.
untagged
migrated from security.stackexchange.com Aug 15 at 13:02
This question came from our site for information security professionals.
Are you essentially asking how a hash function provides confusion and diffusion?
â forest
Aug 15 at 0:18
@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â Mike Ounsworth
Aug 15 at 0:22
@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â forest
Aug 15 at 0:23
lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â Mike Ounsworth
Aug 15 at 0:24
OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â forest
Aug 15 at 0:26
 |Â
show 11 more comments
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I always take the uniformity of hash output as a given and didn't think much of it. Now I am kind of curious, how does good hash function like sha guarantees output uniformity.
Intuitively, given 1:1 input cardinality to output cardinality, the same amount distribution that has high entropy (uniform) in hash output must be equal to the amount of distribution that has low entropy in hash output.
So that means approximately 50% of the possible input distribution will experience less uniformity after hashing.
This cannot be right, can someone point out where my logic is wrong? And guide me on the how good hash function output uniformity for any random distribution?
Edit:
I realize my question is a bit ambiguous, so I think I should try to clarify.
For simplification assuming sha1 is a function that map Integer 2^64 -> Integer 2^64
My question is that given a stream of n
random number where n << 2^62, the uniformity property of good hash function (eg. sha1) will redistribute these n number uniformly on a number line [0, 2^k]
such that the space between each number is roughly equal (uniformly distributed on a number line).
My first impression is that its not always possible.
If you conduct this experiment multiples times (say m
times) where each experiment is to supply n
random number and calculate how uniformly distributed the output is on the number line. You will see approximately 50% (or m/2 ) of the experiment to be less uniform and 50% of experiment to be more uniform.
I think my intuition is wrong, but I don't know where is it wrong.
untagged
I always take the uniformity of hash output as a given and didn't think much of it. Now I am kind of curious, how does good hash function like sha guarantees output uniformity.
Intuitively, given 1:1 input cardinality to output cardinality, the same amount distribution that has high entropy (uniform) in hash output must be equal to the amount of distribution that has low entropy in hash output.
So that means approximately 50% of the possible input distribution will experience less uniformity after hashing.
This cannot be right, can someone point out where my logic is wrong? And guide me on the how good hash function output uniformity for any random distribution?
Edit:
I realize my question is a bit ambiguous, so I think I should try to clarify.
For simplification assuming sha1 is a function that map Integer 2^64 -> Integer 2^64
My question is that given a stream of n
random number where n << 2^62, the uniformity property of good hash function (eg. sha1) will redistribute these n number uniformly on a number line [0, 2^k]
such that the space between each number is roughly equal (uniformly distributed on a number line).
My first impression is that its not always possible.
If you conduct this experiment multiples times (say m
times) where each experiment is to supply n
random number and calculate how uniformly distributed the output is on the number line. You will see approximately 50% (or m/2 ) of the experiment to be less uniform and 50% of experiment to be more uniform.
I think my intuition is wrong, but I don't know where is it wrong.
untagged
asked Aug 14 at 22:48
python
1012
1012
migrated from security.stackexchange.com Aug 15 at 13:02
This question came from our site for information security professionals.
migrated from security.stackexchange.com Aug 15 at 13:02
This question came from our site for information security professionals.
Are you essentially asking how a hash function provides confusion and diffusion?
â forest
Aug 15 at 0:18
@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â Mike Ounsworth
Aug 15 at 0:22
@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â forest
Aug 15 at 0:23
lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â Mike Ounsworth
Aug 15 at 0:24
OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â forest
Aug 15 at 0:26
 |Â
show 11 more comments
Are you essentially asking how a hash function provides confusion and diffusion?
â forest
Aug 15 at 0:18
@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â Mike Ounsworth
Aug 15 at 0:22
@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â forest
Aug 15 at 0:23
lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â Mike Ounsworth
Aug 15 at 0:24
OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â forest
Aug 15 at 0:26
Are you essentially asking how a hash function provides confusion and diffusion?
â forest
Aug 15 at 0:18
Are you essentially asking how a hash function provides confusion and diffusion?
â forest
Aug 15 at 0:18
@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â Mike Ounsworth
Aug 15 at 0:22
@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â Mike Ounsworth
Aug 15 at 0:22
@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â forest
Aug 15 at 0:23
@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â forest
Aug 15 at 0:23
lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â Mike Ounsworth
Aug 15 at 0:24
lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â Mike Ounsworth
Aug 15 at 0:24
OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â forest
Aug 15 at 0:26
OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â forest
Aug 15 at 0:26
 |Â
show 11 more comments
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2883569%2fwhat-enables-hash-function-to-produce-uniform-distribution-given-any-distributio%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Are you essentially asking how a hash function provides confusion and diffusion?
â forest
Aug 15 at 0:18
@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â Mike Ounsworth
Aug 15 at 0:22
@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â forest
Aug 15 at 0:23
lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â Mike Ounsworth
Aug 15 at 0:24
OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â forest
Aug 15 at 0:26