What enables hash function to produce uniform distribution given any distribution of input

up vote
0
down vote

favorite

I always take the uniformity of hash output as a given and didn't think much of it. Now I am kind of curious, how does good hash function like sha guarantees output uniformity.

Intuitively, given 1:1 input cardinality to output cardinality, the same amount distribution that has high entropy (uniform) in hash output must be equal to the amount of distribution that has low entropy in hash output.

So that means approximately 50% of the possible input distribution will experience less uniformity after hashing.

This cannot be right, can someone point out where my logic is wrong? And guide me on the how good hash function output uniformity for any random distribution?

Edit:

I realize my question is a bit ambiguous, so I think I should try to clarify.

For simplification assuming sha1 is a function that map Integer 2^64 -> Integer 2^64

My question is that given a stream of n random number where n << 2^62, the uniformity property of good hash function (eg. sha1) will redistribute these n number uniformly on a number line [0, 2^k] such that the space between each number is roughly equal (uniformly distributed on a number line).

My first impression is that its not always possible.

If you conduct this experiment multiples times (say m times) where each experiment is to supply n random number and calculate how uniformly distributed the output is on the number line. You will see approximately 50% (or m/2 ) of the experiment to be less uniform and 50% of experiment to be more uniform.

I think my intuition is wrong, but I don't know where is it wrong.

asked Aug 14 at 22:48

python

1012

migrated from security.stackexchange.com Aug 15 at 13:02

This question came from our site for information security professionals.

Are you essentially asking how a hash function provides confusion and diffusion?
â€“Â forest
Aug 15 at 0:18

@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â€“Â Mike Ounsworth
Aug 15 at 0:22

@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â€“Â forest
Aug 15 at 0:23

lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â€“Â Mike Ounsworth
Aug 15 at 0:24

OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â€“Â forest
Aug 15 at 0:26

Â |Â
show 11 more comments

up vote
0
down vote

favorite

I always take the uniformity of hash output as a given and didn't think much of it. Now I am kind of curious, how does good hash function like sha guarantees output uniformity.

So that means approximately 50% of the possible input distribution will experience less uniformity after hashing.

This cannot be right, can someone point out where my logic is wrong? And guide me on the how good hash function output uniformity for any random distribution?

Edit:

I realize my question is a bit ambiguous, so I think I should try to clarify.

For simplification assuming sha1 is a function that map Integer 2^64 -> Integer 2^64

My first impression is that its not always possible.

I think my intuition is wrong, but I don't know where is it wrong.

asked Aug 14 at 22:48

python

1012

migrated from security.stackexchange.com Aug 15 at 13:02

This question came from our site for information security professionals.

Are you essentially asking how a hash function provides confusion and diffusion?
â€“Â forest
Aug 15 at 0:18

@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â€“Â Mike Ounsworth
Aug 15 at 0:22

@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â€“Â forest
Aug 15 at 0:23

lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â€“Â Mike Ounsworth
Aug 15 at 0:24

OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â€“Â forest
Aug 15 at 0:26

Â |Â
show 11 more comments

up vote
0
down vote

favorite

I always take the uniformity of hash output as a given and didn't think much of it. Now I am kind of curious, how does good hash function like sha guarantees output uniformity.

So that means approximately 50% of the possible input distribution will experience less uniformity after hashing.

This cannot be right, can someone point out where my logic is wrong? And guide me on the how good hash function output uniformity for any random distribution?

Edit:

I realize my question is a bit ambiguous, so I think I should try to clarify.

For simplification assuming sha1 is a function that map Integer 2^64 -> Integer 2^64

My first impression is that its not always possible.

I think my intuition is wrong, but I don't know where is it wrong.

asked Aug 14 at 22:48

python

1012

I always take the uniformity of hash output as a given and didn't think much of it. Now I am kind of curious, how does good hash function like sha guarantees output uniformity.

So that means approximately 50% of the possible input distribution will experience less uniformity after hashing.

This cannot be right, can someone point out where my logic is wrong? And guide me on the how good hash function output uniformity for any random distribution?

Edit:

I realize my question is a bit ambiguous, so I think I should try to clarify.

For simplification assuming sha1 is a function that map Integer 2^64 -> Integer 2^64

My first impression is that its not always possible.

I think my intuition is wrong, but I don't know where is it wrong.

asked Aug 14 at 22:48

python

1012

asked Aug 14 at 22:48

python

1012

asked Aug 14 at 22:48

python

1012

asked Aug 14 at 22:48

python

1012

migrated from security.stackexchange.com Aug 15 at 13:02

This question came from our site for information security professionals.

migrated from security.stackexchange.com Aug 15 at 13:02

This question came from our site for information security professionals.

Are you essentially asking how a hash function provides confusion and diffusion?
â€“Â forest
Aug 15 at 0:18

@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â€“Â Mike Ounsworth
Aug 15 at 0:22

@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â€“Â forest
Aug 15 at 0:23

lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â€“Â Mike Ounsworth
Aug 15 at 0:24

OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â€“Â forest
Aug 15 at 0:26

Â |Â
show 11 more comments

Are you essentially asking how a hash function provides confusion and diffusion?
â€“Â forest
Aug 15 at 0:18

@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â€“Â Mike Ounsworth
Aug 15 at 0:22

@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â€“Â forest
Aug 15 at 0:23

lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â€“Â Mike Ounsworth
Aug 15 at 0:24

OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â€“Â forest
Aug 15 at 0:26

Are you essentially asking how a hash function provides confusion and diffusion?
â€“Â forest
Aug 15 at 0:18

@forest Are you writing up an answer? I'll take a stab at it, but yours will probably be better.
â€“Â Mike Ounsworth
Aug 15 at 0:22

@MikeOunsworth I'm thinking of writing one up, but I'm trying to fully understand what OP is asking first. I don't want to write an answer explaining how the avalanche effect works just to find out that it doesn't answer the question. In particular, OP's comment on the now-deleted answer confused me.
â€“Â forest
Aug 15 at 0:23

lol yup, I have wikipedia/avalanche_effect open, and was about to brush up on the SHA3 spec. I'll leave this one to you!
â€“Â Mike Ounsworth
Aug 15 at 0:24

OP, can you rephrase your question? I'm not sure if you're asking about permutation cycle length (i.e. chaining a single hash function a massive number of times), or how the uniformity in the digest is achieved. I can explain the properties of a hash (and how it differs from a random oracle), and perhaps that will allow you to find the answer? Otherwise this may be more of a mathematical question.
â€“Â forest
Aug 15 at 0:26

Â |Â
show 11 more comments

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2883569%2fwhat-enables-hash-function-to-produce-uniform-distribution-given-any-distributio%23new-answer', 'question_page');

);

Post as a guest

Name

active

oldest

votes

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

Vtyjkyuk