Bayes theorem confusion with likelihood [duplicate]
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
3
down vote
favorite
This question already has an answer here:
What is the difference between âlikelihoodâ and âprobabilityâ?
9 answers
I learned that Bayes theorem was defined as follows :
$$p(thetamid y)=fracp(ymidtheta)p(theta)p(y)$$
But then today I came across definition with likelihood:
$$p(thetamid y)=fracL(thetamid y)p(theta)p(y) = fracL(thetamid y) p(theta)int L(thetamid y) p(theta) dtheta $$
What is the link between the two?
probability bayesian conditional-probability likelihood
marked as duplicate by Xi'an
StackExchange.ready(function()
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();
);
);
);
Aug 27 at 19:55
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |Â
up vote
3
down vote
favorite
This question already has an answer here:
What is the difference between âlikelihoodâ and âprobabilityâ?
9 answers
I learned that Bayes theorem was defined as follows :
$$p(thetamid y)=fracp(ymidtheta)p(theta)p(y)$$
But then today I came across definition with likelihood:
$$p(thetamid y)=fracL(thetamid y)p(theta)p(y) = fracL(thetamid y) p(theta)int L(thetamid y) p(theta) dtheta $$
What is the link between the two?
probability bayesian conditional-probability likelihood
marked as duplicate by Xi'an
StackExchange.ready(function()
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();
);
);
);
Aug 27 at 19:55
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
5
The likelihood is defined as $L(theta|y)=p(y|theta)$.
â Robin Ryder
Aug 26 at 21:18
stats.stackexchange.com/questions/2641/â¦
â Alex
Aug 27 at 4:58
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
This question already has an answer here:
What is the difference between âlikelihoodâ and âprobabilityâ?
9 answers
I learned that Bayes theorem was defined as follows :
$$p(thetamid y)=fracp(ymidtheta)p(theta)p(y)$$
But then today I came across definition with likelihood:
$$p(thetamid y)=fracL(thetamid y)p(theta)p(y) = fracL(thetamid y) p(theta)int L(thetamid y) p(theta) dtheta $$
What is the link between the two?
probability bayesian conditional-probability likelihood
This question already has an answer here:
What is the difference between âlikelihoodâ and âprobabilityâ?
9 answers
I learned that Bayes theorem was defined as follows :
$$p(thetamid y)=fracp(ymidtheta)p(theta)p(y)$$
But then today I came across definition with likelihood:
$$p(thetamid y)=fracL(thetamid y)p(theta)p(y) = fracL(thetamid y) p(theta)int L(thetamid y) p(theta) dtheta $$
What is the link between the two?
This question already has an answer here:
What is the difference between âlikelihoodâ and âprobabilityâ?
9 answers
probability bayesian conditional-probability likelihood
edited Aug 27 at 7:49
The Laconic
777414
777414
asked Aug 26 at 20:30
user1607
1199
1199
marked as duplicate by Xi'an
StackExchange.ready(function()
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();
);
);
);
Aug 27 at 19:55
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by Xi'an
StackExchange.ready(function()
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();
);
);
);
Aug 27 at 19:55
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
5
The likelihood is defined as $L(theta|y)=p(y|theta)$.
â Robin Ryder
Aug 26 at 21:18
stats.stackexchange.com/questions/2641/â¦
â Alex
Aug 27 at 4:58
add a comment |Â
5
The likelihood is defined as $L(theta|y)=p(y|theta)$.
â Robin Ryder
Aug 26 at 21:18
stats.stackexchange.com/questions/2641/â¦
â Alex
Aug 27 at 4:58
5
5
The likelihood is defined as $L(theta|y)=p(y|theta)$.
â Robin Ryder
Aug 26 at 21:18
The likelihood is defined as $L(theta|y)=p(y|theta)$.
â Robin Ryder
Aug 26 at 21:18
stats.stackexchange.com/questions/2641/â¦
â Alex
Aug 27 at 4:58
stats.stackexchange.com/questions/2641/â¦
â Alex
Aug 27 at 4:58
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
2
down vote
$L(theta|y) = p(y|theta)$. I assume that $y$ is the observation here, and we are inferring the value of the parameter $theta$, Thus, $p(y|theta)$ can be viewed as a function $L$ over the (unknown) variables/parameters $theta$.
For the denominator, $p(y) = int p(y,theta)dtheta = int p(y|theta)p(theta)dtheta = int L(theta|y)p(theta)dtheta$.
add a comment |Â
up vote
1
down vote
The second formula is wrong: the outside parts are equal to each other, but the middle part is merely proportional to (and not necessarily equal to) the outside parts. The likelihood is defined by $L(theta mid y) = k(y) p(y mid theta) propto p(y mid theta)$ where $k$ is some constant-of-proportionality that does not depend on $theta$. This means you have:
$$p(theta mid y) = fracp(y mid theta) p(theta)p(y) = frack(y) L(theta mid y) p(theta)p(y) propto fracL(theta mid y) p(theta)p(y).$$
Using the law of total probability you also have $p(y) = int p(y mid theta) p(theta) dtheta$ which gives:
$$p(theta mid y) = fracp(y mid theta) p(theta)p(y) = frack(y) L(theta mid y) p(theta)k(y) int L(theta mid y) p(theta) dtheta = fracL(theta mid y) p(theta)int L(theta mid y) p(theta) dtheta.$$
In the special case where $k(y) = 1$ you have $L(theta mid y) = p(y mid theta)$ and so in this case you get the second equation you specified. However, it is common when using likelihood functions to use a constant-of-proportionality that effectively removes multiplicative terms that do not depend on $theta$.
It is wrong if they define it as unnormalized, but I've seen it defined as $L(theta | y) = p(y | theta)$ in some texts...
â Timâ¦
Aug 27 at 9:18
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
$L(theta|y) = p(y|theta)$. I assume that $y$ is the observation here, and we are inferring the value of the parameter $theta$, Thus, $p(y|theta)$ can be viewed as a function $L$ over the (unknown) variables/parameters $theta$.
For the denominator, $p(y) = int p(y,theta)dtheta = int p(y|theta)p(theta)dtheta = int L(theta|y)p(theta)dtheta$.
add a comment |Â
up vote
2
down vote
$L(theta|y) = p(y|theta)$. I assume that $y$ is the observation here, and we are inferring the value of the parameter $theta$, Thus, $p(y|theta)$ can be viewed as a function $L$ over the (unknown) variables/parameters $theta$.
For the denominator, $p(y) = int p(y,theta)dtheta = int p(y|theta)p(theta)dtheta = int L(theta|y)p(theta)dtheta$.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
$L(theta|y) = p(y|theta)$. I assume that $y$ is the observation here, and we are inferring the value of the parameter $theta$, Thus, $p(y|theta)$ can be viewed as a function $L$ over the (unknown) variables/parameters $theta$.
For the denominator, $p(y) = int p(y,theta)dtheta = int p(y|theta)p(theta)dtheta = int L(theta|y)p(theta)dtheta$.
$L(theta|y) = p(y|theta)$. I assume that $y$ is the observation here, and we are inferring the value of the parameter $theta$, Thus, $p(y|theta)$ can be viewed as a function $L$ over the (unknown) variables/parameters $theta$.
For the denominator, $p(y) = int p(y,theta)dtheta = int p(y|theta)p(theta)dtheta = int L(theta|y)p(theta)dtheta$.
answered Aug 26 at 21:41
Yi Yang
1179
1179
add a comment |Â
add a comment |Â
up vote
1
down vote
The second formula is wrong: the outside parts are equal to each other, but the middle part is merely proportional to (and not necessarily equal to) the outside parts. The likelihood is defined by $L(theta mid y) = k(y) p(y mid theta) propto p(y mid theta)$ where $k$ is some constant-of-proportionality that does not depend on $theta$. This means you have:
$$p(theta mid y) = fracp(y mid theta) p(theta)p(y) = frack(y) L(theta mid y) p(theta)p(y) propto fracL(theta mid y) p(theta)p(y).$$
Using the law of total probability you also have $p(y) = int p(y mid theta) p(theta) dtheta$ which gives:
$$p(theta mid y) = fracp(y mid theta) p(theta)p(y) = frack(y) L(theta mid y) p(theta)k(y) int L(theta mid y) p(theta) dtheta = fracL(theta mid y) p(theta)int L(theta mid y) p(theta) dtheta.$$
In the special case where $k(y) = 1$ you have $L(theta mid y) = p(y mid theta)$ and so in this case you get the second equation you specified. However, it is common when using likelihood functions to use a constant-of-proportionality that effectively removes multiplicative terms that do not depend on $theta$.
It is wrong if they define it as unnormalized, but I've seen it defined as $L(theta | y) = p(y | theta)$ in some texts...
â Timâ¦
Aug 27 at 9:18
add a comment |Â
up vote
1
down vote
The second formula is wrong: the outside parts are equal to each other, but the middle part is merely proportional to (and not necessarily equal to) the outside parts. The likelihood is defined by $L(theta mid y) = k(y) p(y mid theta) propto p(y mid theta)$ where $k$ is some constant-of-proportionality that does not depend on $theta$. This means you have:
$$p(theta mid y) = fracp(y mid theta) p(theta)p(y) = frack(y) L(theta mid y) p(theta)p(y) propto fracL(theta mid y) p(theta)p(y).$$
Using the law of total probability you also have $p(y) = int p(y mid theta) p(theta) dtheta$ which gives:
$$p(theta mid y) = fracp(y mid theta) p(theta)p(y) = frack(y) L(theta mid y) p(theta)k(y) int L(theta mid y) p(theta) dtheta = fracL(theta mid y) p(theta)int L(theta mid y) p(theta) dtheta.$$
In the special case where $k(y) = 1$ you have $L(theta mid y) = p(y mid theta)$ and so in this case you get the second equation you specified. However, it is common when using likelihood functions to use a constant-of-proportionality that effectively removes multiplicative terms that do not depend on $theta$.
It is wrong if they define it as unnormalized, but I've seen it defined as $L(theta | y) = p(y | theta)$ in some texts...
â Timâ¦
Aug 27 at 9:18
add a comment |Â
up vote
1
down vote
up vote
1
down vote
The second formula is wrong: the outside parts are equal to each other, but the middle part is merely proportional to (and not necessarily equal to) the outside parts. The likelihood is defined by $L(theta mid y) = k(y) p(y mid theta) propto p(y mid theta)$ where $k$ is some constant-of-proportionality that does not depend on $theta$. This means you have:
$$p(theta mid y) = fracp(y mid theta) p(theta)p(y) = frack(y) L(theta mid y) p(theta)p(y) propto fracL(theta mid y) p(theta)p(y).$$
Using the law of total probability you also have $p(y) = int p(y mid theta) p(theta) dtheta$ which gives:
$$p(theta mid y) = fracp(y mid theta) p(theta)p(y) = frack(y) L(theta mid y) p(theta)k(y) int L(theta mid y) p(theta) dtheta = fracL(theta mid y) p(theta)int L(theta mid y) p(theta) dtheta.$$
In the special case where $k(y) = 1$ you have $L(theta mid y) = p(y mid theta)$ and so in this case you get the second equation you specified. However, it is common when using likelihood functions to use a constant-of-proportionality that effectively removes multiplicative terms that do not depend on $theta$.
The second formula is wrong: the outside parts are equal to each other, but the middle part is merely proportional to (and not necessarily equal to) the outside parts. The likelihood is defined by $L(theta mid y) = k(y) p(y mid theta) propto p(y mid theta)$ where $k$ is some constant-of-proportionality that does not depend on $theta$. This means you have:
$$p(theta mid y) = fracp(y mid theta) p(theta)p(y) = frack(y) L(theta mid y) p(theta)p(y) propto fracL(theta mid y) p(theta)p(y).$$
Using the law of total probability you also have $p(y) = int p(y mid theta) p(theta) dtheta$ which gives:
$$p(theta mid y) = fracp(y mid theta) p(theta)p(y) = frack(y) L(theta mid y) p(theta)k(y) int L(theta mid y) p(theta) dtheta = fracL(theta mid y) p(theta)int L(theta mid y) p(theta) dtheta.$$
In the special case where $k(y) = 1$ you have $L(theta mid y) = p(y mid theta)$ and so in this case you get the second equation you specified. However, it is common when using likelihood functions to use a constant-of-proportionality that effectively removes multiplicative terms that do not depend on $theta$.
edited Aug 27 at 0:50
answered Aug 26 at 22:41
Ben
14.4k12176
14.4k12176
It is wrong if they define it as unnormalized, but I've seen it defined as $L(theta | y) = p(y | theta)$ in some texts...
â Timâ¦
Aug 27 at 9:18
add a comment |Â
It is wrong if they define it as unnormalized, but I've seen it defined as $L(theta | y) = p(y | theta)$ in some texts...
â Timâ¦
Aug 27 at 9:18
It is wrong if they define it as unnormalized, but I've seen it defined as $L(theta | y) = p(y | theta)$ in some texts...
â Timâ¦
Aug 27 at 9:18
It is wrong if they define it as unnormalized, but I've seen it defined as $L(theta | y) = p(y | theta)$ in some texts...
â Timâ¦
Aug 27 at 9:18
add a comment |Â
5
The likelihood is defined as $L(theta|y)=p(y|theta)$.
â Robin Ryder
Aug 26 at 21:18
stats.stackexchange.com/questions/2641/â¦
â Alex
Aug 27 at 4:58