How to read out the loss function in YOLO algorithm?

Clash Royale CLAN TAG#URR8PPP

up vote
2
down vote

favorite

How do I read out the loss function used in YOLO?
I somehow need it for a class that I'm attending.

EDIT

Got an answer in Reddit!

edited Sep 10 at 1:02

asked Sep 7 at 9:26

Maning

112

What do you mean by "Read out"? Do you just mean how to understand it?
â€“Â user3658307
Sep 8 at 2:44

@user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
â€“Â Maning
Sep 9 at 13:06

add a commentÂ |Â

up vote
2
down vote

favorite

How do I read out the loss function used in YOLO?
I somehow need it for a class that I'm attending.

EDIT

Got an answer in Reddit!

edited Sep 10 at 1:02

asked Sep 7 at 9:26

Maning

112

What do you mean by "Read out"? Do you just mean how to understand it?
â€“Â user3658307
Sep 8 at 2:44

@user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
â€“Â Maning
Sep 9 at 13:06

add a commentÂ |Â

up vote
2
down vote

favorite

How do I read out the loss function used in YOLO?
I somehow need it for a class that I'm attending.

EDIT

Got an answer in Reddit!

edited Sep 10 at 1:02

asked Sep 7 at 9:26

Maning

112

How do I read out the loss function used in YOLO?
I somehow need it for a class that I'm attending.

EDIT

Got an answer in Reddit!

optimization machine-learning

edited Sep 10 at 1:02

asked Sep 7 at 9:26

Maning

112

edited Sep 10 at 1:02

asked Sep 7 at 9:26

Maning

112

edited Sep 10 at 1:02

asked Sep 7 at 9:26

Maning

112

asked Sep 7 at 9:26

Maning

112

asked Sep 7 at 9:26

Maning

112

What do you mean by "Read out"? Do you just mean how to understand it?
â€“Â user3658307
Sep 8 at 2:44

@user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
â€“Â Maning
Sep 9 at 13:06

add a commentÂ |Â

What do you mean by "Read out"? Do you just mean how to understand it?
â€“Â user3658307
Sep 8 at 2:44

@user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
â€“Â Maning
Sep 9 at 13:06

What do you mean by "Read out"? Do you just mean how to understand it?
â€“Â user3658307
Sep 8 at 2:44

@user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
â€“Â Maning
Sep 9 at 13:06

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
0
down vote

It's a bit of an unexpected question, but I guess I would read it out by describing one term at a time. (Hopefully you meant a high-level description, not literally a phonetic sequence.) I'd say something like this when "reading it out":

Overall, we want to perform simultaneous object detection and classification. The indicator functions $(unicodex1D7D9_ij^ textobj )$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(unicodex1D7D9_i^ textobj )$ denotes whether there is an object in cell $i$.
Hatted quantities (e.g. $widehatx$, $widehatC$, $widehatp_i$) are predictions of their unhatted counterparts.
The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).

The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.

The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.

The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.

The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.

Might be helpful to look at other Yolo questions:
[1],
[2],
[3],
[4],
[5],
[6].

answered Sep 9 at 17:55

user3658307

4,3143945

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2908437%2fhow-to-read-out-the-loss-function-in-yolo-algorithm%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

Overall, we want to perform simultaneous object detection and classification. The indicator functions $(unicodex1D7D9_ij^ textobj )$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(unicodex1D7D9_i^ textobj )$ denotes whether there is an object in cell $i$.
Hatted quantities (e.g. $widehatx$, $widehatC$, $widehatp_i$) are predictions of their unhatted counterparts.
The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).

The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.

The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.

The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.

The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.

Might be helpful to look at other Yolo questions:
[1],
[2],
[3],
[4],
[5],
[6].

answered Sep 9 at 17:55

user3658307

4,3143945

add a commentÂ |Â

up vote
0
down vote

Overall, we want to perform simultaneous object detection and classification. The indicator functions $(unicodex1D7D9_ij^ textobj )$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(unicodex1D7D9_i^ textobj )$ denotes whether there is an object in cell $i$.
Hatted quantities (e.g. $widehatx$, $widehatC$, $widehatp_i$) are predictions of their unhatted counterparts.
The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).

The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.

The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.

The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.

The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.

Might be helpful to look at other Yolo questions:
[1],
[2],
[3],
[4],
[5],
[6].

answered Sep 9 at 17:55

user3658307

4,3143945

add a commentÂ |Â

up vote
0
down vote

Overall, we want to perform simultaneous object detection and classification. The indicator functions $(unicodex1D7D9_ij^ textobj )$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(unicodex1D7D9_i^ textobj )$ denotes whether there is an object in cell $i$.
Hatted quantities (e.g. $widehatx$, $widehatC$, $widehatp_i$) are predictions of their unhatted counterparts.
The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).

The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.

The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.

The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.

The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.

Might be helpful to look at other Yolo questions:
[1],
[2],
[3],
[4],
[5],
[6].

answered Sep 9 at 17:55

user3658307

4,3143945

Overall, we want to perform simultaneous object detection and classification. The indicator functions $(unicodex1D7D9_ij^ textobj )$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(unicodex1D7D9_i^ textobj )$ denotes whether there is an object in cell $i$.
Hatted quantities (e.g. $widehatx$, $widehatC$, $widehatp_i$) are predictions of their unhatted counterparts.
The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).

The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.

The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.

The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.

The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.

Might be helpful to look at other Yolo questions:
[1],
[2],
[3],
[4],
[5],
[6].

answered Sep 9 at 17:55

user3658307

4,3143945

answered Sep 9 at 17:55

user3658307

4,3143945

answered Sep 9 at 17:55

user3658307

4,3143945

answered Sep 9 at 17:55

user3658307

4,3143945

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

Vtyjkyuk