How to read out the loss function in YOLO algorithm?

Multi tool use
Multi tool use

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












How do I read out the loss function used in YOLO?
I somehow need it for a class that I'm attending.



EDIT



Got an answer in Reddit!










share|cite|improve this question























  • What do you mean by "Read out"? Do you just mean how to understand it?
    – user3658307
    Sep 8 at 2:44










  • @user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
    – Maning
    Sep 9 at 13:06














up vote
2
down vote

favorite












How do I read out the loss function used in YOLO?
I somehow need it for a class that I'm attending.



EDIT



Got an answer in Reddit!










share|cite|improve this question























  • What do you mean by "Read out"? Do you just mean how to understand it?
    – user3658307
    Sep 8 at 2:44










  • @user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
    – Maning
    Sep 9 at 13:06












up vote
2
down vote

favorite









up vote
2
down vote

favorite











How do I read out the loss function used in YOLO?
I somehow need it for a class that I'm attending.



EDIT



Got an answer in Reddit!










share|cite|improve this question















How do I read out the loss function used in YOLO?
I somehow need it for a class that I'm attending.



EDIT



Got an answer in Reddit!







optimization machine-learning






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Sep 10 at 1:02

























asked Sep 7 at 9:26









Maning

112




112











  • What do you mean by "Read out"? Do you just mean how to understand it?
    – user3658307
    Sep 8 at 2:44










  • @user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
    – Maning
    Sep 9 at 13:06
















  • What do you mean by "Read out"? Do you just mean how to understand it?
    – user3658307
    Sep 8 at 2:44










  • @user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
    – Maning
    Sep 9 at 13:06















What do you mean by "Read out"? Do you just mean how to understand it?
– user3658307
Sep 8 at 2:44




What do you mean by "Read out"? Do you just mean how to understand it?
– user3658307
Sep 8 at 2:44












@user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
– Maning
Sep 9 at 13:06




@user3658307 Sorry about that, what I was trying to say was how do I read it out loud.
– Maning
Sep 9 at 13:06










1 Answer
1






active

oldest

votes

















up vote
0
down vote













It's a bit of an unexpected question, but I guess I would read it out by describing one term at a time. (Hopefully you meant a high-level description, not literally a phonetic sequence.) I'd say something like this when "reading it out":



  • Overall, we want to perform simultaneous object detection and classification. The indicator functions $(unicodex1D7D9_ij^ textobj )$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(unicodex1D7D9_i^ textobj )$ denotes whether there is an object in cell $i$.
    Hatted quantities (e.g. $widehatx$, $widehatC$, $widehatp_i$) are predictions of their unhatted counterparts.
    The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).


  • The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.


  • The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.


  • The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.


  • The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.


Might be helpful to look at other Yolo questions:
[1],
[2],
[3],
[4],
[5],
[6].






share|cite|improve this answer




















    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "69"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2908437%2fhow-to-read-out-the-loss-function-in-yolo-algorithm%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    It's a bit of an unexpected question, but I guess I would read it out by describing one term at a time. (Hopefully you meant a high-level description, not literally a phonetic sequence.) I'd say something like this when "reading it out":



    • Overall, we want to perform simultaneous object detection and classification. The indicator functions $(unicodex1D7D9_ij^ textobj )$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(unicodex1D7D9_i^ textobj )$ denotes whether there is an object in cell $i$.
      Hatted quantities (e.g. $widehatx$, $widehatC$, $widehatp_i$) are predictions of their unhatted counterparts.
      The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).


    • The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.


    • The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.


    • The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.


    • The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.


    Might be helpful to look at other Yolo questions:
    [1],
    [2],
    [3],
    [4],
    [5],
    [6].






    share|cite|improve this answer
























      up vote
      0
      down vote













      It's a bit of an unexpected question, but I guess I would read it out by describing one term at a time. (Hopefully you meant a high-level description, not literally a phonetic sequence.) I'd say something like this when "reading it out":



      • Overall, we want to perform simultaneous object detection and classification. The indicator functions $(unicodex1D7D9_ij^ textobj )$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(unicodex1D7D9_i^ textobj )$ denotes whether there is an object in cell $i$.
        Hatted quantities (e.g. $widehatx$, $widehatC$, $widehatp_i$) are predictions of their unhatted counterparts.
        The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).


      • The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.


      • The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.


      • The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.


      • The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.


      Might be helpful to look at other Yolo questions:
      [1],
      [2],
      [3],
      [4],
      [5],
      [6].






      share|cite|improve this answer






















        up vote
        0
        down vote










        up vote
        0
        down vote









        It's a bit of an unexpected question, but I guess I would read it out by describing one term at a time. (Hopefully you meant a high-level description, not literally a phonetic sequence.) I'd say something like this when "reading it out":



        • Overall, we want to perform simultaneous object detection and classification. The indicator functions $(unicodex1D7D9_ij^ textobj )$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(unicodex1D7D9_i^ textobj )$ denotes whether there is an object in cell $i$.
          Hatted quantities (e.g. $widehatx$, $widehatC$, $widehatp_i$) are predictions of their unhatted counterparts.
          The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).


        • The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.


        • The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.


        • The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.


        • The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.


        Might be helpful to look at other Yolo questions:
        [1],
        [2],
        [3],
        [4],
        [5],
        [6].






        share|cite|improve this answer












        It's a bit of an unexpected question, but I guess I would read it out by describing one term at a time. (Hopefully you meant a high-level description, not literally a phonetic sequence.) I'd say something like this when "reading it out":



        • Overall, we want to perform simultaneous object detection and classification. The indicator functions $(unicodex1D7D9_ij^ textobj )$ denote when the $j$th box in cell $i$ (i.e. the $j$th prediction has maximal confidence). Similarly the indicator $(unicodex1D7D9_i^ textobj )$ denotes whether there is an object in cell $i$.
          Hatted quantities (e.g. $widehatx$, $widehatC$, $widehatp_i$) are predictions of their unhatted counterparts.
          The sums over $i$ are over the gridded cells of the image, while the sums over $j$ iterate over the bounding box predictors (per cell).


        • The first term checks that the predicted object box centers are close to the real ones, based on the squared distance between the centers.


        • The second term checks that the sizes (width $w$ and height $h$) of the predicted and true boxes are close to each other, to maximize overlap between them.


        • The third and fourth term measures the existence confidence (or objectness), i.e. $C_i$ gives the probability of an object being in cell $i$ at all, so the loss want the confidence of our learner to match whether or not an object is actually present.


        • The fifth term is the classification loss, so that the network correctly categorizes each object if an object exists there.


        Might be helpful to look at other Yolo questions:
        [1],
        [2],
        [3],
        [4],
        [5],
        [6].







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Sep 9 at 17:55









        user3658307

        4,3143945




        4,3143945



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2908437%2fhow-to-read-out-the-loss-function-in-yolo-algorithm%23new-answer', 'question_page');

            );

            Post as a guest













































































            fK98aWlSkPs I2DVqvs s,xpLBrU4SZIN zU5ZQdJ 7 06K3TF9FDXI,LyzHrNC,2C9e,w2
            8gBEYWIVxVMnaLxw

            這個網誌中的熱門文章

            How to combine Bézier curves to a surface?

            Propositional logic and tautologies

            Distribution of Stopped Wiener Process with Stochastic Volatility