Does the $sum_k$ remain after differentiating $log(fracsum_kW_ikH_kjconst.)$ wrt $H_kj$? and why?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I would like to differentiate this equation where $W_ik$ and $H_kj$ are matrix:



$0 = sum_ij(log(fracsum_kW_ikH_kjconst.))$ wrt $H_kj$



After applying chain rule to log:



$0 = sum_ij(fracconst.sum_kW_ikH_kj)(fracsum_kW_ikconst.)$



I wonder for the term $fracsum_kW_ikconst.$, the $sum_k$ should be remained?



What I am confused is: since $H_kj$ is already differentiated, $W_ik$ can't do matrix multiplication and sum may be not required anymore. But in another way, the final answer is zero. If $W_ik$ doesn't take sum, the final answer will be vector (not zero). I am not sure what is right. May someone explain the reason about this?







share|cite|improve this question


























    up vote
    0
    down vote

    favorite












    I would like to differentiate this equation where $W_ik$ and $H_kj$ are matrix:



    $0 = sum_ij(log(fracsum_kW_ikH_kjconst.))$ wrt $H_kj$



    After applying chain rule to log:



    $0 = sum_ij(fracconst.sum_kW_ikH_kj)(fracsum_kW_ikconst.)$



    I wonder for the term $fracsum_kW_ikconst.$, the $sum_k$ should be remained?



    What I am confused is: since $H_kj$ is already differentiated, $W_ik$ can't do matrix multiplication and sum may be not required anymore. But in another way, the final answer is zero. If $W_ik$ doesn't take sum, the final answer will be vector (not zero). I am not sure what is right. May someone explain the reason about this?







    share|cite|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I would like to differentiate this equation where $W_ik$ and $H_kj$ are matrix:



      $0 = sum_ij(log(fracsum_kW_ikH_kjconst.))$ wrt $H_kj$



      After applying chain rule to log:



      $0 = sum_ij(fracconst.sum_kW_ikH_kj)(fracsum_kW_ikconst.)$



      I wonder for the term $fracsum_kW_ikconst.$, the $sum_k$ should be remained?



      What I am confused is: since $H_kj$ is already differentiated, $W_ik$ can't do matrix multiplication and sum may be not required anymore. But in another way, the final answer is zero. If $W_ik$ doesn't take sum, the final answer will be vector (not zero). I am not sure what is right. May someone explain the reason about this?







      share|cite|improve this question














      I would like to differentiate this equation where $W_ik$ and $H_kj$ are matrix:



      $0 = sum_ij(log(fracsum_kW_ikH_kjconst.))$ wrt $H_kj$



      After applying chain rule to log:



      $0 = sum_ij(fracconst.sum_kW_ikH_kj)(fracsum_kW_ikconst.)$



      I wonder for the term $fracsum_kW_ikconst.$, the $sum_k$ should be remained?



      What I am confused is: since $H_kj$ is already differentiated, $W_ik$ can't do matrix multiplication and sum may be not required anymore. But in another way, the final answer is zero. If $W_ik$ doesn't take sum, the final answer will be vector (not zero). I am not sure what is right. May someone explain the reason about this?









      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Aug 14 at 3:44

























      asked Aug 14 at 3:39









      Jan

      1657




      1657




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote



          accepted










          Omit the summation symbols and use straight matrix notation.



          Let $,beta = tfrac1rm const.,,$ then the differential and gradient of the function can be calculated as
          $$eqalign
          phi &= 1:log(beta WH) cr
          dphi &= 1:dlog(beta WH)
          = 1:fracbeta W,dHbeta WH
          = W^TBig(frac1WHBig):dH cr
          fracpartialphipartial H &= W^TBig(frac1WHBig) cr
          $$
          where the dimensions of the variables are:
          $Winmathbb R^mtimes n,,$
          $Hinmathbb R^ntimes p,,$
          $1inmathbb R^mtimes p,,$ and
          $phiinmathbb R$.



          Further, $log(X)$ and $frac1X$ are taken to be element-wise operations, and a colon has been used to denote the trace/Frobenius product, i.e.
          $$eqalignA:B = rm tr(A^TB)$$






          share|cite|improve this answer




















          • May I ask how do you know it should start from $phi = 1:log(beta WH)$? I mean why do you define $phi$ as a product of $1$ and $log(beta WH)$? I am not sure I understand the importance of $1$.
            – Jan
            Aug 14 at 6:16






          • 1




            @Jan: i think $mathbf1$ is all-ones vector such that the Frobenius product $mathbf1: logleft(beta mathbfW mathbfH right) = sum_i,j logleft( left[WHright]_i,j right)$ ... if this makes sense to you and if greg approves my interpretation
            – user550103
            Aug 14 at 10:17










          • @user550103 Your explanation is exactly right.
            – greg
            Aug 14 at 13:41

















          up vote
          0
          down vote













          Mixed indices confuse. $displaystylefracpartialpartial H_uvsum_i,jlogfracsum_k W_ik H_kjtextconst. = sum_ifracW_iusum_k W_ik H_kv$.






          share|cite|improve this answer




















            Your Answer




            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "69"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );








             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2882029%2fdoes-the-sum-k-remain-after-differentiating-log-frac-sum-kw-ikh-kj%23new-answer', 'question_page');

            );

            Post as a guest






























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            2
            down vote



            accepted










            Omit the summation symbols and use straight matrix notation.



            Let $,beta = tfrac1rm const.,,$ then the differential and gradient of the function can be calculated as
            $$eqalign
            phi &= 1:log(beta WH) cr
            dphi &= 1:dlog(beta WH)
            = 1:fracbeta W,dHbeta WH
            = W^TBig(frac1WHBig):dH cr
            fracpartialphipartial H &= W^TBig(frac1WHBig) cr
            $$
            where the dimensions of the variables are:
            $Winmathbb R^mtimes n,,$
            $Hinmathbb R^ntimes p,,$
            $1inmathbb R^mtimes p,,$ and
            $phiinmathbb R$.



            Further, $log(X)$ and $frac1X$ are taken to be element-wise operations, and a colon has been used to denote the trace/Frobenius product, i.e.
            $$eqalignA:B = rm tr(A^TB)$$






            share|cite|improve this answer




















            • May I ask how do you know it should start from $phi = 1:log(beta WH)$? I mean why do you define $phi$ as a product of $1$ and $log(beta WH)$? I am not sure I understand the importance of $1$.
              – Jan
              Aug 14 at 6:16






            • 1




              @Jan: i think $mathbf1$ is all-ones vector such that the Frobenius product $mathbf1: logleft(beta mathbfW mathbfH right) = sum_i,j logleft( left[WHright]_i,j right)$ ... if this makes sense to you and if greg approves my interpretation
              – user550103
              Aug 14 at 10:17










            • @user550103 Your explanation is exactly right.
              – greg
              Aug 14 at 13:41














            up vote
            2
            down vote



            accepted










            Omit the summation symbols and use straight matrix notation.



            Let $,beta = tfrac1rm const.,,$ then the differential and gradient of the function can be calculated as
            $$eqalign
            phi &= 1:log(beta WH) cr
            dphi &= 1:dlog(beta WH)
            = 1:fracbeta W,dHbeta WH
            = W^TBig(frac1WHBig):dH cr
            fracpartialphipartial H &= W^TBig(frac1WHBig) cr
            $$
            where the dimensions of the variables are:
            $Winmathbb R^mtimes n,,$
            $Hinmathbb R^ntimes p,,$
            $1inmathbb R^mtimes p,,$ and
            $phiinmathbb R$.



            Further, $log(X)$ and $frac1X$ are taken to be element-wise operations, and a colon has been used to denote the trace/Frobenius product, i.e.
            $$eqalignA:B = rm tr(A^TB)$$






            share|cite|improve this answer




















            • May I ask how do you know it should start from $phi = 1:log(beta WH)$? I mean why do you define $phi$ as a product of $1$ and $log(beta WH)$? I am not sure I understand the importance of $1$.
              – Jan
              Aug 14 at 6:16






            • 1




              @Jan: i think $mathbf1$ is all-ones vector such that the Frobenius product $mathbf1: logleft(beta mathbfW mathbfH right) = sum_i,j logleft( left[WHright]_i,j right)$ ... if this makes sense to you and if greg approves my interpretation
              – user550103
              Aug 14 at 10:17










            • @user550103 Your explanation is exactly right.
              – greg
              Aug 14 at 13:41












            up vote
            2
            down vote



            accepted







            up vote
            2
            down vote



            accepted






            Omit the summation symbols and use straight matrix notation.



            Let $,beta = tfrac1rm const.,,$ then the differential and gradient of the function can be calculated as
            $$eqalign
            phi &= 1:log(beta WH) cr
            dphi &= 1:dlog(beta WH)
            = 1:fracbeta W,dHbeta WH
            = W^TBig(frac1WHBig):dH cr
            fracpartialphipartial H &= W^TBig(frac1WHBig) cr
            $$
            where the dimensions of the variables are:
            $Winmathbb R^mtimes n,,$
            $Hinmathbb R^ntimes p,,$
            $1inmathbb R^mtimes p,,$ and
            $phiinmathbb R$.



            Further, $log(X)$ and $frac1X$ are taken to be element-wise operations, and a colon has been used to denote the trace/Frobenius product, i.e.
            $$eqalignA:B = rm tr(A^TB)$$






            share|cite|improve this answer












            Omit the summation symbols and use straight matrix notation.



            Let $,beta = tfrac1rm const.,,$ then the differential and gradient of the function can be calculated as
            $$eqalign
            phi &= 1:log(beta WH) cr
            dphi &= 1:dlog(beta WH)
            = 1:fracbeta W,dHbeta WH
            = W^TBig(frac1WHBig):dH cr
            fracpartialphipartial H &= W^TBig(frac1WHBig) cr
            $$
            where the dimensions of the variables are:
            $Winmathbb R^mtimes n,,$
            $Hinmathbb R^ntimes p,,$
            $1inmathbb R^mtimes p,,$ and
            $phiinmathbb R$.



            Further, $log(X)$ and $frac1X$ are taken to be element-wise operations, and a colon has been used to denote the trace/Frobenius product, i.e.
            $$eqalignA:B = rm tr(A^TB)$$







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered Aug 14 at 5:08









            greg

            5,7931715




            5,7931715











            • May I ask how do you know it should start from $phi = 1:log(beta WH)$? I mean why do you define $phi$ as a product of $1$ and $log(beta WH)$? I am not sure I understand the importance of $1$.
              – Jan
              Aug 14 at 6:16






            • 1




              @Jan: i think $mathbf1$ is all-ones vector such that the Frobenius product $mathbf1: logleft(beta mathbfW mathbfH right) = sum_i,j logleft( left[WHright]_i,j right)$ ... if this makes sense to you and if greg approves my interpretation
              – user550103
              Aug 14 at 10:17










            • @user550103 Your explanation is exactly right.
              – greg
              Aug 14 at 13:41
















            • May I ask how do you know it should start from $phi = 1:log(beta WH)$? I mean why do you define $phi$ as a product of $1$ and $log(beta WH)$? I am not sure I understand the importance of $1$.
              – Jan
              Aug 14 at 6:16






            • 1




              @Jan: i think $mathbf1$ is all-ones vector such that the Frobenius product $mathbf1: logleft(beta mathbfW mathbfH right) = sum_i,j logleft( left[WHright]_i,j right)$ ... if this makes sense to you and if greg approves my interpretation
              – user550103
              Aug 14 at 10:17










            • @user550103 Your explanation is exactly right.
              – greg
              Aug 14 at 13:41















            May I ask how do you know it should start from $phi = 1:log(beta WH)$? I mean why do you define $phi$ as a product of $1$ and $log(beta WH)$? I am not sure I understand the importance of $1$.
            – Jan
            Aug 14 at 6:16




            May I ask how do you know it should start from $phi = 1:log(beta WH)$? I mean why do you define $phi$ as a product of $1$ and $log(beta WH)$? I am not sure I understand the importance of $1$.
            – Jan
            Aug 14 at 6:16




            1




            1




            @Jan: i think $mathbf1$ is all-ones vector such that the Frobenius product $mathbf1: logleft(beta mathbfW mathbfH right) = sum_i,j logleft( left[WHright]_i,j right)$ ... if this makes sense to you and if greg approves my interpretation
            – user550103
            Aug 14 at 10:17




            @Jan: i think $mathbf1$ is all-ones vector such that the Frobenius product $mathbf1: logleft(beta mathbfW mathbfH right) = sum_i,j logleft( left[WHright]_i,j right)$ ... if this makes sense to you and if greg approves my interpretation
            – user550103
            Aug 14 at 10:17












            @user550103 Your explanation is exactly right.
            – greg
            Aug 14 at 13:41




            @user550103 Your explanation is exactly right.
            – greg
            Aug 14 at 13:41










            up vote
            0
            down vote













            Mixed indices confuse. $displaystylefracpartialpartial H_uvsum_i,jlogfracsum_k W_ik H_kjtextconst. = sum_ifracW_iusum_k W_ik H_kv$.






            share|cite|improve this answer
























              up vote
              0
              down vote













              Mixed indices confuse. $displaystylefracpartialpartial H_uvsum_i,jlogfracsum_k W_ik H_kjtextconst. = sum_ifracW_iusum_k W_ik H_kv$.






              share|cite|improve this answer






















                up vote
                0
                down vote










                up vote
                0
                down vote









                Mixed indices confuse. $displaystylefracpartialpartial H_uvsum_i,jlogfracsum_k W_ik H_kjtextconst. = sum_ifracW_iusum_k W_ik H_kv$.






                share|cite|improve this answer












                Mixed indices confuse. $displaystylefracpartialpartial H_uvsum_i,jlogfracsum_k W_ik H_kjtextconst. = sum_ifracW_iusum_k W_ik H_kv$.







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered Aug 14 at 4:21









                metamorphy

                7528




                7528






















                     

                    draft saved


                    draft discarded


























                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2882029%2fdoes-the-sum-k-remain-after-differentiating-log-frac-sum-kw-ikh-kj%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    這個網誌中的熱門文章

                    tkz-euclide: tkzDrawCircle[R] not working

                    How to combine Bézier curves to a surface?

                    1st Magritte Awards