Where am I going wrong in solving $fracpartialpartial mathbf w(mathbf y - mathbf Xmathbf w)^T(mathbf y - mathbf X mathbf w) = 0$?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
0
down vote

favorite












I have the following equation which I wish to solve:



$$fracpartialpartial mathbf w(mathbf y - mathbf Xmathbf w)^T(mathbf y - mathbf X mathbf w) = 0$$



Here $mathbf y_n*1, mathbf X_n*2,mathbf w_2*1,$



My solution (done on paper because MathJax is a bit difficult for me to use):



enter image description here



Also, is my reasoning for step 4 correct?







share|cite|improve this question


























    up vote
    0
    down vote

    favorite












    I have the following equation which I wish to solve:



    $$fracpartialpartial mathbf w(mathbf y - mathbf Xmathbf w)^T(mathbf y - mathbf X mathbf w) = 0$$



    Here $mathbf y_n*1, mathbf X_n*2,mathbf w_2*1,$



    My solution (done on paper because MathJax is a bit difficult for me to use):



    enter image description here



    Also, is my reasoning for step 4 correct?







    share|cite|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I have the following equation which I wish to solve:



      $$fracpartialpartial mathbf w(mathbf y - mathbf Xmathbf w)^T(mathbf y - mathbf X mathbf w) = 0$$



      Here $mathbf y_n*1, mathbf X_n*2,mathbf w_2*1,$



      My solution (done on paper because MathJax is a bit difficult for me to use):



      enter image description here



      Also, is my reasoning for step 4 correct?







      share|cite|improve this question














      I have the following equation which I wish to solve:



      $$fracpartialpartial mathbf w(mathbf y - mathbf Xmathbf w)^T(mathbf y - mathbf X mathbf w) = 0$$



      Here $mathbf y_n*1, mathbf X_n*2,mathbf w_2*1,$



      My solution (done on paper because MathJax is a bit difficult for me to use):



      enter image description here



      Also, is my reasoning for step 4 correct?









      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Aug 16 at 8:02

























      asked Aug 16 at 7:45









      rjmessibarca

      225414




      225414




















          3 Answers
          3






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          Line $3$ to line $4$, note that
          $$
          fracpartialpartial w (y^TXw) = X^Ty,
          $$
          then you'll get the right answer
          $$
          hatw = (X^TX)^-1X^Ty.
          $$



          Explicit derivation:
          Note that
          $$
          y^TXw = w_1sum_i=1^ny_i + w_2sum_i=1^ny_ix_1i+cdots+w_psum_i=1^ny_ix_pi,
          $$
          taking derivative w.r.t vector $w$, $w in mathbbR^p$, will result in a gradient, i.e., vector with $p$ rows and $1$ column, namely
          $$
          beginpmatrix
          sum y_i \
          sum y_i x_1i\
          vdots \
          sum y_i x_pi
          endpmatrix,
          $$
          where the $j$th row is the derivative of $y^TXw$ w.r.t. $w_j$.
          Now, as $X^T$ is $ptimes n$ and $y$ is $n times 1$, hence $X^Ty$ is $p times 1$ as required.






          share|cite|improve this answer






















          • Can you please explain why? I thought it would be y'X. I can see from the order of the matrices that my solution is wrong and that yours is correct. But in general, how do I solve such problems involving matrix calculus. My current method is assuming each element of the matrix and then finding the partial derivative
            – rjmessibarca
            Aug 19 at 16:08











          • @rjmessibarca Please see the edited answer.
            – V. Vancak
            Aug 19 at 21:20

















          up vote
          1
          down vote













          No your reasoning in step 4 is wrong. For example if $X$ is a square matrix, $mathbfX^T mathbfX$ will not be a scalar. Therefore your result is wrong. Do note that $$fracpartialpartial mathbfw left(mathbfw^T mathbfX^T mathbfX mathbfw right) = 2 mathbfX^T mathbfX mathbfw$$



          I am sure that you can get to the right answer from here.






          share|cite|improve this answer






















          • $X$ is an $n times 2$ matrix, so $X^TX$ is $2times2$.
            – Jaap Scherphuis
            Aug 16 at 8:19










          • Yes, so not a scalar.
            – Jan
            Aug 16 at 8:20






          • 1




            Yes, I'm just disagreeing with the "If X is a square matrix" bit in your answer. It does not have to be square, though $X^TX$ will be.
            – Jaap Scherphuis
            Aug 16 at 8:22










          • Ooh yeah I know about that, I was just giving an example to show why this was not true but I'll clarify.
            – Jan
            Aug 16 at 8:26










          • @Jan How did you get that result? BTW, I used the expression that you mentioned in step 5. But still answer is wrong.
            – rjmessibarca
            Aug 16 at 8:31

















          up vote
          1
          down vote













          Let $M = mathbfy - mathbfX mathbfw$



          and $f = M^T M = M : M$.



          We will utilize the following the identities



          • Trace and Frobenius product relation $$A:B=rm tr(A^TB)$$ or $$A^T:B=rm tr(AB)$$

          • Cyclic property of Trace/Frobenius product $$eqalign
            A:BC
            &= AC^T:B cr
            &= B^TA:C cr
            &= text etc. cr
            $$

          Now, we obtain the differential first and thereafter we obtain the gradient.



          So,
          beginalign
          df &= left( d M: M right) + left( M : dM right)\
          &= 2M : dM \
          &= 2M : left( - mathbfX d mathbfw right) \
          &= - 2mathbfX^T M : d mathbfw hspace8mm textnote: utilized cyclic property of Frobenius product \
          &= - 2mathbfX^T left( mathbfy - mathbfX mathbfw right) : d mathbfw .
          endalign



          Thus, the gradient reads
          beginalign
          fracpartialpartial mathbfw f
          = - 2mathbfX^T left( mathbfy - mathbfX mathbfw right) .
          endalign



          Then you can set the gradient to $0$ and obtain your $$mathbfw = left( mathbfX^T mathbfX right) ^-1 mathbfX^T mathbfy$$






          share|cite|improve this answer






















          • I did not ask for how to solve. I asked what is wrong with my solution.
            – rjmessibarca
            Aug 16 at 10:12










          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "69"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2884513%2fwhere-am-i-going-wrong-in-solving-frac-partial-partial-mathbf-w-mathbf-y%23new-answer', 'question_page');

          );

          Post as a guest






























          3 Answers
          3






          active

          oldest

          votes








          3 Answers
          3






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          1
          down vote



          accepted










          Line $3$ to line $4$, note that
          $$
          fracpartialpartial w (y^TXw) = X^Ty,
          $$
          then you'll get the right answer
          $$
          hatw = (X^TX)^-1X^Ty.
          $$



          Explicit derivation:
          Note that
          $$
          y^TXw = w_1sum_i=1^ny_i + w_2sum_i=1^ny_ix_1i+cdots+w_psum_i=1^ny_ix_pi,
          $$
          taking derivative w.r.t vector $w$, $w in mathbbR^p$, will result in a gradient, i.e., vector with $p$ rows and $1$ column, namely
          $$
          beginpmatrix
          sum y_i \
          sum y_i x_1i\
          vdots \
          sum y_i x_pi
          endpmatrix,
          $$
          where the $j$th row is the derivative of $y^TXw$ w.r.t. $w_j$.
          Now, as $X^T$ is $ptimes n$ and $y$ is $n times 1$, hence $X^Ty$ is $p times 1$ as required.






          share|cite|improve this answer






















          • Can you please explain why? I thought it would be y'X. I can see from the order of the matrices that my solution is wrong and that yours is correct. But in general, how do I solve such problems involving matrix calculus. My current method is assuming each element of the matrix and then finding the partial derivative
            – rjmessibarca
            Aug 19 at 16:08











          • @rjmessibarca Please see the edited answer.
            – V. Vancak
            Aug 19 at 21:20














          up vote
          1
          down vote



          accepted










          Line $3$ to line $4$, note that
          $$
          fracpartialpartial w (y^TXw) = X^Ty,
          $$
          then you'll get the right answer
          $$
          hatw = (X^TX)^-1X^Ty.
          $$



          Explicit derivation:
          Note that
          $$
          y^TXw = w_1sum_i=1^ny_i + w_2sum_i=1^ny_ix_1i+cdots+w_psum_i=1^ny_ix_pi,
          $$
          taking derivative w.r.t vector $w$, $w in mathbbR^p$, will result in a gradient, i.e., vector with $p$ rows and $1$ column, namely
          $$
          beginpmatrix
          sum y_i \
          sum y_i x_1i\
          vdots \
          sum y_i x_pi
          endpmatrix,
          $$
          where the $j$th row is the derivative of $y^TXw$ w.r.t. $w_j$.
          Now, as $X^T$ is $ptimes n$ and $y$ is $n times 1$, hence $X^Ty$ is $p times 1$ as required.






          share|cite|improve this answer






















          • Can you please explain why? I thought it would be y'X. I can see from the order of the matrices that my solution is wrong and that yours is correct. But in general, how do I solve such problems involving matrix calculus. My current method is assuming each element of the matrix and then finding the partial derivative
            – rjmessibarca
            Aug 19 at 16:08











          • @rjmessibarca Please see the edited answer.
            – V. Vancak
            Aug 19 at 21:20












          up vote
          1
          down vote



          accepted







          up vote
          1
          down vote



          accepted






          Line $3$ to line $4$, note that
          $$
          fracpartialpartial w (y^TXw) = X^Ty,
          $$
          then you'll get the right answer
          $$
          hatw = (X^TX)^-1X^Ty.
          $$



          Explicit derivation:
          Note that
          $$
          y^TXw = w_1sum_i=1^ny_i + w_2sum_i=1^ny_ix_1i+cdots+w_psum_i=1^ny_ix_pi,
          $$
          taking derivative w.r.t vector $w$, $w in mathbbR^p$, will result in a gradient, i.e., vector with $p$ rows and $1$ column, namely
          $$
          beginpmatrix
          sum y_i \
          sum y_i x_1i\
          vdots \
          sum y_i x_pi
          endpmatrix,
          $$
          where the $j$th row is the derivative of $y^TXw$ w.r.t. $w_j$.
          Now, as $X^T$ is $ptimes n$ and $y$ is $n times 1$, hence $X^Ty$ is $p times 1$ as required.






          share|cite|improve this answer














          Line $3$ to line $4$, note that
          $$
          fracpartialpartial w (y^TXw) = X^Ty,
          $$
          then you'll get the right answer
          $$
          hatw = (X^TX)^-1X^Ty.
          $$



          Explicit derivation:
          Note that
          $$
          y^TXw = w_1sum_i=1^ny_i + w_2sum_i=1^ny_ix_1i+cdots+w_psum_i=1^ny_ix_pi,
          $$
          taking derivative w.r.t vector $w$, $w in mathbbR^p$, will result in a gradient, i.e., vector with $p$ rows and $1$ column, namely
          $$
          beginpmatrix
          sum y_i \
          sum y_i x_1i\
          vdots \
          sum y_i x_pi
          endpmatrix,
          $$
          where the $j$th row is the derivative of $y^TXw$ w.r.t. $w_j$.
          Now, as $X^T$ is $ptimes n$ and $y$ is $n times 1$, hence $X^Ty$ is $p times 1$ as required.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Aug 19 at 21:19

























          answered Aug 18 at 14:48









          V. Vancak

          9,9502926




          9,9502926











          • Can you please explain why? I thought it would be y'X. I can see from the order of the matrices that my solution is wrong and that yours is correct. But in general, how do I solve such problems involving matrix calculus. My current method is assuming each element of the matrix and then finding the partial derivative
            – rjmessibarca
            Aug 19 at 16:08











          • @rjmessibarca Please see the edited answer.
            – V. Vancak
            Aug 19 at 21:20
















          • Can you please explain why? I thought it would be y'X. I can see from the order of the matrices that my solution is wrong and that yours is correct. But in general, how do I solve such problems involving matrix calculus. My current method is assuming each element of the matrix and then finding the partial derivative
            – rjmessibarca
            Aug 19 at 16:08











          • @rjmessibarca Please see the edited answer.
            – V. Vancak
            Aug 19 at 21:20















          Can you please explain why? I thought it would be y'X. I can see from the order of the matrices that my solution is wrong and that yours is correct. But in general, how do I solve such problems involving matrix calculus. My current method is assuming each element of the matrix and then finding the partial derivative
          – rjmessibarca
          Aug 19 at 16:08





          Can you please explain why? I thought it would be y'X. I can see from the order of the matrices that my solution is wrong and that yours is correct. But in general, how do I solve such problems involving matrix calculus. My current method is assuming each element of the matrix and then finding the partial derivative
          – rjmessibarca
          Aug 19 at 16:08













          @rjmessibarca Please see the edited answer.
          – V. Vancak
          Aug 19 at 21:20




          @rjmessibarca Please see the edited answer.
          – V. Vancak
          Aug 19 at 21:20










          up vote
          1
          down vote













          No your reasoning in step 4 is wrong. For example if $X$ is a square matrix, $mathbfX^T mathbfX$ will not be a scalar. Therefore your result is wrong. Do note that $$fracpartialpartial mathbfw left(mathbfw^T mathbfX^T mathbfX mathbfw right) = 2 mathbfX^T mathbfX mathbfw$$



          I am sure that you can get to the right answer from here.






          share|cite|improve this answer






















          • $X$ is an $n times 2$ matrix, so $X^TX$ is $2times2$.
            – Jaap Scherphuis
            Aug 16 at 8:19










          • Yes, so not a scalar.
            – Jan
            Aug 16 at 8:20






          • 1




            Yes, I'm just disagreeing with the "If X is a square matrix" bit in your answer. It does not have to be square, though $X^TX$ will be.
            – Jaap Scherphuis
            Aug 16 at 8:22










          • Ooh yeah I know about that, I was just giving an example to show why this was not true but I'll clarify.
            – Jan
            Aug 16 at 8:26










          • @Jan How did you get that result? BTW, I used the expression that you mentioned in step 5. But still answer is wrong.
            – rjmessibarca
            Aug 16 at 8:31














          up vote
          1
          down vote













          No your reasoning in step 4 is wrong. For example if $X$ is a square matrix, $mathbfX^T mathbfX$ will not be a scalar. Therefore your result is wrong. Do note that $$fracpartialpartial mathbfw left(mathbfw^T mathbfX^T mathbfX mathbfw right) = 2 mathbfX^T mathbfX mathbfw$$



          I am sure that you can get to the right answer from here.






          share|cite|improve this answer






















          • $X$ is an $n times 2$ matrix, so $X^TX$ is $2times2$.
            – Jaap Scherphuis
            Aug 16 at 8:19










          • Yes, so not a scalar.
            – Jan
            Aug 16 at 8:20






          • 1




            Yes, I'm just disagreeing with the "If X is a square matrix" bit in your answer. It does not have to be square, though $X^TX$ will be.
            – Jaap Scherphuis
            Aug 16 at 8:22










          • Ooh yeah I know about that, I was just giving an example to show why this was not true but I'll clarify.
            – Jan
            Aug 16 at 8:26










          • @Jan How did you get that result? BTW, I used the expression that you mentioned in step 5. But still answer is wrong.
            – rjmessibarca
            Aug 16 at 8:31












          up vote
          1
          down vote










          up vote
          1
          down vote









          No your reasoning in step 4 is wrong. For example if $X$ is a square matrix, $mathbfX^T mathbfX$ will not be a scalar. Therefore your result is wrong. Do note that $$fracpartialpartial mathbfw left(mathbfw^T mathbfX^T mathbfX mathbfw right) = 2 mathbfX^T mathbfX mathbfw$$



          I am sure that you can get to the right answer from here.






          share|cite|improve this answer














          No your reasoning in step 4 is wrong. For example if $X$ is a square matrix, $mathbfX^T mathbfX$ will not be a scalar. Therefore your result is wrong. Do note that $$fracpartialpartial mathbfw left(mathbfw^T mathbfX^T mathbfX mathbfw right) = 2 mathbfX^T mathbfX mathbfw$$



          I am sure that you can get to the right answer from here.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Aug 16 at 8:27

























          answered Aug 16 at 8:17









          Jan

          559416




          559416











          • $X$ is an $n times 2$ matrix, so $X^TX$ is $2times2$.
            – Jaap Scherphuis
            Aug 16 at 8:19










          • Yes, so not a scalar.
            – Jan
            Aug 16 at 8:20






          • 1




            Yes, I'm just disagreeing with the "If X is a square matrix" bit in your answer. It does not have to be square, though $X^TX$ will be.
            – Jaap Scherphuis
            Aug 16 at 8:22










          • Ooh yeah I know about that, I was just giving an example to show why this was not true but I'll clarify.
            – Jan
            Aug 16 at 8:26










          • @Jan How did you get that result? BTW, I used the expression that you mentioned in step 5. But still answer is wrong.
            – rjmessibarca
            Aug 16 at 8:31
















          • $X$ is an $n times 2$ matrix, so $X^TX$ is $2times2$.
            – Jaap Scherphuis
            Aug 16 at 8:19










          • Yes, so not a scalar.
            – Jan
            Aug 16 at 8:20






          • 1




            Yes, I'm just disagreeing with the "If X is a square matrix" bit in your answer. It does not have to be square, though $X^TX$ will be.
            – Jaap Scherphuis
            Aug 16 at 8:22










          • Ooh yeah I know about that, I was just giving an example to show why this was not true but I'll clarify.
            – Jan
            Aug 16 at 8:26










          • @Jan How did you get that result? BTW, I used the expression that you mentioned in step 5. But still answer is wrong.
            – rjmessibarca
            Aug 16 at 8:31















          $X$ is an $n times 2$ matrix, so $X^TX$ is $2times2$.
          – Jaap Scherphuis
          Aug 16 at 8:19




          $X$ is an $n times 2$ matrix, so $X^TX$ is $2times2$.
          – Jaap Scherphuis
          Aug 16 at 8:19












          Yes, so not a scalar.
          – Jan
          Aug 16 at 8:20




          Yes, so not a scalar.
          – Jan
          Aug 16 at 8:20




          1




          1




          Yes, I'm just disagreeing with the "If X is a square matrix" bit in your answer. It does not have to be square, though $X^TX$ will be.
          – Jaap Scherphuis
          Aug 16 at 8:22




          Yes, I'm just disagreeing with the "If X is a square matrix" bit in your answer. It does not have to be square, though $X^TX$ will be.
          – Jaap Scherphuis
          Aug 16 at 8:22












          Ooh yeah I know about that, I was just giving an example to show why this was not true but I'll clarify.
          – Jan
          Aug 16 at 8:26




          Ooh yeah I know about that, I was just giving an example to show why this was not true but I'll clarify.
          – Jan
          Aug 16 at 8:26












          @Jan How did you get that result? BTW, I used the expression that you mentioned in step 5. But still answer is wrong.
          – rjmessibarca
          Aug 16 at 8:31




          @Jan How did you get that result? BTW, I used the expression that you mentioned in step 5. But still answer is wrong.
          – rjmessibarca
          Aug 16 at 8:31










          up vote
          1
          down vote













          Let $M = mathbfy - mathbfX mathbfw$



          and $f = M^T M = M : M$.



          We will utilize the following the identities



          • Trace and Frobenius product relation $$A:B=rm tr(A^TB)$$ or $$A^T:B=rm tr(AB)$$

          • Cyclic property of Trace/Frobenius product $$eqalign
            A:BC
            &= AC^T:B cr
            &= B^TA:C cr
            &= text etc. cr
            $$

          Now, we obtain the differential first and thereafter we obtain the gradient.



          So,
          beginalign
          df &= left( d M: M right) + left( M : dM right)\
          &= 2M : dM \
          &= 2M : left( - mathbfX d mathbfw right) \
          &= - 2mathbfX^T M : d mathbfw hspace8mm textnote: utilized cyclic property of Frobenius product \
          &= - 2mathbfX^T left( mathbfy - mathbfX mathbfw right) : d mathbfw .
          endalign



          Thus, the gradient reads
          beginalign
          fracpartialpartial mathbfw f
          = - 2mathbfX^T left( mathbfy - mathbfX mathbfw right) .
          endalign



          Then you can set the gradient to $0$ and obtain your $$mathbfw = left( mathbfX^T mathbfX right) ^-1 mathbfX^T mathbfy$$






          share|cite|improve this answer






















          • I did not ask for how to solve. I asked what is wrong with my solution.
            – rjmessibarca
            Aug 16 at 10:12














          up vote
          1
          down vote













          Let $M = mathbfy - mathbfX mathbfw$



          and $f = M^T M = M : M$.



          We will utilize the following the identities



          • Trace and Frobenius product relation $$A:B=rm tr(A^TB)$$ or $$A^T:B=rm tr(AB)$$

          • Cyclic property of Trace/Frobenius product $$eqalign
            A:BC
            &= AC^T:B cr
            &= B^TA:C cr
            &= text etc. cr
            $$

          Now, we obtain the differential first and thereafter we obtain the gradient.



          So,
          beginalign
          df &= left( d M: M right) + left( M : dM right)\
          &= 2M : dM \
          &= 2M : left( - mathbfX d mathbfw right) \
          &= - 2mathbfX^T M : d mathbfw hspace8mm textnote: utilized cyclic property of Frobenius product \
          &= - 2mathbfX^T left( mathbfy - mathbfX mathbfw right) : d mathbfw .
          endalign



          Thus, the gradient reads
          beginalign
          fracpartialpartial mathbfw f
          = - 2mathbfX^T left( mathbfy - mathbfX mathbfw right) .
          endalign



          Then you can set the gradient to $0$ and obtain your $$mathbfw = left( mathbfX^T mathbfX right) ^-1 mathbfX^T mathbfy$$






          share|cite|improve this answer






















          • I did not ask for how to solve. I asked what is wrong with my solution.
            – rjmessibarca
            Aug 16 at 10:12












          up vote
          1
          down vote










          up vote
          1
          down vote









          Let $M = mathbfy - mathbfX mathbfw$



          and $f = M^T M = M : M$.



          We will utilize the following the identities



          • Trace and Frobenius product relation $$A:B=rm tr(A^TB)$$ or $$A^T:B=rm tr(AB)$$

          • Cyclic property of Trace/Frobenius product $$eqalign
            A:BC
            &= AC^T:B cr
            &= B^TA:C cr
            &= text etc. cr
            $$

          Now, we obtain the differential first and thereafter we obtain the gradient.



          So,
          beginalign
          df &= left( d M: M right) + left( M : dM right)\
          &= 2M : dM \
          &= 2M : left( - mathbfX d mathbfw right) \
          &= - 2mathbfX^T M : d mathbfw hspace8mm textnote: utilized cyclic property of Frobenius product \
          &= - 2mathbfX^T left( mathbfy - mathbfX mathbfw right) : d mathbfw .
          endalign



          Thus, the gradient reads
          beginalign
          fracpartialpartial mathbfw f
          = - 2mathbfX^T left( mathbfy - mathbfX mathbfw right) .
          endalign



          Then you can set the gradient to $0$ and obtain your $$mathbfw = left( mathbfX^T mathbfX right) ^-1 mathbfX^T mathbfy$$






          share|cite|improve this answer














          Let $M = mathbfy - mathbfX mathbfw$



          and $f = M^T M = M : M$.



          We will utilize the following the identities



          • Trace and Frobenius product relation $$A:B=rm tr(A^TB)$$ or $$A^T:B=rm tr(AB)$$

          • Cyclic property of Trace/Frobenius product $$eqalign
            A:BC
            &= AC^T:B cr
            &= B^TA:C cr
            &= text etc. cr
            $$

          Now, we obtain the differential first and thereafter we obtain the gradient.



          So,
          beginalign
          df &= left( d M: M right) + left( M : dM right)\
          &= 2M : dM \
          &= 2M : left( - mathbfX d mathbfw right) \
          &= - 2mathbfX^T M : d mathbfw hspace8mm textnote: utilized cyclic property of Frobenius product \
          &= - 2mathbfX^T left( mathbfy - mathbfX mathbfw right) : d mathbfw .
          endalign



          Thus, the gradient reads
          beginalign
          fracpartialpartial mathbfw f
          = - 2mathbfX^T left( mathbfy - mathbfX mathbfw right) .
          endalign



          Then you can set the gradient to $0$ and obtain your $$mathbfw = left( mathbfX^T mathbfX right) ^-1 mathbfX^T mathbfy$$







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Aug 16 at 9:01

























          answered Aug 16 at 8:29









          user550103

          549213




          549213











          • I did not ask for how to solve. I asked what is wrong with my solution.
            – rjmessibarca
            Aug 16 at 10:12
















          • I did not ask for how to solve. I asked what is wrong with my solution.
            – rjmessibarca
            Aug 16 at 10:12















          I did not ask for how to solve. I asked what is wrong with my solution.
          – rjmessibarca
          Aug 16 at 10:12




          I did not ask for how to solve. I asked what is wrong with my solution.
          – rjmessibarca
          Aug 16 at 10:12












           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2884513%2fwhere-am-i-going-wrong-in-solving-frac-partial-partial-mathbf-w-mathbf-y%23new-answer', 'question_page');

          );

          Post as a guest













































































          這個網誌中的熱門文章

          Is there any way to eliminate the singular point to solve this integral by hand or by approximations?

          Why am i infinitely getting the same tweet with the Twitter Search API?

          Carbon dioxide