Gradient descent vs. system of equations

up vote
0
down vote

favorite

Given the matrices $mathbft_Mtimes 1$ and $mathbfQ_Mtimes N$, we want to find $mathbfp_Ntimes 1$ that minimizes $epsilon = ||mathbft - mathbfQmathbfp||_2$.

In order to do so, we could use the gradient descent method. My question is, assuming that we get the best possible result with the gradient descent method, will it be equivalent to clearing $mathbfp$ from the equation $mathbft = mathbfQmathbfp$ as

$mathbfQ^T mathbft = mathbfQ^T mathbfQmathbfp$,

$(mathbfQ^T mathbfQ)^-1 mathbfQ^T mathbft = (mathbfQ^T mathbfQ)^-1 (mathbfQ^T mathbfQ) mathbfp$,

and so
$mathbfp = (mathbfQ^T mathbfQ)^-1 mathbfQ^T mathbft$?

I see that by premultiplying by $mathbfQ^T$ we are 'colapsing' our system of equations (M equations, N unknowns) to N equations with N unknowns, but I can't see how this will minimize $epsilon$ (in case it does).

asked Sep 7 at 9:12

Carlos Navarro AstiasarÃ¡n

454

Minimizing the error is equivalent to minimizing the the squared error, which is $epsilon^2 = |t-Qp|^2 = t^Tt-2t^TQp+p^TQ^TQp$. The minimum is attained when its gradient, $2Q^TQp-2Q^Tt$, equals zero, which occurs when $p=(Q^TQ)^-1Q^Tt$. Thus the two approaches are equivalent.
â€“Â Rahul
Sep 7 at 9:20

add a commentÂ |Â

up vote
0
down vote

favorite

Given the matrices $mathbft_Mtimes 1$ and $mathbfQ_Mtimes N$, we want to find $mathbfp_Ntimes 1$ that minimizes $epsilon = ||mathbft - mathbfQmathbfp||_2$.

$mathbfQ^T mathbft = mathbfQ^T mathbfQmathbfp$,

$(mathbfQ^T mathbfQ)^-1 mathbfQ^T mathbft = (mathbfQ^T mathbfQ)^-1 (mathbfQ^T mathbfQ) mathbfp$,

and so
$mathbfp = (mathbfQ^T mathbfQ)^-1 mathbfQ^T mathbft$?

asked Sep 7 at 9:12

Carlos Navarro AstiasarÃ¡n

454

Minimizing the error is equivalent to minimizing the the squared error, which is $epsilon^2 = |t-Qp|^2 = t^Tt-2t^TQp+p^TQ^TQp$. The minimum is attained when its gradient, $2Q^TQp-2Q^Tt$, equals zero, which occurs when $p=(Q^TQ)^-1Q^Tt$. Thus the two approaches are equivalent.
â€“Â Rahul
Sep 7 at 9:20

add a commentÂ |Â

up vote
0
down vote

favorite

Given the matrices $mathbft_Mtimes 1$ and $mathbfQ_Mtimes N$, we want to find $mathbfp_Ntimes 1$ that minimizes $epsilon = ||mathbft - mathbfQmathbfp||_2$.

$mathbfQ^T mathbft = mathbfQ^T mathbfQmathbfp$,

$(mathbfQ^T mathbfQ)^-1 mathbfQ^T mathbft = (mathbfQ^T mathbfQ)^-1 (mathbfQ^T mathbfQ) mathbfp$,

and so
$mathbfp = (mathbfQ^T mathbfQ)^-1 mathbfQ^T mathbft$?

asked Sep 7 at 9:12

Carlos Navarro AstiasarÃ¡n

454

Given the matrices $mathbft_Mtimes 1$ and $mathbfQ_Mtimes N$, we want to find $mathbfp_Ntimes 1$ that minimizes $epsilon = ||mathbft - mathbfQmathbfp||_2$.

$mathbfQ^T mathbft = mathbfQ^T mathbfQmathbfp$,

$(mathbfQ^T mathbfQ)^-1 mathbfQ^T mathbft = (mathbfQ^T mathbfQ)^-1 (mathbfQ^T mathbfQ) mathbfp$,

and so
$mathbfp = (mathbfQ^T mathbfQ)^-1 mathbfQ^T mathbft$?

linear-algebra gradient-descent

asked Sep 7 at 9:12

Carlos Navarro AstiasarÃ¡n

454

asked Sep 7 at 9:12

Carlos Navarro AstiasarÃ¡n

454

asked Sep 7 at 9:12

Carlos Navarro AstiasarÃ¡n

454

asked Sep 7 at 9:12

Carlos Navarro AstiasarÃ¡n

454

asked Sep 7 at 9:12

Carlos Navarro AstiasarÃ¡n

454

Minimizing the error is equivalent to minimizing the the squared error, which is $epsilon^2 = |t-Qp|^2 = t^Tt-2t^TQp+p^TQ^TQp$. The minimum is attained when its gradient, $2Q^TQp-2Q^Tt$, equals zero, which occurs when $p=(Q^TQ)^-1Q^Tt$. Thus the two approaches are equivalent.
â€“Â Rahul
Sep 7 at 9:20

add a commentÂ |Â

Minimizing the error is equivalent to minimizing the the squared error, which is $epsilon^2 = |t-Qp|^2 = t^Tt-2t^TQp+p^TQ^TQp$. The minimum is attained when its gradient, $2Q^TQp-2Q^Tt$, equals zero, which occurs when $p=(Q^TQ)^-1Q^Tt$. Thus the two approaches are equivalent.
â€“Â Rahul
Sep 7 at 9:20

Minimizing the error is equivalent to minimizing the the squared error, which is $epsilon^2 = |t-Qp|^2 = t^Tt-2t^TQp+p^TQ^TQp$. The minimum is attained when its gradient, $2Q^TQp-2Q^Tt$, equals zero, which occurs when $p=(Q^TQ)^-1Q^Tt$. Thus the two approaches are equivalent.
â€“Â Rahul
Sep 7 at 9:20

add a commentÂ |Â

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2908425%2fgradient-descent-vs-system-of-equations%23new-answer', 'question_page');

);

Post as a guest

Name

active

oldest

votes

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

Vtyjkyuk