Vector Sum Reduction

up vote
1
down vote

favorite

IÃ¢Â€Â™m currently reading Jeremy HowardÃ¢Â€Â™s paper Ã¢Â€ÂœThe Matrix Calculus You Need For Deep LearningÃ¢Â€Â and came across a bit I donÃ¢Â€Â™t understand.

In section 4.4 (Vector Sum Reduction), they are calculating the derivative of $y = operatornamesum(mathbff(mathbfx))$, like so:
beginalign*
fracpartial ypartial mathbfx
&= left[
fracpartial ypartial x_1,
fracpartial ypartial x_2,
dotsc,
fracpartial ypartial x_n
right] \
&= left[
fracpartialpartial x_1 sum_i f_i(mathbfx) ,
fracpartialpartial x_2 sum_i f_i(mathbfx),
dotsc,
fracpartialpartial x_n sum_i f_i(mathbfx)
right] \
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
endalign*
(Original image here.)

Here, the following is noted:

Notice we were careful here to leave the parameter as a vector $mathbfx$ because each function $f_i$ could use all values in the vector, not just $x_i$.

What does that mean? I thought it meant that we canÃ¢Â€Â™t reduce $f_i(mathbfx)$ to $f_i(x_i)$ to $x_i$, but right afterwards, that is precisely what they do:

LetÃ¢Â€Â™s look at the gradient of the simple $y = operatornamesum(mathbfx)$.
The function inside the summation is just $f_i(mathbfx) = x_i$ and the gradient is then:
beginalign*
nabla y
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
&= left[
sum_i fracpartial x_ipartial x_1,
sum_i fracpartial x_ipartial x_2,
dotsc,
sum_i fracpartial x_ipartial x_n
right].
endalign*

(Original image here.)

Does it have something to do with the summation being taken out? Or am I reading this completely wrong?

edited Aug 24 at 4:49

Jendrik Stelzner

7,57221037

asked Aug 24 at 4:36

General Thalion

2

It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â€“Â Lord Shark the Unknown
Aug 24 at 5:03

Ohh got it! Thank you!
â€“Â General Thalion
Aug 25 at 8:42

add a commentÂ |Â

up vote
1
down vote

favorite

IÃ¢Â€Â™m currently reading Jeremy HowardÃ¢Â€Â™s paper Ã¢Â€ÂœThe Matrix Calculus You Need For Deep LearningÃ¢Â€Â and came across a bit I donÃ¢Â€Â™t understand.

Here, the following is noted:

Notice we were careful here to leave the parameter as a vector $mathbfx$ because each function $f_i$ could use all values in the vector, not just $x_i$.

What does that mean? I thought it meant that we canÃ¢Â€Â™t reduce $f_i(mathbfx)$ to $f_i(x_i)$ to $x_i$, but right afterwards, that is precisely what they do:

LetÃ¢Â€Â™s look at the gradient of the simple $y = operatornamesum(mathbfx)$.
The function inside the summation is just $f_i(mathbfx) = x_i$ and the gradient is then:
beginalign*
nabla y
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
&= left[
sum_i fracpartial x_ipartial x_1,
sum_i fracpartial x_ipartial x_2,
dotsc,
sum_i fracpartial x_ipartial x_n
right].
endalign*

(Original image here.)

Does it have something to do with the summation being taken out? Or am I reading this completely wrong?

edited Aug 24 at 4:49

Jendrik Stelzner

7,57221037

asked Aug 24 at 4:36

General Thalion

2

It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â€“Â Lord Shark the Unknown
Aug 24 at 5:03

Ohh got it! Thank you!
â€“Â General Thalion
Aug 25 at 8:42

add a commentÂ |Â

up vote
1
down vote

favorite

IÃ¢Â€Â™m currently reading Jeremy HowardÃ¢Â€Â™s paper Ã¢Â€ÂœThe Matrix Calculus You Need For Deep LearningÃ¢Â€Â and came across a bit I donÃ¢Â€Â™t understand.

Here, the following is noted:

Notice we were careful here to leave the parameter as a vector $mathbfx$ because each function $f_i$ could use all values in the vector, not just $x_i$.

What does that mean? I thought it meant that we canÃ¢Â€Â™t reduce $f_i(mathbfx)$ to $f_i(x_i)$ to $x_i$, but right afterwards, that is precisely what they do:

LetÃ¢Â€Â™s look at the gradient of the simple $y = operatornamesum(mathbfx)$.
The function inside the summation is just $f_i(mathbfx) = x_i$ and the gradient is then:
beginalign*
nabla y
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
&= left[
sum_i fracpartial x_ipartial x_1,
sum_i fracpartial x_ipartial x_2,
dotsc,
sum_i fracpartial x_ipartial x_n
right].
endalign*

(Original image here.)

Does it have something to do with the summation being taken out? Or am I reading this completely wrong?

edited Aug 24 at 4:49

Jendrik Stelzner

7,57221037

asked Aug 24 at 4:36

General Thalion

IÃ¢Â€Â™m currently reading Jeremy HowardÃ¢Â€Â™s paper Ã¢Â€ÂœThe Matrix Calculus You Need For Deep LearningÃ¢Â€Â and came across a bit I donÃ¢Â€Â™t understand.

Here, the following is noted:

Notice we were careful here to leave the parameter as a vector $mathbfx$ because each function $f_i$ could use all values in the vector, not just $x_i$.

What does that mean? I thought it meant that we canÃ¢Â€Â™t reduce $f_i(mathbfx)$ to $f_i(x_i)$ to $x_i$, but right afterwards, that is precisely what they do:

LetÃ¢Â€Â™s look at the gradient of the simple $y = operatornamesum(mathbfx)$.
The function inside the summation is just $f_i(mathbfx) = x_i$ and the gradient is then:
beginalign*
nabla y
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
&= left[
sum_i fracpartial x_ipartial x_1,
sum_i fracpartial x_ipartial x_2,
dotsc,
sum_i fracpartial x_ipartial x_n
right].
endalign*

(Original image here.)

Does it have something to do with the summation being taken out? Or am I reading this completely wrong?

edited Aug 24 at 4:49

Jendrik Stelzner

7,57221037

asked Aug 24 at 4:36

General Thalion

edited Aug 24 at 4:49

Jendrik Stelzner

7,57221037

edited Aug 24 at 4:49

Jendrik Stelzner

7,57221037

edited Aug 24 at 4:49

Jendrik Stelzner

7,57221037

asked Aug 24 at 4:36

General Thalion

asked Aug 24 at 4:36

General Thalion

asked Aug 24 at 4:36

General Thalion

2

It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â€“Â Lord Shark the Unknown
Aug 24 at 5:03

Ohh got it! Thank you!
â€“Â General Thalion
Aug 25 at 8:42

add a commentÂ |Â

2

It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â€“Â Lord Shark the Unknown
Aug 24 at 5:03

Ohh got it! Thank you!
â€“Â General Thalion
Aug 25 at 8:42

It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â€“Â Lord Shark the Unknown
Aug 24 at 5:03

Ohh got it! Thank you!
â€“Â General Thalion
Aug 25 at 8:42

add a commentÂ |Â

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2892779%2fvector-sum-reduction%23new-answer', 'question_page');

);

Post as a guest

Name

active

oldest

votes

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

Vtyjkyuk