Vector Sum Reduction
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
IâÂÂm currently reading Jeremy HowardâÂÂs paper âÂÂThe Matrix Calculus You Need For Deep Learningâ and came across a bit I donâÂÂt understand.
In section 4.4 (Vector Sum Reduction), they are calculating the derivative of $y = operatornamesum(mathbff(mathbfx))$, like so:
beginalign*
fracpartial ypartial mathbfx
&= left[
fracpartial ypartial x_1,
fracpartial ypartial x_2,
dotsc,
fracpartial ypartial x_n
right] \
&= left[
fracpartialpartial x_1 sum_i f_i(mathbfx) ,
fracpartialpartial x_2 sum_i f_i(mathbfx),
dotsc,
fracpartialpartial x_n sum_i f_i(mathbfx)
right] \
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
endalign*
(Original image here.)
Here, the following is noted:
Notice we were careful here to leave the parameter as a vector $mathbfx$ because each function $f_i$ could use all values in the vector, not just $x_i$.
What does that mean? I thought it meant that we canâÂÂt reduce $f_i(mathbfx)$ to $f_i(x_i)$ to $x_i$, but right afterwards, that is precisely what they do:
LetâÂÂs look at the gradient of the simple $y = operatornamesum(mathbfx)$.
The function inside the summation is just $f_i(mathbfx) = x_i$ and the gradient is then:
beginalign*
nabla y
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
&= left[
sum_i fracpartial x_ipartial x_1,
sum_i fracpartial x_ipartial x_2,
dotsc,
sum_i fracpartial x_ipartial x_n
right].
endalign*
(Original image here.)
Does it have something to do with the summation being taken out? Or am I reading this completely wrong?
summation partial-derivative vector-analysis matrix-calculus jacobian
add a comment |Â
up vote
1
down vote
favorite
IâÂÂm currently reading Jeremy HowardâÂÂs paper âÂÂThe Matrix Calculus You Need For Deep Learningâ and came across a bit I donâÂÂt understand.
In section 4.4 (Vector Sum Reduction), they are calculating the derivative of $y = operatornamesum(mathbff(mathbfx))$, like so:
beginalign*
fracpartial ypartial mathbfx
&= left[
fracpartial ypartial x_1,
fracpartial ypartial x_2,
dotsc,
fracpartial ypartial x_n
right] \
&= left[
fracpartialpartial x_1 sum_i f_i(mathbfx) ,
fracpartialpartial x_2 sum_i f_i(mathbfx),
dotsc,
fracpartialpartial x_n sum_i f_i(mathbfx)
right] \
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
endalign*
(Original image here.)
Here, the following is noted:
Notice we were careful here to leave the parameter as a vector $mathbfx$ because each function $f_i$ could use all values in the vector, not just $x_i$.
What does that mean? I thought it meant that we canâÂÂt reduce $f_i(mathbfx)$ to $f_i(x_i)$ to $x_i$, but right afterwards, that is precisely what they do:
LetâÂÂs look at the gradient of the simple $y = operatornamesum(mathbfx)$.
The function inside the summation is just $f_i(mathbfx) = x_i$ and the gradient is then:
beginalign*
nabla y
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
&= left[
sum_i fracpartial x_ipartial x_1,
sum_i fracpartial x_ipartial x_2,
dotsc,
sum_i fracpartial x_ipartial x_n
right].
endalign*
(Original image here.)
Does it have something to do with the summation being taken out? Or am I reading this completely wrong?
summation partial-derivative vector-analysis matrix-calculus jacobian
2
It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â Lord Shark the Unknown
Aug 24 at 5:03
Ohh got it! Thank you!
â General Thalion
Aug 25 at 8:42
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
IâÂÂm currently reading Jeremy HowardâÂÂs paper âÂÂThe Matrix Calculus You Need For Deep Learningâ and came across a bit I donâÂÂt understand.
In section 4.4 (Vector Sum Reduction), they are calculating the derivative of $y = operatornamesum(mathbff(mathbfx))$, like so:
beginalign*
fracpartial ypartial mathbfx
&= left[
fracpartial ypartial x_1,
fracpartial ypartial x_2,
dotsc,
fracpartial ypartial x_n
right] \
&= left[
fracpartialpartial x_1 sum_i f_i(mathbfx) ,
fracpartialpartial x_2 sum_i f_i(mathbfx),
dotsc,
fracpartialpartial x_n sum_i f_i(mathbfx)
right] \
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
endalign*
(Original image here.)
Here, the following is noted:
Notice we were careful here to leave the parameter as a vector $mathbfx$ because each function $f_i$ could use all values in the vector, not just $x_i$.
What does that mean? I thought it meant that we canâÂÂt reduce $f_i(mathbfx)$ to $f_i(x_i)$ to $x_i$, but right afterwards, that is precisely what they do:
LetâÂÂs look at the gradient of the simple $y = operatornamesum(mathbfx)$.
The function inside the summation is just $f_i(mathbfx) = x_i$ and the gradient is then:
beginalign*
nabla y
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
&= left[
sum_i fracpartial x_ipartial x_1,
sum_i fracpartial x_ipartial x_2,
dotsc,
sum_i fracpartial x_ipartial x_n
right].
endalign*
(Original image here.)
Does it have something to do with the summation being taken out? Or am I reading this completely wrong?
summation partial-derivative vector-analysis matrix-calculus jacobian
IâÂÂm currently reading Jeremy HowardâÂÂs paper âÂÂThe Matrix Calculus You Need For Deep Learningâ and came across a bit I donâÂÂt understand.
In section 4.4 (Vector Sum Reduction), they are calculating the derivative of $y = operatornamesum(mathbff(mathbfx))$, like so:
beginalign*
fracpartial ypartial mathbfx
&= left[
fracpartial ypartial x_1,
fracpartial ypartial x_2,
dotsc,
fracpartial ypartial x_n
right] \
&= left[
fracpartialpartial x_1 sum_i f_i(mathbfx) ,
fracpartialpartial x_2 sum_i f_i(mathbfx),
dotsc,
fracpartialpartial x_n sum_i f_i(mathbfx)
right] \
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
endalign*
(Original image here.)
Here, the following is noted:
Notice we were careful here to leave the parameter as a vector $mathbfx$ because each function $f_i$ could use all values in the vector, not just $x_i$.
What does that mean? I thought it meant that we canâÂÂt reduce $f_i(mathbfx)$ to $f_i(x_i)$ to $x_i$, but right afterwards, that is precisely what they do:
LetâÂÂs look at the gradient of the simple $y = operatornamesum(mathbfx)$.
The function inside the summation is just $f_i(mathbfx) = x_i$ and the gradient is then:
beginalign*
nabla y
&= left[
sum_i fracpartial f_i(mathbfx)partial x_1,
sum_i fracpartial f_i(mathbfx)partial x_2,
dotsc,
sum_i fracpartial f_i(mathbfx)partial x_n
right] \
&= left[
sum_i fracpartial x_ipartial x_1,
sum_i fracpartial x_ipartial x_2,
dotsc,
sum_i fracpartial x_ipartial x_n
right].
endalign*
(Original image here.)
Does it have something to do with the summation being taken out? Or am I reading this completely wrong?
summation partial-derivative vector-analysis matrix-calculus jacobian
edited Aug 24 at 4:49
Jendrik Stelzner
7,57221037
7,57221037
asked Aug 24 at 4:36
General Thalion
62
62
2
It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â Lord Shark the Unknown
Aug 24 at 5:03
Ohh got it! Thank you!
â General Thalion
Aug 25 at 8:42
add a comment |Â
2
It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â Lord Shark the Unknown
Aug 24 at 5:03
Ohh got it! Thank you!
â General Thalion
Aug 25 at 8:42
2
2
It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â Lord Shark the Unknown
Aug 24 at 5:03
It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â Lord Shark the Unknown
Aug 24 at 5:03
Ohh got it! Thank you!
â General Thalion
Aug 25 at 8:42
Ohh got it! Thank you!
â General Thalion
Aug 25 at 8:42
add a comment |Â
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2892779%2fvector-sum-reduction%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
2
It's reminding you that in general $f_i(textbf x)=f_i(x_1,x_2,ldots,x_n)$ is a function of all the variables $x_j$. Then they do a particular example where $f_i(x_1,x_2,ldots,x_n)=x_i$ just depends on the one variable.
â Lord Shark the Unknown
Aug 24 at 5:03
Ohh got it! Thank you!
â General Thalion
Aug 25 at 8:42