In CNN, do we have learn kernel values at every convolution layer?
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
3
down vote
favorite
I'm new to machine learning and one of the things I don't understand about CNN is whether we have to learn the kernel values at every convolutional layer, or just learn a single set of kernel values and use it at every convolution layer.
machine-learning conv-neural-network
add a comment |Â
up vote
3
down vote
favorite
I'm new to machine learning and one of the things I don't understand about CNN is whether we have to learn the kernel values at every convolutional layer, or just learn a single set of kernel values and use it at every convolution layer.
machine-learning conv-neural-network
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I'm new to machine learning and one of the things I don't understand about CNN is whether we have to learn the kernel values at every convolutional layer, or just learn a single set of kernel values and use it at every convolution layer.
machine-learning conv-neural-network
I'm new to machine learning and one of the things I don't understand about CNN is whether we have to learn the kernel values at every convolutional layer, or just learn a single set of kernel values and use it at every convolution layer.
machine-learning conv-neural-network
edited Aug 20 at 5:44
Frans Rodenburg
2,545321
2,545321
asked Aug 20 at 5:21
thegoodguy
305
305
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
6
down vote
accepted
The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.
In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:
The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers1. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.
In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.
1 More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.
Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â Frans Rodenburg
Aug 20 at 9:31
RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â Jan Kukacka
Aug 20 at 9:35
add a comment |Â
up vote
1
down vote
That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.
If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â Frans Rodenburg
Aug 20 at 5:42
@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â Shehryar Malik
Aug 20 at 5:55
Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â Frans Rodenburg
Aug 20 at 6:00
So the conclusion is ,I have to learn kernel values for each convolution layer.
â thegoodguy
Aug 20 at 6:19
@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â Shehryar Malik
Aug 20 at 7:56
 |Â
show 1 more comment
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
6
down vote
accepted
The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.
In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:
The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers1. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.
In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.
1 More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.
Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â Frans Rodenburg
Aug 20 at 9:31
RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â Jan Kukacka
Aug 20 at 9:35
add a comment |Â
up vote
6
down vote
accepted
The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.
In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:
The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers1. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.
In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.
1 More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.
Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â Frans Rodenburg
Aug 20 at 9:31
RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â Jan Kukacka
Aug 20 at 9:35
add a comment |Â
up vote
6
down vote
accepted
up vote
6
down vote
accepted
The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.
In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:
The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers1. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.
In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.
1 More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.
The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.
In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:
The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers1. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.
In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.
1 More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.
edited Aug 20 at 9:22
answered Aug 20 at 8:47
Jan Kukacka
4,06811233
4,06811233
Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â Frans Rodenburg
Aug 20 at 9:31
RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â Jan Kukacka
Aug 20 at 9:35
add a comment |Â
Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â Frans Rodenburg
Aug 20 at 9:31
RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â Jan Kukacka
Aug 20 at 9:35
Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â Frans Rodenburg
Aug 20 at 9:31
Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â Frans Rodenburg
Aug 20 at 9:31
RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â Jan Kukacka
Aug 20 at 9:35
RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â Jan Kukacka
Aug 20 at 9:35
add a comment |Â
up vote
1
down vote
That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.
If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â Frans Rodenburg
Aug 20 at 5:42
@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â Shehryar Malik
Aug 20 at 5:55
Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â Frans Rodenburg
Aug 20 at 6:00
So the conclusion is ,I have to learn kernel values for each convolution layer.
â thegoodguy
Aug 20 at 6:19
@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â Shehryar Malik
Aug 20 at 7:56
 |Â
show 1 more comment
up vote
1
down vote
That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.
If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â Frans Rodenburg
Aug 20 at 5:42
@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â Shehryar Malik
Aug 20 at 5:55
Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â Frans Rodenburg
Aug 20 at 6:00
So the conclusion is ,I have to learn kernel values for each convolution layer.
â thegoodguy
Aug 20 at 6:19
@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â Shehryar Malik
Aug 20 at 7:56
 |Â
show 1 more comment
up vote
1
down vote
up vote
1
down vote
That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.
That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.
edited Aug 20 at 7:56
answered Aug 20 at 5:37
Shehryar Malik
212
212
If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â Frans Rodenburg
Aug 20 at 5:42
@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â Shehryar Malik
Aug 20 at 5:55
Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â Frans Rodenburg
Aug 20 at 6:00
So the conclusion is ,I have to learn kernel values for each convolution layer.
â thegoodguy
Aug 20 at 6:19
@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â Shehryar Malik
Aug 20 at 7:56
 |Â
show 1 more comment
If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â Frans Rodenburg
Aug 20 at 5:42
@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â Shehryar Malik
Aug 20 at 5:55
Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â Frans Rodenburg
Aug 20 at 6:00
So the conclusion is ,I have to learn kernel values for each convolution layer.
â thegoodguy
Aug 20 at 6:19
@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â Shehryar Malik
Aug 20 at 7:56
If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â Frans Rodenburg
Aug 20 at 5:42
If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â Frans Rodenburg
Aug 20 at 5:42
@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â Shehryar Malik
Aug 20 at 5:55
@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â Shehryar Malik
Aug 20 at 5:55
Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â Frans Rodenburg
Aug 20 at 6:00
Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â Frans Rodenburg
Aug 20 at 6:00
So the conclusion is ,I have to learn kernel values for each convolution layer.
â thegoodguy
Aug 20 at 6:19
So the conclusion is ,I have to learn kernel values for each convolution layer.
â thegoodguy
Aug 20 at 6:19
@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â Shehryar Malik
Aug 20 at 7:56
@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â Shehryar Malik
Aug 20 at 7:56
 |Â
show 1 more comment
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f362988%2fin-cnn-do-we-have-learn-kernel-values-at-every-convolution-layer%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password