In CNN, do we have learn kernel values at every convolution layer?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
3
down vote

favorite

I'm new to machine learning and one of the things I don't understand about CNN is whether we have to learn the kernel values at every convolutional layer, or just learn a single set of kernel values and use it at every convolution layer.

edited Aug 20 at 5:44

Frans Rodenburg

2,545321

asked Aug 20 at 5:21

thegoodguy

305

add a commentÂ |Â

up vote
3
down vote

favorite

edited Aug 20 at 5:44

Frans Rodenburg

2,545321

asked Aug 20 at 5:21

thegoodguy

305

add a commentÂ |Â

up vote
3
down vote

favorite

edited Aug 20 at 5:44

Frans Rodenburg

2,545321

asked Aug 20 at 5:21

thegoodguy

305

edited Aug 20 at 5:44

Frans Rodenburg

2,545321

asked Aug 20 at 5:21

thegoodguy

305

edited Aug 20 at 5:44

Frans Rodenburg

2,545321

edited Aug 20 at 5:44

Frans Rodenburg

2,545321

edited Aug 20 at 5:44

Frans Rodenburg

2,545321

asked Aug 20 at 5:21

thegoodguy

305

asked Aug 20 at 5:21

thegoodguy

305

asked Aug 20 at 5:21

thegoodguy

305

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
6
down vote

accepted

The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.

In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:

enter image description here

The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers¹. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.

In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.

¹ More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.

edited Aug 20 at 9:22

answered Aug 20 at 8:47

Jan Kukacka

4,06811233

Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â€“Â Frans Rodenburg
Aug 20 at 9:31

RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â€“Â Jan Kukacka
Aug 20 at 9:35

add a commentÂ |Â

up vote
1
down vote

That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.

edited Aug 20 at 7:56

answered Aug 20 at 5:37

Shehryar Malik

212

If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â€“Â Frans Rodenburg
Aug 20 at 5:42

@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â€“Â Shehryar Malik
Aug 20 at 5:55

Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â€“Â Frans Rodenburg
Aug 20 at 6:00

So the conclusion is ,I have to learn kernel values for each convolution layer.
â€“Â thegoodguy
Aug 20 at 6:19

@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â€“Â Shehryar Malik
Aug 20 at 7:56

Â |Â
show 1 more comment

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f362988%2fin-cnn-do-we-have-learn-kernel-values-at-every-convolution-layer%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
6
down vote

accepted

The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.

In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:

enter image description here

In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.

¹ More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.

edited Aug 20 at 9:22

answered Aug 20 at 8:47

Jan Kukacka

4,06811233

Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â€“Â Frans Rodenburg
Aug 20 at 9:31

RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â€“Â Jan Kukacka
Aug 20 at 9:35

add a commentÂ |Â

up vote
6
down vote

accepted

The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.

In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:

enter image description here

In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.

¹ More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.

edited Aug 20 at 9:22

answered Aug 20 at 8:47

Jan Kukacka

4,06811233

Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â€“Â Frans Rodenburg
Aug 20 at 9:31

RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â€“Â Jan Kukacka
Aug 20 at 9:35

add a commentÂ |Â

up vote
6
down vote

accepted

The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.

In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:

enter image description here

In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.

¹ More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.

edited Aug 20 at 9:22

answered Aug 20 at 8:47

Jan Kukacka

4,06811233

The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.

In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:

enter image description here

In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.

¹ More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.

edited Aug 20 at 9:22

answered Aug 20 at 8:47

Jan Kukacka

4,06811233

edited Aug 20 at 9:22

answered Aug 20 at 8:47

Jan Kukacka

4,06811233

answered Aug 20 at 8:47

Jan Kukacka

4,06811233

answered Aug 20 at 8:47

Jan Kukacka

4,06811233

Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â€“Â Frans Rodenburg
Aug 20 at 9:31

RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â€“Â Jan Kukacka
Aug 20 at 9:35

add a commentÂ |Â

Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â€“Â Frans Rodenburg
Aug 20 at 9:31

RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â€“Â Jan Kukacka
Aug 20 at 9:35

Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
â€“Â Frans Rodenburg
Aug 20 at 9:31

RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
â€“Â Jan Kukacka
Aug 20 at 9:35

add a commentÂ |Â

up vote
1
down vote

edited Aug 20 at 7:56

answered Aug 20 at 5:37

Shehryar Malik

212

If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â€“Â Frans Rodenburg
Aug 20 at 5:42

@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â€“Â Shehryar Malik
Aug 20 at 5:55

Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â€“Â Frans Rodenburg
Aug 20 at 6:00

So the conclusion is ,I have to learn kernel values for each convolution layer.
â€“Â thegoodguy
Aug 20 at 6:19

@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â€“Â Shehryar Malik
Aug 20 at 7:56

Â |Â
show 1 more comment

up vote
1
down vote

edited Aug 20 at 7:56

answered Aug 20 at 5:37

Shehryar Malik

212

If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â€“Â Frans Rodenburg
Aug 20 at 5:42

@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â€“Â Shehryar Malik
Aug 20 at 5:55

Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â€“Â Frans Rodenburg
Aug 20 at 6:00

So the conclusion is ,I have to learn kernel values for each convolution layer.
â€“Â thegoodguy
Aug 20 at 6:19

@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â€“Â Shehryar Malik
Aug 20 at 7:56

Â |Â
show 1 more comment

up vote
1
down vote

edited Aug 20 at 7:56

answered Aug 20 at 5:37

Shehryar Malik

212

edited Aug 20 at 7:56

answered Aug 20 at 5:37

Shehryar Malik

212

edited Aug 20 at 7:56

answered Aug 20 at 5:37

Shehryar Malik

212

answered Aug 20 at 5:37

Shehryar Malik

212

answered Aug 20 at 5:37

Shehryar Malik

212

If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â€“Â Frans Rodenburg
Aug 20 at 5:42

@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â€“Â Shehryar Malik
Aug 20 at 5:55

Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â€“Â Frans Rodenburg
Aug 20 at 6:00

So the conclusion is ,I have to learn kernel values for each convolution layer.
â€“Â thegoodguy
Aug 20 at 6:19

@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â€“Â Shehryar Malik
Aug 20 at 7:56

Â |Â
show 1 more comment

If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â€“Â Frans Rodenburg
Aug 20 at 5:42

@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â€“Â Shehryar Malik
Aug 20 at 5:55

Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â€“Â Frans Rodenburg
Aug 20 at 6:00

So the conclusion is ,I have to learn kernel values for each convolution layer.
â€“Â thegoodguy
Aug 20 at 6:19

@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â€“Â Shehryar Malik
Aug 20 at 7:56

If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
â€“Â Frans Rodenburg
Aug 20 at 5:42

@FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
â€“Â Shehryar Malik
Aug 20 at 5:55

Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
â€“Â Frans Rodenburg
Aug 20 at 6:00

So the conclusion is ,I have to learn kernel values for each convolution layer.
â€“Â thegoodguy
Aug 20 at 6:19

@FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
â€“Â Shehryar Malik
Aug 20 at 7:56

Â |Â
show 1 more comment

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

Vtyjkyuk