In CNN, do we have learn kernel values at every convolution layer?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
3
down vote

favorite












I'm new to machine learning and one of the things I don't understand about CNN is whether we have to learn the kernel values at every convolutional layer, or just learn a single set of kernel values and use it at every convolution layer.







share|cite|improve this question




























    up vote
    3
    down vote

    favorite












    I'm new to machine learning and one of the things I don't understand about CNN is whether we have to learn the kernel values at every convolutional layer, or just learn a single set of kernel values and use it at every convolution layer.







    share|cite|improve this question
























      up vote
      3
      down vote

      favorite









      up vote
      3
      down vote

      favorite











      I'm new to machine learning and one of the things I don't understand about CNN is whether we have to learn the kernel values at every convolutional layer, or just learn a single set of kernel values and use it at every convolution layer.







      share|cite|improve this question














      I'm new to machine learning and one of the things I don't understand about CNN is whether we have to learn the kernel values at every convolutional layer, or just learn a single set of kernel values and use it at every convolution layer.









      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Aug 20 at 5:44









      Frans Rodenburg

      2,545321




      2,545321










      asked Aug 20 at 5:21









      thegoodguy

      305




      305




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          6
          down vote



          accepted










          The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.



          In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:



          enter image description here



          The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers1. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.



          In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.




          1 More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.






          share|cite|improve this answer






















          • Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
            – Frans Rodenburg
            Aug 20 at 9:31










          • RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
            – Jan Kukacka
            Aug 20 at 9:35


















          up vote
          1
          down vote













          That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.






          share|cite|improve this answer






















          • If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
            – Frans Rodenburg
            Aug 20 at 5:42










          • @FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
            – Shehryar Malik
            Aug 20 at 5:55










          • Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
            – Frans Rodenburg
            Aug 20 at 6:00











          • So the conclusion is ,I have to learn kernel values for each convolution layer.
            – thegoodguy
            Aug 20 at 6:19










          • @FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
            – Shehryar Malik
            Aug 20 at 7:56










          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "65"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f362988%2fin-cnn-do-we-have-learn-kernel-values-at-every-convolution-layer%23new-answer', 'question_page');

          );

          Post as a guest






























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          6
          down vote



          accepted










          The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.



          In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:



          enter image description here



          The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers1. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.



          In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.




          1 More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.






          share|cite|improve this answer






















          • Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
            – Frans Rodenburg
            Aug 20 at 9:31










          • RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
            – Jan Kukacka
            Aug 20 at 9:35















          up vote
          6
          down vote



          accepted










          The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.



          In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:



          enter image description here



          The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers1. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.



          In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.




          1 More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.






          share|cite|improve this answer






















          • Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
            – Frans Rodenburg
            Aug 20 at 9:31










          • RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
            – Jan Kukacka
            Aug 20 at 9:35













          up vote
          6
          down vote



          accepted







          up vote
          6
          down vote



          accepted






          The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.



          In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:



          enter image description here



          The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers1. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.



          In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.




          1 More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.






          share|cite|improve this answer














          The answer by @Shehryar Malik is correct (+1), but it sounds a bit confusing, especially for people new to convolutional neural networks.



          In the usual CNN scenario, each layer has its own set of convolution kernels that has to be learned. This can be easily seen in the following (famous) image:



          enter image description here



          The left block shows learned kernels in the first layer. The central and right block show kernels learned in deeper layers1. This is very important feature of convolutional neural networks: At different layers the network learns to detect stuff at different levels of abstraction. Therefore the kernels are different.



          In theory, nothing prevents you from using the same kernels at each layer. In fact, that thing is called recurrent convolutional neural network.




          1 More precisely, they show to what kind of image features these kernels respond to, since visualizing kernel with shape 3$times$3$times$256 is not very easy/intuitive/useful.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Aug 20 at 9:22

























          answered Aug 20 at 8:47









          Jan Kukacka

          4,06811233




          4,06811233











          • Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
            – Frans Rodenburg
            Aug 20 at 9:31










          • RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
            – Jan Kukacka
            Aug 20 at 9:35

















          • Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
            – Frans Rodenburg
            Aug 20 at 9:31










          • RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
            – Jan Kukacka
            Aug 20 at 9:35
















          Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
          – Frans Rodenburg
          Aug 20 at 9:31




          Interesting, but speaking of confusion, recurrent convolutional network seems a bit of an unfortunate name if I understand it correctly: Is RCNN it different from an RNN with convolutional layers?
          – Frans Rodenburg
          Aug 20 at 9:31












          RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
          – Jan Kukacka
          Aug 20 at 9:35





          RCNN is indeed unfortunate, as it is also established as an acronym for networks for object detection. But to answer your question, I think the difference is if you apply the same conv. filter repeatedly on the same data (recurrent convnet) or if you pass the data through some conv. layers and then apply standard RNN (as LSTM units), so the recurrence actually does not involve convolutions. Both is possible I think, even though I am no expert on recurrent nets.
          – Jan Kukacka
          Aug 20 at 9:35













          up vote
          1
          down vote













          That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.






          share|cite|improve this answer






















          • If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
            – Frans Rodenburg
            Aug 20 at 5:42










          • @FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
            – Shehryar Malik
            Aug 20 at 5:55










          • Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
            – Frans Rodenburg
            Aug 20 at 6:00











          • So the conclusion is ,I have to learn kernel values for each convolution layer.
            – thegoodguy
            Aug 20 at 6:19










          • @FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
            – Shehryar Malik
            Aug 20 at 7:56














          up vote
          1
          down vote













          That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.






          share|cite|improve this answer






















          • If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
            – Frans Rodenburg
            Aug 20 at 5:42










          • @FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
            – Shehryar Malik
            Aug 20 at 5:55










          • Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
            – Frans Rodenburg
            Aug 20 at 6:00











          • So the conclusion is ,I have to learn kernel values for each convolution layer.
            – thegoodguy
            Aug 20 at 6:19










          • @FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
            – Shehryar Malik
            Aug 20 at 7:56












          up vote
          1
          down vote










          up vote
          1
          down vote









          That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.






          share|cite|improve this answer














          That is entirely up to you. You can define only one set of kernel values and use it for all your layers or instead you could define a separate set of kernel values for each layer. Of course, it would be more prudent to define different sets of kernel values for each layer. This is because the kernel's job is to extract specific information from an input image. Different sets of kernel values at each layer will allow the network greater flexibility in deciding the best features to extract at each layer.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Aug 20 at 7:56

























          answered Aug 20 at 5:37









          Shehryar Malik

          212




          212











          • If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
            – Frans Rodenburg
            Aug 20 at 5:42










          • @FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
            – Shehryar Malik
            Aug 20 at 5:55










          • Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
            – Frans Rodenburg
            Aug 20 at 6:00











          • So the conclusion is ,I have to learn kernel values for each convolution layer.
            – thegoodguy
            Aug 20 at 6:19










          • @FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
            – Shehryar Malik
            Aug 20 at 7:56
















          • If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
            – Frans Rodenburg
            Aug 20 at 5:42










          • @FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
            – Shehryar Malik
            Aug 20 at 5:55










          • Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
            – Frans Rodenburg
            Aug 20 at 6:00











          • So the conclusion is ,I have to learn kernel values for each convolution layer.
            – thegoodguy
            Aug 20 at 6:19










          • @FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
            – Shehryar Malik
            Aug 20 at 7:56















          If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
          – Frans Rodenburg
          Aug 20 at 5:42




          If you define one kernel for each layer, you will still have to tie the weights to accomplish what @thegoodguy asked.
          – Frans Rodenburg
          Aug 20 at 5:42












          @FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
          – Shehryar Malik
          Aug 20 at 5:55




          @FransRodenburg what do you mean by 'tie the weights'? For a CNN, the 'kernel' is the 'weight matrix' and that is essentially what the network is trying to learn.
          – Shehryar Malik
          Aug 20 at 5:55












          Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
          – Frans Rodenburg
          Aug 20 at 6:00





          Even if you have the same kernel dimensions for each convolutional layer, you will still learn different weights. The OP asked whether the values are the same, which is only the case if you force them to be (i.e. tie the weights).
          – Frans Rodenburg
          Aug 20 at 6:00













          So the conclusion is ,I have to learn kernel values for each convolution layer.
          – thegoodguy
          Aug 20 at 6:19




          So the conclusion is ,I have to learn kernel values for each convolution layer.
          – thegoodguy
          Aug 20 at 6:19












          @FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
          – Shehryar Malik
          Aug 20 at 7:56




          @FransRodenburg when I said kernel I meant a set of kernel values (a kernel is after all only a weight matrix, and two different weight matrices might have the same dimensions). Nevertheless, I am editing my answer to make this more apparent.
          – Shehryar Malik
          Aug 20 at 7:56












           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f362988%2fin-cnn-do-we-have-learn-kernel-values-at-every-convolution-layer%23new-answer', 'question_page');

          );

          Post as a guest













































































          這個網誌中的熱門文章

          How to combine Bézier curves to a surface?

          Mutual Information Always Non-negative

          Why am i infinitely getting the same tweet with the Twitter Search API?