What is the difference between projected gradient descent and ordinary gradient descent?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
14
down vote

favorite
10












I just read about projected gradient descent but I did not see the intuition to use Projected one instead of normal gradient descent. Would you tell me the reason and preferable situations of projected gradient descent? What does that projection contribute?










share|cite|improve this question



























    up vote
    14
    down vote

    favorite
    10












    I just read about projected gradient descent but I did not see the intuition to use Projected one instead of normal gradient descent. Would you tell me the reason and preferable situations of projected gradient descent? What does that projection contribute?










    share|cite|improve this question

























      up vote
      14
      down vote

      favorite
      10









      up vote
      14
      down vote

      favorite
      10






      10





      I just read about projected gradient descent but I did not see the intuition to use Projected one instead of normal gradient descent. Would you tell me the reason and preferable situations of projected gradient descent? What does that projection contribute?










      share|cite|improve this question















      I just read about projected gradient descent but I did not see the intuition to use Projected one instead of normal gradient descent. Would you tell me the reason and preferable situations of projected gradient descent? What does that projection contribute?







      optimization numerical-optimization gradient-descent






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited Jul 23 '16 at 3:01









      Rodrigo de Azevedo

      12.7k41751




      12.7k41751










      asked Nov 17 '13 at 22:47









      erogol

      160211




      160211




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          43
          down vote



          accepted










          At a basic level, projected gradient descent is just a more general method for solving a more general problem.



          Gradient descent minimizes a function by moving in the negative gradient direction at each step. There is no constraint on the variable.
          $$
          textProblem 1: min_x f(x)
          $$
          $$
          x_k+1 = x_k - t_k nabla f(x_k)
          $$



          On the other hand, projected gradient descent minimizes a function subject to a constraint. At each step we move in the direction of the negative gradient, and then "project" onto the feasible set.



          $$
          textProblem 2: min_x f(x) text subject to x in C
          $$



          $$
          y_k+1 = x_k - t_k nabla f(x_k)\
          x_k+1 = textarg min_x in C |y_k+1-x|
          $$






          share|cite|improve this answer


















          • 1




            pretty good answer thanks
            – erogol
            Nov 19 '13 at 19:43






          • 1




            sexy answer, sir
            – Enlightened One
            Jul 23 '16 at 3:02










          • simplicity is far under-appreciated. Thank you for this answer.
            – Kai
            Jun 1 at 17:08

















          up vote
          -1
          down vote













          The previous answer is perfect. I would like to add one thing, imagine you have to optimize a convex function but with a non convex constraint, that time gradient descent will help but with every iteration we will make sure that my solution does not go out of the constraint domain so in every iteration I will project my solution into the constraint set until the convergence reached.In that way this method is extremely useful.






          share|cite|improve this answer






















            Your Answer




            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "69"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f571068%2fwhat-is-the-difference-between-projected-gradient-descent-and-ordinary-gradient%23new-answer', 'question_page');

            );

            Post as a guest






























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            43
            down vote



            accepted










            At a basic level, projected gradient descent is just a more general method for solving a more general problem.



            Gradient descent minimizes a function by moving in the negative gradient direction at each step. There is no constraint on the variable.
            $$
            textProblem 1: min_x f(x)
            $$
            $$
            x_k+1 = x_k - t_k nabla f(x_k)
            $$



            On the other hand, projected gradient descent minimizes a function subject to a constraint. At each step we move in the direction of the negative gradient, and then "project" onto the feasible set.



            $$
            textProblem 2: min_x f(x) text subject to x in C
            $$



            $$
            y_k+1 = x_k - t_k nabla f(x_k)\
            x_k+1 = textarg min_x in C |y_k+1-x|
            $$






            share|cite|improve this answer


















            • 1




              pretty good answer thanks
              – erogol
              Nov 19 '13 at 19:43






            • 1




              sexy answer, sir
              – Enlightened One
              Jul 23 '16 at 3:02










            • simplicity is far under-appreciated. Thank you for this answer.
              – Kai
              Jun 1 at 17:08














            up vote
            43
            down vote



            accepted










            At a basic level, projected gradient descent is just a more general method for solving a more general problem.



            Gradient descent minimizes a function by moving in the negative gradient direction at each step. There is no constraint on the variable.
            $$
            textProblem 1: min_x f(x)
            $$
            $$
            x_k+1 = x_k - t_k nabla f(x_k)
            $$



            On the other hand, projected gradient descent minimizes a function subject to a constraint. At each step we move in the direction of the negative gradient, and then "project" onto the feasible set.



            $$
            textProblem 2: min_x f(x) text subject to x in C
            $$



            $$
            y_k+1 = x_k - t_k nabla f(x_k)\
            x_k+1 = textarg min_x in C |y_k+1-x|
            $$






            share|cite|improve this answer


















            • 1




              pretty good answer thanks
              – erogol
              Nov 19 '13 at 19:43






            • 1




              sexy answer, sir
              – Enlightened One
              Jul 23 '16 at 3:02










            • simplicity is far under-appreciated. Thank you for this answer.
              – Kai
              Jun 1 at 17:08












            up vote
            43
            down vote



            accepted







            up vote
            43
            down vote



            accepted






            At a basic level, projected gradient descent is just a more general method for solving a more general problem.



            Gradient descent minimizes a function by moving in the negative gradient direction at each step. There is no constraint on the variable.
            $$
            textProblem 1: min_x f(x)
            $$
            $$
            x_k+1 = x_k - t_k nabla f(x_k)
            $$



            On the other hand, projected gradient descent minimizes a function subject to a constraint. At each step we move in the direction of the negative gradient, and then "project" onto the feasible set.



            $$
            textProblem 2: min_x f(x) text subject to x in C
            $$



            $$
            y_k+1 = x_k - t_k nabla f(x_k)\
            x_k+1 = textarg min_x in C |y_k+1-x|
            $$






            share|cite|improve this answer














            At a basic level, projected gradient descent is just a more general method for solving a more general problem.



            Gradient descent minimizes a function by moving in the negative gradient direction at each step. There is no constraint on the variable.
            $$
            textProblem 1: min_x f(x)
            $$
            $$
            x_k+1 = x_k - t_k nabla f(x_k)
            $$



            On the other hand, projected gradient descent minimizes a function subject to a constraint. At each step we move in the direction of the negative gradient, and then "project" onto the feasible set.



            $$
            textProblem 2: min_x f(x) text subject to x in C
            $$



            $$
            y_k+1 = x_k - t_k nabla f(x_k)\
            x_k+1 = textarg min_x in C |y_k+1-x|
            $$







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            edited Nov 20 '13 at 0:48

























            answered Nov 19 '13 at 1:45









            p.s.

            4,24211314




            4,24211314







            • 1




              pretty good answer thanks
              – erogol
              Nov 19 '13 at 19:43






            • 1




              sexy answer, sir
              – Enlightened One
              Jul 23 '16 at 3:02










            • simplicity is far under-appreciated. Thank you for this answer.
              – Kai
              Jun 1 at 17:08












            • 1




              pretty good answer thanks
              – erogol
              Nov 19 '13 at 19:43






            • 1




              sexy answer, sir
              – Enlightened One
              Jul 23 '16 at 3:02










            • simplicity is far under-appreciated. Thank you for this answer.
              – Kai
              Jun 1 at 17:08







            1




            1




            pretty good answer thanks
            – erogol
            Nov 19 '13 at 19:43




            pretty good answer thanks
            – erogol
            Nov 19 '13 at 19:43




            1




            1




            sexy answer, sir
            – Enlightened One
            Jul 23 '16 at 3:02




            sexy answer, sir
            – Enlightened One
            Jul 23 '16 at 3:02












            simplicity is far under-appreciated. Thank you for this answer.
            – Kai
            Jun 1 at 17:08




            simplicity is far under-appreciated. Thank you for this answer.
            – Kai
            Jun 1 at 17:08










            up vote
            -1
            down vote













            The previous answer is perfect. I would like to add one thing, imagine you have to optimize a convex function but with a non convex constraint, that time gradient descent will help but with every iteration we will make sure that my solution does not go out of the constraint domain so in every iteration I will project my solution into the constraint set until the convergence reached.In that way this method is extremely useful.






            share|cite|improve this answer


























              up vote
              -1
              down vote













              The previous answer is perfect. I would like to add one thing, imagine you have to optimize a convex function but with a non convex constraint, that time gradient descent will help but with every iteration we will make sure that my solution does not go out of the constraint domain so in every iteration I will project my solution into the constraint set until the convergence reached.In that way this method is extremely useful.






              share|cite|improve this answer
























                up vote
                -1
                down vote










                up vote
                -1
                down vote









                The previous answer is perfect. I would like to add one thing, imagine you have to optimize a convex function but with a non convex constraint, that time gradient descent will help but with every iteration we will make sure that my solution does not go out of the constraint domain so in every iteration I will project my solution into the constraint set until the convergence reached.In that way this method is extremely useful.






                share|cite|improve this answer














                The previous answer is perfect. I would like to add one thing, imagine you have to optimize a convex function but with a non convex constraint, that time gradient descent will help but with every iteration we will make sure that my solution does not go out of the constraint domain so in every iteration I will project my solution into the constraint set until the convergence reached.In that way this method is extremely useful.







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited Sep 3 at 10:57









                amWhy

                190k26221433




                190k26221433










                answered Sep 3 at 10:30









                explorer

                1715




                1715



























                     

                    draft saved


                    draft discarded















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f571068%2fwhat-is-the-difference-between-projected-gradient-descent-and-ordinary-gradient%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    這個網誌中的熱門文章

                    How to combine Bézier curves to a surface?

                    Mutual Information Always Non-negative

                    Why am i infinitely getting the same tweet with the Twitter Search API?