+ 3

Gradient Descent Optimization

Anyone familiar with the G.D.O. techniques help me in this... 1. Adam 2. AdaMax 3. Adagrad 4. Nesterov momentum 5. RMSprop 6. L1 and L2 regularization These are some of the techniques applied in G.D.O. 1. Which one of these algorithms can be used together? 2. Please help with links to resources for further reading. 3. Can I apply them in Linear Regression? I have seen scikit learn implementation not talking about them in its documentation. Thanks in advance

18th Oct 2019, 3:53 AM
Dan Rhamba
Dan Rhamba - avatar
2 Answers