+ 3
Gradient Descent Optimization
Anyone familiar with the G.D.O. techniques help me in this... 1. Adam 2. AdaMax 3. Adagrad 4. Nesterov momentum 5. RMSprop 6. L1 and L2 regularization These are some of the techniques applied in G.D.O. 1. Which one of these algorithms can be used together? 2. Please help with links to resources for further reading. 3. Can I apply them in Linear Regression? I have seen scikit learn implementation not talking about them in its documentation. Thanks in advance
2 Answers
+ 2
José Ailton Batista da Silva thanks for the links