+ 4
Why does relu train faster than sigmoid?
2 Answers
+ 2
Efficiency: ReLu is faster to compute than the sigmoid function, and its derivative is faster to compute. This makes a significant difference to training and inference time for neural networks: only a constant factor, but constants can matter.
0
SITHU Nyein does it also have to do with the fact that relu has less noise (deactivates neurons below zero completely, unlike sigmoid)?