👩💻 Join our community of thousands of amazing developers!

We have seen that simple linear classification schemes like logistic regression: \[p (y | x^T w) = \frac{e^{x^Tw_y}}{1+e^{x^Tw_y}}\] can work well but are limited. Alternative approaches like binning work well only for low input dimensionality. That’s where neural networks fill the gap. Neural networks allows us to learn not only the weights but also the useful features. In practice, they are what we call an universal approximator, i.e. an approximator to any function in a bounded continuous d...