By now, you've likely heard about different "regimes" of learning in wide neural nets --- in particular, the contrast between the "(neural tangent) kernel" regime, which is said to not learn features, and the "feature learning" or "mu-parameterized" regime, in which features are learned but analysis is much harder. This is a new foundational idea in deep learning theory (as evidenced by its importance to Sho + Guillame's excellent talks), and it's becoming clear that it's often important to understand for both theory + practice.
For that reason, I figured it'd be useful to give a tutorial sort of talk to explain what this is all about. It'll be very informal and aimed at non-experts. I'm not going to try to prove much, just explain what things are and give an intuitive flavor for why this all matters, how we should think about it, etc. Let's meet at 3pm tomorrow in the auditorium. ----------------------------------------------------------------------
Some central references if you're curious:
Some more references: