- Very few high schools in America offer these classes. Even fewer people take them. The lie to yourself is not recognizing your bubble. You might think you're encouraging others, but you're doing the opposite. People who had those opportunities are likely not the ones that feel like ML is beyond their capabilities.
- While you can be successful in ML without math, this does not mean you should discourage its pursuit (just as you shouldn't place it as a gate keeping requirement. Even Calc and LA aren't required!).
- Math is about a way of thinking and approaching problems. These skills generalize beyond the ability to solve mathematical functions.
- The mathematical knowledge compounds and will make your models better. This may be nonobvious, especially given your suggested background, you've lived with this knowledge for quite some time. But if you haven't gone into things like statistical theory (more than ISLR), probability, metric theory, optimization, and so on, it is quite difficult to see how these help you in the same way it's hard to see what's on a shelf above you. It can also be difficult to explain how these help if you lack the language. But if you want to build good products (that work in the real world and not just in a demo), you'll find this knowledge is invaluable. If you don't understand why, let this be a signal of your overconfidence. Models aren't worth shit if they don't generalize (I'm not talking about AGI, I'm talking about generalizing to customer data)[0].
[0] Being an ML researcher, I specifically have a horse in this race. The more half assed scam products (e.g. Rabbit, Devin, etc) that get out there, the more the public turns to believing ML is another Silicon Valley hype scam. Hype is (unfortunately) essential and allows for bootstrapping, but the game is to replace the bubble before it pops. The more you put into that bubble the more money comes, but also the more ground you have to make up, and the less time you have to do so. Success is the bubble popping without anyone noticing, not how loud it pops.
Hey godelski, thanks for actually taking the time to clarify on this and it's always great to be taken seriously. I have not really much to add, but i would love to see your energy in educating the people that are commenting here.
Thanks. I am passionate haha. I do like to educate, but it is often difficult to do on these forms as I do not know peoples backgrounds. If you throw too high level at them, I find it does more harm than good. I also find that while I know many here took things like calculus, that we also need to recognize that skills degrade with time (though can be regained more easily).
I do understand that it is difficult to find the math pathways in ML. What I do suggest is focusing on generative models[0], but the important part is to keep asking "why" and "what does this actually mean?" I think it is too easy to get stuck in knowing the answer and accepting it without understanding that there is no "right" answer, but less wrong answers. The path gets clearer if this attitude is adopted. This is where I've found teaching highly beneficial[1], as you face the basics and you'll find many questions that are easy to dismiss. To be a good teacher you have to take dumb questions seriously to determine if they're actually dumb or "dumb"[2]. Things you probably asked when learning: "how many layers?", "how many neurons per layer?", "is the optimization space actually smooth like the 2d decent graphs we saw?", "does data lie on a manifold?", "what is a manifold", "does data always lie on a lower dimensional manifold?", "is data always a distribution?", "is my data actually representative of my goals?", "what does this measurement mean?", "what does this measurement not tell me?", and so on. These are all extremely important questions that are almost universally ignored, but that all appear to have simple answers.
The reason I do like category theory (Bartosz also has video lectures for those), is because it helps connect the many different disciplines of math that are needed to answer some of these. To see the generalizations of things like surjective (epimorphic) and injective (monomorphic) functions, plays a role in answering the layer and neuron questions. It was the way I could start understanding how field theory wasn't just cool but impractical math.
But to ML, I think there's this hard gap and I'm not sure of a good resource that fills it (I'm working on one myself). That there's lots of basics (blogs like "the math behind transformers" that show the equations and little to no more) and there's also plenty at a high level by experts in cat theory, set theory, algebraic geometry, or others. The former aren't very useful and the latter can be easily found if you have the requisite knowledge but are impenetrable if you don't.
But with diffusion models and score matching being all the rage now, I highly suggest reading into Aapo Hyvärinen's[3] work. At lower levels I suggest Gelman's book and/or McElreath's. Needham has impressive illustrations and writes so well that you can only emulate because he's at a different level. I found Shao's Mathematical Statistics greatly helpful, but this is not easy to parse. Gallier and Quaintance are also worth looking into. But if you need something on the easier side, Tomczak is your friend. On the YouTube side I'll recommend some that are more easily missed: mathemaniac, EpsilonDeltaMain, jHan, ron-math, and Mutual_Information. You should find more from these too. Also search the #some{,2,3} tags, there's a lot of hidden gems. There's also a Cats4AI group btw, and many of them have now created a startup.
[0] Realistically all models are generative. The definition is ill-defined and you can even demonstrate that classifiers are EBMs. https://arxiv.org/abs/1912.03263
[1] I'm finishing a PhD program, so I do teach a ML course
[2] In either case you have to answer nicely. But it is easy to trick yourself into thinking something is simple when it is not.
/ quote /
But if you haven't gone into things like statistical theory (more than ISLR), probability, metric theory, optimization, and so on, it is quite difficult to see how these help you in the same way it's hard to see what's on a shelf above you. It can also be difficult to explain how these help if you lack the language.
That is actually correct, i miss those steps - if you can recommend anything besides d2l.ai (which i haven't finished yet) let me know! Enjoy your summer and train those smiling-muscles every once in a while.