It depends what you mean by "easy" here, but if you define that as "throw a big ...

horsawlarway · on Sept 24, 2020

The research showed that small networks did work, but they took manual intervention during training.

I think the whole point was that a small network is capable of solving this problem, but training from scratch without intervention only rarely produced a viable solution.

From the article

- But in most cases the trained neural network did not find the optimal solution, and the performance of the network decreased even further as the number of steps increased. The result of training the neural network was largely affected by the chosen set training examples as well as the initial parameters.

So a solution was clearly possible even in the smaller networks when trained from scratch, it just wasn't likely.

So I agree with your first point - The bigger networks did indeed "figure it out". But I don't agree at all with your second - The smaller networks weren't lacking the power to solve the problem, they were just unlikely to reach the right solution with the available training data.