The architecture doesn't keep yielding better results, Jevon's paradox doesn't a...

impossiblefork · on Jan 27, 2025

But surely it can be scaled up, or is this compression thing something making the approach good only for small models (I haven't read the Deepseek papers (can't allocate time to it))?

joak · on Jan 27, 2025

Anyhow, if you can deliver more with less, this is huge good news for AI industry.

After some readjustment we can expect AI companies to start using the new method to deliver more. Science fiction might happen sooner than expected.

Buy the dip.