Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The architecture doesn't keep yielding better results, Jevon's paradox doesn't apply.


But surely it can be scaled up, or is this compression thing something making the approach good only for small models (I haven't read the Deepseek papers (can't allocate time to it))?


Anyhow, if you can deliver more with less, this is huge good news for AI industry.

After some readjustment we can expect AI companies to start using the new method to deliver more. Science fiction might happen sooner than expected.

Buy the dip.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: