Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's more about hardware - these models were trained on TPUs, while GPT-NeoX is being trained on GPUs graciously provided by Coreweave.


Any idea what the required GPU time would cost (if not donated)? Is GPT-3 just a commodity soon?


Our current estimate is that it requires between 2000 and 4000 V100 months.


With training improvements such as DeepSpeed, the GPU costs will likely be substantially lower than what was available at the time OpenAI trained GPT-3. Still not free, though.

The hard part with GPT-3 is it's big enough to make it difficult to actually deploy.


The number thrown around for gpt-3 is $4.6 million, but I am not sure where that figure originates.


It was a number tossed around by a GPU hosting provider, based on their own costs: https://lambdalabs.com/blog/demystifying-gpt-3/

The reality is that GPT-3 was likely "free" to train on Azure, as Microsoft has provided a lot of resources to OpenAI.


If this is true, I wonder what sort of social capital transactional exchange is going on instead.


~4M$ per full training give or take.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: