Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I would go with Hetzner: https://www.hetzner.com/dedicated-rootserver/ex51-ssd-gpu

GTX1080 for 100$ a month. Grantend, it is older, but it still works for DL. Let's say you do 10 experiments a month for ~20 hours. Thats 0.5$/hour and I don't think it is 3 times faster.

If you then want to do even more learning the price goes even down.

//DISCLAIMER: I do not work for them, but used it for DL in the past and it was for sure cheaper than GCP or AWS. If you have to do lots of experiments (>year) go with your own hardware, but do not underestimate the convenience of >100MByte/s if you download many big training sets.



For traditional floating point workloads, the RTX 6000 will probably not be 3x faster. For workloads that can use the tensor ops (integer matrix multiply, basically), the RTX 6000 may be as much as 10-100x faster.


Agree, had exactly the same experience.

It is not a server card, however, it is much faster than any old AWS instances for 1k$/m (if you happen to be an AWS user and did not want to upgrade because of the price going up 3x) TBH, 100 bucks per month is free, while most of the researches do not have 1k$/m for a server, it is cheaper to buy hardware and put Linux on it.

There are of course other options and Linode is kinda late to the party, but I am happy they made this move.


>There are of course other options and Linode is kinda late to the party, but I am happy they made this move.

Considering their main competitor, DO, Vultr, UpCloud, none of the them offers any GPU instances, I don't think they are late at all. If not the first for their market segment.


vast.ai is even cheaper https://vast.ai/console/create/


How does data in/out work in practice with them? I see this 4 Tbit bandwidth but do you happen to know what that translates to and what happens if you exceed that?

Also check availability shows a 5 day wait current: “EX51-SSD-GPU for Falkenstein (FSN1): Due to very high demand for these server models, its current setup time is approximately up to 5 workdays.*” Or maybe there are other regions/dcs.


I have like 18 of their auction servers that are unmetered at 1gbps and really make that bandwidth sweat. I've never had issues honestly, and they've never tried to dreamhost me. I love it.


So far I did not reach that limit (I used it to train networks for image segmentation) so I had mostly ingress and only downloaded large amounts to the machine not from it (which is free like with most providers).

But you can just ask them.

I have to say that not everything was 100% smooth - sometimes the proprietary NVidia driver crashed (you have to use the right CUDA and driver combination) my Linux instance and hanged the system, so I had to hard-reboot it (which is supported via their admin console) which takes some minutes. However that's not their fault as I heard the driver is a big pile of crap shit anyway because NVidia is too embarrassed to post it to LKML.


I thought you are not allowed to put Consumer Graphics Card in Datacenter?

Or is that prohibited in US only?


Tech speaking it’s nVidia’s GeForce driver that restricts datacenter usage not the card itself.

Not deep dived into it but maybe using nouveau instead of GeForce works around that restriction.

You are allowed to use the driver in data centres for cryptocurrency usage. The EULA limited datacenter usage hasn’t really been challenged in court yet. Both sides would have an argument. NVidia are using the Eula to limit an activity that a user would be allowed to do if the location that activity was different (and not even talking type of industry here, though that’s prob in the Eula too) On the other hand, it’s nVidia’s software, they are free to license it how they like.


You can't use nouveau for CUDA, which sort of negates the whole point of having a GPU in a server in the first place.


I've not deep dived into nouveau for a while so wasn't sure if they added cuda support in the past couple of years since I played with it which is why I only said "maybe".


Why?! NVIDIA is a police and a sovereign state now?


It's a flat fee of $100/month, correct? What would be the best option if the amount of training you do is rather "occasional" (but simply using colab doesn't cut it anymore)?


I don't have specific experience with ML, but AWS spot pricing is by far the best deal last time i checked for GPU. You can get something much more powerful than a gtx1080 and get your task done more quickly. The downside is that at any time your instance can be shut down after a short warning signal to backup your progress, so it may or may not be suitable for what you're doing.


Does the price actually depend on whether you are using a GPU or simply an instance you choose? Let's say you need to do some work that will require a GPU, so you spend 5 hours setting up an environment, doing some light programming/experiments in an Jupyter notebook, downloading datasets, looking at the data. Then you train for an hour then one more hour looking at the data, drinking coffee, stuff like that. Then train again.

So you were using the environment for 10 hour, but only 3 of them in total were using GPU. Will you pay for 10 hours of GPU usage, or will only 3h be expensive and 7h cheap?


If you use a GPU instance you pay the cost for it whether not you use the actual GPU. If the GPU time is short relative to the other stuff you are doing (like data cleanup) it might make sense to do your non-GPU related setup on a different instance first.


I have one of these instances. It's awesome and I recommend it highly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: