I would go with Hetzner: https://www.hetzner.com/dedicated-rootserver/ex51-ssd-g...

mokus · on June 20, 2019

For traditional floating point workloads, the RTX 6000 will probably not be 3x faster. For workloads that can use the tensor ops (integer matrix multiply, basically), the RTX 6000 may be as much as 10-100x faster.

tarasmatsyk · on June 20, 2019

Agree, had exactly the same experience.

It is not a server card, however, it is much faster than any old AWS instances for 1k$/m (if you happen to be an AWS user and did not want to upgrade because of the price going up 3x) TBH, 100 bucks per month is free, while most of the researches do not have 1k$/m for a server, it is cheaper to buy hardware and put Linux on it.

There are of course other options and Linode is kinda late to the party, but I am happy they made this move.

ksec · on June 20, 2019

>There are of course other options and Linode is kinda late to the party, but I am happy they made this move.

Considering their main competitor, DO, Vultr, UpCloud, none of the them offers any GPU instances, I don't think they are late at all. If not the first for their market segment.

listic · on June 20, 2019

vast.ai is even cheaper https://vast.ai/console/create/

svd4anything · on June 20, 2019

How does data in/out work in practice with them? I see this 4 Tbit bandwidth but do you happen to know what that translates to and what happens if you exceed that?

Also check availability shows a 5 day wait current: “EX51-SSD-GPU for Falkenstein (FSN1): Due to very high demand for these server models, its current setup time is approximately up to 5 workdays.*” Or maybe there are other regions/dcs.

indalo · on June 20, 2019

I have like 18 of their auction servers that are unmetered at 1gbps and really make that bandwidth sweat. I've never had issues honestly, and they've never tried to dreamhost me. I love it.

picozeta · on June 20, 2019

So far I did not reach that limit (I used it to train networks for image segmentation) so I had mostly ingress and only downloaded large amounts to the machine not from it (which is free like with most providers).

But you can just ask them.

I have to say that not everything was 100% smooth - sometimes the proprietary NVidia driver crashed (you have to use the right CUDA and driver combination) my Linux instance and hanged the system, so I had to hard-reboot it (which is supported via their admin console) which takes some minutes. However that's not their fault as I heard the driver is a big pile of crap shit anyway because NVidia is too embarrassed to post it to LKML.

ksec · on June 20, 2019

I thought you are not allowed to put Consumer Graphics Card in Datacenter?

Or is that prohibited in US only?

Crosseye_Jack · on June 20, 2019

Tech speaking it’s nVidia’s GeForce driver that restricts datacenter usage not the card itself.

Not deep dived into it but maybe using nouveau instead of GeForce works around that restriction.

You are allowed to use the driver in data centres for cryptocurrency usage. The EULA limited datacenter usage hasn’t really been challenged in court yet. Both sides would have an argument. NVidia are using the Eula to limit an activity that a user would be allowed to do if the location that activity was different (and not even talking type of industry here, though that’s prob in the Eula too) On the other hand, it’s nVidia’s software, they are free to license it how they like.

sannee · on June 20, 2019

You can't use nouveau for CUDA, which sort of negates the whole point of having a GPU in a server in the first place.

Crosseye_Jack · on June 20, 2019

I've not deep dived into nouveau for a while so wasn't sure if they added cuda support in the past couple of years since I played with it which is why I only said "maybe".

PedroBatista · on June 20, 2019

Why?! NVIDIA is a police and a sovereign state now?

krick · on June 20, 2019

It's a flat fee of $100/month, correct? What would be the best option if the amount of training you do is rather "occasional" (but simply using colab doesn't cut it anymore)?

tootahe45 · on June 20, 2019

I don't have specific experience with ML, but AWS spot pricing is by far the best deal last time i checked for GPU. You can get something much more powerful than a gtx1080 and get your task done more quickly. The downside is that at any time your instance can be shut down after a short warning signal to backup your progress, so it may or may not be suitable for what you're doing.

krick · on June 20, 2019

Does the price actually depend on whether you are using a GPU or simply an instance you choose? Let's say you need to do some work that will require a GPU, so you spend 5 hours setting up an environment, doing some light programming/experiments in an Jupyter notebook, downloading datasets, looking at the data. Then you train for an hour then one more hour looking at the data, drinking coffee, stuff like that. Then train again.

So you were using the environment for 10 hour, but only 3 of them in total were using GPU. Will you pay for 10 hours of GPU usage, or will only 3h be expensive and 7h cheap?

cstejerean · on June 20, 2019

If you use a GPU instance you pay the cost for it whether not you use the actual GPU. If the GPU time is short relative to the other stuff you are doing (like data cleanup) it might make sense to do your non-GPU related setup on a different instance first.

icelancer · on June 20, 2019

I have one of these instances. It's awesome and I recommend it highly.