The sad part is that it's trivial to get around CF's bot protection if you're wr...

homebrewer · on Jan 2, 2025

It's the same for spam email, yet most spam gets caught in spamassassin rules that were written 20 years ago and haven't seen much improvement since then. Most bad guys just don't bother to do anything above the bare minimum. For example, I see lots of email getting caught in a rule that checks for incorrectly formatted pseudo-Outlook mailer header, which is trivial to circumvent if you pay any attention to it (the difference is in excessive whitespace, or a slightly incorrect "Outlook" version, or something like that).

somat · on Jan 3, 2025

see also: The surprising effectiveness of simply asking the spam server to try again(sometimes called graylisting). It shouldn't work at all, but proves to filter an awful lot of the worst mail noise.

http://man.openbsd.org/spamd

derefr · on Jan 2, 2025

> it's pretty much impossible to bypass as a human if their magical black box doesn't like your browser and/or IP address

There are residential-IP-backed VPN services that you can use just like commercial VPN services — but they're mostly built on the backs of botnets, so it's ethically questionable to use them.

michaelmior · on Jan 2, 2025

FWIW, StarVPN claims to have "ethically sourced" IPs. That is, not from botnets. Their pricing is quite a bit higher than many (cheapest plan is $20/month), but could be worth trying.

https://www.starvpn.com/

mike_d · on Jan 2, 2025

The "residential VPN" providers setup fake ISPs or buy AT&T/Verizon business circuits with large blocks of IPs and sell them as residential.

They are easily detected if you are buying IP intelligence from one of the higher quality providers: https://app.spur.us/context?q=STARVPN_PROXY

duckmysick · on Jan 3, 2025

The linked page shows a sign-in screen.

michaelmior · on Jan 3, 2025

Spur access requires a free account.

michaelmior · on Jan 3, 2025

That's helpful to know. I wasn't aware of this.

devilbunny · on Jan 2, 2025

You could also use Tailscale back to your own IP if the goal is not having to trust public WiFi.

makeitdouble · on Jan 3, 2025

To note, IP is only a part of it, and the full extent of what's baked into a CF score will never be explicited (for obvious reasons).

CloudFront being way past the simple blocking of IP addresses, I wouldn't be surprised if a mismatch between your IP block and your language/cookies would be enough to lower your score.

ghxst · on Jan 2, 2025

This is great for bypassing the server side bot detection but not the client side one, where it will attempt to verify the integrity of your browser environment.

hedora · on Jan 3, 2025

Well yeah, if you’re a legitimate user, CF will block you.

It’s only easy to bypass if you’re scraping or doing nefarious stuff.

shadowgovt · on Jan 2, 2025

Surprisingly, it still works as intended. Yes, it won't keep professionals and dedicated bot-fabricators out, but that's like 5% of the botters out there; the rest are the bot equivalent of script kiddies who can't be bothered, and it filters them great. Meanwhile, the script kiddies have a process that still works on non-CF sites, so they don't need to improve their process.

hedora · on Jan 3, 2025

We bypassed it by switching to starlink. Now my IP address is a too-big-to-fail CGNAT.

The old IP address was a mom-and-pop CGNAT.

Thanks CF, for protecting us from capitalism, I guess?

PrimaryAlibi · on Jan 4, 2025

That's same for almost all surveillance/tracking tech. It's always trivial for criminals/abusers to bypass. The surveillance is just about controlling the sheep.

solardev · on Jan 2, 2025

How does it get around captchas?

tedivm · on Jan 2, 2025

If they don't think you're suspicious they don't make you do the captchas, and as others have mentioned you can always outsource it to captcha farms. There are also AI models which do a fairly decent amount, and since most captchas let you repeat attempts with new patterns you can have a pretty high error rate to get past them. Then there's the ADA, which requires accessibility- many captchas have an audio component as a backup and those are easy to interpret by models.

michaelmior · on Jan 2, 2025

curl-impersonate doesn't solve CAPTCHAs, but the goal is to look enough like a human that Cloudflare doesn't present a CAPTCHA in the first place.

gruez · on Jan 2, 2025

Cloudflare turnstile isn't even a captcha. The user just has to tick a box. Behind the scenes there's a javascript challenge to make sure you're vaguely a browser and not some script a bazillion requests per minute.

xdfgh1112 · on Jan 2, 2025

It's also used for proof of work as many scrapers are using thousands of IPs but only a few CPUs

gjsman-1000 · on Jan 2, 2025

You pay contract workers in a third world country a tiny amount of money per day, to spend all day clicking boxes.