Cloudflare to AI Crawlers: Pay or be blocked (blog.cloudflare.com)
from wosat@lemmy.world to technology@lemmy.world on 01 Jul 16:05
https://lemmy.world/post/32323115

Cloudflare, along with a majority of the world’s leading publishers and AI companies, is changing the default to block AI crawlers unless they pay creators for content.

#technology

threaded - newest

Bell@lemmy.world on 01 Jul 16:16 next collapse

A nice explanation of what’s wrong with web-based AI. I hope other content providers follow suit.

InternetCitizen2@lemmy.world on 01 Jul 16:59 next collapse

Kind of reminds me of that time CF bait and switched that gambling website. Everyone was wondering who they find more predatory and distasteful.

sucoiri@lemmy.world on 01 Jul 21:38 collapse

iirc it came out that the gambling site was intentionally cycling through cloudflare’s IP range to avoid IP blocking, causing legitimate websites to be flagged as unsafe. They said they’d still host them, they just need their own IP and not one of cloudflare’s. Horribly communicated to the customer though

InternetCitizen2@lemmy.world on 01 Jul 21:47 collapse

That makes more sense now.

romantired@shibanu.app on 01 Jul 17:05 next collapse

Awesome, clear, you know how to do it, you can do it!

3dcadmin@lemmy.relayeasy.com on 01 Jul 17:16 next collapse

Cloudflare aren’t perfect, but I still use them because for a free account the benefits outweigh the negatives like this… However to say the worlds leading publishers and AI are on board is simply not true…

drmoose@lemmy.world on 01 Jul 17:39 next collapse

This has been tried before many times. The problem is that this exchange can never satisfy all of the parties.

Would site host take 0.01$ for a page? If its Walmart.com then they’d happily lose even 0.10$ or more if competitors can’t analyze their products and other perceived IP damages.

For example, let’s assume they do the business math and come out that if it is anything below 5$ is a no deal - what scraper would pay 5$ for a single product page scrape? Maybe openAI can pay that but is this what we want where public scraping is only accessible to billionaires? What if you’re just a user that wants to track Walmart price to build your own budgeting script? Are you paying 5$ on every request?

Now for creative content like blogs etc. it could actually work and micropayments have been holy grail here forever but what more likely to happen is that free content will outcompete paid because when LLM asks do you want to read this for free or pay 2$ for this other source 99% of the users will pick free because some unknown source has zero authority in the end user’s eyes to justify this risk.

What I suspect will happen is similar with what happened with SEO spam rise but it’ll be a but better because LLMs are harder to game than Google. Most content will be free but have injected biases, shilling or other promotions or agendas to subsidize the costs. On the other hand a lot of content will remain high quality and free as a legitimate source for authority signaling within relevant industries which is already a big thing.

Anyway thanks for coming to my ted talk

MCasq_qsaCJ_234@lemmy.zip on 01 Jul 18:54 next collapse

Could this conflict with the fair use of AI training rules?

ter_maxima@jlai.lu on 01 Jul 19:02 collapse

No, the fact it’s technically legal doesn’t mean you have to make it easy for them.

joyjoy@lemmy.zip on 01 Jul 19:57 collapse

Yup. “Legal” just means the government won’t punish you.

ter_maxima@jlai.lu on 01 Jul 19:01 collapse

If they could stop blocking real users at the same time they block AI crawlers, that would be nice.

Imgonnatrythis@sh.itjust.works on 01 Jul 20:28 next collapse

You just need to pay!

AceFuzzLord@lemmy.zip on 01 Jul 22:45 collapse

I absolutely hate cloudflare because they always block me whenever I visit a site while connected to any VPN server on Proton, regardless of country. Even when I’m not connected to a VPN I sometimes have trouble with them deciding my connection is suspicious despite not having anything that should trigger it.

fmstrat@lemmy.nowsci.com on 02 Jul 13:19 collapse

Make a dummy Google Account, and log into it when on the VPN. Having an ad history avoids the blocks usually. (Note: only do this if your browsing is not activist related/etc)

Also, if it’s image captchas that never end, switch to the accessibility option for the captcha.