minus-squarepunrca@piefed.worldtoSelfhosted@lemmy.world•Based on this graph, and this graph alone, guess at what time I completely blocked OpenAI crawlerslinkfedilinkEnglisharrow-up30arrow-down3·4 days agoIt’s best to use either Cloudflare (best IMO) or Anubis. If you don’t want any AI bots, then you can setup Anubis (open source; requires JavaScript to be enabled by the end user): https://github.com/TecharoHQ/anubis Cloudflare automatically setups robots.txt file to block “AI crawlers” (but you can setup to allow “AI search” for better SEO). Eg: https://blog.cloudflare.com/control-content-use-for-ai-training/#putting-up-a-guardrail-with-cloudflares-managed-robots-txt Cloudflare also has an option of “AI labyrinth” to serve maze of fake data to AI bots who don’t respect robots.txt file. linkfedilink
punrca@piefed.world to Reddit@lemmy.worldEnglish · 2 months agoReddit is Being Manipulated by Professional Shills Every Day (reupload) - YouTubeplus-squarewww.youtube.comexternal-linkmessage-square4linkfedilinkarrow-up12arrow-down10
arrow-up12arrow-down1external-linkReddit is Being Manipulated by Professional Shills Every Day (reupload) - YouTubeplus-squarewww.youtube.compunrca@piefed.world to Reddit@lemmy.worldEnglish · 2 months agomessage-square4linkfedilink
It’s best to use either Cloudflare (best IMO) or Anubis.
If you don’t want any AI bots, then you can setup Anubis (open source; requires JavaScript to be enabled by the end user): https://github.com/TecharoHQ/anubis
Cloudflare automatically setups robots.txt file to block “AI crawlers” (but you can setup to allow “AI search” for better SEO). Eg: https://blog.cloudflare.com/control-content-use-for-ai-training/#putting-up-a-guardrail-with-cloudflares-managed-robots-txt
Cloudflare also has an option of “AI labyrinth” to serve maze of fake data to AI bots who don’t respect robots.txt file.