I just started using this myself, seems pretty great so far!

Clearly doesn’t stop all AI crawlers, but a significantly large chunk of them.

  • Possibly linux@lemmy.zip
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    4
    ·
    edit-2
    11 hours ago

    It is not great on many levels.

    • It only runs against the Firefox user agent. This is not great as the user agent can easy be changed. It may work now but tomorrow that could all change.

    • It doesn’t measure load so even if your website has only a few people accessing it they will stick have to do the proof of work.

    • The POW algorithm is not well designed and requires a lot of compute on the server which means that it could be used as a denial of service attack vector. It also uses sha256 which isn’t optimized for a proof of work type calculation and can be brute forced pretty easily with hardware.

    • I don’t really care for the animé cat girl thing. This is more of a personal thing but I don’t think it is appropriate.

    In summary the Tor implementation is a lot better. I would love to see someone port it to the clearnet. I think this project was created by someone lacking experience which I find a bit concerning.

    • zutto@lemmy.fedi.zutto.fiOP
      link
      fedilink
      English
      arrow-up
      7
      ·
      11 hours ago
      1. Doesn’t run against Firefox only, it runs against whatever you configure it to. And also, from personal experience, I can tell you that majority of the AI crawlers have keyword “Mozilla” in the user agent.

      2. Yes, this isn’t cloudflare, but I’m pretty sure that’s on the Todo list. If not, make an issue to the project please.

      3. The computational requirements on the server side are a less than a fraction of the cost what the bots have to spend, literally. A non-issue. This tool is to combat the denial of service that these bots cause by accessing high cost services, such as git blame on gitlab. My phone can do 100k sha256 sums per second (with single thread), you can safely assume any server to outperform this arm chip, so you’d need so much resources to cause denial of service that you might as well overload the server with traffic instead of one sha256 calculation.


      And this isn’t really comparable to Tor. This is a self hostable service to sit between your web server/cdn and service that is being attacked by mass crawling.

    • nrab@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      4
      ·
      10 hours ago

      …you do realize that brute forcing it is the work you use to prove yourself, right? That’s the whole point of PoW

      • Possibly linux@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 hours ago

        True, I should of phrased that better.

        The issue is that sha256 is fairly easy to do at scale. Modern high performance hardware is well optimized for it so you could still perform attack with a bunch of GPUs. AI scrapers tend to have a lot of those.