@algernon@lemmy.ml avatar

algernon

@[email protected]

A tiny mouse, a hacker.

This profile is from a federated server and may be incomplete. View on remote instance

algernon ,
@algernon@lemmy.ml avatar

Sadly, that's not code Linus wrote. Nor one he merged. (It's from git, copied from rsync, committed by Junio)

algernon ,
@algernon@lemmy.ml avatar

...and here I am, running a blog that if it gets 15k hits a second, it won't even bat an eye, and I could run it on a potato. Probably because I don't serve hundreds of megabytes of garbage to visitors. (The preview image is also controllable iirc, so just, like, set it to something reasonably sized.)

algernon ,
@algernon@lemmy.ml avatar

I only serve bloat to AI crawlers.

map $http_user_agent $badagent {
  default     0;
  # list of AI crawler user agents in "~crawler 1" format
}

if ($badagent) {
   rewrite ^ /gpt;
}

location /gpt {
  proxy_pass https://courses.cs.washington.edu/courses/cse163/20wi/files/lectures/L04/bee-movie.txt;
}

...is a wonderful thing to put in my nginx config. (you can try curl -Is -H "User-Agent: GPTBot" https://chronicles.mad-scientist.club/robots.txt | grep content-length: to see it in action ;))

algernon ,
@algernon@lemmy.ml avatar

That would result in those fediverse servers theoretically requesting 333333 * 114MB = ~38Gigabyte/s.

On the other hand, if the site linked would not serve garbage, and would fit like 1Mb like a normal site, then this would be only ~325mb/s, and while that's still high, it's not the end of the world. If it's a site that actually puts effort into being optimized, and a request fits in ~300kb (still a lot, in my book, for what is essentially a preview, with only tiny parts of the actual content loaded), then we're looking at 95mb/s.

If said site puts effort into making their previews reasonable, and serve ~30kb, then that's 9mb/s. It's 3190 in the Year of Our Lady Discord. A potato can serve that.

algernon ,
@algernon@lemmy.ml avatar

I don't think serving 86 kilobytes to AI crawlers will make any difference in my bandwidth use :)

algernon ,
@algernon@lemmy.ml avatar

It's not. It just doesn't get enough hits for that 86k to matter. Fun fact: most AI crawlers hit /robots.txt first, they get served a bee movie script, fail to interpret it, and leave, without crawling further. If I'd let them crawl the entire site, that'd result in about two megabytes of traffic. By serving a 86kb file that doesn't pass as robots.txt and has no links, I actually save bandwidth. Not on a single request, but by preventing a hundred others.

algernon ,
@algernon@lemmy.ml avatar

It's about 5 times longer than previous releases were maintained for, and is an experiment. If there's a need for a longer term support branch, there will be one. It's pointless to start maintaining an 5+ year branch with 0 users and a handful of volunteers, none of whom are paid for doing the maintenance.

So yes, in that context, 15 months is long.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • kbinchat
  • All magazines