404media.co

gravitas_deficiency , to Not The Onion in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue

Perfectly balanced. As all things should be.

You didn’t create this dragon, Google. We did. And we are weirder than you understand.

misophonium , to Not The Onion in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue

Fucksmith forges finest fuck

lewdian69 , to Not The Onion in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue

Paywall

JATtho , to Technology in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue

I once said that the current "AI" is just a excel spread sheet with a few billion rows, from what all of the answer gets interpolated from...

NutWrench , to Technology in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue
@NutWrench@lemmy.world avatar

They also highlight the fact that Google’s AI is not a magical fountain of new knowledge, it is reassembled content from things humans posted in the past indiscriminately scraped from the internet and (sometimes) remixed to look like something plausibly new and “intelligent.”

This. "AI" isn't coming up with new information on its own. The current state of "AI" is a drooling moron, plagiarizing any random scrap of information it sees in a desperate attempt to seem smart. The people promoting AI are scammers.

AutistoMephisto , to Technology in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue
@AutistoMephisto@lemmy.world avatar
Resol ,
@Resol@lemmy.world avatar

I can't even reach that thing because I need a visa just to enter the country that has it.

AutistoMephisto ,
@AutistoMephisto@lemmy.world avatar

My guy, Google pays Reddit $60 Million/year for this. $60Million.

https://lemmy.world/pictrs/image/e1c18a68-a57a-451d-972f-0c10bbaa5413.png

I remember I once got told, years ago that I was stupid for saying "Data is the new Oil" and now look! Do you know what I could do if I had $60Million in my bank right now? And Google isn't the only one! Companies the world over are paying out the nose for user-generated content and business is booming! If I'm an oil well, it's time my oil came with a price tag. I was a Reddit user for YEARS! Almost since the beginning of Reddit! I made some of the training data that Google and others are using! Where's my cut of that $60M?

Resol ,
@Resol@lemmy.world avatar

That picture will forever haunt me in my dreams.

SlothMama ,

I want a whole Lemmy subreddit ( community? ) of the AI overviews gone wild like this, it's funny af

Maven ,
@Maven@lemmy.zip avatar

You should make one. I'd sub immediately

crusa187 , to Technology in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue

There’s an old adage in computing which really applies here:

Garbage in, garbage out.

Breve , to Technology in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue

I've used an LLM that provides references for most things it says, and it really ruined a lot of the magic when I saw the answer was basically copied verbatim from those sources with a little rewording to mash it together. I can't imagine trusting an LLM that doesn't do this now.

blusterydayve26 ,

Which one?!

Breve ,

Kagi's FastGPT. It's handy for quick answers to questions I'd normally punch in a search engine with the same ability to vet the sources.

Same ,

I'd hate to defend an llm, but Kagi FastGPT explicitly works by rewording search sources through an llm. It's not actually a stand alone llm, that's why it's able to cite it's sources.

Hackerman_uwu , to Technology in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue

Is this real though? Does ChatGPT just literally take whole snippets of texts like that? I thought it used some aggregate or probability based on the whole corpus of text it was trained on.

bionicjoey ,

It does, but the thing with the probability is that it doesn't always pick the most likely next bit of text, it basically rolls dice and picks maybe the second or third or in rare cases hundredth most likely continuation. This chaotic behaviour is part of what makes it feel "intelligent" and why it's possible to reroll responses to the same prompt.

uranos ,

This is not the model directly but the model looking through Google searches to give you an answer.

NutWrench , to Technology in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue
@NutWrench@lemmy.world avatar

I've been trying out SearX and I'm really starting to like it. It reminds me of early Internet search results before Google started added crap to theirs. There's currently 82 Instances to choose from, here

https://searx.space/

vox ,
@vox@sopuli.xyz avatar

it literally just proxies/aggregates google/bing search results tho?

Voroxpete ,

So does pretty much every search engine. Running your own web crawler requires a staggering amount of resources.

Mojeek is one you can check out if that's what you're looking for, but it's index is noticeably constrained compared to other search engines. They just don't have the compute power or bandwidth to maintain an up to date index of the entire web.

Mojeek ,
@Mojeek@lemmy.ml avatar

we're working on it 😉 slow and steady and all that; we also fixed a bug with recrawl recently that should be improving things

catalog3115 , to Technology in Samsung Requires Independent Repair Shops to Share Customer Data, Snitch on People Who Use Aftermarket Parts, Leaked Contract Shows

The use of aftermarket parts in repair is relatively common. This provision requires independent repair shops to destroy the devices of their own customers, and then to snitch on them to Samsung. 

That's just pure evil and bully. If you have aftermarket parts they will destroy the device and force you to pay for it. This is the reason we need right to repair. Every consumer should support it.

HeyMrDeadMan , to Reddit in Reddit’s Goon Cave Community Has Been Banned

So obviously I'm an idiot, I thought GoonCaves was the group people posted pictures of computer rooms overflowing with empty coke bottles, cigarette butts, fast food containers, and the occasional piss jug. What group am I thinking of?

EDIT: NeckbeardNests, that's what I was thinking of. Now see, that's just wholesome internet content.

MargotRobbie , to Technology in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue
@MargotRobbie@lemmy.world avatar

Reddit, and by extension, Lemmy, offers the ideal format for LLM datasets: human generated conversational comments, which, unlike traditional forums, are organized in a branched nested format and scored with votes in the same way that LLM reward models are built.

There is really no way of knowing, much less prevent public facing data from being scraped and used to build LLMs, but, let's do an thought experiment: what if, hypothetically speaking, there is some particularly individual who wanted to poison that dataset with shitposts in a way that is hard to detect or remove with any easily automate method, by camouflaging their own online presence within common human generated text data created during this time period, let's say, the internet marketing campaign of a major Hollywood blockbuster.

Since scrapers do not understand context, by creating shitposts in similar format to, let's say, the social media account of an A-list celebrity starring in this hypothetical film being promoted(ideally, it would be someone who no longer has a major social media presence to avoid shitpost data dilution), whenever an LLM aligned on a reward model built on said dataset is prompted for an impression of this celebrity, it's likely that shitposts in the same format would be generated instead, with no one being the wiser.

That would be pretty funny.

Again, this is entirely hypothetical, of course.

kjaeselrek ,

What’s this about shitposting? I’m just here to talk about rampart.

just_another_person , to Technology in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue

Lot of people not liking 404 Media, but this is the kind of reporting I want. Point out what's going wrong. Bring it to a conversation without a lot of skew. Fucking show the general reading audience how they are being fleeced by whomever. Didn't Vice do this at one point?

Shialac ,

Maybe. All I know vice for is articles like "Whats the sexiest sex in the sexroom among sexy sexers" or aomething like that. So the average r/askreddit post

Drusenija ,

So if they were basically regurgitating Reddit already, does that mean they were using AI before it was cool? They might have just used the Amazon approach to AI (I.e., why use technology when we can throw a bunch of minimum workers at the problem).

aStonedSanta ,

I recall vice doing that at one time also.

sheogorath ,

Isn't 404 media the guys from Vice who left before it imploded?

aStonedSanta ,

https://www.nytimes.com/2023/08/22/business/media/404-media-vice-motherboard.html

Apparently so! I dunno how to remove the paywall for others I just use reader mode.

ours ,

The article's author was the Editor-in-chief of Vice's Motherboard as stated in his bio.

DudeImMacGyver , to Technology in Google Is Paying Reddit $60 Million for Fucksmith to Tell Its Users to Eat Glue
@DudeImMacGyver@sh.itjust.works avatar

I love that my almost 2 decades of shitposting will be put to... use?

SkyezOpen ,

Yes. Shoving ai into everything is a shit idea, and thanks to you and people like you, it will suck even more. You have done the internet a great service, and I salute you.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • kbinchat
  • All magazines