automatically clean up tracking and other tags in posted urls

Example (spam post containing an amazon affiliate link, post hopefully deleted by now but I assume mods/admins can see it): https://lemmy.world/post/15846936

Also there are tons of links people post legitimately but have tracking parameters, gclid=this, fbclid=that, etc. Those can be cleaned up too.

By editing out these parameters automatically when the link is posted, people's privacy can be protected and the incentive to post affiliate spam can be decreased.

It could be a server config parameter and/or put into the posting UI: "your post contains [link] with flagged parameters, choose between a) post cleaned up version (shown), or b) post link without changes (may go into moderation queue depending on community settings)."

scrubbles ,
@scrubbles@poptalk.scrubbles.tech avatar

This is one of those things that sounds really easy but would just be a ton of work, and Lemmy is just 2 devs afaik. How do you know this query parameter is a tracker vs something required for the page? There's no way to know programmatically, so they'd need to maintain a list of known bad parameters - and even then those are probably site specific. What if one site uses the query parameter as tracking and another uses the same name coincidentally and it's a part of filtering or sorting?

This sounds like a better task for a browser extension or an npm package they could consume instead, rather than it being a part of Lemmy itself.

unexposedhazard ,

I mean the lists for this already exist because there are firefox plugins that do this and often these kinds are open source. This could surely be a case for crowdsourced / collaborative data.

solrize OP ,

I maintain a list and it's not that big a deal. It doesn't try to be leakproof. It has maybe a dozen actions (code fragments) a table of site names mapping names to actions.

scrubbles ,
@scrubbles@poptalk.scrubbles.tech avatar

Then I'd say that'd be a great 3rd party plugin for Lemmy (which they just enabled support for btw). I don't think that should be a core lemmy feature, to be dependent on a list of query parameters that need to be kept up to date, but it would be a perfect use case for a plugin, which could call out to a database/script/file of known trackers and their domains, and throw up a warning like you're saying.

Not trying to demoralize you, actually the exact opposite, this would be a great project for you if you already have a list! If you built a plugin that could do that for Lemmy I would be one of the first instances to install it!

solrize OP ,

Tbh 5 or so parameters take care of 80% of cases. I'm not conversant enough with lemmy implementation to know the difference between core and non core features but it has to be on the server side because there are too many clients.

scrubbles ,
@scrubbles@poptalk.scrubbles.tech avatar

plugins are server side

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • kbinchat
  • All magazines