Spedwell

@[email protected]

This profile is from a federated server and may be incomplete. View on remote instance

Spedwell ,

It's only double counted in a situation where you're actually counting both sides. This is a Canadian study published by a Canadian outlet about the impacts of Canadian policy.

They're not trying to balance the books, so to speak, they're evaluating transactions on a single account.

Spedwell ,

Right concept, except you're off in scale. A MULT instruction would exist in both RISC and CISC processors.

The big difference is that CISC tries to provide instructions to perform much more sophisticated subroutines. This video is a fun look at some of the most absurd ones, to give you an idea.

Spedwell ,

Ah, yes. The famously singular "westerners" who all 100% agreed with every foreign affairs policy of their government over the past century.

Spedwell , (edited )

concepts embedded in them

internal model

You used both phrases in this thread, but those are two very different things. It's a stretch to say this research supports the latter.

Yes, LLMs are still next-token generators. That is a descriptive statement about how they operate. They just have embedded knowledge that allows them to generate sometimes meaningful text.

Spedwell ,

330 micrograms per gram

That seems like... a lot. Way more than I expected or am comfortable thinking about.

Spedwell ,

It's not really stupid at all. See the matrix code example from this article: https://spectrum.ieee.org/ai-code-generation-ownership

You can't really know when the genAI is synthesizing from thousands of inputs or just outright reciting copyrighted code. Not kosher if it's the latter.

Spedwell ,

I get that there are better choices now, but let's not pretend like a straw you blow into is the technological stopping point for limb-free computer control (sorry if that's not actually the best option, it's just the one I'm familiar with). There are plenty of things to trash talk Neuralink about without pretending this technology (or it's future form) is meritless.

Spedwell ,

We should already be at that point. We have already seen LLMs' potential to inadvertently backdoor your code and to inadvertently help you violate copyright law (I guess we do need to wait to see what the courts rule, but I'll be rooting for the open-source authors).

If you use LLMs in your professional work, you're crazy. I would never be comfortably opening myself up to the legal and security liabilities of AI tools.

Spedwell ,

The issue on the copyright front is the same kind of professional standards and professional ethics that should stop you from just outright copying open-source code into your application. It may be very small portions of code, and you may never get caught, but you simply don't do that. If you wouldn't steal a function from a copyleft open-source project, you wouldn't use that function when copilot suggests it. Idk if copilot has added license tracing yet (been a while since I used it), but absent that feature you are entirely blind to the extent which it's output is infringing on licenses. That's huge legal liability to your employer, and an ethical coinflip.


Regarding understanding of code, you're right. You have to own what you submit into the codebase.

The drawback/risks of using LLMs or copilot are more to do with the fact it generates the likely code, which means it's statistically biased to generate whatever common and unnoticeable bugged logic exists in the average github repo it trained on. It will at some point give you code you read and say "yep, looks right to me" and then actually has a subtle buffer overflow issue, or actually fails in an edge case, because in a way that is just unnoticeable enough.

And you can make the argument that it's your responsibility to find that (it is). But I've seen some examples thrown around on twitter of just slightly bugged loops; I've seen examples of it replicated known vulnerabilities; and we have that package name fiasco in the that first article above.

If I ask myself would I definitely have caught that? the answer is only a maybe. If it replicates a vulnerability that existed in open-source code for years before it was noticed, do you really trust yourself to identify that the moment copilot suggests it to you?

I guess it all depends on stakes too. If you're generating buggy JavaScript who cares.

Spedwell ,

That's significantly worse privacy-wise, since Google gets a copy of everything.

A recovery email in this case was used to uncover the identity of the account-holder. Unless you're using proton mail anonymously (if you're replacing your personal gmail, then probably not) then you don't need to consider the recover email as a weakness.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • kbinchat
  • All magazines