xcjs ,
@xcjs@programming.dev avatar

No offense intended, but are you sure it's using your GPU? Twenty minutes is about how long my CPU-locked instance takes to run some 70B parameter models.

On my RTX 3060, I generally get responses in seconds.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • [email protected]
  • kbinchat
  • All magazines