

1·
19 days ago@[email protected] I was using it like a year and a half ago. With the web UI that looks like it’s from 2008 lol. But I think at this point, I’d use it for the OpenAI compatible endpoint anyway.
@[email protected] I was using it like a year and a half ago. With the web UI that looks like it’s from 2008 lol. But I think at this point, I’d use it for the OpenAI compatible endpoint anyway.
Is the KoboldCPP UI a bit better these days? Or maybe it’s best to hook it up as an OpenAI connection via OpenWebUI… I mostly use it with ollama.
What are the benefits of EXL3 vs the more normal quantizations? I have 16gb of VRAM on an AMD card. Would I be able to benefit from this quant yet?