What local, small models are you all using?

devxyn@sh.itjust.works · 6 days ago

What local, small models are you all using?

FrankLaskey@lemmy.ml · 5 days ago

In my opinion, Qwen3-30B-A3B-2507 would be the best here. Thinking version likely best for most things as long as you don’t mind a slight penalty to speed for more accuracy. I use the quantized IQ4_XS models from Bartowski or Unsloth on HuggingFace.

I’ve seen the new OSS-20B models from OpenAI ranked well in benchmarks but I have not liked the output at all. Typically seems lazy and not very comprehensive. And makes obvious errors.

If you want even smaller and faster the Qwen3 Distill of DeepSeek R1 0528 8B is great for its size (esp if you’re trying to free up some VRAM to use larger context lengths)

devxyn@sh.itjust.works · 5 days ago

That’s what I’m using, and it’s pretty nice. Thanks for your input!

xavier · 5 days ago

Qwen-30 is also my choice when using tools, but i switch to mistral-small for redaction as It manages spanish better IMHO. OSS-120 work great with tools, but it’s slow to unusable on my computer (4060 TI and 64GB RAM). Maybe I’ll try OSS-20, but have readed It is heavilly censored.