@Franconian_Nomad

Franconian_Nomad@feddit.org · 4 hours ago

Yeah, a higher quant would be nice, I actually try not to go below Q5, but you can domino’s so much with 16GB of VRAM and the ddr4 system RAM.

But I must say I‘m pretty impressed by Qwen3.6-35b, not only from its capabilities but also from hardware requirements. MoE for the win I guess.

RWKV sounds interesting, have to look into it, thanks!

Franconian_Nomad@feddit.org · 11 hours ago

Yeah, I wouldn’t use it for coding. It’s a bit dumb unfortunately.

Franconian_Nomad@feddit.org · 11 hours ago

Looping was a problem after reaching a certain context window size. The llama.cpp flags - -flash-attn on and looping penalties helped.

Franconian_Nomad@feddit.org · 11 hours ago

I‘m not a coder, so I don’t know exactly. It is able to code, but I would say somebody with experience should guide it and have an eye on the results.

Franconian_Nomad@feddit.org · 16 hours ago

Have you tried qwen3.5-9b? It’s pretty solid for its size.

Franconian_Nomad@feddit.org · 16 hours ago

I don’t host it exactly, just use it when I don’t use my graphics card for gaming. I run Qwen3.6-35b on my 16gb vram RX 9700 xt with 34t/s. I use it as an IT advisor, admin and Linux teacher for my cachyOS gaming PC.