Yeah, I wouldn’t use it for coding. It’s a bit dumb unfortunately.
- 0 Posts
- 6 Comments
Joined 2 years ago
Cake day: September 27th, 2024
You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.
Looping was a problem after reaching a certain context window size. The llama.cpp flags - -flash-attn on and looping penalties helped.
I‘m not a coder, so I don’t know exactly. It is able to code, but I would say somebody with experience should guide it and have an eye on the results.
Have you tried qwen3.5-9b? It’s pretty solid for its size.
I don’t host it exactly, just use it when I don’t use my graphics card for gaming. I run Qwen3.6-35b on my 16gb vram RX 9700 xt with 34t/s. I use it as an IT advisor, admin and Linux teacher for my cachyOS gaming PC.


Yeah, a higher quant would be nice, I actually try not to go below Q5, but you can domino’s so much with 16GB of VRAM and the ddr4 system RAM.
But I must say I‘m pretty impressed by Qwen3.6-35b, not only from its capabilities but also from hardware requirements. MoE for the win I guess.
RWKV sounds interesting, have to look into it, thanks!