SuspiciousCarrot78@aussie.zone to Selfhosted@lemmy.worldEnglish · 17 hours agoDo you host your own AI?message-squaremessage-square151linkfedilinkarrow-up1124file-text
arrow-up1124message-squareDo you host your own AI?SuspiciousCarrot78@aussie.zone to Selfhosted@lemmy.worldEnglish · 17 hours agomessage-square151linkfedilinkfile-text
minus-squareDomi@lemmy.secnd.melinkfedilinkEnglisharrow-up7·4 hours agoYes, I got a Strix Halo machine before the RAM price hike and use it to run all my ML stuff on it. Currently using llama-swap with llama.cpp/ComfyUI and opencode/Open WebUI as frontend. I’m running Qwen3.6-27b, Voxtral Mini 4b, Piper and Qwen Image. Also, some embedding and reranking models. I use them for: Tagging and classification of my documents in Paperless Home Assistant (voice assistant) Translations (both text and image) Transcriptions Some light coding and debugging Avatar/Backdrop generation for DnD sessions
minus-squareSuspiciousCarrot78@aussie.zoneOPlinkfedilinkEnglisharrow-up1·3 hours agoWhat sort of tok/s are you getting on the strix?
minus-squareDomi@lemmy.secnd.melinkfedilinkEnglisharrow-up2·1 hour agoAbout 200 t/s prompt processing and 10-20 t/s with MTP. Greatly depends on the task, predictable things like code generates at 18-20 t/s. Creative writing more like 10-17 t/s.
minus-squareSuspiciousCarrot78@aussie.zoneOPlinkfedilinkEnglisharrow-up1·32 minutes agoDamn - I thought strix would do a bit better than that, for how much it costs.
Yes, I got a Strix Halo machine before the RAM price hike and use it to run all my ML stuff on it.
Currently using llama-swap with llama.cpp/ComfyUI and opencode/Open WebUI as frontend.
I’m running Qwen3.6-27b, Voxtral Mini 4b, Piper and Qwen Image. Also, some embedding and reranking models.
I use them for:
What sort of tok/s are you getting on the strix?
About 200 t/s prompt processing and 10-20 t/s with MTP.
Greatly depends on the task, predictable things like code generates at 18-20 t/s. Creative writing more like 10-17 t/s.
Damn - I thought strix would do a bit better than that, for how much it costs.