SuspiciousCarrot78@aussie.zone to Selfhosted@lemmy.worldEnglish · 18 hours agoDo you host your own AI?message-squaremessage-square151linkfedilinkarrow-up1124file-text
arrow-up1124message-squareDo you host your own AI?SuspiciousCarrot78@aussie.zone to Selfhosted@lemmy.worldEnglish · 18 hours agomessage-square151linkfedilinkfile-text
minus-squarehexagonwin@lemmy.todaylinkfedilinkEnglisharrow-up2·3 hours agoa haswell xeon e5-1650 machine, i remember running llama 7b in llama.cpp in like 2023 and it was quite sluggish. guess i should try whisper at some point…
minus-squareSuspiciousCarrot78@aussie.zoneOPlinkfedilinkEnglisharrow-up6·edit-23 hours agoHa. You were doing inference on CPU on a haswell era. Been there, done that. OTOH…whisper.cpp is heavily optimised for it. Plus, you’re doing batch transcription, not real-time, so slow doesn’t actually matter. Fire Whisper small or medium overnight and wake up to searchable text. PS: if you want a good fast little llm, something like Qwen 3.6 2B will work well on the Xeon.
a haswell xeon e5-1650 machine, i remember running llama 7b in llama.cpp in like 2023 and it was quite sluggish. guess i should try whisper at some point…
Ha. You were doing inference on CPU on a haswell era. Been there, done that.
OTOH…whisper.cpp is heavily optimised for it.
Plus, you’re doing batch transcription, not real-time, so slow doesn’t actually matter.
Fire Whisper small or medium overnight and wake up to searchable text.
PS: if you want a good fast little llm, something like Qwen 3.6 2B will work well on the Xeon.