Do you host your own ML / AI / LLM? What do you use, and what do you use it for?

  • SuspiciousCarrot78@aussie.zoneOP
    link
    fedilink
    English
    arrow-up
    4
    ·
    3 hours ago

    How ancient is ancient? TTS and STT are much lighter than llm. (eg: Whisper, Piper, Kokoro, Coqui etc)…you might have more capability than you think, especially if you’re doing batch processing like that.

    • hexagonwin@lemmy.today
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 hours ago

      a haswell xeon e5-1650 machine, i remember running llama 7b in llama.cpp in like 2023 and it was quite sluggish. guess i should try whisper at some point…

      • SuspiciousCarrot78@aussie.zoneOP
        link
        fedilink
        English
        arrow-up
        6
        ·
        edit-2
        3 hours ago

        Ha. You were doing inference on CPU on a haswell era. Been there, done that.

        OTOH…whisper.cpp is heavily optimised for it.

        Plus, you’re doing batch transcription, not real-time, so slow doesn’t actually matter.

        Fire Whisper small or medium overnight and wake up to searchable text.

        PS: if you want a good fast little llm, something like Qwen 3.6 2B will work well on the Xeon.