Do you host your own ML / AI / LLM? What do you use, and what do you use it for?

  • chaospatterns@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    4 hours ago

    Partially. I started with hosting my own llama3.2 + granite4 models using Ollama for my Home Assistant smart home and for general chat with OpenWebUI. I also run whisper for speech-to-text locally on my 1080 Ti GPU. I like the privacy and ownership of my self-hosted models, but I started to run into limitations with the small weights. So I built some tools that allow me to selectively route traffic to larger models hosted on DeepInfra depending on my need. For example, to GLM/Kimi models for code reviews or for my custom harnesses or harder problems.