I wanted to shared my enthusiasm, which makes me feel like a little boy (despite me being 50+) fascinated by how such complex systems can be managed so easily by novices. I started using Proxmox recently. I had a machine running one VM with various docker images installed. But NVMe was tiny. So I setup another node and got it to share the same NFS share on the NAS, where I had saved full backups of the VM. Once added the NFS share to the new node (with a bigger ZFS local partition) I simply restored the VM from the NFS share that had been backed up from the original node. It seemlessly imported and started. Then I cloned on the new node so that I could get it on the new ZFS partition. Now the next task is to get a bigger NVMe on the original machine, install Proxmox from scratch, and add to cluster so that it shared the backup NFS share. I just then need to understand how to get HA up and running so that VMs are always synced flawlessly. Proxmox is super brilliant. I feel like I have a data center at home :-) I could not imagine this system was so flexible and relatively easy to use. The people that deliver and contribute to this stuf are super cool. A couple of proxmox nodes, a Truenas scale NAS and a good backup strategy and your data is really safe and rock solid … I hope :-)

  • 4am@lemmy.zip
    link
    fedilink
    English
    arrow-up
    4
    ·
    7 hours ago

    There are ways to do it with a network disk being present or something, but generally HA in Proxmox needs an odd number of nodes to reach quorum; basically if an HA node detects that it is isolated then it freezes all VMs, assuming it is having network issues and that other nodes, which themselves may not be isolated, could be running the same VM - since the whole point of HA is that if a node and its VMs disappear, the remaining ones take over duties until the missing node returns.

    If you have an even number of nodes, you need a tiebreaker vote to reach quorum - half the total nodes plus 1 for a majority is the default.

    You can adjust the total number of node “votes”that dictate what quorum is, but if you have two nodes and you set it to 1, then you’ll always have “split brain” where copies of the same VMs will keep running on both nodes, and if you set it to two then and node going down will freeze the other as well (both will assume they are the one with problems, since they’d both be below quorum). Therefore you need an odd number of votes.

    The best way is to have a third host (or a 5th, or a 7th, etc. 😅); but there is a way (tutorial on Proxmox’s docs) to set up the presence of a network share as a tie-breaking vote, rather than a full additional node; the idea being that if the node can see the disk, that means it can see the network and therefore it is the node you’d want running the VMs.

    So plan carefully around this, it’s not fun when a cluster you’ve become dependent on for services deadlocks itself 😅 ask my wife how I know this