Engineering Note
Proxmox cluster quorum and QDevice setup
Why two-node Proxmox clusters lock up, how corosync counts votes, setting up a corosync QDevice tiebreaker, and the safe node-maintenance procedure.
TL;DR
Proxmox allows cluster changes only when more than half of all votes are present, so a two-node cluster freezes solid the moment either node goes down — one vote out of two is not a majority. The fix is a third vote: a full third node, or a lightweight corosync QDevice on any always-on box. This note covers the vote math, the QDevice setup commands, what actually breaks at quorum loss, and the maintenance procedure that avoids the self-inflicted outage.
The math
Corosync grants quorum when votes_present > votes_total / 2 — strictly
greater, which is the part people forget. Consequences worth memorizing:
| Nodes (votes) | Survives the loss of |
|---|---|
| 2 | nothing — either node down = no quorum |
| 2 + QDevice | 1 node |
| 3 | 1 node |
| 4 | 1 node (2 of 4 is not a majority!) |
| 5 | 2 nodes |
Even node counts buy you nothing over the odd count below them. My lab runs 3 votes; if you’re at 4 nodes, add a QDevice to make it 5.
Without quorum, Proxmox doesn’t just pause HA: /etc/pve (pmxcfs) goes
read-only, so you can’t start VMs, edit configs, or migrate. The cluster is
protecting you from split-brain, in which both halves write divergent state
and you get to spend a weekend reconciling VM configs.
QDevice: the two-node fix
A QDevice is a corosync vote daemon on any always-on Linux box outside the cluster — a Raspberry Pi, a NAS container, a tiny VM at another site. It is not a Proxmox node; it just answers “which partition should win.”
On the tiebreaker box (Debian/Ubuntu):
apt install corosync-qnetd
On the cluster (any node, with root SSH to the qnetd box):
apt install corosync-qdevice # on BOTH nodes
pvecm qdevice setup <QNETD_IP>
Verify:
pvecm status
# Expected: "Total votes: 3", Qdevice line present,
# Flags: Quorate Qdevice
Gotchas I’ve hit:
- The setup command needs root SSH (key auth) to the qnetd host; set that up first or it fails halfway with a partial config.
- Removing a node later? Remove the qdevice first (
pvecm qdevice remove), fix the cluster, re-add it. Vote math with a stale qdevice config gets confusing fast. - Put the qdevice on a third failure domain. A qdevice plugged into the same UPS as node 1 answers the wrong question during a power event.
Safe maintenance procedure
The self-inflicted outage pattern: reboot node A for updates, and while it’s down, touch something on node B that needs quorum — or worse, node B hiccups and now nothing has quorum. The procedure that avoids it:
pvecm status— confirm quorate and all votes present before starting.- Migrate or shut down guests on the target node (
ha-managerhandles HA guests if configured). - Reboot/patch the node. The remaining node + qdevice keep quorum.
- Wait for it to rejoin (
pvecm statusshows full votes) before touching the next node. Never overlap.
Emergency-only: if a node is permanently dead and you must operate the
survivor of a two-node cluster, pvecm expected 1 unlocks it. Understand
what you’re asserting: that the other node cannot possibly be alive. Power
it off at the PDU before you run it, and rebuild proper quorum the same
week — this override has a way of becoming permanent (see the lab-discipline
argument in building a production-grade lab).
Frequently asked questions
- Why does my two-node Proxmox cluster become read-only when one node is off?
- Quorum requires strictly more than half of total votes. With two votes total, one node has exactly half — not a majority — so pmxcfs mounts read-only and VM starts/migrations are refused. This is by design to prevent split-brain; add a QDevice or third node for a tiebreaking vote.
- Is 'pvecm expected 1' safe to use?
- Only as a deliberate emergency override when you are certain the other node is truly dead and disconnected — it tells the survivor to proceed without majority, which is exactly the split-brain condition quorum exists to prevent. Never leave it as a standing configuration, and never use it while the other node might still be running.