Knowledge Base
2025.10
GENERIC
Networking
Storage
Compute
Designate
Orchestration
Self-Hosted
Install
UPGRADE
Monitoring
Add-Ons
Title
Message
Create new category
What is the title of your new category?
Edit page index title
What is the title of the page index?
Edit category
What is the new title of your category?
Edit link
What is the new title and URL of your link?
VMHA Stuck in "Waiting"
Summarize Page
Copy Markdown
Open in ChatGPT
Open in Claude
Problem
VMHA remains stuck in the "waiting" state during enablement.
Environment
- Private Cloud Director Virtualization β v2025.4 and Higher
- Self-Hosted Private Cloud Director Virtualization β v2025.4 and Higher
Cause
A decommissioned host was still listed in Nova's service records. Because of this, VMHA tried to use that host during setup, which caused an error and left the VMHA stuck in the "waiting" state.
Diagnostics
For SAAS customers contact Platform9 Support Team to validate if you are hitting the issue mentioned in this article.
- Check VMHA logs:
command
$ kubectl exec deploy/hamgr -n <REGION_NAMESPACE> -- cat /var/log/pf9/hamgr/hamgr.log | grep -A1 'Enabling HA'Look for log entries like:
hamgr.log
Enabling HA on some of the hosts [...] including host '[HOST-ID]'WARNING Role status of host [HOST-ID] is not ok- List compute services and validate if any of the hypervisors are showing the "Status" as
disabledand "State"down
Identify services that are down, disabled, or associated with non-existent or decommissioned hosts. In the sample output the HOST2.EXAMPLE.COM is the decommissioned node.
command
βx
$ openstack compute service listβ#sample output+--------------------+-------------+--------------------+-------+---------+-------+--------------+| ID | Binary | Host | Zone | Status | State | Updated At |+--------------------+-------------+--------------------+-------+---------+-------+--------------+| [HOST1_SERVICE_ID] | nova-compute| [HOST1.EXAMPLE.COM]| [zone]| enabled | up | [TIMESTAMP] || [HOST2_SERVICE_ID] | nova-compute| [HOST2.EXAMPLE.COM]| [zone]| disabled| down | [TIMESTAMP] || [HOST3_SERVICE_ID] | nova-compute| [HOST3.EXAMPLE.COM]| [zone]| enabled | up | [TIMESTAMP] |+--------------------+-------------+--------------------+-------+---------+-------+--------------+- List hypervisors and validate host mapping. In the sample output, we see that the node
[HOST2.EXAMPLE.COM]is in adownstate. we can check its associatedservice IDto validate the host mapping
command
$ openstack hypervisor listβ#sample output:+----------------+---------------------+-----------------+-------------+-------+| ID | Hypervisor Hostname | Hypervisor Type | Host IP | State |+----------------+---------------------+-----------------+-------------+-------+| [HOST1_UUID] | [HOST1.EXAMPLE.COM] | QEMU | [IP-ADDR-1] | up || [HOST2_UUID] | [HOST2.EXAMPLE.COM] | QEMU | [IP-ADDR-2] | down |+----------------+---------------------+-----------------+-------------+-------+β$ openstack hypervisor show <HYPERVISOR_ID>β#sample output $ openstack hypervisor show [HOST2_UUID]+---------------------+--------------------------------------+| Field | Value |+---------------------+--------------------------------------+| aggregates | [] || cpu_info | None || host_ip | [IP-ADDR-2] || hypervisor_hostname | [HOST2.EXAMPLE.COM] || hypervisor_type | QEMU || hypervisor_version | [HYPERVISOR_VERSION] || id | [HOST2_UUID] || service_host | [SERVICE_HOST_UUID] || service_id | [HOST2_SERVICE_ID] || state | down || status | disabled |+---------------------+--------------------------------------+Resolution
- Identify the stale compute service entry from the output of the below command, in the sample output we see the node
HOST2.EXAMPLE.COMis down.
command
$ openstack compute service listβ#sample output+--------------------+-------------+--------------------+-------+---------+-------+--------------+| ID | Binary | Host | Zone | Status | State | Updated At |+--------------------+-------------+--------------------+-------+---------+-------+--------------+| [HOST1_SERVICE_ID] | nova-compute| [HOST1.EXAMPLE.COM]| [zone]| enabled | up | [TIMESTAMP] || [HOST2_SERVICE_ID] | nova-compute| [HOST2.EXAMPLE.COM]| [zone]| disabled| down | [TIMESTAMP] || [HOST3_SERVICE_ID] | nova-compute| [HOST3.EXAMPLE.COM]| [zone]| enabled | up | [TIMESTAMP] |+--------------------+-------------+--------------------+-------+---------+-------+--------------+- Delete the stale service using below command, post deletion of the stale entry we will still have minimum two working hypervisors as per the requirement of enabling VMHA
command
$ openstack compute service delete <HOST2_SERVICE_ID>- Wait for the VMHA to retry the operation automatically, or disable and re-enable VMHA to trigger a fresh attempt.
Validation:
- Ensure VMHA state transitions from
waitingtoenabled. - Confirm no additional stale hosts remain.
Additional Information:
- At minimum two working hypervisors are needed for enabling VMHA
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
Last updated on
Was this page helpful?
Next to read:
Unable to View Tenants and Users in PCD UIDiscard Changes
Do you want to discard your current changes and overwrite with the template?
Archive Synced Block
Message
Create new Template
What is this template's title?
Delete Template
Message