-
Notifications
You must be signed in to change notification settings - Fork 1.3k
server: investigate pending HA work when executing in new MS session #10167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: investigate pending HA work when executing in new MS session #10167
Conversation
For HA work items that are created for host state change, checks must be done when execution is called in a new management server session. A new column, reason, has been added in cloud.op_ha_work table to track the reason for HA work. When HighAvailabilityManager starts it finds and puts all pending HA work items in Investigating state. During execution of the HA work if it is found in investigating state, checks are done to verify if the work is still valid. If the jobs is found to be invalid it is cancelled. Signed-off-by: Abhishek Kumar <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.20 #10167 +/- ##
============================================
+ Coverage 16.07% 16.99% +0.92%
- Complexity 12884 12977 +93
============================================
Files 5639 5245 -394
Lines 494184 461985 -32199
Branches 59924 54191 -5733
============================================
- Hits 79420 78513 -907
+ Misses 405934 374799 -31135
+ Partials 8830 8673 -157
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
@blueorangutan package |
|
@shwstppr a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
sureshanaparti
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clgtm
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12012 |
|
@blueorangutan test |
|
@shwstppr a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-12060)
|
Signed-off-by: Abhishek Kumar <[email protected]>
|
@blueorangutan package |
|
@shwstppr a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12025 |
Signed-off-by: Abhishek Kumar <[email protected]>
|
@blueorangutan package |
|
@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12041 |
Signed-off-by: Abhishek Kumar <[email protected]>
|
@blueorangutan package |
|
@shwstppr a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 12145 |
|
@blueorangutan test |
|
@kiranchavala a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
[SF] Trillian test result (tid-12189)
|
kiranchavala
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM , Tested the scenario of HostMaintenance, HostDown and HostDegraded.
In case of management server restart, found that cloudstack validates the entries in op_ha_work table and takes action based on the host state
| Test Case Execution | Result |
|---|---|
| Test host maintainence scenario | Pass |
| Test host ha /host down scenario | Pass |
…ession (apache#519) Fixes apache#511 For HA work items that are created for host state change, checks must be done when execution is called in a new management server session. A new column, reason, has been added in cloud.op_ha_work table to track the reason for HA work. When HighAvailabilityManager starts it finds and puts all pending HA work items in Investigating state. During execution of the HA work if it is found in investigating state, checks are done to verify if the work is still valid. If the jobs is found to be invalid it is cancelled. Upstream PR: apache#10167 Signed-off-by: Abhishek Kumar <[email protected]>
…pache#10167) For HA work items that are created for host state change, checks must be done when execution is called in a new management server session. A new column, reason, has been added in cloud.op_ha_work table to track the reason for HA work. When HighAvailabilityManager starts it finds and puts all pending HA work items in Investigating state. During execution of the HA work if it is found in investigating state, checks are done to verify if the work is still valid. If the jobs is found to be invalid it is cancelled. Signed-off-by: Abhishek Kumar <[email protected]>
Description
For HA work items that are created for host state change, checks must be done when execution is called in a new management server session.
A new column, reason, has been added in cloud.op_ha_work table to track the reason for HA work.
When HighAvailabilityManager starts it finds and puts all pending HA work items in Investigating state. During execution of the HA work if it is found in investigating state, checks are done to verify if the work is still valid. If the jobs is found to be invalid it is cancelled.
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?