Slot migration fails silently when SetSlotRange times out or faults
Describe the bug
During cluster slot migration, MigrateSession.TrySetSlotRanges can silently fail without marking the migration as failed, leaving slots in an indeterminate state.
The method uses a .ContinueWith(TaskContinuationOptions.OnlyOnRanToCompletion).WaitAsync().Result pattern to send CLUSTER SETSLOTRANGE commands to the target node. When the underlying task times out, is canceled, or faults, the OnlyOnRanToCompletion continuation never executes. The resulting exception is caught by the generic catch (Exception) block, which returns false but does not set Status = MigrateState.FAIL. This leaves the migration session in a PENDING state rather than FAIL, which can prevent proper recovery and cleanup.
Additionally, the error message triggering the initial failure is lost.
Steps to reproduce the bug
Trigger a migration that fails
Inspect logs
Expected behavior
Expected behavior
When SetSlotRange fails for any reason (timeout, cancellation, network error), the migration session should:
- Set
Status = MigrateState.FAIL
- Log a diagnostic message that identifies the affected slots and the nature of the failure
- Return
false so the caller can trigger recovery
Screenshots
No response
Release version
No response
IDE
No response
OS version
No response
Additional context
No response
Slot migration fails silently when
SetSlotRangetimes out or faultsDescribe the bug
During cluster slot migration,
MigrateSession.TrySetSlotRangescan silently fail without marking the migration as failed, leaving slots in an indeterminate state.The method uses a
.ContinueWith(TaskContinuationOptions.OnlyOnRanToCompletion).WaitAsync().Resultpattern to sendCLUSTER SETSLOTRANGEcommands to the target node. When the underlying task times out, is canceled, or faults, theOnlyOnRanToCompletioncontinuation never executes. The resulting exception is caught by the genericcatch (Exception)block, which returnsfalsebut does not setStatus = MigrateState.FAIL. This leaves the migration session in aPENDINGstate rather thanFAIL, which can prevent proper recovery and cleanup.Additionally, the error message triggering the initial failure is lost.
Steps to reproduce the bug
Trigger a migration that fails
Inspect logs
Expected behavior
Expected behavior
When
SetSlotRangefails for any reason (timeout, cancellation, network error), the migration session should:Status = MigrateState.FAILfalseso the caller can trigger recoveryScreenshots
No response
Release version
No response
IDE
No response
OS version
No response
Additional context
No response