Skip to content

reconcile: wait till PVC is ready#1971

Merged
vrutkovs merged 1 commit intomasterfrom
wait-for-pvc-status
Mar 20, 2026
Merged

reconcile: wait till PVC is ready#1971
vrutkovs merged 1 commit intomasterfrom
wait-for-pvc-status

Conversation

@AndrewChubatiuk
Copy link
Contributor

@AndrewChubatiuk AndrewChubatiuk commented Mar 16, 2026

fixes #1970


Summary by cubic

Wait for PVCs to be fully ready (Bound with requested capacity and not Resizing) before continuing reconcile and during StatefulSet PVC expansion. This pauses pod rollouts until storage is actually usable.

  • Bug Fixes
    • Added waitForPVCReady with polling via VM_PVC_WAIT_READY_INTERVAL and VM_PVC_WAIT_READY_TIMEOUT; waits for requested capacity, PVC to be Bound, and not Resizing; skips if PVC is deleting.
    • Wrapped PVC get/create/update in a retry-on-conflict flow and skipped updates for PVCs with a non-zero DeletionTimestamp.
    • Invoked the wait after PVC reconcile and for each PVC in StatefulSet expansion; unified readiness polling via config (reconcile.Init now takes BaseOperatorConf; added VM_WAIT_READY_INTERVAL for VM CRs; renamed internals to AppWaitReadyTimeout and PodWaitReadyInterval, env names unchanged).

Written for commit 1dbeca6. Summary will update on new commits.

}

return updatePVC(ctx, rclient, &existingObj, newObj, prevObj, owner)
func waitForPVCBound(ctx context.Context, rclient client.Client, nsn types.NamespacedName, generation int64) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add unit tests for this function

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/controller/operator/factory/reconcile/reconcile.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/reconcile.go:27">
P1: The new PVC wait uses a hardcoded 5s timeout, so PVC binding can fail much earlier than the operator's configured readiness deadlines.</violation>
</file>

<file name="internal/controller/operator/factory/reconcile/statefulset_pvc_expand.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/statefulset_pvc_expand.go:124">
P1: `waitForPVCBound` is called with the StatefulSet name instead of the PVC name, so it polls a non-existent PVC and times out.</violation>
</file>

<file name="internal/controller/operator/factory/reconcile/pvc.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/pvc.go:51">
P1: Terminating PVCs no longer short-circuit successfully; this unconditional wait turns the previous skip path into a reconcile error.</violation>

<violation number="2" location="internal/controller/operator/factory/reconcile/pvc.go:51">
P1: Waiting for PVCs to reach `Bound` here can block valid `WaitForFirstConsumer` claims before their Deployment is created.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@AndrewChubatiuk AndrewChubatiuk changed the title reconcile: wait till PVC is in bound phase reconcile: wait till PVC is ready Mar 17, 2026
@AndrewChubatiuk
Copy link
Contributor Author

@cubic-dev-ai

@cubic-dev-ai
Copy link
Contributor

cubic-dev-ai bot commented Mar 17, 2026

@cubic-dev-ai

@AndrewChubatiuk I have started the AI code review. It will take a few minutes to complete.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 15 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="internal/controller/operator/factory/reconcile/deploy.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/deploy.go:102">
P1: `ProgressTimeoutExceeded` is not the Deployment condition reason Kubernetes sets here; use `ProgressDeadlineExceeded` or rollout timeout failures will be missed.</violation>
</file>

<file name="internal/controller/operator/factory/reconcile/statefulset_pvc_expand.go">

<violation number="1" location="internal/controller/operator/factory/reconcile/statefulset_pvc_expand.go:127">
P1: This waits on a stale PVC generation, so a just-updated PVC can be treated as ready before the resize spec is observed.</violation>
</file>

<file name="internal/controller/operator/factory/k8stools/interceptors.go">

<violation number="1" location="internal/controller/operator/factory/k8stools/interceptors.go:82">
P2: This now calls `Status().Update` for StatefulSet, Deployment, and PVC too, but the fake client is not configured with status subresources for those types, so intercepted updates will fail in tests.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@AndrewChubatiuk AndrewChubatiuk force-pushed the wait-for-pvc-status branch 5 times, most recently from f58e467 to e137c63 Compare March 17, 2026 11:04
@AndrewChubatiuk AndrewChubatiuk force-pushed the wait-for-pvc-status branch 2 times, most recently from aec1218 to e330290 Compare March 17, 2026 14:18
if !existingObj.CreationTimestamp.IsZero() {
size = existingObj.Spec.Resources.Requests[corev1.ResourceStorage]
}
if err = waitForPVCReady(ctx, rclient, nsn, size); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be waiting for ready PVC only after DeletionTimestamp check? If VMCluster is being delete we don't need to wait for bound PVC

Copy link
Contributor Author

@AndrewChubatiuk AndrewChubatiuk Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two loops: one for creation/update, another one - waiting for PVC readiness. deletion may happen in each of these loops, that's why checking deletion timestamp in both, exiting without error and showing a warning. No further reconciliation is needed till user performs manual actions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense

@AndrewChubatiuk
Copy link
Contributor Author

@vrutkovs could you please take a look?

@vrutkovs vrutkovs merged commit e7ef10e into master Mar 20, 2026
6 checks passed
@vrutkovs vrutkovs deleted the wait-for-pvc-status branch March 20, 2026 13:46
AndrewChubatiuk added a commit that referenced this pull request Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cluster is operational prematurely on PVC expand

2 participants