Tracking issue filed here: velero-io/velero#3055
we used the AWS plugin provider with minio as the backuplocation
simply try to backup the cluster resources:
velero create backup mybackup-1 -n spp-velero
The backup remains in Status "InProgress" and never reaches Complete state.
When monitoring the pod we see that the pod is restarted during the backup, this might cause the backup to never reach Completion state.
What did you expect to happen:
The velero backup of cluster resources should end with Complete state.
The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velero
oc logs -p deployment/velero -n spp-velero
time="2020-11-05T18:44:10Z" level=info msg="Setting up backup log" backup=spp-velero/mybackup-1 controller=backup logSource="pkg/controller/backup_controller.go:512"
time="2020-11-05T18:44:10Z" level=info msg="Setting up backup temp file" backup=spp-velero/mybackup-1 logSource="pkg/controller/backup_controller.go:534"
time="2020-11-05T18:44:10Z" level=info msg="Setting up plugin manager" backup=spp-velero/mybackup-1 logSource="pkg/controller/backup_controller.go:541"
time="2020-11-05T18:44:10Z" level=info msg="Getting backup item actions" backup=spp-velero/mybackup-1 logSource="pkg/controller/backup_controller.go:545"
time="2020-11-05T18:44:10Z" level=info msg="Setting up backup store to check for backup existence" backup=spp-velero/mybackup-1 logSource="pkg/controller/backup_controller.go:551"
time="2020-11-05T18:44:10Z" level=info msg="Writing backup version file" backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:236"
time="2020-11-05T18:44:10Z" level=info msg="Including namespaces: *" backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:242"
time="2020-11-05T18:44:10Z" level=info msg="Excluding namespaces: " backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:243"
time="2020-11-05T18:44:10Z" level=info msg="Including resources: *" backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:246"
time="2020-11-05T18:44:10Z" level=info msg="Excluding resources: " backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:247"
time="2020-11-05T18:44:10Z" level=info msg="Backing up all pod volumes using restic: false" backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:248"
time="2020-11-05T18:44:23Z" level=info msg="Getting items for group" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:76"
time="2020-11-05T18:44:23Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=pods
time="2020-11-05T18:44:23Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=pods
time="2020-11-05T18:44:24Z" level=info msg="Retrieved 215 items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:297" namespace= resource=pods
time="2020-11-05T18:44:24Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=persistentvolumeclaims
time="2020-11-05T18:44:24Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=persistentvolumeclaims
time="2020-11-05T18:44:24Z" level=info msg="Retrieved 3 items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:297" namespace= resource=persistentvolumeclaims
time="2020-11-05T18:44:24Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=persistentvolumes
time="2020-11-05T18:44:24Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=persistentvolumes
time="2020-11-05T18:44:24Z" level=info msg="Retrieved 3 items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:297" namespace= resource=persistentvolumes
time="2020-11-05T18:44:24Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=namespaces
time="2020-11-05T18:44:24Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=namespaces
time="2020-11-05T18:44:24Z" level=info msg="Retrieved 64 items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:297" namespace= resource=namespaces
time="2020-11-05T18:44:24Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=events
time="2020-11-05T18:44:24Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=events
time="2020-11-05T18:44:24Z" level=info msg="Retrieved 2148 items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:297" namespace= resource=events
time="2020-11-05T18:44:25Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=secrets
time="2020-11-05T18:44:25Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=secrets
velero backup describe or kubectl get backup/ -n velero -o yaml
velero backup get -n spp-velero
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
mybackup-1 InProgress 0 0 2020-11-05 19:44:10 +0100 CET 29d default
[root@nevada16 install]# velero backup describe mybackup-1 -n spp-velero
Name: mybackup-1
Namespace: spp-velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.18.3+2fbd7c7
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=18+
Phase: InProgress
Errors: 0
Warnings: 0
Namespaces:
Included: *
Excluded:
Resources:
Included: *
Excluded:
Cluster-scoped: auto
Label selector:
Storage Location: default
Velero-Native Snapshot PVs: auto
TTL: 720h0m0s
Hooks:
Backup Format Version: 1.1.0
Started: 2020-11-05 19:44:10 +0100 CET
Completed: <n/a>
Expiration: 2020-12-05 19:44:10 +0100 CET
Velero-Native Snapshots:
velero backup logs
velero backup logs mybackup-1 -n spp-velero
Logs for backup "mybackup-1" are not available until it's finished processing. Please wait until the backup has a phase of Completed or Failed and try again.
velero restore describe or kubectl get restore/ -n velero -o yaml
velero restore logs
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
The pod is always restarting after or on retrieving the resource secrets. This is reproducable.
A backup with exclude secrets works fine.
velero create backup mybackup-2 --exclude-resources secrets -n spp-velero
velero get backup -n spp-velero
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
mybackup-1 InProgress 0 0 2020-11-05 19:44:10 +0100 CET 29d default
mybackup-2 Completed 0 1 2020-11-05 19:48:35 +0100 CET 29d default
Environment:
OCP 4.5.15, we see the same behavior on 4.5.6 and 4.6.1
Velero version (use velero version): 1.5.2 and 1.4.3
Velero features (use velero client config get features):
Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2-0-g52c56ce", GitCommit:"b66f2d3a6893be729f1b8660309a59c6e0b69196", GitTreeState:"clean", BuildDate:"2020-08-10T04:49:09Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.3+2fbd7c7", GitCommit:"2fbd7c7", GitTreeState:"clean", BuildDate:"2020-10-09T11:41:17Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes installer & version:
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release): CoreOS 4.5
Tracking issue filed here: velero-io/velero#3055
we used the AWS plugin provider with minio as the backuplocation
simply try to backup the cluster resources:
velero create backup mybackup-1 -n spp-velero
The backup remains in Status "InProgress" and never reaches Complete state.
When monitoring the pod we see that the pod is restarted during the backup, this might cause the backup to never reach Completion state.
What did you expect to happen:
The velero backup of cluster resources should end with Complete state.
The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velero
oc logs -p deployment/velero -n spp-velero
time="2020-11-05T18:44:10Z" level=info msg="Setting up backup log" backup=spp-velero/mybackup-1 controller=backup logSource="pkg/controller/backup_controller.go:512"
time="2020-11-05T18:44:10Z" level=info msg="Setting up backup temp file" backup=spp-velero/mybackup-1 logSource="pkg/controller/backup_controller.go:534"
time="2020-11-05T18:44:10Z" level=info msg="Setting up plugin manager" backup=spp-velero/mybackup-1 logSource="pkg/controller/backup_controller.go:541"
time="2020-11-05T18:44:10Z" level=info msg="Getting backup item actions" backup=spp-velero/mybackup-1 logSource="pkg/controller/backup_controller.go:545"
time="2020-11-05T18:44:10Z" level=info msg="Setting up backup store to check for backup existence" backup=spp-velero/mybackup-1 logSource="pkg/controller/backup_controller.go:551"
time="2020-11-05T18:44:10Z" level=info msg="Writing backup version file" backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:236"
time="2020-11-05T18:44:10Z" level=info msg="Including namespaces: *" backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:242"
time="2020-11-05T18:44:10Z" level=info msg="Excluding namespaces: " backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:243"
time="2020-11-05T18:44:10Z" level=info msg="Including resources: *" backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:246"
time="2020-11-05T18:44:10Z" level=info msg="Excluding resources: " backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:247"
time="2020-11-05T18:44:10Z" level=info msg="Backing up all pod volumes using restic: false" backup=spp-velero/mybackup-1 logSource="pkg/backup/backup.go:248"
time="2020-11-05T18:44:23Z" level=info msg="Getting items for group" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:76"
time="2020-11-05T18:44:23Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=pods
time="2020-11-05T18:44:23Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=pods
time="2020-11-05T18:44:24Z" level=info msg="Retrieved 215 items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:297" namespace= resource=pods
time="2020-11-05T18:44:24Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=persistentvolumeclaims
time="2020-11-05T18:44:24Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=persistentvolumeclaims
time="2020-11-05T18:44:24Z" level=info msg="Retrieved 3 items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:297" namespace= resource=persistentvolumeclaims
time="2020-11-05T18:44:24Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=persistentvolumes
time="2020-11-05T18:44:24Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=persistentvolumes
time="2020-11-05T18:44:24Z" level=info msg="Retrieved 3 items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:297" namespace= resource=persistentvolumes
time="2020-11-05T18:44:24Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=namespaces
time="2020-11-05T18:44:24Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=namespaces
time="2020-11-05T18:44:24Z" level=info msg="Retrieved 64 items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:297" namespace= resource=namespaces
time="2020-11-05T18:44:24Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=events
time="2020-11-05T18:44:24Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=events
time="2020-11-05T18:44:24Z" level=info msg="Retrieved 2148 items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:297" namespace= resource=events
time="2020-11-05T18:44:25Z" level=info msg="Getting items for resource" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:165" resource=secrets
time="2020-11-05T18:44:25Z" level=info msg="Listing items" backup=spp-velero/mybackup-1 group=v1 logSource="pkg/backup/item_collector.go:291" namespace= resource=secrets
velero backup describe or kubectl get backup/ -n velero -o yaml
velero backup get -n spp-velero
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
mybackup-1 InProgress 0 0 2020-11-05 19:44:10 +0100 CET 29d default
[root@nevada16 install]# velero backup describe mybackup-1 -n spp-velero
Name: mybackup-1
Namespace: spp-velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.18.3+2fbd7c7
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=18+
Phase: InProgress
Errors: 0
Warnings: 0
Namespaces:
Included: *
Excluded:
Resources:
Included: *
Excluded:
Cluster-scoped: auto
Label selector:
Storage Location: default
Velero-Native Snapshot PVs: auto
TTL: 720h0m0s
Hooks:
Backup Format Version: 1.1.0
Started: 2020-11-05 19:44:10 +0100 CET
Completed: <n/a>
Expiration: 2020-12-05 19:44:10 +0100 CET
Velero-Native Snapshots:
velero backup logs
velero backup logs mybackup-1 -n spp-velero
Logs for backup "mybackup-1" are not available until it's finished processing. Please wait until the backup has a phase of Completed or Failed and try again.
velero restore describe or kubectl get restore/ -n velero -o yaml
velero restore logs
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
The pod is always restarting after or on retrieving the resource secrets. This is reproducable.
A backup with exclude secrets works fine.
velero create backup mybackup-2 --exclude-resources secrets -n spp-velero
velero get backup -n spp-velero
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
mybackup-1 InProgress 0 0 2020-11-05 19:44:10 +0100 CET 29d default
mybackup-2 Completed 0 1 2020-11-05 19:48:35 +0100 CET 29d default
Environment:
OCP 4.5.15, we see the same behavior on 4.5.6 and 4.6.1
Velero version (use velero version): 1.5.2 and 1.4.3
Velero features (use velero client config get features):
Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2-0-g52c56ce", GitCommit:"b66f2d3a6893be729f1b8660309a59c6e0b69196", GitTreeState:"clean", BuildDate:"2020-08-10T04:49:09Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.3+2fbd7c7", GitCommit:"2fbd7c7", GitTreeState:"clean", BuildDate:"2020-10-09T11:41:17Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes installer & version:
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release): CoreOS 4.5