Remove pgBackrest repo pod name from backup #2451

nonemax · 2021-05-14T08:20:58Z

Checklist:

Have you added an explanation of what your changes do and why you'd like them to be included?
Have you updated or added documentation for the change, as applicable?
Have you tested your changes on all related environments with successful results, as applicable?

Type of Changes:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

What is the current behavior? (link to any open issues here)
Now there is a need to specify the name of the pgBackrest pod when you creating a backup

What is the new behavior (if this is a feature change)?
This PR add automatic setting for pgBackrest pod name when you creating a backup

Other information:

This error check is using an `err` variable that is defined and handled earlier in the code, which leads to an extra error in the logs.

When performing an in-place PostgreSQL cluster restore (such as when using the 'pgo restore' command), the PVC for the current primary database (includes the PGDATA PVC, along with any WAL and/or tablespace PVCs) will now be preserved if found (as identified by the 'current- primary' annotation on the pgcluster custom resource). This will cause the 'crunchy-postgres-ha' container to attempt a pgBackRest "delta" restore when bootstrapping the restored cluster, therefore leveraging any existing data within the PGDATA directory efficiently restore the database. Issue: [ch9878]

The pgo-backrest and pgo-backrest-repo containers have been moved to the Crunchy Containers project. As such, the associated files are no longer needed in this repository. Additionally, the references to these containers are now updated to match the new naming convention being used, and the image tag and prefix values are updated to reflect the new location of the containers. This change removes debug flags and references to the unused sshd_port env variable. It also fixes a minor copy paste error in the docs. Co-authored-by: TJ Moore <[email protected]>

A post-failover backup is now only triggered for non-standby clusters. Therefore, if a failover occurs within a standby cluster, and automatic backup will no longer be run. Issue: [ch9912] Issue: CrunchyData#2102

As part of the compaction changes some labels and label checks were changed. This pr reverts these changes. The `pgo-scheduler` code was updated to check for a `crunchy-pgbackrest-repo` label instead of the `pgo-backrest-repo` label. The deployment templates were not updated to use the updated label so the scheduler would fail to create a backup job when scheduling a backup. the pgo.sqlrunner template was updated to have the `sqlrunner` label instead of `pgo-sqlrunner`

This was originally written for the pgAdmin 4 integration, but can serve multiple purposes for some of the advanced updating logic.

A rolling update of a PostgreSQL cluster involves applying any updates that may require downtime to each replica within a PostgreSQL cluster, followed by the promotion to a replica deemed suitable to be a primary, followed by the update being applied to the former primary. This commit introduces an interface to perform this exact behavior, by allowing for any updates to the Deployments of PostgreSQL instances to have any updates applied in a rolling fashion. Issue: [ch9881]

As this is a change that can cause downtime, it is prudent to try to limit the downtime by applying a rolling update methodology.

Modifying the annotations on the template portion of a Deployment Spec causes each Pod that is under management of a Deployment to be restarted. For managing a database server, this can be less than ideal. As such, it is prudent to employ a rolling update strategy for annotation updates on database instances to minimize downtime on the primary instance.

The `pgo update cluser --tablespace` functionality now leverages the rolling update algorithm to minimize the appearance of downtime to any connected clients.

The `--rolling` flag allows for one to specify a restart on a PosgreSQL cluster to occur in a rolling fashion, i.e. all the replicas a restarted, then a switchover occurs, then the newly demoted primary is restarted. This subsequently creates a task custom resource to perform the rolling update, as said updates can take some time to process. Issue: [ch9881]

The bootstrap Pod, a remnaint of a cluster restore, gets caught up in the `pgo df` search, but unfortunately this is not a valid Pod. This exlcude this Pod from being considered. Issue: [ch2029] Issue: CrunchyData#2029

When the bootstrap Job completes successfully after a restore, it contains information that ends up being consumed by other parts of the Operator system, such as Patroni. As the logs from the Job do not provide much, if any, helpful information after a restore succeeds, it's best to have the Operator eliminate the job. As such, this changes the behavior so that the bootstrap Job is removed. As this has lead to some buggy behavior, this is being considered as a bug fix, as regular operational work would dictate that the Job is removed anyway. Issue: [ch9919]

By adding this limitation, Pods such as Evicted Pods would not be considered as a part of `pgo test` output, as this could present some odd scenarios, such as the presence of two primaries. Issue: [ch9931] Issue: CrunchyData#2095

This moves several home-constructed methods to using a similarly constructed one that is maintained upstream. Provides more consistency across the code that can serve future implementation.

The functionality of the crunchy-backrest-restore container is now included in the new crunchy-pgbackrest. As such, the existing reference to the obsolete container can now be removed.

For a variety of reasons, including the need to exec into Pods to get PVC status with `pgo df`, only running Pods should be considered for this command and, in particular, no evicted Pods. Issue: [ch9959] Issue: CrunchyData#2129

pgBouncer is an optional deployment, as such, we should proceed on if the pgBouncer is not found.

Updates to a PostgreSQL cluster that warrant a rolling update are now aggregated to only trigger a single rolling update per action taken on a PostgreSQL cluster. This allows for the changes to be rolled out more rapidly, as well as limit the number of downtime events that need to take place.

This adds a CRD attribute to pgcluster called `exporter`, which will ultimately allow for the toggling on/off of the metrics sidecar within a PostgreSQL cluster. Includes an upgrade path for eliminating confusing labels for the enablement of the exporter.

The updates the "exporter.json" template, which is used for deploying the "crunchy-postgres-exporter" sidecar for metrics collection in a PostgreSQL cluster, to not have a preceiding "," in it. This in turn allows for the file to be mapped into a Kubernetes Container object, for convenience of manipulation within a program.

Match labels are immutable objects, and given some potentially mutable labels exists within the match labels for the PostgreSQL Deployment objects, it is necessary to modify this set of labels to use the minimum needed for properly deploying a cluster. This reduces the current set of match labels for a PostgreSQL instance to the following: - vendor - pg-cluster -- the name of the PostgreSQL cluster (group of all instances) - deployment-name -- the name of the PostgreSQL instance - pgo-pg-database

This commit introduces the ability to enable/disable the metrics collection sidecar (`crunchy-postgres-exporter`) during the lifetime of a PostgreSQL cluster. This can be toggled in multiple ways, including: - The `exporter` attribute in pgclusters.crunchydata.com - `pgo update cluster --enable-metrics`, which adds the sidecar - `pgo update cluster --disable-metrics`, which removes the sidecar As adding/removing a sidecar results in modifying a Deployment template, this action will trigger a rolling update of the PostgreSQL cluster in an effort to minimize any downtime. This also has the net effect of moving the "ccp_monitoring" used that is created to being fully managed by the Postgres Operator. The `CollectSecretName` attribute is now removed from the pgcluster CRD, as is the "PgMonitorPassword" attribute from the `pgo-deployer` and other installers. Issue: [ch7270] Issue: CrunchyData#1413

This introduces the `--exporter-rotate-password` flag to `pgo update cluster` so that the metrics collection password can be rotated.

Since the primary PVC for the cluster is now retained during an in-place PostgreSQL cluster restore in support of pgBackRest delta restores, when preparing a cluster for a restore we can no longer rely on the deletion of all PVC's as an indicator that the 'config' and 'leader' ConfigMaps created by Patroni can be removed. Therefore, the Operator now specifically waits for all Deployments to be successfully removed prior to deleting these resources. Issue: [ch9926]

This is a convenience for development, allowing all of the Golang binaries to be built from a single target.

While these should rarely, if ever, happen, the world of distributed computing is unpredictable and we should ensure our code can fail gracefully in these scenarios.

) There were cases where this was failing due to too many quotes being used, so this should avoid said issues. Issue: [ch9981] Issue: CrunchyData#2108

This moves Grafana to 6.7.5 and Prometheus to 2.23.0. Note that this continues to use the upstream version.

This adds examples to the monitoring architecture and tutorial around how to enable metrics collection on an existing PostgreSQL cluster.

There is an additional requirement for the exporter image to include findutils.

This redirects to upstream projects to find out more about the available metrics.

This makes it easier for PGO Pods to work with other systems. Issue: CrunchyData#2407

The "secretName" in for the "ssh-config" volume in the "cluster-bootstrap-job.json" template has been updated to reference the the proper Secret as needed for restoring across namespaces.

This provides some modernity to help with reporting issues or requesting features for PGO.

When performing an in-place restore (e.g. using 'pgo restore'), any existing "bootstrap" Secrets are now deleted. This facilitates retry attempts, e.g. after a restore attempt fails, by ensuring all resources can be properly recreated in order to re-attempt the restore. Additionally, fixes the spelling of variable "BootstrapConfigPrefix", and moves it to a location where it can be utilized by both the "cluster" and "pgbackrest" packages.

These are unavailable on older versions of PostgreSQL. Issue: [ch11334]

Ensure it is easy to do test builds.

If $GOPATH is unset, this will default to using the standard module GOPATH.

This ensures the file perms on LICENSE.txt do not change when script is run.

The apiserver will reconcile its own TLS certificate -- it has been doing so for awhile. This unifies the method to ensure that only the apiserver will generate its certificate, unless the user explicitly provides one. Issue: [ch11380]

As we are no longer calling openssl from the Ansible scripts, we do not need to install it explicitly. However, this is included in most, if not all, of the base images we use.

This includes some of the process debugging utils, vi, less, and ensuring tzdata is present[1] [1] https://access.redhat.com/solutions/5616681 Issue: [ch11367]

This label is not referenced anywhere.

The label in question (pgremove) was used to indicate that the PVC was managed by PGO, but there are other labels that handle that,

This allows custom labels to be extended to the following objects that are managed by PGO: - Pods - Deployments - Jobs - PVCs - ConfigMaps - Secrets - Services Issue: [ch11329]

This allows for custom labels to be edited on all of the managed objects within a cluster. This works both from editing "userlabels" within the pgclusters custom resource, as well as via API calls. The changes are applied to Postgres instances using a rolling update. Issue: [ch11329]

This updates pgMonitor to v4.5-RC3, and makes additional changes from da556b9. Issue: [ch11334]

This includes some fixes and changes since 4.7.0-beta.2

Due to some reshuffling, these were not appearing in the "pgo show cluster" command, but now they are.

Different casing was used in different parts of the documentation for backrestResources and backrestLimits. This fixes it to be consistent with what's actually used.

Add `disable_fsgroup: "true"` for the OpenShift 3.11 installer.

…a#45) * K8SPG-42 Remove podName from backup task * K8SPG-42 Change error messages for getting backrest pod * K8SPG-42 Change error messages for getting backrest pod, get back main to operator.yaml * K8SPG-42 Fix typo * K8SPG-42 Remove stoorageSpec from backup.yaml

andrewlecuyer · 2022-07-06T23:52:25Z

Closing since this PR targets PGO v4, which is no longer being maintained in the master branch.

If there is still a specific change related to this PR that you are interested in implementing within PGO v5, please feel free to submit a new issue or PR.

jmckulk and others added 30 commits December 2, 2020 12:31

Remove orphaned error check

b1e5421

This error check is using an `err` variable that is defined and handled earlier in the code, which leads to an extra error in the logs.

No post-failover backups for standby clusters

5c16b64

A post-failover backup is now only triggered for non-standby clusters. Therefore, if a failover occurs within a standby cluster, and automatic backup will no longer be run. Issue: [ch9912] Issue: CrunchyData#2102

Refactor wait for deployment function in context of a cluster

83aef43

This was originally written for the pgAdmin 4 integration, but can serve multiple purposes for some of the advanced updating logic.

Modify cluster update resources to utilize rolling updates

d621d4e

As this is a change that can cause downtime, it is prudent to try to limit the downtime by applying a rolling update methodology.

Modify adding tablespaces to use rolling updates

b5a84cf

The `pgo update cluser --tablespace` functionality now leverages the rolling update algorithm to minimize the appearance of downtime to any connected clients.

Do not consider bootstrap Pod in pgo df

1aefe22

The bootstrap Pod, a remnaint of a cluster restore, gets caught up in the `pgo df` search, but unfortunately this is not a valid Pod. This exlcude this Pod from being considered. Issue: [ch2029] Issue: CrunchyData#2029

Only consider running Pods for pgo test

57c9815

By adding this limitation, Pods such as Evicted Pods would not be considered as a part of `pgo test` output, as this could present some odd scenarios, such as the presence of two primaries. Issue: [ch9931] Issue: CrunchyData#2095

Use poll utility provided by Kubernetes for waiting

1b5fe49

This moves several home-constructed methods to using a similarly constructed one that is maintained upstream. Provides more consistency across the code that can serve future implementation.

Remove references to crunchy-backrest-restore

e42fe12

The functionality of the crunchy-backrest-restore container is now included in the new crunchy-pgbackrest. As such, the existing reference to the obsolete container can now be removed.

Do not consider evicted Pods with pgo df

2155584

For a variety of reasons, including the need to exec into Pods to get PVC status with `pgo df`, only running Pods should be considered for this command and, in particular, no evicted Pods. Issue: [ch9959] Issue: CrunchyData#2129

Ignore a pgBouncer not found error when updating annotations

fdef989

pgBouncer is an optional deployment, as such, we should proceed on if the pgBouncer is not found.

Allow for the metrics agent password to be rotated

021effb

This introduces the `--exporter-rotate-password` flag to `pgo update cluster` so that the metrics collection password can be rotated.

Add a "build" Makefile target

475d57c

This is a convenience for development, allowing all of the Golang binaries to be built from a single target.

Fix several edge case out-of-index panics

693a786

While these should rarely, if ever, happen, the world of distributed computing is unpredictable and we should ensure our code can fail gracefully in these scenarios.

Modify syntax for checking for recovery status via SQL (CrunchyData#2133

c209b16

) There were cases where this was failing due to too many quotes being used, so this should avoid said issues. Issue: [ch9981] Issue: CrunchyData#2108

Bump Grafana & Prometheus versions for Postgres Operator Monitoring

2fba41a

This moves Grafana to 6.7.5 and Prometheus to 2.23.0. Note that this continues to use the upstream version.

Provide more documentation on metrics enablement

4e53236

This adds examples to the monitoring architecture and tutorial around how to enable metrics collection on an existing PostgreSQL cluster.

jkatz and others added 24 commits April 26, 2021 16:12

Update package requirements for exporter image

fe78c1f

There is an additional requirement for the exporter image to include findutils.

Simplify the Crunchy Postgres Exporter docs

52892dc

This redirects to upstream projects to find out more about the available metrics.

Give names to the ports

129273b

This makes it easier for PGO Pods to work with other systems. Issue: CrunchyData#2407

Updates Secret Name In Bootstrap Job Template

4836191

The "secretName" in for the "ssh-config" volume in the "cluster-bootstrap-job.json" template has been updated to reference the the proper Secret as needed for restoring across namespaces.

Update GitHub issue templates

70a52be

This provides some modernity to help with reporting issues or requesting features for PGO.

Restrict pg_stat_statement_reset queries to PG12+ (CrunchyData#2418)

da556b9

These are unavailable on older versions of PostgreSQL. Issue: [ch11334]

Update Makefile targets

31f7412

Ensure it is easy to do test builds.

Update defaults for running aggregator script

314da25

If $GOPATH is unset, this will default to using the standard module GOPATH.

Small tweak to script

7f0ede5

This ensures the file perms on LICENSE.txt do not change when script is run.

Remove explicit package installation in deployer

62f5d60

As we are no longer calling openssl from the Ansible scripts, we do not need to install it explicitly. However, this is included in most, if not all, of the base images we use.

Include additional dependencies into minimal image

f87c88a

This includes some of the process debugging utils, vi, less, and ensuring tzdata is present[1] [1] https://access.redhat.com/solutions/5616681 Issue: [ch11367]

Remove superfluous label from user labels

d407593

This label is not referenced anywhere.

Remove superfluous label on managed PVCs

7228e85

The label in question (pgremove) was used to indicate that the PVC was managed by PGO, but there are other labels that handle that,

Extend custom labels to PGO managed objects

99399e2

This allows custom labels to be extended to the following objects that are managed by PGO: - Pods - Deployments - Jobs - PVCs - ConfigMaps - Secrets - Services Issue: [ch11329]

Update pgMonitor version

ff7ef0e

This updates pgMonitor to v4.5-RC3, and makes additional changes from da556b9. Issue: [ch11334]

Update release notes

e27e167

This includes some fixes and changes since 4.7.0-beta.2

Bump version v4.7.0-beta.3

bf0f51d

Ensure custom labels appear in "pgo show cluster" command

9029cf8

Due to some reshuffling, these were not appearing in the "pgo show cluster" command, but now they are.

Fix typos in casing of backrest parameters

671856d

Different casing was used in different parts of the documentation for backrestResources and backrestLimits. This fixes it to be consistent with what's actually used.

Add explicit disable_fsgroup to OpenShift 3.11 manifest

081b9d3

Add `disable_fsgroup: "true"` for the OpenShift 3.11 installer.

nonemax changed the title ~~K8SPG-42 Remove pgBackrest repo pod name from backup.yaml (#45)~~ Remove pgBackrest repo pod name from backup.yaml (#45) May 14, 2021

nonemax changed the title ~~Remove pgBackrest repo pod name from backup.yaml (#45)~~ Remove pgBackrest repo pod name from backup(#45) May 14, 2021

nonemax changed the title ~~Remove pgBackrest repo pod name from backup(#45)~~ Remove pgBackrest repo pod name from backup May 14, 2021

andrewlecuyer force-pushed the master branch from 507d34a to 21905f0 Compare July 7, 2021 15:07

andrewlecuyer closed this Jul 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove pgBackrest repo pod name from backup #2451

Remove pgBackrest repo pod name from backup #2451

Uh oh!

nonemax commented May 14, 2021

Uh oh!

andrewlecuyer commented Jul 6, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Remove pgBackrest repo pod name from backup #2451

Remove pgBackrest repo pod name from backup #2451

Uh oh!

Conversation

nonemax commented May 14, 2021

Uh oh!

andrewlecuyer commented Jul 6, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants