Check for PostgreSQL replication delay/lag -part3

November 26, 2016

By postgresdba

I think the best one would be 2 combined with a check if the wal receiver process is running before running that

query with something like:

$ ps aux | egrep ‘wal\sreceiver’

postgres 3858 0.0 0.0 2100112 3312 ? Ss 19:35 0:01 postgres:

wal receiver process streaming 36/900A738

This solution would only be run on the slave and it is pretty easy to setup.

Check For Postgresql Replication Delay/Lag -Part2

November 26, 2016

By postgresdba

Master:

SELECT pg_current_xlog_location();

Slave:

SELECT pg_last_xlog_receive_location();

and by comparing these two values you could see if the servers are in sync.

The problem yet again is that if streaming replication fails, both of these

functions will continue to return same values and you could still end up

thinking the replication is working. But also you need to query both the

master and slave to be able to monitor this, which is not that easy on

monitoring systems, and you still don’t have the information about the

actual lag in seconds, so you would still need to run the first query.

Check For Postgresql Replication Delay/Lag

November 26, 2016

By postgresdba

This can be achieved by comparing pg_last_xlog_receive_location() and

pg_last_xlog_replay_location() on the slave, and if they are the same it

returns 0, otherwise it runs the above query again:

SELECT

CASE

WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0

ELSE EXTRACT (EPOCH FROM now() – pg_last_xact_replay_timestamp())::INTEGER

END

AS replication_lag;

This query is all good, but the problem is that it is not safe. If for some

reason the master stops sending transaction logs, this query will continue

to return 0 and you will think the replication is working, when it is not.

Check Replication Delay/Lag In Postgressql -Part1

November 26, 2016

By postgresdba

I used to check replication delay/lag by running the following query on the

slave:

SELECT EXTRACT(EPOCH FROM (now() – pg_last_xact_replay_timestamp()))::INT;

This query works great and it is a very good query to give you the lag in

seconds. The problem is if the master is not active, it doesn’t mean a

thing. So you need to first check if two servers are in sync and if they

are, return 0.

Monitoring Approach For Streaming Replication With Hot Standby In Postgresql 9.3

November 26, 2016

By postgresdba

The people using PostgreSQL and the Streaming Replication feature seem to ask many of the same questions:

1. How best to monitor Streaming Replication?

2. What is the best way to do that?

3. Are there alternatives, when monitoring on Standby, to using the pg_stat_replication view on Master?

4. How should I calculate replication lag-time, in seconds, minutes, etc.?

In light of these commonly asked questions, I thought a blog would help. The following are some methods I’ve found to be useful.

Monitoring is critical for large infrastructure deployments where you have Streaming Replication for:

1. Disaster recovery

2. Streaming Replication is for High Availability

3. Load balancing, when using Streaming Replication with Hot Standby

PostgreSQL has some building blocks for replication monitoring, and the following are some important

functions and views which can be use for monitoring the replication:

1. pg_stat_replication view on master/primary server.

This view helps in monitoring the standby on Master. It gives you the following details:

pid: Process id of walsender process

usesysid: OID of user which is used for Streaming replication.

usename: Name of user which is used for Streaming replication

application_name: Application name connected to master

client_addr: Address of standby/streaming replication

client_hostname: Hostname of standby.

client_port: TCP port number on which standby communicating with WAL sender

backend_start: Start time when SR connected to Master.

state: Current WAL sender state i.e streaming

sent_location: Last transaction location sent to standby.

write_location: Last transaction written on disk at standby

flush_location: Last transaction flush on disk at standby.

replay_location: Last transaction flush on disk at standby.

sync_priority: Priority of standby server being chosen as synchronous standby

sync_state: Sync State of standby (is it async or synchronous).

e.g.:

postgres=# select * from pg_stat_replication ;

-[ RECORD 1 ]—-+———————————

pid | 1114

usesysid | 16384

usename | repuser

application_name | walreceiver

client_addr | 172.17.0.3

client_hostname |

client_port | 52444

backend_start | 15-MAY-14 19:54:05.535695 -04:00

state | streaming

sent_location | 0/290044C0

write_location | 0/290044C0

flush_location | 0/290044C0

replay_location | 0/290044C0

sync_priority | 0

sync_state | async

2. pg_is_in_recovery() : Function which tells whether standby is still in recovery mode or not.

e.g.postgres=# select pg_is_in_recovery();

pg_is_in_recovery

——————-

(1 row)

3. pg_last_xlog_receive_location: Function which tells location of last transaction log which was streamed by Standby and also written on standby disk.

e.g.postgres=# select pg_last_xlog_receive_location();

pg_last_xlog_receive_location

——————————-

0/29004560

(1 row)

4. pg_last_xlog_replay_location: Function which tells last transaction replayed during recovery process. e.g is given below:

postgres=# select pg_last_xlog_replay_location();

pg_last_xlog_replay_location

——————————

0/29004560

(1 row)

5. pg_last_xact_replay_timestamp: This function tells about the time stamp of last transaction which was replayed during recovery. Below is an example:

postgres=# select pg_last_xact_replay_timestamp();

pg_last_xact_replay_timestamp

———————————-

15-MAY-14 20:54:27.635591 -04:00

(1 row)

Above are some important functions/views, which are already available in PostgreSQL for monitoring the streaming replication.

So, the logical next question is, “What’s the right way to monitor the Hot Standby with Streaming Replication on Standby Server?”

If you have Hot Standby with Streaming Replication, the following are the points you should monitor:

1. Check if your Hot Standby is in recovery mode or not:

For this you can use pg_is_in_recovery() function.

2.Check whether Streaming Replication is working or not.

And easy way of doing this is checking the pg_stat_replication view on Master/Primary. This view gives information only on master if Streaming Replication is working.

3. Check If Streaming Replication is not working and Hot standby is recovering from archived WAL file.

For this, either the DBA can use the PostgreSQL Log file to monitor it or utilize the following functions provided in PostgreSQL 9.3:

pg_last_xlog_replay_location();

pg_last_xact_replay_timestamp();

4. Check how far off is the Standby from Master.

There are two ways to monitor lag for Standby.

i. Lags in Bytes: For calculating lags in bytes, users can use the pg_stat_replication view on the master with the function pg_xlog_location_diff function. Below is an example:

pg_xlog_location_diff(pg_stat_replication.sent_location, pg_stat_replication.replay_location)

which gives the lag in bytes.

ii. Calculating lags in Seconds. The following is SQL, which most people uses to find the lag in seconds:

SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location()

THEN 0

ELSE EXTRACT (EPOCH FROM now() – pg_last_xact_replay_timestamp())

END AS log_delay;

Including the above into your repertoire can give you good monitoring for PostgreSQL.

I will in a future post include the script that can be used for monitoring the Hot Standby with PostgreSQL streaming replication

Replication Synchronous_Commit Options Available In Postgrsql

November 26, 2016

By postgresdba

Options available

Allowed values up to Postgres version 9.5 were – „on“, „remote_write“, „local“ and „off“ and now with 9.6 „remote_apply“ will join the party. Short descriptions of all of these in plain-text could be something like that of below. NB! Latter 3 values are effective only in synchronous streaming replication mode, and fall back to „on“ value when no replicas have been listed in the „synchronous_standby_names“ parameter.

* on – Transaction commit always waits until the data is really flushed to the transaction log (aka WAL or XLOG) making sure the transaction is really persisted. In synchronous streaming replication mode also the replica needs to do the same.

* off – Commit can be acknowledged to the calling client before the transaction is actually flushed to the transaction log, making it possible to lose some recent (<1s by default) allegedly-committed transactions in case of a server crash. Synchronous replication is ignored. This is one of the most known Postgres performance tweaks.

* local – Forces the „on“ behavior (guaranteed data flush) only on the master node. Usually used as user set session parameter for speeding up non-critical data imports on master for example.

* remote_write – Provides a weaker guarantee than „on“, transaction counts as commited if master does a guaranteed flush and the replica just gets a write confirmation from the operating system (prone to replica corruption if replica machine crashes before hitting storage).

* remote_apply – Provides the strongest replica consistency – commit doesn’t return before replica flushes and also applies the data. Clients could either query the master or the slave, they would have exactly the same view on the data.

Clearing Redo logfiles in Oracle

November 25, 2016

By postgresdba

A redo log file might become corrupted while the database is open, and ultimately stop database activity because archiving cannot continue. In this situation the ALTER DATABASE CLEAR LOGFILE statement can be used reinitialize the file without shutting down the database.

The following statement clears the log files in redo log group number 3:
ALTER DATABASE CLEAR LOGFILE GROUP 3;

This statement overcomes two situations where dropping redo logs is not possible:
If there are only two log groups
The corrupt redo log file belongs to the current group
If the corrupt redo log file has not been archived, use the UNARCHIVED keyword in the statement.
ALTER DATABASE CLEAR UNARCHIVED LOGFILE GROUP 3;

This statement clears the corrupted redo logs and avoids archiving them. The cleared redo logs are available for use even though they were not archived.
If you clear a log file that is needed for recovery of a backup, then you can no longer recover from that backup. The database writes a messa

Renaming or Relocating Logfiles in oracle

November 25, 2016

By postgresdba

suppose you want to move a logfile from ‘/u01/oracle/ica/log1.ora’ to ‘/u02/oracle/ica/log1.ora’, then do the following

1. Shutdown the database
SQL>shutdown immediate;

2. Move the logfile from Old location to new location using operating system command
$mv /u01/oracle/ica/log1.ora /u02/oracle/ica/log1.ora

3. Start and mount the database
SQL>startup mount

4. Now give the following command to change the location in controlfile
SQL>alter database rename file ‘/u01/oracle/ica/log1.ora’ to ‘/u02/oracle/ica/log2.ora’;

5. Open the database
SQL>alter database open;

Redo logfile Modification/Management in Oracle(add,delete,drop and resize redo logfile)

November 25, 2016

By postgresdba

Adding a New Redo Logfile Group
SQL>alter database add logfile group 3 ‘/u01/oracle/ica/log3.ora’ size 10M;

Adding Members to an existing group
SQL>alter database add logfile member ‘/u01/oracle/ica/log11.ora’ to group 1;

Dropping Members from a group
You can drop member from a log group only if the group is having more than one member and if it is not the current group. If you want to drop members from the current group, force a log switch or wait so that log switch occurs and another group becomes current. To force a log switch give the following command

SQL>alter system switch logfile;

the following command can be used to drop a logfile member
SQL>alter database drop logfile member ‘/u01/oracle/ica/log11.ora’;

Dropping Logfile Group
Similarly, you can also drop logfile group only if the database is having more than two groups and if it is not the current group.
SQL>alter database drop logfile group 3;

Resizing Logfiles
You cannot resize logfiles. If you want to resize a logfile create a new logfile group with the new size and subsequently drop the old logfile group.

moving and renaming datafiles –oracle

November 25, 2016

By postgresdba

Moving datafiles of a database: The datafiles reside under /home/oracle/OraHome1/databases/ora9 and have go to /home/oracle/databases/ora9.

SQL> select tablespace_name, substr(file_name,1,70) from dba_data_files;
TABLESPACE_NAME SUBSTR(FILE_NAME,1,70)
—————————— ———————————————————————-
SYSTEM /home/oracle/OraHome1/databases/ora9/system.dbf
UNDO /home/oracle/OraHome1/databases/ora9/undo.dbf
DATA /home/oracle/OraHome1/databases/ora9/data.dbf

SQL> select member from v$logfile;
MEMBER
——————————————————————————–
/home/oracle/OraHome1/databases/ora9/redo1.ora
/home/oracle/OraHome1/databases/ora9/redo2.ora
/home/oracle/OraHome1/databases/ora9/redo3.ora

SQL> select name from v$controlfile;
NAME
——————————————————————————–
/home/oracle/OraHome1/databases/ora9/ctl_1.ora
/home/oracle/OraHome1/databases/ora9/ctl_2.ora
/home/oracle/OraHome1/databases/ora9/ctl_3.ora

Now, as the files to be moved are known, the database can be shut down:
SQL> shutdown

The files can be copied to their destination:
$ cp /home/oracle/OraHome1/databases/ora9/system.dbf /home/oracle/databases/ora9/system.dbf
$ cp /home/oracle/OraHome1/databases/ora9/undo.dbf /home/oracle/databases/ora9/undo.dbf
$ cp /home/oracle/OraHome1/databases/ora9/data.dbf /home/oracle/databases/ora9/data.dbf
$
$ cp /home/oracle/OraHome1/databases/ora9/redo1.ora /home/oracle/databases/ora9/redo1.ora
$ cp /home/oracle/OraHome1/databases/ora9/redo2.ora /home/oracle/databases/ora9/redo2.ora
$ cp /home/oracle/OraHome1/databases/ora9/redo3.ora /home/oracle/databases/ora9/redo3.ora
$
$ cp /home/oracle/OraHome1/databases/ora9/ctl_1.ora /home/oracle/databases/ora9/ctl_1.ora
$ cp /home/oracle/OraHome1/databases/ora9/ctl_2.ora /home/oracle/databases/ora9/ctl_2.ora
$ cp /home/oracle/OraHome1/databases/ora9/ctl_3.ora /home/oracle/databases/ora9/ctl_3.ora

The init.ora file is also copied because it references the control files. I name the copied file just init.ora because it is not in a standard place anymore and it will have to be named explicitely anyway when the database is started up.
$ cp /home/oracle/OraHome1/dbs/initORA9.ora /home/oracle/databases/ora9/init.ora

The new location for the control files must be written into the (copied) init.ora file:
/home/oracle/databases/ora9/init.ora
control_files = (/home/oracle/databases/ora9/ctl_1.ora,
/home/oracle/databases/ora9/ctl_2.ora,
/home/oracle/databases/ora9/ctl_3.ora)

$ sqlplus “/ as sysdba”
SQL> startup exclusive mount pfile=/home/oracle/databases/ora9/init.ora
SQL> alter database rename file ‘/home/oracle/OraHome1/databases/ora9/system.dbf’ to ‘/home/oracle/databases/ora9/system.dbf’;
SQL> alter database rename file ‘/home/oracle/OraHome1/databases/ora9/undo.dbf’ to ‘/home/oracle/databases/ora9/undo.dbf’;
SQL> alter database rename file ‘/home/oracle/OraHome1/databases/ora9/data.dbf’ to ‘/home/oracle/databases/ora9/data.dbf’;

SQL> alter database rename file ‘/home/oracle/OraHome1/databases/ora9/redo1.ora’ to ‘/home/oracle/databases/ora9/redo1.ora’;
SQL> alter database rename file ‘/home/oracle/OraHome1/databases/ora9/redo2.ora’ to ‘/home/oracle/databases/ora9/redo2.ora’;
SQL> alter database rename file ‘/home/oracle/OraHome1/databases/ora9/redo3.ora’ to ‘/home/oracle/databases/ora9/redo3.ora’;

SQL> shutdown

SQL> startup pfile=/home/oracle/databases/ora9/init.ora