Postgresql Hot Standby Installation And Test The Monitor Of Standby

1- Keygen Generate And Copy
[postgres@pg01 ~] ssh-keygen -t rsa
[postgres@pg02 ~] ssh-keygen -t rsa
[postgres@pg01 ~] ssh-copy-id -i .ssh/id_rsa.pub postgres@pg02
[postgres@pg02 ~] ssh-copy-id -i .ssh/id_rsa.pub postgres@pg01
2- Create Streaming Replication User on Primary DB
psql# createuser -U postgres -p 5432 repuser -P -c 10 –replication
3- Primary DB pg_hba.conf configuration
[root@pg01 ~] vi /pgdata/data/postgresql.conf
Add below line
host replication repuser 192.168.10.12/32 md5
[root@pg01 ~] systemctl reload postgresql.service
4- Create Archive Directory on Hot Standby Server
[root@pg02 ~] mkdir -p /pgdata/ARCHIVELOG
5- Postgresql.conf configuration on Primary DB
[root@pg01 ~] vi /pgdata/data/postgresql.conf
Parameters change like below
listen_addresses = ‘192.168.10.10’
wal_level = hot_standby # minimal, archive, hot_standby, or logical
archive_mode = on # enables archiving; off, on, or always
archive_command = ‘rsync -a %p [email protected]:/pgdata/ARCHIVELOG/%f’ # command to use to archive a logfile segment
max_wal_senders = 3 # max number of walsender processes
wal_keep_segments = 1000 # in logfile segments, 16MB each; 0 disables
max_replication_slots = 4 # max number of replication slots(change requires restart)
[root@pg01 ~] systemctl restart postgresql.service
6- Base Backup
You can use pg_basebackup when primary db is start or stop. Stop status is more confident.
[root@pg01 ~] systemctl stop postgresql.service
/pgdata/data directory has to be empty on hot standby server. pg_basebackup command will fill up that directory.
[root@pg02 ~] /usr/pgsql-9.5/bin/pg_basebackup -h 192.168.10.10 -D /pgdata/data -U repuser -p 5432 -v -P –xlog-method=stream
7- Postgresql.conf configuration on Hot Standby Server
Primary DB configuration has to be disabled with # mark.
#wal_level = hot_standby # minimal, archive, hot_standby, or logical
#archive_mode = on # enables archiving; off, on, or always
#archive_command = ‘rsync -a %p [email protected]:/pgdata/ARCHIVELOG/%f’ # command to use to archive a logfile segment
#max_wal_senders = 3 # max number of walsender processes
#wal_keep_segments = 1000 # in logfile segments, 16MB each; 0 disables
#max_replication_slots = 4
listen_addresses = ‘*’
hot_standby = on
8- Create Recovery.conf file on Hot Standby Server
[root@pg02 ~] vi /pgdata/data/recovery.conf
restore_command = ‘cp /pgdata/ARCHIVELOG/%f %p’ # e.g. ‘cp /mnt/server/archivedir/%f %p’
archive_cleanup_command = ‘/usr/pgsql-9.5/bin/pg_archivecleanup /pgdata/ARCHIVELOG/%r’
standby_mode = on
primary_conninfo = ‘host=192.168.10.10 port=5432 user=repuser password=repuserpass’
trigger_file = ‘/pgdata/data/failover.uygula’
If you want to delay to apply committed values. You can use recovery_min_apply_delay parameter in recovery.conf like below.
recovery_min_apply_delay = 5min
This parameter provide 5 minutes delay. When you commit a transaction at primary side, hot standby will apply this transaction 5 minutes later.
9- Start Hot Standby PostgreSQL Service
[root@pg02 ~] systemctl start postgresql.service
10- Test
Primary DB
Replications list
psql# select * from pg_stat_replication ;
Sender process check
ps -ef | grep sender
postgres 18388 18298 0 17:04 ? 00:00:00 postgres: wal sender process repuser 192.168.10.12(33700) streaming 0/9000348
psql# create table test (name text);
psql# insert into test values(‘nijam’);
Hot Standby DB
Recovery mode check
psql# select pg_is_in_recovery();
Receiver process check
ps -ef | grep receiver
postgres 20936 20919 0 17:04 ? 00:00:00 postgres: wal receiver process streaming 0/9000268
psql# select * from test;
name
———–
nijam

Determining replication monitoring in postgreSql

Checking Log Position
replica1=# select pg_last_xlog_receive_location();
       pg_last_xlog_receive_location
      ——————————-
       41A/10808DE8
replica2=# select pg_last_xlog_receive_location();
       pg_last_xlog_receive_location
      ——————————-
       41A/FFD1560
Timestamp for the replica replay:
replica1=# select pg_last_xact_replay_timestamp();
pg_last_xact_replay_timestamp
——————————-
2012-10-05 10:35:47.527-07
compare two xlog locations to see which one is higher:
replica2=# select pg_xlog_location_diff(‘1/10808DE8′,’1/FFD1560’);
pg_xlog_location_diff
———————–
              8616072
The way you would use them is as follows:
1.The master goes down.
2.Check xlog_location_numeric() on each backend.
3.Pick the backend with the highest (or tied for highest) numeric position, and check how far behind it is in replay using replay_lag_mb(), but see below.
4.If the highest replica isn’t too far behind on replay, promote it.
5.If the highest replica is too far behind, drop to the next-highest and check replay lag.
easily done using pg_xlog_location_diff:
replica2=# SELECT pg_xlog_location_diff(pg_xlog_last_receive_location(), pg_xlog_last_replay_location());
pg_xlog_location_diff
—————————-
      16192
replay lag in megabytes, regardless of PostgreSQL version.
bench=# select replay_lag_mb();
replay_lag_mb
—————
         93.7
all_replayed() and replay_lag_mb() are designed to be run only on replicas.  They will return NULL on standalone or master servers.
these functions will return the same results regardless of which database they’re installed in.  However, they can only be called from the database in which they are installed.   So you might want to install them in the “postgres” scratch database.
xlog_location_numeric() returns a numeric value which can exceed a 64-bit integer in size.  So make sure your calling and sorting code is prepared to handle a larger-than-eight-byte number.

Time-Based Replication Monitoring In The Hot_Standby_Delay

This was something that had been a long-standing item on my personal TODO list, and happened to scratch the itch of a couple of clients at the time.
Previously it would only take an integer representing how many bytes of WAL data the master could be ahead of a replica before the threshold is crossed:
check_hot_standby_delay –dbhost=master,replica1 –critical=16777594
This is certainly useful for, say, keeping an eye on whether you’re getting close to running over your wal_keep_segments value. Of course it can also be used to indicate whether the replica is still processing WAL, or has become stuck for some reason. But for the (arguably more common) problem of determining whether a replica is falling too far behind determining what byte thresholds to use, beyond simply guessing, isn’t easy to figure out.
Postgres 9.1 introduced a handy function to help solve this problem: pg_last_xact_replay_timestamp(). It measures a slightly different thing than the pg_last_xlog_* functions the action previously used. And it’s for that reason that the action now has a more complex format for its thresholds:
check_hot_standby_delay –dbhost=master,replica1 –critical=”16777594 and 5 min”
For backward compatibility, of course, it’ll still take an integer and work the same as it did before. Or alternatively if you only want to watch the chronological lag, you could even give it just a time interval, ‘5 min’, and the threshold only takes the transaction replay timestamp into account. But if you specify both, as above, then both conditions must be met before the threshold activates.

Monitoring Streaming Replication with script also in postgreSql

One of the easiest ways to monitor slave lag when using streaming replication is to turn hot standby on your slave and use pg_last_xact_replay_timestamp() and/or the other recovery information functions. Here’s an example query to run on the slave systems to get the number of seconds behind it is:
SELECT extract(epoch from now() – pg_last_xact_replay_timestamp()) AS slave_lag
The issue with this query is that while your slave(s) may be 100% caught up, the time interval being returned is always increasing until new write activity occurs on the master that the slave can replay. This can cause your monitoring to give false positives that your slave is falling behind if you have things set up to ensure your slaves are no more than a few minutes behind. A side affect of this monitoring query can also give you an indication that writes to your master have stopped for some reason.
One of our clients has a smaller sized database that doesn’t get quite as much write traffic as our typical clients do. But it still has failover slaves and still needs to be monitored just like our other larger clients to ensure it doesn’t fall too far behind. So, my coworker introduced me to the pg_stat_replication view that was added in PostgreSQL 9.1. Querying this from the master returns information about streaming replication slaves connected to it.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]—-+——————————
pid              | 16649
usesysid         | 16388
usename          | replication
application_name | walreceiver
client_addr      | xxx.xxx.xxx.xxx
client_hostname  | db1-prod-ca
client_port      | 58085
backend_start    | 2013-10-29 19:57:51.48142+00
state            | streaming
sent_location    | 147/11000000
write_location   | 147/11000000
flush_location   | 147/11000000
replay_location  | 147/11000000
sync_priority    | 0
sync_state       | async
-[ RECORD 2 ]—-+——————————
pid              | 7999
usesysid         | 16388
usename          | replication
application_name | walreceiver
client_addr      | yyy.yyy.yyy.yyy
client_hostname  | db2-prod
client_port      | 54932
backend_start    | 2013-10-29 15:32:47.256794+00
state            | streaming
sent_location    | 147/11000000
write_location   | 147/11000000
flush_location   | 147/11000000
replay_location  | 147/11000000
sync_priority    | 0
sync_state       | async
He also provided a handy query to get back a simple, easy to understand numeric value to indicate slave lag. The issue I ran into using the query is that this view uses pg_stat_activity as one of its sources. If you’re not a superuser, you’re not going to get any statistics on sessions that aren’t your own (and hopefully you’re not using a superuser role as the role for your monitoring solution). So, instead I made a function with SECURITY DEFINER set, made a superuser role the owner, and gave my monitoring role EXECUTE privileges on the function.
CREATE OR REPLACE FUNCTION streaming_slave_check() RETURNS TABLE (client_hostname text, client_addr inet, byte_lag float)
LANGUAGE SQL SECURITY DEFINER
AS $$
  SELECT
      client_hostname,
      client_addr,
      sent_offset – (replay_offset – (sent_xlog – replay_xlog) * 255 * 16 ^ 6 ) AS byte_lag
  FROM (
      SELECT
          client_hostname,
          client_addr,
          (‘x’ || lpad(split_part(sent_location::text,   ‘/’, 1), 8, ‘0’))::bit(32)::bigint AS sent_xlog,
          (‘x’ || lpad(split_part(replay_location::text, ‘/’, 1), 8, ‘0’))::bit(32)::bigint AS replay_xlog,
          (‘x’ || lpad(split_part(sent_location::text,   ‘/’, 2), 8, ‘0’))::bit(32)::bigint AS sent_offset,
          (‘x’ || lpad(split_part(replay_location::text, ‘/’, 2), 8, ‘0’))::bit(32)::bigint AS replay_offset
      FROM pg_stat_replication
  ) AS s;
$$;
Running this query gives back a few handy columns that should be good enough for most monitoring tools. You can easily add more columns from pg_stat_replication or any other tables you need to join against for more info.
postgres=# select * from streaming_slave_check();
client_hostname |   client_addr   | byte_lag
—————–+—————–+———-
db1-prod-ca     | xxx.xxx.xxx.xxx |      160
db2-prod        | yyy.yyy.yyy.yyy |      160
UPDATE: If you’re running PostgreSQL 9.2+, there is a new, built-in function that avoids needing the above function all together and can just query pg_stat_replication directly.
postgres=# SELECT client_hostname
  , client_addr
  , pg_xlog_location_diff(pg_stat_replication.sent_location, pg_stat_replication.replay_location) AS byte_lag
FROM pg_stat_replication;
client_hostname |  client_addr  | byte_lag
—————–+—————+———-
db1-prod-ca     | xxx.xxx.xx.xx |        0
db2-prod        | yy.yyy.yyy.yy |        0
Unfortunately, this function still requires superuser privileges to obtain all relevant data and most monitoring tools do not use a superuser role (I hope). So, in that case you do still need a SECURITY DEFINER function, but it can be a much much simpler one
CREATE OR REPLACE FUNCTION streaming_slave_check() RETURNS TABLE (client_hostname text, client_addr inet, byte_lag numeric)
LANGUAGE SQL SECURITY DEFINER
AS $$
  SELECT client_hostname
  , client_addr
  , pg_xlog_location_diff(pg_stat_replication.sent_location, pg_stat_replication.replay_location) AS byte_lag
  FROM pg_stat_replication;
$$;
This can also be useful to monitor slave lag when you don’t have hot standby turned on for your slaves to allow read-only queries.
Combining both of the replication monitoring solutions mentioned in this post should give you a much better overall picture of the status of your master/slave systems.

monitor replication delay -four solution in postgreSql

Looking at the documentation and all the blog posts about how to monitor
replication delay I don’t think there is one good and most importantly safe
solution which works all the time.
Solution 1:
I used to check replication delay/lag by running the following query on the
slave:
SELECT EXTRACT(EPOCH FROM (now() – pg_last_xact_replay_timestamp()))::INT;
This query works great and it is a very good query to give you the lag in
seconds. The problem is if the master is not active, it doesn’t mean a
thing. So you need to first check if two servers are in sync and if they
are, return 0.
Solution 2:
This can be achieved by comparing pg_last_xlog_receive_location()  and
pg_last_xlog_replay_location() on the slave, and if they are the same it
returns 0, otherwise it runs the above query again:
SELECT
CASE
WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0
ELSE EXTRACT (EPOCH FROM now() – pg_last_xact_replay_timestamp())::INTEGER
END
AS replication_lag;
This query is all good, but the problem is that it is not safe. If for some
reason the master stops sending transaction logs, this query will continue
to return 0 and you will think the replication is working, when it is not.
Solution 3:
Master:
SELECT pg_current_xlog_location();
Slave:
SELECT pg_last_xlog_receive_location();
and by comparing these two values you could see if the servers are in sync.
The problem yet again is that if streaming replication fails, both of these
functions will continue to return same values and you could still end up
thinking the replication is working. But also you need to query both the
master and slave to be able to monitor this, which is not that easy on
monitoring systems, and you still don’t have the information about the
actual lag in seconds, so you would still need to run the first query.
Solution 4:
You could query pg_stat_replication on the master, compare sent_location
and replay_location, and if they are the same, the replication is in sync.
One more good thing about pg_stat_replication is that if streaming
replication fails it will return an empty result, so you will know it
failed. But the biggest problem with this system view is that only the postgres
user can read it, so it’s not that monitoring friendly since you don’t want
to give your monitoring system super user privileges, and you still don’t
have the delay in seconds.
I think the best one would be 2 combined
with a check if the wal receiver process is running before running that
query with something like:
$ ps aux | egrep ‘wal\sreceiver’
postgres  3858  0.0  0.0 2100112 3312 ?        Ss   19:35   0:01 postgres:
wal receiver process   streaming 36/900A738
This solution would only be run on the slave and it is pretty easy to setup.

Check for PostgreSQL replication delay/lag -part3

I think the best one would be 2 combined with a check if the wal receiver process is running before running that
query with something like:
$ ps aux | egrep ‘wal\sreceiver’
postgres  3858  0.0  0.0 2100112 3312 ?        Ss   19:35   0:01 postgres:
wal receiver process   streaming 36/900A738
This solution would only be run on the slave and it is pretty easy to setup.

Check For Postgresql Replication Delay/Lag

This can be achieved by comparing pg_last_xlog_receive_location()  and
pg_last_xlog_replay_location() on the slave, and if they are the same it
returns 0, otherwise it runs the above query again:
SELECT
CASE
WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location() THEN 0
ELSE EXTRACT (EPOCH FROM now() – pg_last_xact_replay_timestamp())::INTEGER
END
AS replication_lag;
This query is all good, but the problem is that it is not safe. If for some
reason the master stops sending transaction logs, this query will continue
to return 0 and you will think the replication is working, when it is not.

Check Replication Delay/Lag In Postgressql -Part1

I used to check replication delay/lag by running the following query on the
slave:
SELECT EXTRACT(EPOCH FROM (now() – pg_last_xact_replay_timestamp()))::INT;
This query works great and it is a very good query to give you the lag in
seconds. The problem is if the master is not active, it doesn’t mean a
thing. So you need to first check if two servers are in sync and if they
are, return 0.

Monitoring Approach For Streaming Replication With Hot Standby In Postgresql 9.3

The people using PostgreSQL and the Streaming Replication feature seem to ask many of the same questions:
1. How best to monitor Streaming Replication?
2. What is the best way to do that?
3. Are there alternatives, when monitoring on Standby, to using the pg_stat_replication view on Master?
4. How should I calculate replication lag-time, in seconds, minutes, etc.?
In light of these commonly asked questions, I thought a blog would help. The following are some methods I’ve found to be useful.
Monitoring is critical for large infrastructure deployments where you have Streaming Replication for:
1. Disaster recovery
2. Streaming Replication is for High Availability
3. Load balancing, when using Streaming Replication with Hot Standby
PostgreSQL has some building blocks for replication monitoring, and the following are some important
functions and views which can be use for monitoring the replication:
1. pg_stat_replication view on master/primary server.
This view helps in monitoring the standby on Master. It gives you the following details:
pid:              Process id of walsender process
usesysid:         OID of user which is used for Streaming replication.
usename:          Name of user which is used for Streaming replication
application_name: Application name connected to master
client_addr:      Address of standby/streaming replication
client_hostname:  Hostname of standby.
client_port:      TCP port number on which standby communicating with WAL sender
backend_start:    Start time when SR connected to Master.
state:            Current WAL sender state i.e streaming
sent_location:    Last transaction location sent to standby.
write_location:   Last transaction written on disk at standby
flush_location:   Last transaction flush on disk at standby.
replay_location:  Last transaction flush on disk at standby.
sync_priority:    Priority of standby server being chosen as synchronous standby
sync_state:       Sync State of standby (is it async or synchronous).
e.g.:
postgres=# select * from pg_stat_replication ;
-[ RECORD 1 ]—-+———————————
pid              | 1114
usesysid         | 16384
usename          | repuser
application_name | walreceiver
client_addr      | 172.17.0.3
client_hostname  |
client_port      | 52444
backend_start    | 15-MAY-14 19:54:05.535695 -04:00
state            | streaming
sent_location    | 0/290044C0
write_location   | 0/290044C0
flush_location   | 0/290044C0
replay_location  | 0/290044C0
sync_priority    | 0
sync_state       | async
2. pg_is_in_recovery() : Function which tells whether standby is still in recovery mode or not.
e.g.postgres=# select pg_is_in_recovery();
pg_is_in_recovery
——————-
t
(1 row)
3. pg_last_xlog_receive_location: Function which tells location of last transaction log which was streamed by Standby and also written on standby disk.
e.g.postgres=# select pg_last_xlog_receive_location();
pg_last_xlog_receive_location
——————————-
0/29004560
(1 row)
4. pg_last_xlog_replay_location: Function which tells last transaction replayed during recovery process. e.g is given below:
postgres=# select pg_last_xlog_replay_location();
pg_last_xlog_replay_location
——————————
0/29004560
(1 row)
5. pg_last_xact_replay_timestamp: This function tells about the time stamp of last transaction which was replayed during recovery. Below is an example:
postgres=# select pg_last_xact_replay_timestamp();
 pg_last_xact_replay_timestamp
———————————-
15-MAY-14 20:54:27.635591 -04:00
(1 row)
Above are some important functions/views, which are already available in PostgreSQL for monitoring the streaming replication.
So, the logical next question is, “What’s the right way to monitor the Hot Standby with Streaming Replication on Standby Server?”
If you have Hot Standby with Streaming Replication, the following are the points you should monitor:
1. Check if your Hot Standby is in recovery mode or not:
For this you can use pg_is_in_recovery() function.
2.Check whether Streaming Replication is working or not.
And easy way of doing this is checking the pg_stat_replication view on Master/Primary. This view gives information only on master if Streaming Replication is working.
3. Check If Streaming Replication is not working and Hot standby is recovering from archived WAL file.
For this, either the DBA can use the PostgreSQL Log file to monitor it or utilize the following functions provided in PostgreSQL 9.3:
pg_last_xlog_replay_location();
pg_last_xact_replay_timestamp();
4. Check how far off is the Standby from Master.
There are two ways to monitor lag for Standby.
  i. Lags in Bytes: For calculating lags in bytes, users can use the pg_stat_replication view on the master with the function pg_xlog_location_diff function. Below is an example:
pg_xlog_location_diff(pg_stat_replication.sent_location, pg_stat_replication.replay_location)
which gives the lag in bytes.
ii. Calculating lags in Seconds. The following is SQL, which most people uses to find the lag in seconds:
SELECT CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location()
             THEN 0
           ELSE EXTRACT (EPOCH FROM now() – pg_last_xact_replay_timestamp())
      END AS log_delay;
Including the above into your repertoire can give you good monitoring for PostgreSQL.
I will in a future post include the script that can be used for monitoring the Hot Standby with PostgreSQL streaming replication

find out table size,database size,tablespace size,

Database Object Size Functions:-
The functions shows to calculate the disk space usage of database objects.
pg_column_size(any):- Number of bytes used to store a particular value (possibly compressed)
pg_database_size(oid):- Disk space used by the database with the specified OID
pg_database_size(name):- Disk space used by the database with the specified name
pg_indexes_size(regclass):- Total disk space used by indexes attached to the specified table
pg_relation_size(relation regclass, fork text):- Disk space used by the specified fork (‘main’, ‘fsm’, ‘vm’, or                                                                          ‘init’) of the specified table or index
pg_relation_size(relation regclass):- Shorthand for pg_relation_size(…, ‘main’)
pg_size_pretty(bigint):- Converts a size in bytes expressed as a 64-bit integer into a human-readable format                                        with size units
pg_size_pretty(numeric):- Converts a size in bytes expressed as a numeric value into a human-readable                                                   format with size units
pg_table_size(regclass):- Disk space used by the specified table, excluding indexes (but including TOAST,                                            free space map, and visibility map)
pg_tablespace_size(oid):- Disk space used by the tablespace with the specified OID
pg_tablespace_size(name):- Disk space used by the tablespace with the specified name
pg_total_relation_size(regclass):- Total disk space used by the specified table, including all indexes and                                                            TOAST data

The storage space (in bytes) for one specific table:
ads=# select pg_relation_size(‘size_test’);
pg_relation_size
——————
5668864
(1 row)

Same query, but with the result in a human-readable format:
ads=# select pg_size_pretty(pg_relation_size(‘size_test’));
pg_size_pretty
—————-
5536 kB(1 row)

Remember that this is only the size of the table, not included an index or additional stuff.
The size including any index can be found out with:
ads=# select pg_size_pretty(pg_total_relation_size(‘size_test’));
pg_size_pretty
—————-
7656 kB
(1 row)

The size of a complete database:
ads=# select pg_size_pretty(pg_database_size(‘ads’));
pg_size_pretty
—————-
11 MB
(1 row)

You can even find out, how much space a specific value needs:
ads=# select pg_column_size(5::smallint);
pg_column_size
—————-
2
(1 row)


ads=# select pg_column_size(5::int);
pg_column_size
—————-
4
(1 row)


ads=# select pg_column_size(5::bigint);
pg_column_size
—————-
8
(1 row)


ads=# select pg_column_size(‘This is a string’::varchar);
pg_column_size
—————-
20
(1 row)


ads=# select length(‘This is a string’::varchar);
length
——–
16
(1 row)

Viewing the path of a relation
SELECT pg_relation_filepath(‘tablename’);

size of a postgresql tablespace
postgres=# SELECT pg_size_pretty(pg_tablespace_size(‘my_tablespace’));
 pg_size_pretty
—————-
 118 MB
(1 row)