Pages

Sunday, January 5, 2020

Database migration oracle to postgres : Data-types


Few tips - Database migration oracle to postgres : Datatypes

Oracle vs PostgreSQL
Oracle
Postgres
Data Types

clob
text
number
numeric
varchar2
character varying
sysdate
current_date
Value()
COALESCE()
XML / CLOB Synthesis

xmlTypeVariable.getClobVal()
cast(xmlTypeVariable AS TEXT)
         or
        xmlTypeVariable::TEXT
xml.extract('/Val1/Val2/@Val3').genumericval()
tempNum := (xpath('/Val1/Val2/@Val3/text()',xml))[1];
                    cast(tempNum AS NUMERIC);
xml.extract('/Val1/Val2/@Val3').getstringval()
(xpath('/Val1/Val2/@Val3/text()',xml)[1];
xml.extract('/Val1/').getstringval()
(xpath('/Val1/node()',xml)[1];
dbms_output.put_line('text' || variablename)
Raise notice 'text %',variablename;
Raise debug 'text %',variablename;
TYPE Conversion and Functions

to_char(var1)
cast(var1 AS TEXT) or var1::character varying
replace('string', 'text')
replace('string', 'text', '')
DECODE(PT.ISNUMERIC,0,'No','Yes')
(select (   CASE WHEN PT.ISNUMERIC = 0 THEN 'No'
            ELSE 'Yes' END))
to_number(stuff)
stuff::numeric
instr(varchar, varchar)
"user"."INSTR" (character varying, character varying)
instr(varchar, varchar, integer)
"user"."INSTR" (character varying, character varying, integer)
instr(varchar, varchar, integer, integer)
"user"."INSTR" (character varying, character varying, integer, integer)
Sequence Related

SOME_SEQ.nextval
nextval('"user"."SOME_SEQ"')
select SOME_SEQ.nextval INTO v_Varriable from DUAL;
v_Varriable = nextval('"user"."SOME_SEQ"');
sequence_name.currentval
currval('"user".sequence_name')
PL/SQL and Exception Handling

Exception    
  When Others Then    
    Dbms_Output.Put_line(Sqlcode || Sqlerrm);
exception
when others then
raise notice 'EXCEPTION: (%)',SQLSTATE;
Dbms_Output.put_line(Sqlcode || Sqlerrm );
RAISE NOTICE 'Truncating table test_table: % %', SQLSTATE, SQLERRM;
IF SQL%ROWCOUNT = 0 THEN
//stuff
IF NOT FOUND THEN
         or
        GET DIAGNOSTICS integer_var = ROW_COUNT;
        IF integer_var = 0 THEN
MINUS
EXCEPT
REGEXP_LIKE(CHARACTER VARYING , NUMERIC, NUMERIC, PATTERN)
e.g.REGEXP_LIKE(SUBSTR(ag.agreementname, 1, 1), '[:0123456789ABCDEF:]')
(CHARACTER VARYING) LIKE (PATTERN)
(SUBSTR(ag.agreementname, 1, 1) LIKE '[:STUVWXYZ:]')
NULL processing

nvl(stuff, 0)
COALESCE(stuff, '0')::numeric
nvl(val1, val2)
coalesce(val1,val2)
nvl2(PEM.CONTRACT_FROM::character varying,'DIRECT','CONTRACT')
COALESCE(stuff as date type::character varying,'DIRECT','CONTRACT')

How to check for null:
        IF var1 IS NULL THEN
         //stuff
        END IF;
Working with Rownum

where rownum <= 25 order by [column_name];
order by [column_name] limit 25;
rownum 5
order by [column_name] limit 1 offset 5
SELECT var1 FROM table1
WHERE thing1 = thing2 AND thing3 = thing4 AND rownum = 1
SELECT var1 FROM table1
ORDER BY var1 LIMIT 1
WHERE thing1 = thing2 AND thing3 = thing4 AND
Working with DATEs

TRUNC(SYSDATE)
DATE_TRUNC('day',CURRENT_DATE)


Last_day(date )
"user"."LAST_DAY"(date)
ADD_MONTHS(DATE, MONTHS)
"user"."ADD_MONTHS"(DATE, MONTHS)
months_between(d1,d2)
date_part('Year',age(d1, d2)) * 12 + date_part('Month',age(d1, d2));
JOINS

select a.field1, b.field2
from a, b
where a.item_id = b.item_id(+)
select a.field1, b.field2 from a, b
where a.item_id = b.item_id
UNION
select a.field1, NULL as field2 from a
where 0= (select count(*) from b where b.item_id=a.item_id)

Friday, January 3, 2020

Replication in PostgreSQL

Knowledge of high availability is a must for anybody managing PostgreSQL. It is a topic that we have seen over and over, but that never gets old.
When talking about replication, we will be talking a lot about WALs. So, let's review a little bit what is this about.

Write Ahead Log (WAL):

Write Ahead Log is a standard method for ensuring data integrity, it is automatically enabled by default.

The WALs are the REDO logs in PostgreSQL. But, what are the REDO logs?

REDO logs contain all changes that were made in the database and they are used by replication, recovery, online backup and point in time recovery (PITR). Any changes that have not been applied to the data pages can be redone from the REDO logs.

Using WAL results in a significantly reduced number of disk writes, because only the log file needs to be flushed to disk to guarantee that a transaction is committed, rather than every data file changed by the transaction.

A WAL record will specify, bit by bit, the changes made to the data. Each WAL record will be appended into a WAL file. The insert position is a Log Sequence Number (LSN) that is a byte offset into the logs, increasing with each new record.

The WALs are stored in pg_xlog (or pg_wal in PostgreSQL 10) directory, under the data directory. These files have a default size of 16MB (the size can be changed by altering the --with-wal-segsize configure option when building the server). They have a unique incremental name, in the following format: "00000001 00000000 00000000".

The number of WAL files contained in pg_xlog (or pg_wal) will depend on the value assigned to the parameter checkpoint_segments (or min_wal_size and max_wal_size, depending on the version) in the postgresql.conf configuration file.

One parameter that we need to setup when configuring all our PostgreSQL installations is the wal_level. It determines how much information is written to the WAL .The default value is minimal, which writes only the information needed to recover from a crash or immediate shutdown. Archive adds logging required for WAL archiving; hot_standby further adds information required to run read-only queries on a standby server; and, finally logical adds information necessary to support logical decoding. This parameter requires a restart, so, it can be hard to change on running prod databases if we have forgotten that.

History of Replication in PostgreSQL:

The first replication method (warm standby) that PostgreSQL implemented (version 8.2 , back in 2006) was based on the log shipping method.

This means that the WAL records are directly moved from one database server to another to be applied. We can say that is a continuous PITR.

PostgreSQL implements file-based log shipping by transferring WAL records one file (WAL segment) at a time.

This replication implementation has the downside that if there is a major failure on the primary servers, transactions not yet shipped will be lost. So there is a window for data loss (you can tune this by using the archive_timeout parameter, which can be set to as low as a few seconds, but such a low setting will substantially increase the bandwidth required for file shipping).

Replication types:
1.warm stand by replication (this is upto 8.4)
2.streaming replication (from 9 onwards )
3.cascading replication
4.logical replication

sync Types:
async - default
sync

Pictorial representation of replication in PostgreSQL:


















1) Warm stand by replication:

  • Warm-standby is simply a replica that's not open for read-only SQL statements. i.e., This is active -passive, we can able to access master server, you can't able to access slave sever, if master server crashed or completely
  • showdown in that case we can make slave server as master serve. but after some time if you want to add old master as slave or you want to fail back, those options will not work.
  • only option is you build replication freshly.
  • This will be suitable for Disaster recovery purpose.

in the older version wal_level=archive (it will takes the WAL file backup form master and restore into slave server). It will be a time lagging process, so most of the times we will see lag b/w master and slave..

2) Streaming Replication:

  • Streaming replication is the upgraded Technic that allows to send modified data blocks without waiting for the WAL file to be closed by the primary. 
  • It's send as soon as that block is written the WAL file. So we can reduce the lag between primary and secondary nodes.
  • Here we will keep wal_level=replication
  • Complete pumping/streaming will happen, due to this there will very less chances for lag.
  • In this we have both ACTIVE ACTIVE server, but one server will be in read only mode, you can able to run select queries only.
  • Always slave server will allows only SELECT queries, at least we can split the read and write queries across 2 servers.
  • If master server went down we can simply change the slave as master. if you want to do fallback simply we can do.
  • This is for high availability, at any point of time DB should be available to users.
  • Streaming replication is asynchronous by default
















Multi slave replication:

master --> slave1  slave2  --here both the salves will be depends on upon master.

3) cascading replication:

A(M)  --> B(S)
B(M)  --> C(S)

This concept we call it as cascading replication, A is the master for B, so B in read only mode, B is the master for C, so C is in read only mode, B also in read only mode, here there will be no direct impact to master. now we can split the read quires across the slave server.

4) Logical replication:

for replicating some set of tables/schemas/complete database from one server to another server, there is no master slave concept. Here we ll not discuss this replication in details. I ll post logical replication in separate post.

Now from here we can discuss how we can set replication.

For Replication setup :

wal_level=replica (streaming--async/async/cascading/multi slave , logical-logical replication) --master side
recovery.conf (slave side)--connection information of master server.

Assume:
192.168.10.11   -master (example01)
192.168.10.12   -slave (example02)

enable ssh connection:
Enable ssh password less connectivity between master and slave servers.

Configuration- Master server

1) Verify port and services in master db
2) For the initial sync we to run the base-backup.
3) Enable pg_hba.conf entry both sides.
host    replication     postgres        192.168.10.11/32       trust
host    replication     postgres        192.168.10.12/32       trust
or allow all IPs
host    replication     postgres        0.0.0.0/0        trust

4) configure master server:
listen_addresses = '*' # any IP can connect
wal_level = replica
NOTE:
 minimal: some thing went wrong with server, wal files will remains only which are required for recovery.
 replica: it will keep the enough wal files in wal directory, which is required for replication setup.
 logical: as part of logical replication we will use this parameter.
5) Set below parameters for wal files:
max_wal_senders = 10
wal_keep_segments = 1000 #(1000 wal files avaible in pg_wal directory, you can keep in size as well like 2gb/3g/4gb.)
after changing this parameter need to restart services.
6) archive mode on
         archive_mode=on
    archive_command='cp % /var/lib/pgsql/archive/%f'

7) Enable below parameters for failover and fail-back
wal_log_hints = on #(tracking itsself)
recovery_min_apply_delay = 0 #(delay between master and slave) If you want your salve should apply WAL files with 1 hour delay, put 60 as value 

Configuration - Slave server

1) Restore backup taken in pg_basebackup and use different port
2) Configure parameter in postgres.conf (contains master server connection information)
recovery.conf
       standby_mode = 'on' ( default)
   primary_conninfo = 'user=postgres host=192.168.10.11 port=5432'
hot_standby = on 
3) Give permissions to data directory
chmod -R 700
4) start slave server
pg_ctl start

Note: slave server only in recovery mode.

5) For big env., you need to always scp the WAL files like below example in recovery.conf

  restore_command='scp postgres@192.168.10.11:/var/lib/pgsql/archive/%f %p'

Check the replication status

from master:

[postgres@example02 11]$ ps -ef|grep sender
postgres   2576   2529  0 19:40 ?        00:00:00 postgres: walsender postgres 192.168.10.12(16864) streaming 0/5000060

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 2576
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      | 192.168.10.12
client_hostname  |
client_port      | 16864
backend_start    | 2019-09-06 19:40:49.430354-07
backend_xmin     |
state            | streaming
sent_lsn         | 0/5000140
write_lsn        | 0/5000140
flush_lsn        | 0/5000140
replay_lsn       | 0/5000140
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async

From slave server:

[postgres@example01 11]$ ps -ef|grep receiver
postgres  13546  13541  0 19:40 ?        00:00:00 postgres: walreceiver   streaming 0/5000140

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t

Note:
always it should be (t) true, then only it is slave server, at the same time by seeing recovery.conf file in any server data directory, we call that server as slave server.

Verify lag:

select now()-pg_last_xact_replay_timestamp() as replication_lag;

SELECT
  pg_last_wal_receive_lsn() receive,
  pg_last_wal_replay_lsn() replay,
  (
   extract(epoch FROM now()) -
   extract(epoch FROM pg_last_xact_replay_timestamp())
  )::int lag;

if lag is there it will  display in seconds. But up to any time lag is ok, WAL files/archive files should exist, then only slave server slowly applies logs one by one.

It is recommended that, for big application environment keep 'restore_command' with archive location in recovery.conf file.

This will effect if required WAL file not exists in the pg_wal of master, then simply WAL receiver goes to archive location.
e.g.,
   restore_command='scp postgres@192.168.10.11:/var/lib/pgsql/archive/%f %p'

==================================
To Make it as sync replication
==================================

in master server postgresql.conf file make this entry

synchronous_standby_names = '*'

[postgres@example02 11]$ pg_ctl reload
server signaled
2019-09-06 19:48:44.581 PDT [2529] LOG:  received SIGHUP, reloading configuration files
2019-09-06 19:48:44.582 PDT [2529] LOG:  parameter "synchronous_standby_names" changed to "*"
[postgres@example02 11]$


[postgres@example02 11]$ psql
psql (11.0)
Type "help" for help.

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 2576
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      | 192.168.10.12
client_hostname  |
client_port      | 16864
backend_start    | 2019-09-06 19:40:49.430354-07
backend_xmin     |
state            | streaming
sent_lsn         | 0/5000140
write_lsn        | 0/5000140
flush_lsn        | 0/5000140
replay_lsn       | 0/5000140
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 1
sync_state       | sync

If slave server services stopped, primary will be affected.

Note:
If your setup is with 2 node, then don't keep slave in 'sync' mode.

Test sync functionality:

stop slave server and create one sample database in master, it will not think about slave sevrer, simply commit in master server, once you started slave, automatically it will comes to sync.

then create one database in master as a example.

if slave server in stop mode transaction will not commit automatically, we have to cancel the transaction, then only it will commit in master.

postgres=# create database test1db;
2020-01-03 17:57:00.479 PST [3443] LOG:  standby "walreceiver" is now a synchronous standby with priority 1
CREATE DATABASE
postgres=# create database test2db;
^CCancel request sent
2020-01-03 17:57:38.372 PST [3439] WARNING:  canceling wait for synchronous replication due to user request
2020-01-03 17:57:38.372 PST [3439] DETAIL:  The transaction has already committed locally, but might not have been replicated to the standby.
WARNING:  canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not have been replicated to the standby.
CREATE DATABASE

==================
multi slave
===================
one master multiple slaves, direct headache will be on master, there might be chances for netwrok delay.

You are now connected to database "postgres" as user "postgres".

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 3448
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      | 192.168.38.133
client_hostname  |
client_port      | 5401
backend_start    | 2020-01-03 17:58:05.884697-08
backend_xmin     |
state            | streaming
sent_lsn         | 0/14000060
write_lsn        | 0/14000060
flush_lsn        | 0/14000060
replay_lsn       | 0/14000060
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async
-[ RECORD 2 ]----+------------------------------
pid              | 3546
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      | 192.168.38.133
client_hostname  |
client_port      | 5404
backend_start    | 2020-01-03 18:07:46.055172-08
backend_xmin     |
state            | streaming
sent_lsn         | 0/14000060
write_lsn        | 0/14000060
flush_lsn        | 0/14000060
replay_lsn       | 0/14000060
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async

Here I sent two replications in one server.

===============================
cascading replication:
==============================
Here there is no concept of one master. One of slave act as master for another slave. So direct dependency between multiple slaves can be avoided but at the same time we have many high availability or replication servers for different purposes.

A master - B slave
B master - C slave

here none of the 2 servers depending up on one master, so load will be less one master server, but if you want run reporting  queries on top slave servers you can use both slaves.

for cascade replication within the server:

e.g., Taking a base backup
$ /usr/local/pgsql/bin/pg_basebackup -D /var/lib/pgsql/slave/ -X fetch -R -p5432
$

standby_mode = 'on'
primary_conninfo = 'user=postgres port=5432'

change the port in new slave.

$ /usr/local/pgsql/bin/pg_ctl -D /var/lib/pgsql/slave/ start
waiting for server to start....2020-01-03 18:14:43.221 PST [13805] LOG:  listening on IPv4 address "0.0.0.0", port 5433
2020-01-03 18:14:43.221 PST [13805] LOG:  listening on IPv6 address "::", port 5433
2020-01-03 18:14:43.222 PST [13805] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5433"
2020-01-03 18:14:43.238 PST [13806] LOG:  database system was interrupted while in recovery at log time 2020-01-03 18:10:06 PST
2020-01-03 18:14:43.238 PST [13806] HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2020-01-03 18:14:43.250 PST [13806] LOG:  entering standby mode
2020-01-03 18:14:43.253 PST [13806] LOG:  redo starts at 0/14000060
2020-01-03 18:14:43.255 PST [13806] LOG:  consistent recovery state reached at 0/14000140
2020-01-03 18:14:43.255 PST [13806] LOG:  invalid record length at 0/14000140: wanted 24, got 0
2020-01-03 18:14:43.255 PST [13805] LOG:  database system is ready to accept read only connections
2020-01-03 18:14:43.262 PST [13810] LOG:  started streaming WAL from primary at 0/14000000 on timeline 1
 done
server started

-- In primary ( master)

$ /usr/local/pgsql/bin/psql -p5432
psql (11.4)
Type "help" for help.

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 13811
usesysid         | 10
usename          | postgres
application_name | walreceiver
client_addr      |
client_hostname  |
client_port      | -1
backend_start    | 2020-01-03 18:14:43.259911-08
backend_xmin     |
state            | streaming
sent_lsn         | 0/14000140
write_lsn        | 0/14000140
flush_lsn        | 0/14000140
replay_lsn       | 0/14000140
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async

postgres=#

slave also acting as master for another slave,here we 2 slaves but still on top of master no direct impact.


Sunday, December 29, 2019

Backup and Restore in postgreSQL

=============================
Copy data only in text format
=============================

hr1dev=# select * from test.emp;
 empno |  ename   |  job  |    dob     | deptno 
-------+----------+-------+------------+--------
   100 | Gouranga | Admin | 1983-01-10 |     10
   101 | John     | User  | 1987-11-12 |     20
(2 rows)


hr1dev=# copy test.emp to '/tmp/emp.txt';
COPY 2

$ cat /tmp/emp.txt
100     Gouranga        Admin   1983-01-10      10
101     John    User    1987-11-12      20

hr1dev=# copy emp to '/tmp/empdata.txt';
COPY 1000100
hr1dev=# delete from emp;
DELETE 1000100
hr1dev=# select count(*) from emp;           
 count 
-------
     0
(1 row)

hr1dev=# copy emp from '/tmp/empdata.txt';
COPY 1000100
hr1dev=# select count(*) from emp;
  count  
---------
 1000100
(1 row)

hr1dev=# 

======================
export with delimiter:
======================

hr1dev=# copy emp to '/tmp/emp_del.txt' with delimiter '|';
COPY 1000100
hr1dev=# 

======================
export with header:
======================
hr1dev=# copy emp to '/tmp/emp_head' with CSV HEADER;
COPY 1000100
hr1dev=# 

$ more /tmp/emp_head
no,name
1,data1
2,data2
3,data3
4,data4

=======================================
export data for specific query:
=======================================
hr1dev=#  copy (select * from emp where no>10000) to '/tmp/emp_query.txt';
COPY 990000



-----------------------------------------------------------
BACKUP AND RESTORE
-----------------------------------------------------------
In postgresql we have 2 levels of backups.

1. Logical backup
2. Physical backup (file system backup/hot backup/online backup/offline backup)

Logical Backup:
---------------
Dump file will be in the format of logical , it will not contains any physical files means configuration files/ data files/transaction logs. To perform logical backup we can use two utilities. These are :
   a. pg_dump
   b, pg_dumpall

a.pg_dump : 
Using pg_dump, following are possible:
1. Single database backup at a time
2. Single schema,multiples schemas backup at a time
3. Single table,multiple tables backup at a time.
4. Dump can be taken in customized format means .tar or .gz directory formats.
7. Parallel jobs can be invoked
8. To restore, we can use traditional restore utility, this 10 times faster than the older utility.

Using pg_dump, we can't perform below activities:
1. we can't take global objects backup
2. Full server logical backup also not possible at a time.

b.pg_dumpall

Using pg_dumpall, following are possible:
1. Complete server backup at a time (all databases backup at a time)
2. Global objects backup only possible

We can't do the following using pg_dumpall utility:
1. Single database/table/schema backup not possible.
2. Taking dump in customized format is not possible.
5. we can't perform parallel jobs
6. here we can't use traditional restore utility.

To restore,  we can use 2 utilities:
-----------------------------------------
1.psql
2.pg_restore (traditional utility)

normally taking dump we have 2 ways:

1.plain format : we can able to read the dump file, what ever the statements you want to execute , you can simply execute as normal sql statements. while restoring you can't use pg_restore utility if dump in plain format.
2. If your dump in plain format , if you dump contains 10 tables backup, you can't restore single table from the dump.
3. To restore manually you have to copy that respective sql statements from the dump.
4. If dump in readable format, there might be number of chances for data misuse. this will be security compliance.
5. Always prefer to take dump in customized for, which is in binary format, not in the readable format. At the same time we can restore single table even-though dump contains multiple tables.
we can use pg_restore utility for restore as well.


Syntax: pg_dump -d db_name -t table_name

-- Taking backup in plain text format

$ /usr/local/pgsql/bin/pg_dump -t test.emp -d hr1dev > emp.sql

-bash-4.2$ pwd
/var/lib/pgsql/backups
-bash-4.2$ cat emp.sql
--
-- PostgreSQL database dump
--
-- Dumped from database version 11.0
-- Dumped by pg_dump version 11.0

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET client_min_messages = warning;
SET row_security = off;
SET default_tablespace = hr_tbs;
SET default_with_oids = false;
--
-- Name: emp; Type: TABLE; Schema: test; Owner: postgres; Tablespace: hr_tbs
--
CREATE TABLE test.emp (
    empno integer NOT NULL,
    ename character varying NOT NULL,
    job character varying,
    dob date,
    deptno integer
);
ALTER TABLE test.emp OWNER TO postgres;
--
-- Data for Name: emp; Type: TABLE DATA; Schema: test; Owner: postgres
--
COPY test.emp (empno, ename, job, dob, deptno) FROM stdin;
100     Gouranga        Admin   1983-01-10      10
101     John    User    1987-11-12      20
\.
SET default_tablespace = '';
--
-- Name: emp emp_pkey; Type: CONSTRAINT; Schema: test; Owner: postgres
--
ALTER TABLE ONLY test.emp
    ADD CONSTRAINT emp_pkey PRIMARY KEY (empno);

--
-- Name: TABLE emp; Type: ACL; Schema: test; Owner: postgres
--
GRANT SELECT,INSERT,DELETE,UPDATE ON TABLE test.emp TO gouranga;
--
-- PostgreSQL database dump complete
--
----------------------------------------------------------------------------------------------------------

=================
Restore from dump
=================

postgres=# \c hr1dev
You are now connected to database "hr1dev" as user "postgres".
hr1dev=# drop table test.emp; 
ERROR:  cannot drop table test.emp because other objects depend on it
DETAIL:  materialized view test.emp_mv depends on table test.emp
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

hr1dev=# drop table test.emp CASCADE;
NOTICE:  drop cascades to materialized view test.emp_mv
DROP TABLE
hr1dev=# 


$ psql -d hr1dev -f emp.sql -o emp_hr1dev_imp.log

-f : filename where data exported in text format
-d : in which database to be imported
-o : output log file

hr1dev=# select * from test.emp;
 empno |  ename   |  job  |    dob     | deptno 
-------+----------+-------+------------+--------
   100 | Gouranga | Admin | 1983-01-10 |     10
   101 | John     | User  | 1987-11-12 |     20
(2 rows)


======================================
Export dump in customized format:
======================================

pg_dump -Fc -d hr1dev -f hr1dev.bin -v 2>dbbackup.log

Taking Dump in parallel format:
--------------------------------------
pg_dump -Fc -d hr1dev -f hr1dev.bin -v -j 2 2>dbbackup.log

Taking backup from remote system:
-----------------------------------------------
pg_dump -Fc -d hr1dev -f hr1dev.bin -v -j 2 -h 192.168.10.132 -p 5433 2>dbbackup.log

for location:
----------------
$ pg_dump -Fd -d hr1dev -f /fullpath/dbbackup -v -j 2 -h 192.168.10.132 -p 5433 2>dbbackup.log

example:
---------
$ pg_dump -Fc -d hr1dev -f hr1dev.bin -v 2>dbbackup.log
$ ll
total 22264
-rw-r--r--. 1 postgres postgres     3255 Dec 29 08:48 dbbackup.log
-rw-r--r--. 1 postgres postgres 22783616 Dec 29 08:48 hr1dev.bin
$ cat dbbackup.log
pg_dump: last built-in OID is 16383
pg_dump: reading extensions
pg_dump: identifying extension members
pg_dump: reading schemas
pg_dump: reading user-defined tables
pg_dump: reading user-defined functions
pg_dump: reading user-defined types
pg_dump: reading procedural languages
pg_dump: reading user-defined aggregate functions
pg_dump: reading user-defined operators
pg_dump: reading user-defined access methods
pg_dump: reading user-defined operator classes
pg_dump: reading user-defined operator families
pg_dump: reading user-defined text search parsers
pg_dump: reading user-defined text search templates
pg_dump: reading user-defined text search dictionaries
pg_dump: reading user-defined text search configurations
pg_dump: reading user-defined foreign-data wrappers
pg_dump: reading user-defined foreign servers
pg_dump: reading default privileges
pg_dump: reading user-defined collations
pg_dump: reading user-defined conversions
pg_dump: reading type casts
pg_dump: reading transforms
pg_dump: reading table inheritance information
pg_dump: reading event triggers
pg_dump: finding extension tables
pg_dump: finding inheritance relationships
pg_dump: reading column info for interesting tables
pg_dump: finding the columns and types of table "test.sample"
pg_dump: finding the columns and types of table "test.sample2"
pg_dump: finding the columns and types of table "public.emp"
pg_dump: finding the columns and types of table "public.emp_mv"
pg_dump: finding the columns and types of table "test.emp"
pg_dump: flagging inherited columns in subtables
pg_dump: reading indexes
pg_dump: reading indexes for table "test.emp"
pg_dump: flagging indexes in partitioned tables
pg_dump: reading extended statistics
pg_dump: reading constraints
pg_dump: reading triggers
pg_dump: reading rewrite rules
pg_dump: reading policies
pg_dump: reading row security enabled for table "test.sample"
pg_dump: reading policies for table "test.sample"
pg_dump: reading row security enabled for table "test.sample2"
pg_dump: reading policies for table "test.sample2"
pg_dump: reading row security enabled for table "public.emp"
pg_dump: reading policies for table "public.emp"
pg_dump: reading row security enabled for table "public.emp_mv"
pg_dump: reading policies for table "public.emp_mv"
pg_dump: reading row security enabled for table "test.seq1"
pg_dump: reading policies for table "test.seq1"
pg_dump: reading row security enabled for table "test.emp"
pg_dump: reading policies for table "test.emp"
pg_dump: reading publications
pg_dump: reading publication membership
pg_dump: reading publication membership for table "test.sample"
pg_dump: reading publication membership for table "test.sample2"
pg_dump: reading publication membership for table "public.emp"
pg_dump: reading publication membership for table "test.emp"
pg_dump: reading subscriptions
pg_dump: reading large objects
pg_dump: reading dependency data
pg_dump: saving encoding = UTF8
pg_dump: saving standard_conforming_strings = on
pg_dump: saving search_path = 
pg_dump: saving database definition
pg_dump: dumping contents of table "public.emp"
pg_dump: dumping contents of table "test.emp"
pg_dump: dumping contents of table "test.sample"
pg_dump: dumping contents of table "test.sample2"


Restore database from backup:
=============================
pg_restore -Fd -d hr1dev hr1dev.bin -v -j 2 2>db_hr1dev_restore.log


postgres=# drop database hr1dev;
DROP DATABASE
postgres=# \c hr1dev
FATAL:  database "hr1dev" does not exist
Previous connection kept
postgres=# 

postgres=# create database hr1dev;
CREATE DATABASE

$ pg_restore -Fc -d hr1dev hr1dev.bin -v -j 2 2>db_hr1dev_restore.log

$ cat db_hr1dev_restore.log
pg_restore: connecting to database for restore
pg_restore: processing item 3107 ENCODING ENCODING
pg_restore: processing item 3108 STDSTRINGS STDSTRINGS
pg_restore: processing item 3109 SEARCHPATH SEARCHPATH
pg_restore: processing item 3110 DATABASE hr1dev
pg_restore: processing item 6 SCHEMA test
pg_restore: creating SCHEMA "test"
pg_restore: processing item 199 TABLE emp
pg_restore: creating TABLE "public.emp"
pg_restore: processing item 200 MATERIALIZED VIEW emp_mv
pg_restore: creating MATERIALIZED VIEW "public.emp_mv"
pg_restore: processing item 202 TABLE emp
pg_restore: creating TABLE "test.emp"
pg_restore: processing item 197 TABLE sample
pg_restore: creating TABLE "test.sample"
pg_restore: processing item 198 TABLE sample2
pg_restore: creating TABLE "test.sample2"
pg_restore: processing item 201 SEQUENCE seq1
pg_restore: creating SEQUENCE "test.seq1"
pg_restore: entering main parallel loop
pg_restore: launching item 3101 TABLE DATA emp
pg_restore: launching item 3104 TABLE DATA emp
pg_restore: processing data for table "test.emp"
pg_restore: processing data for table "public.emp"
pg_restore: finished item 3104 TABLE DATA emp
pg_restore: launching item 3099 TABLE DATA sample
pg_restore: processing data for table "test.sample"
pg_restore: finished item 3099 TABLE DATA sample
pg_restore: launching item 3100 TABLE DATA sample2
pg_restore: processing data for table "test.sample2"
pg_restore: finished item 3100 TABLE DATA sample2
pg_restore: launching item 3113 SEQUENCE SET seq1
pg_restore: executing SEQUENCE SET seq1
pg_restore: finished item 3113 SEQUENCE SET seq1
pg_restore: launching item 2976 CONSTRAINT emp emp_pkey
pg_restore: creating CONSTRAINT "test.emp emp_pkey"
pg_restore: finished item 2976 CONSTRAINT emp emp_pkey
pg_restore: finished item 3101 TABLE DATA emp
pg_restore: launching item 3111 ACL SCHEMA test
pg_restore: launching item 3112 ACL TABLE emp
pg_restore: creating ACL "test.TABLE emp"
pg_restore: creating ACL "SCHEMA test"
pg_restore: finished item 3112 ACL TABLE emp
pg_restore: finished item 3111 ACL SCHEMA test
pg_restore: launching item 3102 MATERIALIZED VIEW DATA emp_mv
pg_restore: creating MATERIALIZED VIEW DATA "public.emp_mv"
pg_restore: finished item 3102 MATERIALIZED VIEW DATA emp_mv
pg_restore: finished main parallel loop

Testing:
postgres=# \c hr1dev
You are now connected to database "hr1dev" as user "postgres".
hr1dev=# \dt test.*
          List of relations
 Schema |  Name   | Type  |  Owner   
--------+---------+-------+----------
 test   | emp     | table | postgres
 test   | sample  | table | gouranga
 test   | sample2 | table | postgres
(3 rows)

hr1dev=# 


Example-2:
pg_dump -Fd -d demodb -f dbbackup -v -j 2 2>dbdemo_backup.log

postgres=# drop database demodb;
DROP DATABASE

pg_restore -Fd -d demodb dbbackup -v -j 2 2>db_demodb_restore.log


postgres=# create database demodb;
CREATE DATABASE
postgres=# \q

-bash-4.2$ pg_restore -Fd -d demodb dbbackup -v -j 2 2>db_demodb_restore.log
-bash-4.2$ 
-bash-4.2$ cat db_demodb_restore.log
pg_restore: connecting to database for restore
pg_restore: processing item 3079 ENCODING ENCODING
pg_restore: processing item 3080 STDSTRINGS STDSTRINGS
pg_restore: processing item 3081 SEARCHPATH SEARCHPATH
pg_restore: processing item 3082 DATABASE demodb
pg_restore: processing item 196 TABLE emp
pg_restore: creating TABLE "public.emp"
pg_restore: entering main parallel loop
pg_restore: launching item 3076 TABLE DATA emp
pg_restore: processing data for table "public.emp"
pg_restore: finished item 3076 TABLE DATA emp
pg_restore: finished main parallel loop
-bash-4.2$ psql
psql (11.0)
Type "help" for help.

postgres=# \c demodb
You are now connected to database "demodb" as user "postgres".
demodb=# select count(1) from emp;
 count 
-------
    50
(1 row)

demodb=# 

==============================================
To take complete server level backup:
==============================================
1. parallel not possible
2. only plain format will work.
3. use pg_dumpall

To Take backup:
----------------
$ pg_dumpall>fullserverbackup.sql
$ ls -ltr fullserverbackup.sql
-rw-r--r--. 1 postgres postgres 17786563 Dec 29 09:44 fullserverbackup.sql


To Restore
---------------
$ psql -f fullserverbackup.sql -o fullserverrestore.log

This dump will contains users and user access rights information and table spaces information, this is very rare . Even-though you are taking backup of full server, if required point in-time recovery possible. Only with physical backup only we can able to do point in-time recovery.


PHYSICAL BACKUP:

1. File system backup: after stopping application and jobs related to databases, we will apply check point and we will copy data directory.
2. Hot backup: 
3. offline backup : stop services and by using any Linux command copy data directory to any other location.
4. online backup: no need to stop jobs, no need to stop services, simply you can take the backup by using an utility called pg_basebackup.


pg_basebackup:
===============
It will contain exact copy of our data directory with proper check point. it will gives consistency.
While triggering this one first it will apply check point , and it will start actual backup.
To take this backup you need empty directory always.
For plain database, you can take the full backup within the server, but if you have table-spaces, then you are going to take backup,
simply data directory will be backed up to respective directory. but table-space location will be trying to backed up to existing location
only, if that is empty, backup will success, if that is not empty we have to remap.


Example:

List tablespaces:

postgres=# \db+
                                      List of tablespaces
    Name    |  Owner   |     Location      | Access privileges | Options |  Size  | Description 
------------+----------+-------------------+-------------------+---------+--------+-------------
 hr_tbs     | postgres | /usr/pgsql/hr_tbs |                   |         | 16 kB  | 
 pg_default | postgres |                   |                   |         | 207 MB | 
 pg_global  | postgres |                   |                   |         | 574 kB | 
(3 rows)

postgres=# 


$ pg_basebackup -D /var/lib/pgsql/base_backup/ -X fetch -P -T /usr/pgsql/hr_tbs=/usr/pgsql/hr_tbs_new
229205/229205 kB (100%), 2/2 tablespaces


You can backup in .tar format also:

$ pg_basebackup -D /var/lib/pgsql/base_backup/ -X fetch -Ft -P -T /usr/pgsql/hr_tbs=/usr/pgsql/hr_tbs_new
229206/229206 kB (100%), 2/2 tablespaces


Note: Ensure 'base_backup' must be empty.

$ pwd
/var/lib/pgsql
$ cd base_backup/
-bash-4.2$ ls -ltr
total 229212
-rw-------. 1 postgres postgres     19968 Dec 29 10:18 24634.tar
-rw-------. 1 postgres postgres 234689024 Dec 29 10:18 base.tar


You can see here, all folders backed up:

$ pwd
/var/lib/pgsql/base_backup
$ ls -ltr
total 60
-rw-------. 1 postgres postgres   226 Dec 29 10:14 backup_label
drwx------. 2 postgres postgres     6 Dec 29 10:14 pg_dynshmem
drwx------. 2 postgres postgres     6 Dec 29 10:14 pg_commit_ts
drwx------. 2 postgres postgres     6 Dec 29 10:14 pg_twophase
drwx------. 2 postgres postgres     6 Dec 29 10:14 pg_subtrans
drwx------. 2 postgres postgres     6 Dec 29 10:14 pg_snapshots
drwx------. 2 postgres postgres     6 Dec 29 10:14 pg_serial
drwx------. 2 postgres postgres     6 Dec 29 10:14 pg_notify
drwx------. 4 postgres postgres    34 Dec 29 10:14 pg_multixact
drwx------. 7 postgres postgres    62 Dec 29 10:14 base
drwx------. 2 postgres postgres    18 Dec 29 10:14 pg_tblspc
drwx------. 2 postgres postgres     6 Dec 29 10:14 pg_replslot
drwx------. 2 postgres postgres     6 Dec 29 10:14 pg_stat
-rw-------. 1 postgres postgres 23806 Dec 29 10:14 postgresql.conf
-rw-------. 1 postgres postgres    88 Dec 29 10:14 postgresql.auto.conf
drwx------. 2 postgres postgres    17 Dec 29 10:14 pg_xact
-rw-------. 1 postgres postgres     3 Dec 29 10:14 PG_VERSION
drwx------. 2 postgres postgres     6 Dec 29 10:14 pg_stat_tmp
drwx------. 4 postgres postgres    65 Dec 29 10:14 pg_logical
-rw-------. 1 postgres postgres  1636 Dec 29 10:14 pg_ident.conf
-rw-------. 1 postgres postgres  4513 Dec 29 10:14 pg_hba.conf
drwx------. 2 postgres postgres  4096 Dec 29 10:14 pg_log
drwx------. 3 postgres postgres    28 Dec 29 10:14 hr_tbs_new
drwx------. 2 postgres postgres     6 Dec 29 10:14 hr_tbs
-rw-------. 1 postgres postgres    47 Dec 29 10:14 current_logfiles
drwx------. 2 postgres postgres  4096 Dec 29 10:14 global
drwx------. 3 postgres postgres    58 Dec 29 10:14 pg_wal
-bash-4.2$ 

For TAR format no need to mention table-space remapping, it will be create number of files based on number of table-spaces.

$ pwd
/var/lib/pgsql/backup
$ pg_basebackup -D /var/lib/pgsql/backup/ -X fetch -Ft -P
229206/229206 kB (100%), 2/2 tablespaces

$ ls -ltr
total 229212
-rw-------. 1 postgres postgres     19968 Dec 29 10:23 24634.tar
-rw-------. 1 postgres postgres 234689536 Dec 29 10:23 base.tar
$ pwd
/var/lib/pgsql/backup

================================
Restore from Full Backup
================================

[postgres@example02 backup]$ tar -xvf base.tar
[postgres@example02 backup]$ rm base.tar

[postgres@example02 backup]$ tar -xvf 16613.tar
PG_12_201909212/

[postgres@example02 backup]$ cat tablespace_map
16613 /var/lib/pgsql/hr_tbs_new


update tablespace_map file wit new table space location.
change port in postgresql.conf file of backup directory.

[postgres@example02 backup]$ chmod 0700 /var/lib/pgsql/backup/

[postgres@example02 backup]$ pg_ctl -D /var/lib/pgsql/backup start
waiting for server to start....    3372 2019-12-24 17:58:55 PST  1LOG:  starting PostgreSQL 12.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bit
    3372 2019-12-24 17:58:55 PST  2LOG:  listening on IPv6 address "::1", port 5433
    3372 2019-12-24 17:58:55 PST  3LOG:  listening on IPv4 address "127.0.0.1", port 5433
    3372 2019-12-24 17:58:55 PST  4LOG:  listening on IPv4 address "192.168.38.132", port 5433
    3372 2019-12-24 17:58:55 PST  5LOG:  listening on Unix socket "/tmp/.s.PGSQL.5433"
    3372 2019-12-24 17:58:55 PST  6LOG:  redirecting log output to logging collector process
    3372 2019-12-24 17:58:55 PST  7HINT:  Future log output will appear in directory "log".
 done
server started
[postgres@example02 backup]$ ps -ef|grep postgres
root       2815   2784  0 16:47 pts/0    00:00:00 su - postgres
postgres   2816   2815  0 16:47 pts/0    00:00:00 -bash
postgres   2851      1  0 16:47 ?        00:00:00 /usr/local/pgsql/bin/postgres
postgres   2852   2851  0 16:47 ?        00:00:00 postgres: logger
postgres   2854   2851  0 16:47 ?        00:00:00 postgres: checkpointer
postgres   2855   2851  0 16:47 ?        00:00:00 postgres: background writer
postgres   2856   2851  0 16:47 ?        00:00:00 postgres: walwriter
postgres   2857   2851  0 16:47 ?        00:00:00 postgres: stats collector
postgres   2858   2851  0 16:47 ?        00:00:00 postgres: logical replication launcher
postgres   3372      1  0 17:58 ?        00:00:00 /usr/local/pgsql/bin/postgres -D /var/lib/pgsql/backup
postgres   3373   3372  0 17:58 ?        00:00:00 postgres: logger
postgres   3375   3372  0 17:58 ?        00:00:00 postgres: checkpointer
postgres   3376   3372  0 17:58 ?        00:00:00 postgres: background writer
postgres   3377   3372  0 17:58 ?        00:00:00 postgres: walwriter
postgres   3378   3372  0 17:58 ?        00:00:00 postgres: stats collector
postgres   3379   3372  0 17:58 ?        00:00:00 postgres: logical replication launcher
postgres   3380   2816  0 17:59 pts/0    00:00:00 ps -ef
postgres   3381   2816  0 17:59 pts/0    00:00:00 grep --color=auto postgres
[postgres@example02 backup]$


Now 2 instances running in the same server.
with same binaries, different data directories, different ports we can start number of instances.

[postgres@example02 backup]$ psql -p5433
psql (12.0)
Type "help" for help.

postgres=# show data_directory;
    data_directory
-----------------------
 /var/lib/pgsql/backup
(1 row)

postgres=#


[postgres@example02 backup]$ pg_ctl -D /var/lib/pgsql/DATA/ status
pg_ctl: server is running (PID: 2851)
/usr/local/pgsql/bin/postgres
[postgres@example02 backup]$ pg_ctl -D /var/lib/pgsql/backup/ status
pg_ctl: server is running (PID: 3372)
/usr/local/pgsql/bin/postgres "-D" "/var/lib/pgsql/backup"
[postgres@example02 backup]$


Anywhere backup strategy will be physical only. daily full backup. 
There is no incremental concept in PostgreSQL.
They will schedule script to create folder before backup with the date.