How To Backup and Restore a PostgreSQL Database
In this blog, the author will take the reader through the step-by-step process of how to backup and restore a PostgreSQL database.
Join the DZone community and get the full member experience.
Join For FreeIn this blog, you will learn how to backup and restore a PostgreSQL database. Enjoy!
1. Introduction
Some time ago, I needed to back up a PostgreSQL database from a production server in order to be able to fix a problem that was difficult to reproduce in the test environment. It appeared that I could not find the answer very quickly by means of a Google search. After a while, I managed to find out the commands I needed to use, and it seemed to me a good idea to share this knowledge.
In order to test the commands, I will be using a database I created for a previous blog. The schema is quite basic and is shown here for information purposes only. It is not relevant for the remainder of the post. The database is called myjhipsterplanet
and that is the one you will back up and restore in this post.
2. Prerequisites
Prerequisites needed for this blog:
- Ubuntu 22.04 is used;
- Basic knowledge of Docker and Docker Compose is needed;
- Basic PostgreSQL knowledge is needed. PostgreSQL 14.5 is used.
3. Preparation
In this section, a database is prepared, which you will use to backup and restore. Skip this section if you just want to know how to use the backup and restore commands.
The database backup and Docker Compose file used in this section are available at GitHub.
The Docker Compose file will start a PostgreSQL database containing a basic setup where the default admin user postgres
is replaced with a user mypostgresqldumpplanet
. The PostgreSQL data is mounted to the directory ~/docker/volumes/postgresqldata
.
version: '3.8'
services:
mypostgresqldumpplanet-postgresql:
image: postgres:14.5
volumes:
- ~/docker/volumes/postgresqldata/:/var/lib/postgresql/data/
environment:
- POSTGRES_USER=mypostgresqldumpplanet
- POSTGRES_PASSWORD=
- POSTGRES_HOST_AUTH_METHOD=trust
ports:
- 127.0.0.1:5432:5432
Execute the following command from the root of the repository in order to start a PostgreSQL Docker container.
$ docker compose -f postgresql.yml up -d
[+] Running 2/2
Network mypostgresqldumpplanet_default Created 0.1s
Container mypostgresqldumpplanet-mypostgresqldumpplanet-postgresql-1 Started 0.4s
Copy the database dump file from the root of the repository to the mounted PostgreSQL directory.
$ sudo cp 2023-04-01-original.dump ~/docker/volumes/postgresqldata/
Now you need to open a bash shell inside the container. This can be done with the docker exec
command. Inside the container, the Linux user postgres
has to be used; therefore, the -u
argument needs to be passed.
$ docker exec -it -u postgres mypostgresqldumpplanet-mypostgresqldumpplanet-postgresql-1 bash
postgres@4934dd659171:/$
Now you have access to bash
inside the container. You need to restore the database from the dump by means of pg_restore
.
$ pg_restore -U mypostgresqldumpplanet -d postgres -C /var/lib/postgresql/data/2023-04-01-original.dump
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 3382; 1262 16384 DATABASE myjhipsterplanet myjhipsterplanet
pg_restore: error: could not execute query: ERROR: role "myjhipsterplanet" does not exist
Command was: ALTER DATABASE myjhipsterplanet OWNER TO myjhipsterplanet;
...
The parameters used are explained:
-U mypostgresqldumpplanet
: the admin database user.-d postgres
: you need to provide a database thatpg_restore
can connect to. Because this is an empty instance of PostgreSQL, you use thepostgres
database, which is always available with the-d
argument.-C
: Because you have an empty PostgreSQL database, you need to provide the argument-C
in order thatpg_restore
(the backup restore tool) will create the database for you./var/lib/postgresql/data/2023-04-01-original.dmp
: The file location of the database dump. The database dump is accessible via the mount in the directory/var/lib/postgresql/data/
.
As you can see in the output, the restore was not completely successful because of a missing role.
Enter the database, connect to the database postgres
with the user mypostgresqldumpplanet
.
$ psql -d postgres -U mypostgresqldumpplanet
Create the missing role.
postgres=# CREATE ROLE myjhipsterplanet;
CREATE ROLE
Check which databases exist with the \l
command.
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
------------------------+------------------------+----------+------------+------------+---------------------------------------------------
myjhipsterplanet | myjhipsterplanet | UTF8 | en_US.utf8 | en_US.utf8 |
mypostgresqldumpplanet | mypostgresqldumpplanet | UTF8 | en_US.utf8 | en_US.utf8 |
postgres | mypostgresqldumpplanet | UTF8 | en_US.utf8 | en_US.utf8 |
template0 | mypostgresqldumpplanet | UTF8 | en_US.utf8 | en_US.utf8 | =c/mypostgresqldumpplanet +
| | | | | mypostgresqldumpplanet=CTc/mypostgresqldumpplanet
template1 | mypostgresqldumpplanet | UTF8 | en_US.utf8 | en_US.utf8 | =c/mypostgresqldumpplanet +
| | | | | mypostgresqldumpplanet=CTc/mypostgresqldumpplanet
(5 rows)
Note that the database myjhipsterplanet
is created. However, not everything was created correctly, so drop the database.
postgres=# DROP DATABASE myjhipsterplanet;
DROP DATABASE
Use the quit
command to exit the postgres
prompt.
Try the restore command again, and this time the restore is successful.
Connect via psql
to the database.
Use the \c
command to use the myjhipsterplanet
database.
postgres=# \c myjhipsterplanet
You are now connected to database "myjhipsterplanet" as user "mypostgresqldumpplanet".
Verify that the customer
table contains data.
myjhipsterplanet=# select * from customer;
id | customer_name | company_id
----+-------------------------------+------------
1 | Assistant |
2 | Market |
3 | program payment User-friendly |
4 | Team-oriented |
5 | index |
6 | auxiliary experiences |
7 | Sahara |
8 | Kong deposit |
9 | indexing Ball |
10 | users |
(10 rows)
In order to be able to verify the database will be restored correctly, later on, you change one of the customer records.
myjhipsterplanet=# UPDATE customer SET customer_name = 'Assistent1' WHERE id = 1;
4. Backup Database
In order to backup the database, pg_dump
is used. The official documentation of pg_dump
can be found at the PostgreSQL website. Also, help information can be retrieved with the following command:
postgres@4934dd659171:/$ pg_dump --help
If you are running PostgreSQL inside a Docker container, you need to have access to a bash shell inside the container. First, you need to retrieve the name of the Docker container.
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9dea76770a2e postgres:14.5 "docker-entrypoint.s…" 5 seconds ago Up 4 seconds 127.0.0.1:5432->5432/tcp mypostgresqldumpplanet-mypostgresqldumpplanet-postgresql-1
Open a bash shell inside the container, as described above:
$ docker exec -it -u postgres mypostgresqldumpplanet-mypostgresqldumpplanet-postgresql-1 bash
postgres@4934dd659171:/$
There are four formats for creating a backup, and each of them is described in the next sections.
4.1 Plain Text SQL Backup
By default, pg_dump
will create a plain text SQL backup. Do not use this default if you want to use the backup to restore with pg_restore
.
postgres@4934dd659171:/$ pg_dump -f /var/lib/postgresql/data/2023-04-01-plaintext.sql -U mypostgresqldumpplanet myjhipsterplanet
The parameters explained:
-f /var/lib/postgresql/data/2023-04-01-plaintext.sql
: The name of the backup file;-U mypostgresqldumpplanet
: The name of the PostgreSQL admin user, often this will bepostgres
;myjhipsterplanet
: The name of the database you want to backup.
4.2 Custom Backup
The PostgreSQL custom backup format. Compressed by default and most likely the best option to use for creating the backup.
postgres@4934dd659171:/$ pg_dump -F c -f /var/lib/postgresql/data/2023-04-01-custom.dump -U mypostgresqldumpplanet myjhipsterplanet
The arguments are similar for the plain text SQL backup, only here, you need to add the parameter -F c
in order to choose the custom backup format.
4.3 Directory Backup
The directory backup format. Compressed by default and creates a directory with one file for each table and blob being dumped. This format also supports parallel dumps.
postgres@4934dd659171:/$ pg_dump -F d -f /var/lib/postgresql/data/2023-04-01-directory -U mypostgresqldumpplanet myjhipsterplanet
The arguments are similar for the plain text SQL backup, only here, you need to add the parameter -F d
in order to choose the directory backup format.
Some websites explaining how to create PostgreSQL backups do not use the -f
parameter for specifying the output file. Instead, they redirect the output of the pg_dump
command to a file with the greater-than-sign (>
). However, this will not work with the directory backup format as you will encounter the following error:
postgres@4934dd659171:/$ pg_dump -F d -U mypostgresqldumpplanet myjhipsterplanet > /var/lib/postgresql/data/2023-04-01-directory/
bash: /var/lib/postgresql/data/2023-04-01-directory/: Is a directory
4.4 Tar Backup
The tar backup format. Does not support compression but extracting the tar leads to a valid directory format.
postgres@4934dd659171:/$ pg_dump -F t -f /var/lib/postgresql/data/2023-04-01-tar.tar -U mypostgresqldumpplanet myjhipsterplanet
The arguments are similar as for the plain text SQL backup, only here you need to add the parameter -F t
in order to choose for the tar backup format.
5. Restore Database
If you follow the steps in this blog, you first need to restore the changed customer record to its original value. Connect to the database and change the record again.
myjhipsterplanet=# UPDATE customer SET customer_name = 'Assistent' WHERE id = 1;
In order to backup the database, pg_restore
is used. The official documentation of pg_restore
can be found at the PostgreSQL website. Also, help information can be retrieved with the following command:
postgres@4934dd659171:/$ pg_restore --help
5.1 Plain Text SQL Restore
As mentioned before, this will not work with pg_restore
. The following error will be shown:
postgres@4934dd659171:/$ pg_restore -U mypostgresqldumpplanet -d myjhipsterplanet -c /var/lib/postgresql/data/2023-04-01-plaintext.sql
pg_restore: error: input file appears to be a text format dump. Please use psql.
5.2 Custom Restore
The custom restore works as a charm:
postgres@4934dd659171:/$ pg_restore -U mypostgresqldumpplanet -d myjhipsterplanet -c /var/lib/postgresql/data/2023-04-01-custom.dump
The parameters explained:
-U mypostgresqldumpplanet
: The name of the PostgreSQL admin user, often this will bepostgres
;- –
d myjhipsterplanet
: The name of the database you want to restore; -c
: In order to clean the database objects before restoring them;/var/lib/postgresql/data/2023-04-01-custom.dump
: The name of the backup file.
Connect to the database and show the customer with id 1.
postgres@4934dd659171:/$ psql -d postgres -U mypostgresqldumpplanet
postgres=# \c myjhipsterplanet
myjhipsterplanet=# select * from customer where id = 1;
id | customer_name | company_id
----+---------------+------------
1 | Assistent1 |
(1 row)
As you can see, the value is Assistent1
, the value from the backup.
5.3 Directory Restore
The directory restore can be done as follows:
postgres@4934dd659171:/$ pg_restore -U mypostgresqldumpplanet -d myjhipsterplanet -c /var/lib/postgresql/data/2023-04-01-directory/
The parameters are similar to the custom format. The pg_restore
command will figure out for itself which format is used in the backup file.
5.4 Tar Restore
And for completeness, the tar restore.
postgres@4934dd659171:/$ pg_restore -U mypostgresqldumpplanet -d myjhipsterplanet -c /var/lib/postgresql/data/2023-04-01-tar.tar
6. What Is The Mounted Docker Volume?
When you are running PostgreSQL as a Docker container, the directory /var/lib/postgresql/data
will be mounted to a host directory. How can you find this directory? First, retrieve the name of the Docker container.
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4934dd659171 postgres:14.5 "docker-entrypoint.s…" 3 hours ago Up 3 hours 127.0.0.1:5432->5432/tcp mypostgresqldumpplanet-mypostgresqldumpplanet-postgresql-1
The Docker inspect
command will dump the mounts of the container, and here you will find the mounted volumes.
$ docker inspect mypostgresqldumpplanet-mypostgresqldumpplanet-postgresql-1
...
"Mounts": [
{
"Type": "bind",
"Source": "/home/<user>/docker/volumes/postgresqldata",
"Destination": "/var/lib/postgresql/data",
"Mode": "rw",
"RW": true,
"Propagation": "rprivate"
}
],
...
7. Clean Up
If you followed the steps in this blog, you need to do some cleanup.
Stop the PostgreSQL Docker container with the following command executed from the root of the Git repository.
$ docker compose -f postgresql.yml down
[+] Running 2/2
Container mypostgresqldumpplanet-mypostgresqldumpplanet-postgresql-1 Removed
Network mypostgresqldumpplanet_default Removed
8. Conclusion
In this post, you learned how to backup and restore a PostgreSQL database. Interesting information on how to optimize backups can be found here.
Published at DZone with permission of Gunter Rotsaert, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments