Why You Need To Backup Your Postgres Database and How To Do It
This article is going to demonstrate why database backups are crucial and how to back up your PostgreSQL database using pg_dump and pg_dumpall.
Join the DZone community and get the full member experience.
Join For FreeDo you remember the last time you worked hard on an essay or paper, only to lose it all when Word or your computer suddenly crashed? Hours of work are gone because you didn't hit the save button often enough. That sort of frustration could drive even the bravest of souls to tears.
You might be able to afford this kind of data loss once in a while, but now imagine it's an entire database with Terrabytes of information—not just one document. Especially if that information is in a production environment, the damage could be catastrophic, costing millions of dollars. Not even mature organizations can afford these mistakes, and database backups prevent precisely this kind of situation.
Why Backing up Databases is Necessary
Database Backups will see that users suffer minimal losses during a media, software, or hardware failure. These failures may be caused by innocent mistakes or perpetrators with malicious intent. The loss of valuable information can not only irreparably tarnish reputations but can also potentially be grounds for legal action.
For instance, the UK Prison System lost the release dates and personal information of over eighty thousand inmates, resulting in the loss of a profitable contract in 2008. A web hosting company known as DreamHost had a bug in their routers, which deleted several records that were not restorable. They had to issue a public apology to their customers and work very hard to earn their trust again.
In case of any mishap, the backed-up data can get restored, and the database can be up and running again as quickly as possible. Keeping a backup of your databases can also prove to be helpful in the long run when you want to look back upon your old data for things like taxes, audits, and annual reports for stakeholders and investors. You can even use your data backups to improve your client relationships by leveraging the data in these backups to increase marketing and sales.
However, backing up a database is not as easy as plugging in a hard drive and transferring all your data.
The What, Where, and How of Database Backups
A database has several components in addition to its data files. First, it is necessary to back up the control file, parameter file, and log files. Only then will the database be able to replicate the structure of the restored data, initialize parameters and reflect the latest changes made in the database. It is also crucial to preserve passwords that you might need to enter during recovery.
A database backup involves backing up the operational state and stored data and the architecture of the database software. Then, in case of any unforeseen circumstance, one can create a copy or duplicate instance of the old database.
The data backed up can be stored on your USB drives or external hard drives if you need it to be portable. You can use online backup services to access them anywhere, but you need to ensure that your data is encrypted and password protected. If you have a local area network, you can back up your files to another server or device through it too.
Responsibilities of the DBA
The DBA has to arrive at a plan that includes which components to back up and the type of backup. For example, cold backups occur after the database is shut down, while hot backups happen even while the database is running. Often, these decisions depend on the kind of failure you are preparing for. For instance, you can't overcome hardware failures without restoring the system entirely and hence require the backup of the server OS.
Another way to categorize backups is as full or incremental backups. Full backups are straightforward, as they store a copy of all the existing files. However, large databases are backed up using multiple channels and use incremental backup techniques to reduce the amount of time taken for the process. Incremental backups back up only those files that get updated after the last backup was performed. Despite being complex to locate, they save storage. We can retrieve full backups quickly, but they are heavy on the disk.
These backups are scheduled for times when the database is not heavily used, ensuring its performance is not compromised. Even some tools allow you to automatically schedule these backups and send alerts to rerun if they fail.
Things To Keep in Mind
Before devising a plan to back up your databases, you must consider how much you can spend on the process, where you would like to store your backed up data, and what method you would use. You should also consider whether your data will grow in the long run and if your backup plan needs to be scalable.
In a business, you must have all stakeholders on the same page concerning your disaster recovery plan. This plan is the order in which the IT team would restore the databases. After all, it is everybody's responsibility to ensure that the more important applications must be available at the earliest.
Let us now see how to back up a PostgreSQL database.
Backing up a PostgreSQL Database
In PostgreSQL, you have a utility called the pg_dump
that can perform backups. To back up your database onto a plain-text SQL file, simply log in to your database server. Then switch to the Postgres account, and run the command.
pg_dump database_name > filename.sql
The plain-text SQL file will have the commands required to reconstruct the entire database to its state when backed up.
For any other file type, you can use the -F
format specifier. Should you need the backup to be stored as a custom archive file, append -F c
before the database name. For a directory or tar format archive, use the alphabet d
and t
, respectively. pg_restore
can restore the backed-up databases in any format. So, it shouldn't be a concern when you're deciding which format to use.
pg_dump -F c database_name > filename.dump
pg_dump -F t database_name > filename.tar
Sometimes, you may want to store the backup in a new directory and not a file. In this case, you can replace the > filename
with -f directory
. This will create a new directory with this name, which means you cannot give it an already existing name.
pg_dump -F d database_name -f directory
pg_dump
can only dump a single database. But isn't it a little tedious to run such commands for each of your databases? If you'd like to back them all up in an instant, use the pg_dumpall
tool.
pg_dumpall > alldbdump.sql
You can also use
pg_dumpall
to back up global objects like roles and tablespaces that don't belong to a single database but are commonly used across the entire system.
Restoring a Database on PostgreSQL
If you have taken your backup in a text file, the PL/SQL utility can restore these files.
plsql database_name < filename.sql
pg_restore can restore the files backed up in the custom, directory, or tar formats.
pg_restore -d database_name filename.dump
pg_restore -d database_name filename.tar
pg_restore -d database_name dumpdirectory
pg_restore has several other command-line arguments that you can use to specify certain operations. For instance, -a restores the data alone and not its definitions. Likewise, -c drops the database objects before creating them again, and you can use -t to restore a particular table.
You can check out all the command line arguments here.
Backing up Large databases
If you're backing up a huge database, chances are you can't afford the space to store all of it directly. In such cases, you can compress the dump file and then store it. One example of a tool that facilitates this is - gzip.
Pg_dump database_name | gzip > dumpname.g
With massive databases, dumping with compression can be a time taking process. PostgreSQL rids you of this concern, too, as it allows you to dump several files in parallel. You can use the -j flag to do the same.
pg_dump -F d -j 5 -f dumpdirectory
Although this process saves your time, it could potentially put a higher load on your database server.
Backing up Remote Databases
PostgreSQL's pg_dump tool makes backing up remote database servers easy with the help of command-line options. All you need to do is use -h and -p to specify the remote host and remote port that the database listener is on, respectively.
pg_dump -U role -h 20.20.20.20 -p 7777 database_name
The -U flag is used to mention the role to connect to your database as. This is important to ensure that you have the proper privileges to access the content on the database. If you still have trouble accessing it, you might want to check if the database authentication methods are configured on the database server.
You can also transfer your database from one server to another in this manner:
pg_dump -U role -h 10.10.10.10 database | plsql -U role -h 20.20.20.20 database_name
Automatic Backups
You can schedule the backups of your databases at timely intervals through cron jobs on PostgreSQL. To configure one such job, log in through the superuser and give these commands:
First, create a directory for your backups and then edit the crontab to add a new job.
mkdir -p /srv/backups/databases
crontab -e
At the end of the crontab, copy/paste this:
pg_dump -U postgres database_name > /srv/backups/postgres/database_name.sql
The dump format, however, is your choice. Once done, the cron service will begin to run it and ensure that your data will get backed up at midnight daily.
Conclusion
Database backups are essential to ensure your data is always safe and within your reach. Choosing your backup type and method depends on various factors. Whether you want to back up the data of an entire business or just one on your server, there are many options available.
Now that you have seen how to back up a PostgreSQL database, go ahead and try it on your database!
Published at DZone with permission of Felix Schildorfer. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments