13 Tips to Improve PostgreSQL Insert Performance
Get a cheatsheet full of ways to improve your database ingest (INSERT) performance and speed up your time-series queries using PostgreSQL.
Join the DZone community and get the full member experience.
Join For FreeIngest performance is critical for many common PostgreSQL use cases, including application monitoring, application analytics, IoT monitoring, and more. While databases have long had time fields, there's a key difference in the type of data these use cases collect: unlike standard relational "business" data, changes are treated as inserts, not overwrites (in other words, every new value becomes a new row in the database, instead of replacing the row's prior value with the latest one).
If you're operating in a scenario where you need to retain all data v. overwriting past values, optimizing the speed in which your database can ingest new data becomes essential.
We have a lot of experience optimizing performance for ourselves and our community members, and we’ve broken our top tips into two categories. First, we’ve outlined a few tips that are useful for improving PostgreSQL in general. After that, we’ve outlined a few that are specific to TimescaleDB.
Improve PostgreSQL Performance
Here are some best practices for improving ingest performance in vanilla PostgreSQL:
1. Use Indexes in Moderation
Having the right indexes can speed up your queries, but they’re not a silver bullet. Incrementally maintaining indexes with each new row requires additional work. Check the number of indexes you’ve defined on your table (use the psql
command \d table_name
), and determine whether their potential query benefits outweigh the storage and insert overhead. Since every system is different, there aren’t any hard and fast rules or “magic number” of indexes – just be reasonable.
2. Reconsider Foreign Key Constraints
Sometimes it's necessary to build foreign keys (FK) from one table to other relational tables. When you have an FK constraint, every INSERT will typically then need to read from your referenced table, which can degrade performance. Consider if you can denormalize your data – we sometimes see pretty extreme use of FK constraints, done from a sense of “elegance” rather than engineering tradeoffs.
3. Avoid Unnecessary UNIQUE Keys
Developers are often trained to specify primary keys in database tables, and many ORMs love them. Yet, many use cases – including common monitoring or time-series applications – don’t require them, as each event or sensor reading can simply be logged as a separate event by inserting it at the tail of a hypertable's current chunk during write time.
If a UNIQUE constraint is otherwise defined, that insert can necessitate an index lookup to determine if the row already exists, which will adversely impact the speed of your INSERT.
4. Use Separate Disks for WAL and Data
While this is a more advanced optimization that isn't always needed, if your disk becomes a bottleneck, you can further increase throughput by using a separate disk (tablespace) for the database's write-ahead log (WAL) and data.
5. Use Performant Disks
Sometimes developers deploy their database in environments with slower disks, whether due to poorly-performing HDD, remote SANs, or other types of configurations. And because when you are inserting rows, the data is durably stored to the write-ahead log (WAL) before the transaction completes, slow disks can impact insert performance. One thing to do is check your disk IOPS using the ioping
command.
Read test:
$ ioping -q -c 10 -s 8k .
--- . (hfs /dev/disk1 930.7 GiB) ioping statistics ---
9 requests completed in 208 us, 72 KiB read, 43.3 k iops, 338.0 MiB/s
generated 10 requests in 9.00 s, 80 KiB, 1 iops, 8.88 KiB/s
min/avg/max/mdev = 18 us / 23.1 us / 35 us / 6.17 us
Write test:
xxxxxxxxxx
$ ioping -q -c 10 -s 8k -W .
--- . (hfs /dev/disk1 930.7 GiB) ioping statistics ---
9 requests completed in 10.8 ms, 72 KiB written, 830 iops, 6.49 MiB/s
generated 10 requests in 9.00 s, 80 KiB, 1 iops, 8.89 KiB/s
min/avg/max/mdev = 99 us / 1.20 ms / 2.23 ms / 919.3 us
You should see at least 1000s of read IOPS and many 100s of write IOPS. If you are seeing far fewer, your INSERT performance is likely to be impacted by your disk hardware. See if alternative storage configurations are feasible.
Choose and Configure TimescaleDB for Better Ingest Performance
TimescaleDB is tuned to improve ingest performance. The most common uses for TimescaleDB involve storing massive amounts of data for cloud infrastructure metrics, product analytics, web analytics, IoT devices, and many other time-series use cases. As is typical with time-series data, these scenarios are time-centric, almost solely append-only (lots of INSERTs), and require fast ingestion of large amounts of data within small time windows.
TimescaleDB is packaged as an extension to PostgreSQL and is purpose-built for time-series use cases. So, if getting faster ingest performance out of PostgreSQL is necessary for your applications or systems, consider using TimescaleDB (available fully-managed via Timescale Cloud – our database-as-a-service offering, or self-managed via our free Community Edition).
...and here are 8 more techniques for improving ingest performance with TimescaleDB:
6. Use Parallel Writes
Each INSERT or COPY command to TimescaleDB (as in PostgreSQL) is executed as a single transaction and thus runs in a single-threaded fashion. To achieve higher ingest, you should execute multiple INSERTS or COPY commands in parallel.
For help with bulk loading large CSV files in parallel, check out TimescaleDB's parallel copy command.
Pro tip: make sure your client machine has enough cores to execute this parallelism (running 32 client workers on a 2 vCPU machine doesn’t help much – the workers won’t actually be executed in parallel).
7. Insert Rows in Batches
In order to achieve higher ingest rates, you should insert your data with many rows in each INSERT call (or else use some bulk insert command, like COPY or our parallel copy tool).
Don't insert your data row-by-row – instead, try at least hundreds (or thousands) of rows per INSERT. This allows the database to spend less time on connection management, transaction overhead, SQL parsing, etc., and more time on data processing.
8. Properly Configure shared_buffers
We typically recommend 25% of available RAM. If you install TimescaleDB via a method that runs timescaledb-tune
, it should automatically configure shared_buffers
to something well-suited to your hardware specs.
Note: in some cases, typically with virtualization and constrained cgroups memory allocation, these automatically-configured settings may not be ideal. To check that your shared_buffers
are set to within the 25% range, run SHOW shared_buffers
from your psql
connection.
9. Run our Docker Images on Linux Hosts
If you are running a TimescaleDB Docker container (which runs Linux) on top of another Linux operating system, you're in great shape. The container is basically providing process isolation, and the overhead is extremely minimal.
If you're running the container on a Mac or Windows machine, you'll see some performance hits for the OS virtualization, including for I/O.
Instead, if you need to run on Mac or Windows, we recommend installing directly instead of using a Docker image.
10. Write Data in Loose Time Order
When chunks are sized appropriately (see #11 and #12), the latest chunk(s) and their associated indexes are naturally maintained in memory. New rows inserted with recent timestamps will be written to these chunks and indexes already in memory.
If a row with a sufficiently older timestamp is inserted – i.e., it's an out-of-order or backfilled write – the disk pages corresponding to the older chunk (and its indexes) will need to be read in from disk. This will significantly increase write latency and lower insert throughput.
Particularly, when you are loading data for the first time, try to load data in sorted, increasing timestamp order.
Be careful if you're bulk loading data about many different servers, devices, and so forth:
- Do not bulk insert data sequentially by server (i.e., all data for server A, then server B, then C, and so forth). This will cause disk thrashing as loading each server will walk through all chunks before starting anew.
- Instead, arrange your bulk load so that data from all servers are inserted in loose timestamp order (e.g., day 1 across all servers in parallel, then day 2 across all servers in parallel, etc.)
11. Avoid “Too Large” Chunks
To maintain higher ingest rates, you want your latest chunk, as well as all its associated indexes, to stay in memory, so that writes to the chunk and index updates merely update memory. (The write is still durable, as inserts are written to the WAL on disk before the database pages are updated.)
If your chunks are too large, then writes to even the latest chunk will start swapping to disk.
As a rule-of-thumb, we recommend that the latest chunks and all their indexes fit comfortably within the database's shared_buffers
. You can check your chunk sizes via the chunk_relation_size_pretty
SQL command.
xxxxxxxxxx
=> SELECT chunk_table, table_size, index_size, toast_size, total_sizeFROM chunk_relation_size_pretty('hypertable_name')ORDER BY ranges DESC LIMIT 4;
chunk_table | table_size | index_size | toast_size | total_size
-----------------------------------------+------------+------------+------------+------------
_timescaledb_internal._hyper_1_96_chunk | 200 MB | 64 MB | 8192 bytes | 272 MB
_timescaledb_internal._hyper_1_95_chunk | 388 MB | 108 MB | 8192 bytes | 500 MB
_timescaledb_internal._hyper_1_94_chunk | 388 MB | 108 MB | 8192 bytes | 500 MB
_timescaledb_internal._hyper_1_93_chunk | 388 MB | 108 MB | 8192 bytes | 500 MB
If your chunks are too large, you can update the range for future chunks via the set_chunk_time_interval
command. However, this does not modify the range of existing chunks (e.g., by rewriting large chunks into multiple small chunks).
For configurations where individual chunks are much larger than your available memory, we recommend dumping and reloading your hypertable data to properly sized chunks.
Keeping the latest chunk applies to all active hypertables; if you are actively writing to two hypertables, the latest chunks from both should fit within shared_buffers
.
12. Avoid Too Many or Too Small Chunks
Unless you're running multi-node TimescaleDB, we don't currently recommend using space partitioning. And if you do, remember that this number of chunks is created for every time interval.
So, if you create 64 space partitions and daily chunks, you'll have 24,640 chunks per year. This may lead to a bigger performance hit during query time (due to planning overhead) compared to insert time, but something to consider nonetheless.
Another thing to avoid: using an incorrect integer value when you specify the time interval range in create_hypertable
.
Pro tip:
- If your time column uses a native timestamp type, then any integer value should be in terms of microseconds (so one day = 86400000000). We recommend using interval types ('1 day') to avoid potential for any confusion.
- If your time column is an integer or bigint itself, use the appropriate range: if the integer timestamp is in seconds, use 86400; if the bigint timestamp is in nanoseconds, use 86400000000000.
In both cases, you can usechunk_relation_size_pretty
to make sure your chunk sizes or partition ranges seem reasonable:
xxxxxxxxxx
=> SELECT chunk_table, ranges, total_size
FROM chunk_relation_size_pretty('hypertable_name')
ORDER BY ranges DESC LIMIT 4;
chunk_table | ranges | total_size
-----------------------------------------+---------------------------------------------------------+------------
_timescaledb_internal._hyper_1_96_chunk | {"['2020-02-13 23:00:00+00','2020-02-14 00:00:00+00')"} | 272 MB
_timescaledb_internal._hyper_1_95_chunk | {"['2020-02-13 22:00:00+00','2020-02-13 23:00:00+00')"} | 500 MB
_timescaledb_internal._hyper_1_94_chunk | {"['2020-02-13 21:30:00+00','2020-02-13 22:00:00+00')"} | 500 MB
_timescaledb_internal._hyper_1_93_chunk | {"['2020-02-13 20:00:00+00','2020-02-13 21:00:00+00')"} | 500 MB
13. Watch Row Width
The overhead from inserting a wide row (say, 50, 100, 250 columns) is going to be much higher than inserting a narrower row (more network I/O, more parsing and data processing, larger writes to WAL, etc.). Most of our published benchmarks are using TSBS, which uses 12 columns per row. So you'll correspondingly see lower insert rates if you have very wide rows.
If you are considering very wide rows because you have different types of records, and each type has a disjoint set of columns, you might want to try using multiple hypertables (one per record type) – particularly if you don't often query across these types.
Additionally, JSONB records are another good option if virtually all columns are sparse. That said, if you're using sparse wide rows, use NULLs for missing records whenever possible, not default values, for the most performance gains (NULLs are much cheaper to store and query).
Finally, the cost of wide rows are actually much less once you compress rows using TimescaleDB’s native compression. Rows are converted into more columnar compressed form, sparse columns compress extremely well, and compressed columns aren’t read from disk for queries that don’t fetch individual columns.
Summary
If ingest performance is critical to your use case, consider using TimescaleDB. You can get started with Timescale Cloud for free today, or download TimescaleDB to your own machine or cloud instance for free.
Our approach to support is to address your whole solution, so we're here to help help you achieve your desired performance results (see more details about our Customer Care team and ethos).
Lastly, our Slack community is a great place to connect with 4K+ other developers with similar use cases, as well as myself, Timescale engineers, product team members, and Developer Advocates.
Published at DZone with permission of Mike Freedman. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments