PostgreSQL Performance Metrics
PostgreSQL server configuration parameters for optimization and database performance.
Join the DZone community and get the full member experience.
Join For FreeWe can get the best performance out of their PostgreSQL database by tracking key performance metrics. Keeping these metrics on your dashboard will help. Ignoring these problems could result in a plunge in the productivity of Postgresql. Here I want to explain how to monitor PostgreSQL, I added in details what exactly you should be looking at when monitoring the performance of your database. There are several key metrics you'll definitely want to keep track of when it comes to database performance
Database Connection Parameters
A PostgreSQL database server can have multiple active connections running concurrently in the database. If the number of connections is high, you may need to analyze the state of these user sessions, and terminate idle sessions that are slowing down the server.
Replication Parameters
Replication is a process wherein data is copied from a database on master to a database on slaves. PostgreSQL offers an internal streaming replication service that creates a high-availability environment, balances the load of read-only queries across several nodes, creates a read-only database to run analytical queries on, and many other pivotal functions. Monitoring replicas is a pivotal aspect of PostgreSQL monitoring as replicas can sometimes go out of sync.
Throughput Parameters
It represent the amount of data processed in particular time. It is a composite of I/O speed, CPU speed, parallel capabilities of the machine, and the efficiency of the operating system and system software.
Locks Parameters
PostgreSQL provides various lock modes to control concurrent access to data in tables. This mechanism ensures consistency of data in the database. Looking at pg_locks shows you what locks are granted and what processes are waiting for locks to be acquired.
Resource Parameters
Manages all the resources for the Database:
- Disk and Index Usage
- Memory Parameters
- WAL Parameters
Items |
Parameters |
Table/Query |
Connections |
Maximum number of connections |
SELECT count(*) FROM pg_stat_activity; |
DB Conflict count |
pg_stat_database_conflicts |
|
Number of commits |
xact_commit in pg_stat_database View |
|
Number of sessions |
count of session from pg_stat_user_functions View |
|
Vacuums |
last_vacuum last_autovacuum vacuum_count autovacuum_count autoanalyze_count from pg_stat_all_tables View |
|
Checkpoints and bgwriter statistics |
pg_stat_bgwriter shows metrics to flush data in memory (buffers) to disk. It can do this in 3 different ways
buffers_alloc : Total Number of buffers allocated
|
|
Client related info |
SELECT current_database() datname, schemaname, relname, seq_scan, seq_tup_read, idx_scan, idx_tup_fetch, n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd, n_live_tup, n_dead_tup, n_mod_since_analyze, COALESCE(last_vacuum, '1970-01-01Z'), COALESCE(last_vacuum, '1970-01-01Z') as last_vacuum, COALESCE(last_autovacuum, '1970-01-01Z') as last_autovacuum, COALESCE(last_analyze, '1970-01-01Z') as last_analyze, COALESCE(last_autoanalyze, '1970-01-01Z') as last_autoanalyze, vacuum_count, autovacuum_count, analyze_count, autoanalyze_count FROM pg_stat_user_tables
Python: ("SELECT current_database(), schemaname, relname, seq_scan, seq_tup_read, idx_scan, idx_tup_fetch, n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd, n_live_tup, n_dead_tup, n_mod_since_analyze, COALESCE(last_vacuum, %s), COALESCE(last_vacuum, %s) as last_vacuum, COALESCE(last_autovacuum, %s) as last_autovacuum, COALESCE(last_analyze, %s) as last_analyze, COALESCE(last_autoanalyze, %s) as last_autoanalyze, vacuum_count, autovacuum_count, analyze_count, autoanalyze_count FROM pg_stat_user_tables where seq_scan > 25 order by seq_tup_read desc limit 5;", (dt_format,dt_format,dt_format,dt_format,dt_format,) ); |
|
Replication |
Hosts with replication delay |
SELECT write_location - sent_location AS write_lag, flush_location - write_location AS flush_lag, replay_location - flush_location AS replay_lag FROM pg_stat_replication; |
Replication lag in bytes |
SELECT pg_current_wal_lsn() - confirmed_flush_lsn FROM pg_replication_slots; |
|
Lag in seconds |
SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())) as lag |
|
Checkpoints |
checkpoints_req and checkpoints_timed. The first shows the number of checkpoints requested while the latter represents the number of checkpoints scheduled |
|
Status of physical replication (pg_stat_replication) |
pg_stat_replication |
|
Inactive replication slots |
SELECT count(*) FROM pg_replication_slots WHERE NOT active; |
|
Replica info |
SELECT usename,application_name,client_hostname,state,sent_location,write_location,replay_location from pg_stat_replication |
|
Throughput |
Sequential scans vs index scans |
seq_scan seq_tup_read idx_scan idx_tup_fetch from pg_stat_all_tables View |
Top Function calls |
SELECT backend_xid FROM pg_stat_activity |
|
Number of running backend |
SELECT count(*) FROM pg_stat_activity; |
|
Locks |
Locks by lock mode |
lock from the pg_locks view |
Deadlocks/database |
Deadlocks from pg_stat_database View |
|
Backend waiting on locks |
SELECT count(*) FROM pg_stat_activity WHERE wait_event = 'Lock'; |
|
Backend idle in transactions |
SELECT count(*) FROM pg_stat_activity WHERE state = 'idle in transaction'; |
|
Session holding or awaiting each lock |
SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa ON pl.pid = psa.pid; |
|
Resource Utilization |
Tables with most disk usage |
heap_blks_read from pg_statio_all_tables View |
Tables with most live rows |
n_live_tup from pg_stat_all_tables View |
|
Most frequent scanned index |
Idx_scan from pg_stat_all_tables? |
|
dead rows |
higher number of dead rows (n_dead_tup in the pg_stat_user_tables view |
|
Temp bytes |
temp_bytes from pg_stat_database View |
|
Active user count or current activity per process (pg_stat_activity) |
pg_stat_activity view will have one row per server process |
|
DB commits |
xact_commit from pg_stat_database View |
|
Live tuples and Dead tuples |
SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum FROM pg_stat_all_tables ORDER BY n_dead_tup/(n_live_tup * current_setting('autovacuum_vacuum_scale_factor')::float8 + current_setting('autovacuum_vacuum_threshold')::float8) DESC |
|
Local block info |
SELECT t2.rolname, t3.datname, queryid, calls, total_time / 1000 as total_time_seconds, min_time / 1000 as min_time_seconds, max_time / 1000 as max_time_seconds, mean_time / 1000 as mean_time_seconds, stddev_time / 1000 as stddev_time_seconds, rows, shared_blks_hit, shared_blks_read, shared_blks_dirtied, shared_blks_written, local_blks_hit, local_blks_read, local_blks_dirtied, local_blks_written, temp_blks_read, temp_blks_written, blk_read_time / 1000 as blk_read_time_seconds, blk_write_time / 1000 as blk_write_time_seconds FROM pg_stat_statements t1 JOIN pg_roles t2 ON (t1.userid=t2.oid) JOIN pg_database t3 ON (t1.dbid=t3.oid) WHERE t2.rolname != 'rdsadmin' |
|
WAL Buffers |
Shared/ WAL/ CLOG/ Checkpoint buffers |
SELECT current_database() datname, schemaname, relname, heap_blks_read, heap_blks_hit, idx_blks_read, idx_blks_hit, toast_blks_read, toast_blks_hit, tidx_blks_read, tidx_blks_hit FROM pg_statio_user_tables |
Database cache usage ratio: formula SUM (blks_hit) / SUM (blks_read). |
blks_hit from pg_stat_database View |
|
WAL count ready to be archieved |
SELECT count(*) FROM pg_ls_dir('pg_xlog/archive_status') WHERE pg_ls_dir ~ '^[0-9A-F]{24}.ready$'; |
|
Disk Space by Database |
SELECT pg_database.datname, pg_database_size(pg_database.datname) as size_bytes FROM pg_database |
All of this can be seen in the GitHub with the working python script:
https://github.com/forprem/pg-metrics/blob/master/pg_connect.py
Run this script with: python pg_connect.py
You need to change your PostgreSQL setting in connect function database, user, password, and host info.
Opinions expressed by DZone contributors are their own.
Comments