Useful PostgreSQL Commands/Queries

In this article, we will briefly discuss some queries and their uses, specifically in the case of both monitoring and debugging PostgreSQL-related issues.

Sahil Aggarwal

Apr. 27, 21 · Tutorial

Likes (7)

Comment

Save

7.9K Views

Introduction

There are many types of queries and of course, all are important, but when it comes to monitoring and debugging, we are generally in a hurry and do not remember what to query to check for a specific problem/issue. In this blog, we will briefly discuss some queries and their uses, specifically in the case of monitoring and debugging PostgreSQL-related issues.

In the case of monitoring and debugging, we generally have the following categorical issues: size-related and performance-related. Sometimes these issues are interlinked.

Size-Related Issues

Here are some basic Postgres commands/queries which will be very helpful in monitoring or debugging size-related issues:

Checking the Size of the Database: There are many times where we need to check which database is the culprit of an issue i.e… which database is consuming major space . Following is a query to check the size of a database in decreasing order according to size consumed.

    SQL
   
          x
         
SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname)) AS size FROM pg_database ORDER BY pg_database_size(pg_database.datname) desc;

After that, we may need to find which table/index is consuming the most space; for this, we have the following query:

    SQL
   
xxxxxxxxxx

SELECT
  nspname || '.' || relname AS "relation", pg_size_pretty(pg_relation_size(C.oid)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(C.oid) DESC
LIMIT 20;

There are many times where there are no records in the table, but due to dead tuples, the size of a table/index may increase. To check which table has the highest dead tuples, use the following query:

    SQL
   
xxxxxxxxxx

SELECT schemaname,relname,n_live_tup,n_dead_tup,last_vacuum,last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup DESC limit 10;

Sample Output:

    SQL
   
xxxxxxxxxx

schemaname | relname | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum
— — — — — — + — — — — — — — — — — — — — — — — — — — — — — — — — + — — — — — — + — — — — — — + — — — — — — — — — — — — — — — — — + — — — — — — — — — — — — — — — — —
public | campaign_session_denormalized_data | 1123219 | 114268349 | 2021–01–10 18:27:34.050087+05:30 | 2021–01–19 14:08:58.062574+05:30

From the above output, we can also determine if the auto vacuum is running properly or not i.e... When the last auto vacuum ran on any particular table whose dead tuples are high.

Over time, due to MVCC, your table will grow in size (called table bloat)—this is why regular VACUUM is needed. This query will show you a list of tables and indexes with the most bloats. The value represents the number of “wasted bytes," or the difference between what is actually used by the table and index, and what we compute that it should be.

The way it works is it estimates the optimized size of the table/index by a calculation from each row sizes times total rows and compares that against the actual table size. Do note that this is an estimate, not an actual figure.

    SQL
   
 

     
    

      

     

      

     

     

    
xxxxxxxxxx

          

         

          

         

            
          
with foo as (
  SELECT
    schemaname, tablename, hdr, ma, bs,
    SUM((1-null_frac)*avg_width) AS datawidth,
    MAX(null_frac) AS maxfracsum,
    hdr+(
      SELECT 1+COUNT(*)/8
      FROM pg_stats s2
      WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
    ) AS nullhdr
  FROM pg_stats s, (
    SELECT
      (SELECT current_setting('block_size')::NUMERIC) AS bs,
      CASE WHEN SUBSTRING(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
      CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
    FROM (SELECT version() AS v) AS foo
  ) AS constants
  GROUP BY 1,2,3,4,5  
), rs as (
  SELECT
    ma,bs,schemaname,tablename,
    (datawidth+(hdr+ma-(CASE WHEN hdr%ma=0 THEN ma ELSE hdr%ma END)))::NUMERIC AS datahdr,
    (maxfracsum*(nullhdr+ma-(CASE WHEN nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
  FROM foo  
), sml as (
  SELECT
    schemaname, tablename, cc.reltuples, cc.relpages, bs,
    CEIL((cc.reltuples*((datahdr+ma-
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::FLOAT)) AS otta,
    COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::FLOAT)),0) AS iotta -- very rough approximation, assumes all cols
  FROM rs
  JOIN pg_class cc ON cc.relname = rs.tablename
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
  LEFT JOIN pg_index i ON indrelid = cc.oid
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
)
SELECT
  current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
  ROUND((CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages::FLOAT/otta END)::NUMERIC,1) AS tbloat,
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes,
  iname, /*ituples::bigint, ipages::bigint, iotta,*/
  ROUND((CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages::FLOAT/iotta END)::NUMERIC,1) AS ibloat,
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
FROM sml
ORDER BY wastedbytes DESC

      

     

Query extracted but rewritten from checkpostgres

Sample Output

    SQL
   
xxxxxxxxxx

current_database | schemaname | tablename | tbloat | wastedbytes | iname | ibloat | wastedibytes
— — — — — — — — — + — — — — — — + — — — — — — — — — — — — — — — — -+ — — — — + — — — — — — -+ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — -+ — — — — + — — — — — — —
dashboard | public | job_logs | 1.1 | 4139507712 | job_logs_pkey | 0.2 | 0
dashboard | public | job_logs | 1.1 | 4139507712 | index_job_logs_on_job_id_and_created_at | 0.4 | 0
dashboard | public | events | 1.1 | 3571736576 | events_pkey | 0.1 | 0
dashboard | public | events | 1.1 | 3571736576 | index_events_on_tenant_id | 0.1 | 0
dashboard | public | events | 1.1 | 3571736576 | index_events_on_event_type | 0.2 | 0
dashboard | public | jobs | 1.1 | 2013282304 | index_jobs_on_status | 0.0 | 0
dashboard | public | jobs | 1.1 | 2013282304 | index_jobs_on_tag | 0.3 | 0
dashboard | public | jobs | 1.1 | 2013282304 | index_jobs_on_tenant_id | 0.2 | 0
dashboard | public | jobs | 1.1 | 2013282304 | index_jobs_on_created_at | 0.2 | 0
dashboard | public | jobs | 1.1 | 2013282304 | index_jobs_on_created_at_queued_or_running | 0.0 | 21086208

You should be looking at:

tbloat: Table bloat, the ratio between what it currently is, and what it can be optimized to.
wastedbytes: The number of bytes wasted.
ibloat & wastedibytes: The same as above, but for indexes.

When you see a table with high bloats, then consider running VACUUM ANALYZE on it.

Performance Related issues

For performance-related monitoring, we need to run the following queries, though it may be too much in a day-to-day scenario.

Get Running Queries (And Lock statuses) in PostgreSQL

    SQL
   
xxxxxxxxxx

SELECT S.pid, age(clock_timestamp(), query_start),usename,query,L.mode,L.locktype,L.granted,s.datname FROM pg_stat_activity S inner join pg_locks L on S.pid = L.pid order by L.granted, L.pid DESC;

The above query may help if some particular query is running for hours and has taken some lock, due to other queries getting stuck on lock. It also shows the database name on which query is running. We can also determine if some connection is open and no query is running so there may be some session leak in your database (for more for about session leak read the following article)

To kill a particular query that is causing the issue, use this query:

    SQL
   
xxxxxxxxxx

SELECT pg_cancel_backend(pid);

This will send a SIGINT to the current process.

I hope this article helps all of you with at least some of your time spent debugging. Please comment and follow me if you have any questions or want to discuss!

Database PostgreSQL sql

Published at DZone with permission of Sahil Aggarwal. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

Trending