The PostgreSQL Performance Puzzle was, perhaps, too easy – it didn’t take long for someone to guess the correct answer!
But I didn’t see much discussion about why the difference or what was happening. My emphasis on the magnitude of the difference was a tip-off that there’s much more than meets the eye with this puzzle challenge – and one reason I published it is that I’d looked and I thought there were some very interesting things happening beneath the surface.
So let’s dig!
There are several layers. But – as with all SQL performance questions always – we begin simply with EXPLAIN.
[root@ip-172-31-36-129 ~]# sync; echo 1 > /proc/sys/vm/drop_caches
[root@ip-172-31-36-129 ~]# service postgresql-14 restart;
Redirecting to /bin/systemctl restart postgresql-14.service
pg-14.4 rw root@db1=# select 1;
...
The connection to the server was lost. Attempting reset: Succeeded.
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
pg-14.4 rw root@db1=# set track_io_timing=on;
SET
pg-14.4 rw root@db1=# set log_executor_stats=on;
SET
pg-14.4 rw root@db1=# set client_min_messages=log;
SET
pg-14.4 rw root@db1=# \timing on
Timing is on.
pg-14.4 rw root@db1=# explain (analyze,verbose,buffers,settings) select count(mydata) from test where mynumber1 < 500000;
LOG: 00000: EXECUTOR STATISTICS
DETAIL: ! system usage stats:
! 0.107946 s user, 0.028062 s system, 0.202054 s elapsed
! [0.113023 s user, 0.032292 s system total]
! 7728 kB max resident size
! 89984/0 [92376/360] filesystem blocks in/out
! 0/255 [1/1656] page faults/reclaims, 0 [0] swaps
! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
! 197/0 [260/0] voluntary/involuntary context switches
LOCATION: ShowUsage, postgres.c:4898
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=20080.19..20080.20 rows=1 width=8) (actual time=190.273..190.278 rows=1 loops=1)
Output: count(mydata)
Buffers: shared hit=4 read=5533
I/O Timings: read=82.114
-> Index Scan using test_mynumber1 on public.test (cost=0.57..18803.24 rows=510781 width=18) (actual time=0.389..155.537 rows=499999 loops=1)
Output: mydata, mynumber1, mynumber2
Index Cond: (test.mynumber1 < 500000)
Buffers: shared hit=4 read=5533
I/O Timings: read=82.114
Settings: effective_cache_size = '8GB', max_parallel_workers_per_gather = '0'
Query Identifier: 970248584308223886
Planning:
Buffers: shared hit=58 read=29
I/O Timings: read=9.996
Planning Time: 10.529 ms
Execution Time: 190.819 ms
(16 rows)
Time: 213.768 ms
pg-14.4 rw root@db1=#
That’s the EXPLAIN output for the first (ordered) column. It’s worth noting that even though the getrusage() data reported by log_executor_stats is gathered at a different level (by the operating system for the individual database PID), this data actually correlates very closely with the direct output of EXPLAIN ANALYZE on several dimensions:
Execution Time: 190.819 ms
and getrusage() reports 0.202054 s elapsed
(very close)I/O Timings: read=82.114
(subtracted from execution time, suggests 108.705 ms not doing I/O) and getrusage() reports 0.107946 s user
(very close, suggesting we didn’t spend time off CPU for anything other than I/O)Buffers: shared ... read=5533
(database blocks are 8k and filesystem blocks are 512 bytes, so this suggests 88528 filesystem blocks) and getrusage() reports 89984/... filesystem blocks in/...
(again very close)On a side note… this is one reason I appreciate that PostgreSQL uses a per-process model rather than the threaded model of MySQL. There are other advantages with the threaded model – but as far as I’m aware, operating systems don’t track resource usage data per thread.
Repeating the same steps with the second (random) column:
Continue reading
Recent Comments