>
Planet, PostgreSQL, Technical

Postgres Per-Connection Statistics

I’ve had a wish list for a few years now of observability-related things I’d love to see someday in community/open-source Postgres. A few items from my wish list:

  • Wait event counters and cumulative time
  • Wait event arguments (like object, block, etc – specific argument depends on the wait event)
  • Comprehensive tracking of CPU time (ie. capture/track POSIX rusage data and expose the kernel’s perspective on per-connection CPU usage as a metric)
  • Stop putting “COMMIT/END” in pg_stat_activity when it’s the currently executing statement. It is endlessly frustrating because for 99% of applications you will have no way to know which transactions are committing – and what part of your application code is triggering this – when you get a pile-up of sessions running COMMIT/END at the same time. It would be more useful to just leave the previous SQL in pg_stat_activity, and expose the fact that it’s a commit elsewhere (eg. state==commit or state==active_commit instead of state==active). It’s also usually pretty clear from the current wait events when you’re committing. To troubleshoot, you usually also want to know what you’re committing – not just when you’re committing.
  • On-CPU state
    • SQL execution stage (parse/plan/execute/fetch)
    • SQL execution plan identifier in pg_stat_statements and pg_stat_activity (there are some promising patches and discussions on the hackers email list about this right now!)
    • Currently executing node of the execution plan in pg_stat_activity
  • Progress on long operations (eg. a large seqscan) – there have been improvements here in recent years
  • Better runtime visibility into procedural languages
  • Ability to better detect plan flips, probably via some kind of plan ID or plan hash in pg_stat_activity (i’ve done this by looking for big changes in per-exec logical IO, but that’s far less effective than what could be done with some improvements in core)
  • Per-connection statistics (almost all stats today are at the instance or database level) – I wrote a tool to snapshot statistics before and after a query so that you could get a report on exactly what the query did, and a lot of the stats are only useful if you run the tool on an otherwise idle system.

As I’ve noted in a few places, there has been slow and steady progress in Postgres over recent years. There’s also plenty of good discussion continuing on mailing lists now.

One long running topic is better instrumentation for detecting plan changes – the thing we need is some kind of plan hash – and Lukas Fittl is keeping the discussion going with a fresh patch and proposal on the lists last week. It hasn’t gotten much discussion just yet – this would be a great patch for any newcomers interested in PG observability to test and give feedback on!

Another interesting mailing list discussion right now (which wasn’t on my list above) is about capturing page fault information. Postgres today does all IO through the page cache, so Postgres can tell us if a page wasn’t in the database cache but it can’t tell us if the page actually required a read from storage (ie. wasn’t in the OS page cache either). The most common Postgres configuration today is 25% of memory as database cache, and relying on the OS page cache beyond that, but it’s impossible to calculate a real buffer cache hit ratio with this configuration today! However the operating system does track reads from storage at a per-process level as “major page faults”. The extension pg_stat_kcache has been around for awhile and partly fills the gap by tracking and exposing major page faults within Postgres. This mailing list thread started by Atsushi Torikoshi is about the idea of bringing some of this functionality into core Postgres. I personally think it’s a great idea, and likely will prove valuable in both EXPLAIN output and also in pg_stat_database.

But a commit last month from Bertrand Drouvot is what I’m most excited about. I think it represents a really important step forward and I think it’s easy to miss the significance of this, so I wanted to briefly highlight it.

Bertrand’s commit adds per-connection I/O statistics, and Bertrand did a short blog about it at https://bdrouvot.github.io/2025/01/07/postgres-backend-statistics-part-1/

But I think the really important thing here is something that isn’t obvious at first: this commit lays the groundwork for per-connection statistics in general. This email on the lists yesterday illustrates the point:

The thing I’m most excited about is how the December commit lays important groundwork toward better observability for Postgres in general. Maybe we could get that “major page fault” information in postgres at a per-connection level!

It looks like we’re knocking another item off my wish list. A big thank you to Bertrand, Michael, and everyone who provided reviews and feedback on the mailing lists about this patch!

And go have a look at Bertrand’s blog, if you didn’t already read it. Exciting stuff.

Unknown's avatar

About Jeremy

Building and running reliable data platforms that scale and perform. about.me/jeremy_schneider

Discussion

No comments yet.

Leave a New Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Disclaimer

This is my personal website. The views expressed here are mine alone and may not reflect the views of my employer.

contact: 312-725-9249 or schneider @ ardentperf.com


https://about.me/jeremy_schneider

oaktableocmaceracattack

(a)

Enter your email address to receive notifications of new posts by email.

Join 76 other subscribers