I’m back home from Vancouver. What a great week – in every way. I’ll try to share a few highlights here.
Updated Happiness Hints

First and foremost: after many years, the Happiness Hints have received a major update! Before the conference, I updated the hints based on all the feedback I’ve collected over the past few years. Then the hints were updated into a poster format and we printed it as part of the pgconf.dev poster session. Throughout the week, I continued collecting more feedback. I used a sharpie during the conference and marked up the poster with ideas. Special thanks to Laurenz Albe, David Rader, Sami Imseih, Ryan Booz and Nik Samokhvalov (Nik you weren’t at the conference but a happiness hint resulted from other discussions we’ve had). Of course I’m forgetting more people who gave feedback making the happiness hints better. After coming home from the conference, I incorporated all the notes I had – and the version that’s now published here at ardentperf.com is the latest & best version I’ve assembled so far.
Physical Replication and Postgres High Availability
An extraordinary number of postgres users rely on physical replication for high availability. It’s been around for a long time and it works well. Nonetheless, there are a few rough edges and over the years there have been various mailing list threads that haven’t fully been resolved.
I proposed a Friday unconference session on this topic, and the topic received enough votes to be selected. Notes from the unconference are available on the Postgres wiki. But the discussion extended far beyond the unconference; there were also hallway discussions over coffee (thanks Thomas Munro) and then continuing discussions over dinner at Joey Burrard and beers at Steamworks (thanks Ants Aasma).
The first question that everybody asks is “should postgres have more HA capabilities in core”? And a discussion starting along these lines consumed the first half of the unconference.
But I thought the most interesting train of thought was something more incremental – an idea that Postgres is missing a fundamental/overarching concept or first principle which could make a lot of problems easier to solve – a concept around cluster topology. There are a few ways this could look. A function or a view like pg_nodes or something? The ability on a hot standby to query for all of the replicas in the topology? How about a function that could be called on a hot standby when the primary is unreachable and return a list of potential candidates for promotion to be a new primary?
My own idea is to consider the set of nodes in synchronous_standby_names as the “cluster” or “herd” of instances. (Jeff didn’t like the name “herd” but the word “cluster” already means something else in postgres…) Maybe we can let people set the number to “0” if they want a cluster with async replication. Which brings us to another challenge – managing changes to this parameter. First, how do we know the exact moment when every single connection and session is aware of a new set of cluster members? Remember that individual connections are responsible to ensure transactions are replicated before acknowledging commits to clients. Second, how could we ensure that all of the replicas know about changes when adding or removing cluster members?
There are also challenges around logical replication slots (like losing them after two failovers in a row, or the inability to replicate them at all if decoding from standbys) – could a new cluster concept help? A new cluster concept also might help around managing backups of WAL across a cluster. Lots of interesting ideas!
Wait Events and Physical Reads and pg_stat_statements
A handful of short discussions with Sami Imseih and Lukas Fittl. First off, Sami has some patches for pg_stat_statements that I’m pretty excited about. Improving concurrency around the LWLock and looking for ways to optimize the situation with the query text file.
Second, I had a few chats around physical reads. Right now I’m using the pg_stat_kcache extension to get data on physical reads. Postgres itself only tells reads that happen from the OS page cache. There’s ongoing work around direct IO, and also Postgres 18 will get a new AIO feature… and I’m curious if pg_stat_kcache will be able to get data about the background IO workers in pg18. There was some concern around the overhead of calling getrusage() too frequently; some benchmarking would be good, to determine if the overhead is too high to get per-query physical reads from AIO workers. (I’m expecting io_uring to be unavailable in many containerized environments; I think that GKE servers and also Docker’s default seccomp profile disable it.) I wonder if some users will want to disable the IO workers purely so they can continue getting physical read stats.
Third were a few small side conversations on better observability around Wait Events and Locks. For wait events, I think that we should be able to add counters to keep track of the number of times every wait event is called and the total duration for each wait event. But what about LWLock Wait Events? Too much overhead? It turns out that LWLocks don’t register wait events if they can quickly acquire a lock. They only register a wait if they actually relinquish the CPU to wait on a semaphor – so I think the overhead of maintaining counters on waits might be acceptable (and essential for debugging). Separately from this, I think that we also might be able to find a way to count the total number of times each LWLock is acquired – but it would need to be very efficient to be enabled all the time. (Postgres has LWLOCK_STATS already as a build flag but it’s not typically enabled.) I suspect we might want counters that are local to each process, and only aggregate them to central stats at some conservative interval.
Collation
It would hardly be a postgres conference if Jeff Davis and I didn’t have at least one conversation about Collation where we both insist that we’re now retired from collation work, then spend an hour debating how to best move Postgres forward.
Amazing work was done. But also, there is still more work to do.
The big problem nobody’s talking about is that language changes. ICU needs to be upgraded. Linguistic sort order is like time zones. It’s rare, but it changes – and when the sort order changes, all your indexes become invalid. Postgres does not have any good story yet for ICU upgrades.
Postgres now has a builtin stable code-point-order collation (pg_c_utf8). It’s possible to set this as the database default and do your linguistic sorting at the expression or column level. (Which you should! And it’s a happiness hint!) But lets be real: users in non-english languages don’t want to go through their entire schema or application adding COLLATE "fr_FR.utf8" everywhere.
The million-dollar question is “what do users really want?”
Can we come up with some limited “client locale” concept that gives users default behavior according to their client locale, while the database itself (and all indexes) operate with pg_c_utf8 collation? Maybe users only really care about ordering of results? The ORDER BY matters to them, but they actually might not really expect or care about the less-than operator? I think the ideal behavior is somehow that indexes are always created with pg_c_utf8 collation, while users can have a good experience that doesn’t require adding COLLATE clauses everywhere. The challenge is how to figure a way that pg_c_utf8 indexes can be used most of the time.
FWIW, Oracle takes a very interesting (if pragmatic) approach here – they just list all the operators, and some default to binary/codepoint collation while others default to the client locale. (Of course collation can always be explicitly specificed; this is just for defaults.) Indexes are always created binary/codepoint and indexes are generally used by queries. In Postgres, could the bttextcmp() function in postgres be tweaked somehow so that it can use pg_c_utf8 indexes by default even when the user requests linguistic collation? Or could we look at query execution plans and only apply linguistic collation to top-level nodes somehow? Crazy ideas, not sure any of it works, we’re still brainstorming.
Lightning Talks and Dinner Groups
Two final brief mentions. This year, Masahiko Sawada and myself organized the Lightning Talks. First time I’ve done it. We mostly just followed the same process which had been used last year – there was a very helpful google doc which we followed (and updated). From 29 total submissions, we randomly chose 12. Four people used green cards to indicate new/inexperienced speaker and we made sure that 2 of those were included. Every speaker gets 5 minutes max!
I learned that originally, Lightning Talks at pgcon were first-come-first-serve. As the conference grew, the Lightning Talks switched to random selection. Submissions were done at the conference by putting a note card into a box with your name and topic. I overheard a little discussion on Friday around whether lightning talks should move to a model of online submissions in the future, maybe ahead of time, more like a real CFP with a selection process instead of purely random selection.
I’m new here, but I do think there’s something that feels a little more authentic when it’s a physical submission at the conference and a random selection. Fits with the theme of this conference – ample time for hallway discussions and impromptu topics. Online submission feels a bit different; there are pros and cons both ways.
One final thing this year was that Paul Ramsey organized dinner groups on Tuesday and Thursday. What a fantastic idea! I was part of dinner groups on both days and really enjoyed meeting new people and having some great conversations. I forget to get a picture on Thursday, but here’s the group from Tuesday.
I wasn’t originally planning to attend this conference – and I’m very glad that I decided to go. I hope I’m able to attend another pgconf.dev in the future!




Discussion
No comments yet.