>
Linux, Planet, PostgreSQL, Technical

Copy-and-Paste A New Postgres Dev Env In 5 Min

You can cut-and-paste the following commands to quickly get a new & clean dev environment for working with PostgreSQL source code. This includes Michael Paquier’s powerful script kit for managing the PostgreSQL development environment.

Setting up from scratch takes me about 5 minutes, plus 3 minutes to configure, compile and install PostgreSQL for testing. Running the full PostgreSQL test suite (including TAP tests) took me 13 minutes.

This is a quick and very easy way for anyone to try out a patch from the hacker’s mailing list, if you want to check whether it fixes a bug you’re encountering, or if you want to try out a new feature that’s currently under development.

One advantage of using a reproducible fresh Ubuntu instance is that there’s a lot less risk of getting side-tracked troubleshooting random build issues from quirks that accumulate over time on your laptop or other long-term local development environment. Operating system configuration, package & dependency versions… I’ve seen lots of unexpected weird things mess with local builds of PostgreSQL.

Free-tier eligible EC2 t2.micro instance with Ubuntu LTS. For information about Microsoft Azure and Google Compute Engine, see the Notes section at the end of this article. I expect these steps to work largely unmodified as long as it’s the same Ubuntu LTS. This assumes you have already installed the AWS CLI and created your EC2 key pair, which is pretty quick and easy. See https://aws.amazon.com/cli/ and https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html

Short Version / Quick Start

Last updated: 9-Jan-2023

1Start an instance with Ubuntu 22.04 LTS

KEY=jeremy-mb(use your EC2 key pair name)

aws ec2 run-instances --region us-east-1 --key-name $KEY --instance-type t2.micro --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=pgdev}]" --image-id ami-0c7217cdde317cfec

aws ec2 describe-instances --region us-east-1 --query 'Reservations[].Instances[?Tags[?Key==`Name`].Value|[0]==`pgdev`].[PublicIpAddress,State.Name,InstanceType,InstanceId,Tags[?Key==`Name`].Value|[0]]' --output text

    18.234.108.74 running t2.micro i-02fdac0833d5465da pgdev

(Repeat the command above until you see status “running”. Make sure to cut-and-paste carefully; the quote characters around the strings “Name” and “pgdev” need to be preserved.)

ssh ubuntu@18.234.108.74(use the IP address above)
8:19pm
2Setup the Operating System

sudo apt update
sudo apt-get install -y build-essential libreadline-dev zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev libxml2-utils xsltproc ccache pkg-config
sudo apt-get install -y meson libipc-run-perl python2.7 python-is-python3
libperl-critic-perl cpanminus
sudo apt-get install -y emacs gdb lldb sysstat linux-tools-common linux-tools-generic linux-tools-aws

sudo cpanm https://cpan.metacpan.org/authors/id/S/SH/SHANCOCK/Perl-Tidy-20230309.tar.gz

sudo sed -i 's/kernel.yama.ptrace_scope.*/kernel.yama.ptrace_scope = 0/g' /etc/sysctl.d/10-ptrace.conf
sudo sh -c "echo 0 > /proc/sys/kernel/yama/ptrace_scope"
8:20pm
3Install Michael Paquier’s Toolkit for PG Env Mgmt

git clone https://github.com/ardentperf/home.git --no-checkout
mv -v home/.git ./.git
rmdir -v home
git checkout -f 05860649a722


exit
ssh ubuntu@18.234.108.74
(leave and restart your ssh session; same IP address as above)
byobu(defaults to tmux; use byobu-screen if you’re old school)

(Note this is technically pulling a copy of Michael’s toolkit from my own github account. See the Notes at the end for more info.)
8:21pm
4Get PostgreSQL Source Code

git clone https://git.postgresql.org/git/postgresql.git $HOME_GIT/postgres

(By default this will check out the HEAD commit of the main PostgreSQL development branch, where development happens for the next major version of PostgreSQL. At this point, you can of course check out a stable branch for an already-released version and you can apply any patches or code changes of interest.)
8:22pm
5Build PostgreSQL

pg_compile

(After any changes to the source code, even checking out a whole new branch, just re-run this script to update the build with your changes. Then pg_stop and pg_start your running instance to test the changes.)
8:24pm
6Test PostgreSQL

pg_start

psql
8:27pm
aws ec2 terminate-instances --instance-ids i-02fdac0833d5465da

Environment Details

As mentioned above, this uses Michael Paquier’s script toolkit for managing the environment. See https://paquier.xyz/projects/home/ for more information about his code. Michael’s toolkit has some nice shortcuts and it supports working with multiple copies of the PostgreSQL source code side-by-side and multiple running instances, but for this quick start we’re only going to use a single copy of the source code in the default location and a single running instance.

Here are the most important directories and commands you need to get started:

Directories

HOME/home/ubuntuWelcome to your friendly neighborhood Ubuntu Linux
HOME_GIT$HOME/gitBase path for all PostgreSQL git repositories
HOME_POSTGRES_SRC$HOME_GIT/postgresDefault directory for PostgreSQL source code
HOME_POSTGRES_INSTALL$HOME/pgsqlDefault installation directory for running/testing PostgreSQL builds
HOME_POSTGRES_DATA$HOME/dataBase path for all datafiles when running/testing PostgreSQL builds
(directory)$HOME/data/5432Default directory for datafiles of the first running instance (which listens on port 5432) when testing PostgreSQL builds

The first instance listens on port 5432. If pg_start is run repeatedly, then it continues to start additional instances and increments the port each time. The port number is used as the sub-directory name for the datafiles.
(directory)$HOME/data/5432/logDefault directory for log file (error/warning/info/debug/etc) when running/testing a PostgreSQL build.

If pg_start is run repeatedly then the log file for each running instance is within the datafile sub-directory for that instance, named after the port number on which it listens. As of PostgreSQL v16, the log_directory configuration setting (aka “GUC”) defaults to the value “log” and the log_filename configuration setting defaults to the value “postgresql-%Y-%m-%d_%H%M%S.log“. Reference https://www.postgresql.org/docs/devel/runtime-config-logging.html

Example log filename: /home/ubuntu/data/5432/log/postgresql-2024-01-08_012450.log
HOME_POSTGRES_ETC$HOME/etc/postgres.dDirectory with PostgreSQL configuration files for running/testing PostgreSQL builds. Default configuration file is “postgresql.dev.conf

Reference https://www.postgresql.org/docs/devel/runtime-config.html
(file)$HOME/.homeconfigCustom CFLAGS, LDFLAGS, options for configure

Reference https://www.postgresql.org/docs/devel/installation.html

Commands

gitSource code control. Installed and ready to rock. Start here.

One handy command, for example… git format-patch -1 HEAD
patchUse this to install a patch from the mailing lists for testing. Generally you’ll change into the HOME_POSTGRES_SRC directory then type “patch -p1 <FILENAME” to apply the patch on top of the code you have checked out.
pg_compileConfigure, compile and install PostgreSQL including core contrib extensions and documentation. Defaults to HOME_POSTGRES_SRC and HOME_POSTGRES_INSTALL but there are command line arguments to work with other targets. Try “-h” for help.

After any changes to the source code in HOME_POSTGRES_SRC, re-run this script to update the build at HOME_POSTGRES_INSTALL with your changes. Then re-start your running instance to test the changes.
pg_checkRun all tests for HOME_POSTGRES_SRC. Under the hood, this does “make check-world” with parallelism of 4.
pgallRun pg_compile and pg_check for HOME_POSTGRES_SRC. Compilation errors go to the file HOME_POSTGRES_SRC/compile.txt
pgdocsCompile documentation only. The pg_compile/pgall commands also compile documentation.
pg_startClean and re-initialize a new database, then start running an instance with the build of PostgreSQL for testing

The default port for listening is 5432. The listening port number is used as the datafile directory name (see above). If an instance is already running and listening on port 5432, then pg_start will not stop that instance, but will continue to start additional instances by incrementing to the next available port. After selecting a port, if a datafile directory already exists from previous testing (but is not running now), then all files are completely removed. What starting an instance for testing, a new clean database is always created with the initdb program.

For this quick-start, I suggest to run only a single instances at a time. Stop your instance before re-running pg_start again. Advanced usage of Michael’s scripts can facilitate multiple instances for testing things like logical replication or physical replication between identical builds or different builds. Or side-by-side testing and backpatching of a bugfix into multiple already-released stable branches.
pg_stopStop all running builds & instances of PostgreSQL. If pg_start was called more than once (starting multiple instances listening on multiple ports), then a single call of pg_stop will terminate all of the running builds immediately.
pspgShow all running PostgreSQL processes (it’s just a fancy ps|grep under the hood)
psqlThe PostgreSQL command-line client that’s part of your test PostgreSQL build. It’s installed to HOME_POSTGRES_INSTALL/bin which is already added to your path.
pgbenchThe PostgreSQL performance testing client that’s part of your test PostgreSQL build. It’s installed to HOME_POSTGRES_INSTALL/bin which is already added to your path.
gdbDebugger for stepping through code, setting breakpoints, and all your other needs. Installed and ready to rock.
perfSystem performance analysis tool. Installed and ready to rock. Run as root with sudo.

Terminal Multiplexers and Source-Code Editors

Byobu and screen and tmux are all available. If you’re working from a Mac, then iTerm2 has nice integration with tmux.

Both vim and emacs are installed and ready to use on the Ubuntu instance. VSCode on your desktop with the SSH plugin connected to this Ubuntu instance also works nicely, if you want a graphical source code editor.

Helpful Links for Beginning PostgreSQL Development

  • The Missing Manual for Hacking Postgres – Brandur’s guide might be the best starting point anywhere
  • So you want to be a developer? – Probably the canonical starting point. Official PostgreSQL wiki page with links off to the developer FAQ and a few other important things.
  • Hacking On Postgres (video) (slides)- Recent presentation (Sep ’22) by James Coleman. I noticed that his fourth slide also has links to several more good talks on this topic.
  • PostgreSQL Hacker Tips – Recent presentation (Dec ’23) by Michael Paquier with some coverage of the toolkit used here. I didn’t see a video yet, but a recording might eventually appear online.

Long Version / Notes and Customization

1It’s possible that there might be some one-time network/vpc/security setup that I’ve done in my AWS account long ago and forgotten about. If you run into any difficulties with ssh-ing into the Ubuntu system, please let me know what your fix was so that I can make the steps here better.

Information about locating AMIs for different ubuntu versions on EC2 is at https://ubuntu.com/server/docs/cloud-images/amazon-ec2 and https://cloud-images.ubuntu.com/locator/ec2/

The first Ubuntu link above also has instructions for Microsoft Azure and Google Compute Engine. I haven’t tested, but I expect the Ubuntu images will be the same so the steps here should work identically. The linux-tools-aws package was related to the perf utility, and probably doesn’t apply to Azure or GCE. (But do they have their own version of this?) If someone wants to test/confirm and send me an email with the steps, then I’ll gladly credit you and post them here!

I prefer LTS versions, so that my instructions last a bit longer without breaking 🙂 and 22.04 was the latest LTS as of writing (the next one will come out pretty soon).

I used this SSM command to find the AMI in this blog; feel free to customize (for example if you wanted a different region):

$ aws ssm get-parameters --names /aws/service/canonical/ubuntu/server/22.04/stable/current/amd64/hvm/ebs-gp2/ami-id

{ "Parameters": [ {
"Name": "/aws/service/canonical/ubuntu/server/22.04/stable/current/amd64/hvm/ebs-gp2/ami-id",
"Type": "String",
"Value": "ami-0c7217cdde317cfec",
"Version": 44,
"LastModifiedDate": "2023-12-06T19:47:04.175000-08:00",
"ARN": "arn:aws:ssm:us-east-1::parameter/aws/service/canonical/ubuntu/server/22.04/stable/current/amd64/hvm/ebs-gp2/ami-id",
"DataType": "aws:ec2:image"
} ],"InvalidParameters": [] }


As of writing, the page https://aws.amazon.com/ec2/instance-types/t3/ says that t2.micro are eligible for the free tier, and t3.micro are eligible in regions where t2.micro are unavailable.
2This is largely taken from https://wiki.postgresql.org/wiki/Compile_and_Install_from_source_code with some additions.

See also https://www.postgresql.org/docs/current/install-requirements.html

I would really like to be configuring and building with all available options for PostgreSQL… ICU, LLVM/JIT, all core contrib extensions. Generally I like to test with everything there. With this first pass at a quick-start dev environment I might be missing some package dependencies and configure options. Please let me know if you find any packages that are missing (along with the build options to use them), and again I’ll happily credit you and add them here. Thanks!

The sysctl ptrace_scope change is required for gdb to be allowed to attach to running processes. I would like to add BCC (eBPF) tools as well to these setup instructions, but I haven’t finished figuring out whether there’s a conflict with Python 2.7 which Michael has aliased. It’s on my TODO list. 🙂

For heavier performance testing, you probably want to move to a non-bursting instance family but those won’t fall into the free tier so cost will be more of a consideration. It’s easy enough with the instructions here to switch the instance family and to stop or perhaps even terminate the instance when it’s not actively in use.
3This step installs Michael’s home directory. Obviously it’s optional to run byobu or tmux or screen afterwards, but I find it to be a good habit. Byobu is great!

https://paquier.xyz/projects/home/ and https://github.com/michaelpq/home

As noted above, the cut-and-paste steps here technically pull a copy of Michael’s scripts from my own github account. I made a few very minor tweaks to simplify the cut-and-paste steps here in this blog post: handling unset EDITOR variable, auto-creating HOME_GIT, default screenrc (which I plan to revert now that I remember byobu) and running TAP tests by default (I personally think it’s important to encourage people to default to running all tests). Michael might take some of these upstream to his own repo. And if he publishes an other updates to his repo, I plan to keep my repo in sync.

But separately: you might also notice that I hard-coded a specific commit in the cut-and-paste steps here instead of taking the HEAD commit. This gives me a little more control around introducing updates and enhancements for Michaels scripts (by either him or myself) into the cut-and-paste steps here… this should avoid anything breaking unexpectedly/accidentally due to commits that someone added to their repo. Not a right/wrong thing, it’s just my style when doing cut-and-paste instructions like this. 🙂

Why not create a custom AMI with dependencies and environment mgmt scripts pre-installed? Go ahead! That approach works great too! I chose this approach because the 5 minutes to copy/paste the instructions really isn’t much and doesn’t bother me, and the long-term maintenance seems a touch easier. I’m doing this in between elementary school PTA meetings, helping kids with homework, adjudicating disagreements over who’s turn on the Nintendo comes first, grocery shopping, making dinner, planning a trip to see cousins and grandparents, paying bills, planning household finances, and all the rest. I can add a new OS package to this blog post by adding one word in the fancy wordpress WYSIWYG editor and that’s just going to happen faster than if I had to build new AMIs every time there was a new package to add or an update to Michael’s scripts. I don’t expect too much difficulty even with updating to the new Ubuntu LTS later this year. But again – not a right/wrong thing, just my style, and the beauty of this is that you’re welcome to do your own thing however you want!
4-6Nothing else comes to mind beyond everything already written in the previous sections.

Change Log

9-Jan-2024: Fixed quoting in describe-instances command (I had inadvertently lost some quotes with copy/paste to the originally published blog), changed commit used for Paquier’s env mgmt scripts (removed screenrc and cleaned up commit history to align w fixed upstream PRs – reference my github repo), add deps python2.7 and perltidy (PG-specific version) and perlcritic, added Byobu to the quick-start (I’d somehow forgotten this when I first published) and enable mouse for byobu-screen, some improvements to documentation of directories and commands (especially pg_start), pre-install lldb, removed screenrc and backup bashrc, added a better screenshot demonstrating gdb and byobu.

3-Feb-2024: Add package sysstat

About Jeremy

Building and running reliable data platforms that scale and perform. about.me/jeremy_schneider

Discussion

One thought on “Copy-and-Paste A New Postgres Dev Env In 5 Min

  1. Ooh, that is exciting! Now I need to fight strong urge to join this into a shell script to deploy the whole thing with one command and use it as a party trick whenever some one asks :)
    Not sure if it will work out anyway, probably will be breaking every step along the way

    Liked by 1 person

    Posted by Viacheslav Andzhich | February 4, 2024, 6:12 am

Leave a New Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Disclaimer

This is my personal website. The views expressed here are mine alone and may not reflect the views of my employer.

contact: 312-725-9249 or schneider @ ardentperf.com


https://about.me/jeremy_schneider

oaktableocmaceracattack

(a)

Enter your email address to receive notifications of new posts by email.

Join 68 other subscribers