Advisory Locks in Postgresql

PostgreSQL provides various lock modes to control concurrent access to data in tables. Advisory locks provide a convenient way to obtain a lock from PostgreSQL that is completely application enforced, and will not block writes to the table.

Imagine you have a scheduled task that runs in the background, mutates the database and sends information off to another 3rd party service. We can use PostgreSQL Advisory Locks to guarantee that the program cannot cause any unexpected behavior if ran multiple times concurrently. Concurrency is fun, but surprise concurrency can be brutal! In our case it’s beneficial to have a guarantee that the program in question can never run concurrently.

Application Enforced Locks

Throughout this post we will explore Postgres advisory locks, which are application enforced database locks. Advisory locks can be acquired at the session level and at the transaction level and release as expected when a session ends or a transaction completes. So what does it mean to have an application enforced database lock? Essentially, when your process starts up, you can acquire a lock through Postgres and then release it when the program exits. In this way, we have a guarantee that the program cannot be running concurrently, as it will not be able to acquire the lock at startup, in which case you can just exit.

The benefit of this is that the tables are never actually locked for writing, so the main application can behave as normal and users will never notice anything is happening in the background.

Here is the syntax for obtaining the lock:

SELECT pg_try_advisory_lock(1);

Now, there’s a few things happening in that statement so lets examine it.

  • pg_try_advisory_lock(key); is the Postgres function that will try to obtain the lock. If the lock is successfully obtained, Postgres will return 't', otherwise 'f' if you failed you obtain the lock.

  • pg_try_advisory_lock(key) takes an argument which will become the identifier for the lock. So, for example, if you were to pass in 1, and another session tried to call SELECT pg_try_advisory_lock(1) it would return 'f', however the other session could obtain SELECT pg_try_advisory_lock(2).

  • pg_try_advisory_lock(key) will not wait until it can obtain the lock, it will return immediately with 't' or 'f'.

Implementation

So how would we go about using this in Ruby?

def obtained_lock? connection.select_value(‘select pg_try_advisory_lock(1);’) == ‘t’ end

We can grab our ActiveRecord connection and call #select_value in order get back a 't' or 'f' value. A simple equality check let’s us know whether or not we have obtained the lock, and if we haven’t we can choose to bail and exit the program.

class LockObtainer
  def lock_it_up
    exclusive do
      # do important stuff here
    end
  end

  private

  def exclusive
    if obtained_lock?
      begin
        yield
      ensure
        release_lock
      end
    end
  end

  def obtained_lock?
    connection.select_value('select pg_try_advisory_lock(1);') == 't'
  end

  def release_lock
    connection.execute 'select pg_advisory_unlock(1);'
  end

  def connection
    ActiveRecord::Base.connection
  end
end

Additional Information

There are a few interesting things about Advisory Locks:

  • They are reference counted, so you can obtain a lock N times, but must release it N times for another process to acquire it.

  • Advisory Locks can be acquired at the session level or the transaction level, meaning if acquired within a transaction, an Advisory Lock will be automatically released for you. Outside of a transaction, you must manually release the lock, or end your session with PostgreSQL.

  • Calling SELECT pg_advisory_unlock_all() will unlock all advisory locks currently held by you in your session.

  • Calling SELECT pg_advisory_unlock(n) will release one lock reference for the id n.

Effective Rails Deployment

Capistrano has been an excellent tool for years to deploy simple or multi-stage complex applications to remote servers. However the rise of PASS like Heroku, Dokku, I got in mind to use some automatic orchestration framework.

I looked over Chef, Puppet, SaltStack and Ansible on a weekend and decided to go with ansible since it looked easiest to get hands dirty and have great documentation. After some initial hicups I was able to deploy some of our inhouse applications on EC2 and introduced it to my coworkers at codecrux.

Since then, we use ansible to configure and manage remote server, and also configure our local development environment running on top of Virtualbox with vagrant.

Ansible uses YAML for configuration and for actual commands it has less flexibility compared to capistrano which uses Ruby DSL, but at the same time helps you to keep your playbooks simple. ansible also extremely good with orchestration and rolling deployments. We were able to keep a homogenous command set and duplicate most of Capistrano’s features in very small amount of code.

Let’s take an example to deploy lobster.rs. Here is how we did it -

Configuration

First of all you need to define required variables in a yaml file for ansible to pick them up.

---
app_name: lobsters
rails_env: production

git_url: git@github.com:jcs/lobsters.git
git_version: master

app_path: '/'
shared_path: '/shared'
releases_path: '/releases'
current_release_path: '/current'
app_public_path: "/public"
app_config_path: "/config"
app_temp_path: "/tmp"
app_logs_path: "/log"

keep_releases: 5

You should also define your remote host in ansible inventory/host file like

[production]
lobsters.dev   ## replace with your domain name

Playbook

Now when we have configuration we can create the actual playbook for doing capistrano-style deployments. Create deploy.yml file:

---
- hosts: all
  tasks:
    - set_fact: this_release_ts=
    - set_fact: this_release_path=/

    - debug: msg='New release path '

    - name: Create new release dir
      file: path= state=directory

    - name: Update code
      git: repo= dest= version= accept_hostkey=yes
      register: git

    - debug: msg='Updated repo from  to '

    - name: Symlink shared files
      file: src=/ dest=/ state=link force=yes
      with_items:
        - config/database.yml
        - config/secrets.yml
        - config/unicorn.rb
        - log
        - tmp
        - vendor/bundle

    - name: Install bundle
      command: 'bundle install --deployment --without="development test"'
      args:
        chdir: ''

    - name: Precompile assets
      command: rake assets:precompile chdir=
      environment:
        RAILS_ENV: ''

    - name: Migrate database
      command: rake db:migrate chdir=
      environment:
        RAILS_ENV: ''

    - name: Symlink new release
      file: src= dest= state=link force=yes

    - name: Restart unicorn
      command: sudo restart 

    - name: Cleanup
      shell: "ls -1t |tail -n +|xargs rm -rf"
      args:
        chdir: ''

Deploy

Once everything is set up, you can run the following command to deploy:

ansible-playbook -u railsbox -i production deploy.yml