Tuesday, March 31, 2015

PostgreSQL Shutdown

PostgreSQL has three shutdown modes: smart, fast, and immediate.  For many years, the default has been "smart", but Bruce Momjian has just committed a patch to change the default to "fast" for PostgreSQL 9.5.  In my opinion, this is a good thing; I have complained about the current, and agreed with others complaining about it, many times, at least as far back as December of 2010.  Fortunately, we now seem to have now achieved consensus on this change.

For those not familiar with the existing modes, "smart" means that the database system begins shutting down at once, but does not finish shutting down until all existing sessions have terminated, which can take a very long time, perhaps forever.  While the shutdown is in progress, no new sessions can be created, so if you were planning to shut down the database quickly, and then restart it, you instead end up in a partially shut-down state for a long time, maybe forever, which can function as a sort of self-inflicted denial-of-service attack.

A "fast" shutdown terminates all existing sessions, performs a checkpoint (so that recovery will not be required on restart), and then shuts down.  This is, I believe, the best type of shutdown to use, because it will generally complete relatively quickly, which I think is what most people want.  An "immediate" shutdown terminates all existing sessions and shuts down without performing a checkpoint; on restart, recovery is required.

One problem with fast shutdowns is that, if there is a large amount of recently-modified data that must be written to disk, the shutdown checkpoint can take a long time to complete.  Some people try to avoid this by using the immediate shutdown mode, but that's not really a solution: instead of shutting down slowly, the database will instead start up slowly.  Either way, your database is off-line for too long.  Here's a better way:

psql -c CHECKPOINT && pg_ctl stop -m fast

The system may run more slowly during the checkpoint, since a lot of data may be getting written to disk very quickly, but it will still be up and accepting connections, which is a good thing if that checkpoint takes multiple minutes to complete.  Only once that checkpoint is done do we begin shutting down.  The fast shutdown will perform a second checkpoint, but that one will generally completely pretty quickly since it gets run right after the first one, and therefore won't have much new, dirty data to write out.

There's an interesting terminological difference between PostgreSQL and Oracle around shutdown modes.  In Oracle, what PostgreSQL calls a "smart" shutdown is instead called a "normal" shutdown.  Due to the fact that such a shutdown may take an exceedingly long time to complete, both terms are perhaps a misnomer; in this era of nonstop data processing, "dumb" or "neverending" might be a more apt moniker.  Still, the terms are fairly similar, and perhaps reflect the expectations of the time period in which they were invented.  More confusingly, what PostgreSQL calls a "fast" shutdown is referred to as an "immediate" shutdown in Oracle; but our "immediate" shutdown is what Oracle calls an "abort".  This can lead DBAs familiar with Oracle to issue an "immediate" shutdown against PostgreSQL when what they really intended was what we call a "fast" shutdown.

3 comments:

  1. How about "s/smart/polite/" ? Hinting at the real-life situation when two overly-polite people let each other go first.

    The choice of smart vs immediate really depends on your applications. Of course an app that can't recover from a aborted transaction is arguably broken, but in my experience most apps are like that.

    My distro's init script does 'smart with timeout' (falling back to 'immediate' if 'smart didn't succeed in configurable-60 seconds). I think that's an even safer compromise.

    ReplyDelete
  2. Sounds like we need a built-in way to do what your script does: checkpoint, then terminate connections, then checkpoint again, then shut down. Sounds no worse to me than "fast", since it'll take about as long without taking the database off-line. There's probably a pathological case where it's worse, like if a client is writing as fast as you can save to disk. But it'd be a decent default.

    ReplyDelete
  3. What Lawrence said. :-) Yep, it should do that internally!

    ReplyDelete