Am I preloading the app in Heroku + Unicorn correctly?
When using Unicorn on Heroku. Scaling up, will have problems, since the newly scaled web dyno's can be accessed by a request when it is still loading the app. Which mostly results in a Timeout error.
I did a bit of reading at http://codelevy.com/2010/02/09/getting-started-with-unicorn.html and https://github.com/blog/517-unicorn
The two articles suggested using preload_app true
. And an after_fork
and before_fork
block.
In Rails 3+, is the code in the before_block
still required? I read somewhere, otherwise. Anyone who has experienced setting this up before and would like to share?
Am I missing anything else? Am I pre-loading the app correctly?
# config/initializers/unicorn.rb
# Read from:
# http://michaelvanrooijen.com/articles/2011/06/01-more-concurrency-on-a-single-heroku-dyno-with-the-new-celadon-cedar-stack/
worker_processes 3 # amount of unicorn workers to spin up
timeout 30 # restarts workers that hang for 90 seconds
# Noted from http://codelevy.com/2010/02/09/getting-started-with-unicorn.html
# and https://github.com/blog/517-unicorn
preload_app true
after_fork do |server, worker|
ActiveRecord::Base.establish_connection
end
before_fork do |server, worker|
##
# When sent a USR2, Unicorn will suffix its pidfile with .oldbin and
# immediately start loading up a new version of itself (loaded with a new
# version of our app). When this new Unicorn is completely loaded
# it will begin spawning workers. The first worker spawned will check to
# see if an .oldbin pidfile exists. If so, this means we've just booted up
# a new Unicorn and need to tell the old one that it can now die. To do so
# we send it a QUIT.
#
# Using this method we get 0 downtime deploys.
old_pid = Rails.root + '/tmp/pids/unicorn.pid.oldbin'
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
# someone else did our job for us
end
end
end
What you're seeing here is expected. The moment you scale up by a dyno, the Heroku platform will deploy that slug to a new dyno, which is entirely isolated from your other dynos (ie another unicorn master).
Once that dyno is deployed and running (effectively booted), the routing mesh will start sending requests to that dyno, which is when Rails will start up on the Unicorn, or whatever, server you've got setup.
However, once that request arrives you have a 30 second window to return your data or the request will be timed out on the routing mesh (error H12).
Therefore, to summarize, your problem isn't to do with forking, it's that your application cannot start up within 30 seconds, hence the early timeouts. Worrying about forking and PID files is that not something that you need to worry about on Heroku platform.
Only a partial answer but I were able to reduce these nasty scaling timeouts with this Unicorn configuration:
worker_processes 3 # amount of unicorn workers to spin up
timeout 30 # restarts workers that hang for 30 seconds
preload_app true
# hack: traps the TERM signal, preventing unicorn from receiving it and performing its quick shutdown.
# My signal handler then sends QUIT signal back to itself to trigger the unicorn graceful shutdown
# http://stackoverflow.com/a/9996949/235297
before_fork do |_server, _worker|
Signal.trap 'TERM' do
puts 'intercepting TERM and sending myself QUIT instead'
Process.kill 'QUIT', Process.pid
end
end
# Fix PostgreSQL SSL error
# http://stackoverflow.com/a/8513432/235297
after_fork do |server, worker|
defined?(ActiveRecord::Base) and
ActiveRecord::Base.establish_connection
end
Also, I use heroku labs:enable preboot
(see https://devcenter.heroku.com/articles/labs-preboot/). Unfortunately, I still see some timeouts when scaling up the web dynos.
Here's a discussion in the HireFire support forum, I initiated: http://hirefireapp.tenderapp.com/discussions/problems/205-scaling-up-and-down-too-quickly-provoking-503s
preload_app true
helped for our app, so do give it a shot if you're seeing issues with timeouts during deploy/reboot. The comments saying it doesn't help made me think it wasn't worth trying, then realised it was indeed the fix we needed.
Our situation was a slow-to-boot Rails app using preboot. On some deploys and restarts, we would get a lot of timeouts, to the point that the site was considered down by our uptime monitoring.
We came to realise that with preload_app false
, Unicorn will bind its port first and then load the app. As soon as it binds the port, Heroku starts sending it traffic. But it then takes a looong time for this slow app to load, so that traffic gets timeouts.
This is easy to verify by running Unicorn in dev, trying to access the site right after starting Unicorn, and checking whether you get a "no server on that port" type error (desirable) or a very slow request (not desirable).
When we instead set preload_app true
, then it would take longer until Unicorn binds the port, but once it does and Heroku sends it traffic, it's ready to respond.