Capistrano, Ruby 1.9 and SSH Gateways
capistranoUPDATE 2011-05-05 16:17 PST: It looks like this has been resolved!
UPDATE 2011-01-16 16:26 PST: I’ve updated the data collection gist to show conversion to CSV in case that’s useful. I’ve also included the code I used to calculate the statistics for each data set.
I use Ruby at work for systems administration tasks on a daily basis. One particularly useful tool in my arsenal is Capistrano. However, as I began to transition my daily taks to Ruby 1.9 I noticed Capistrano was running slowly. Tasks that normally took on the order of a couple seconds could take tens of seconds and sometimes even minutes. I do not yet know what the root cause of the problem is, and I’m not even sure anyone else is having such a problem. After some hours of Googling, I have yet to see a single page describing anything remotely like this. I’m posting this in case anyone else has noticed something similar.
The problem appears to only occur when I am using an ssh gateway,
defined in my Capfile
like so:
set :gateway, "host.example.com"
With that line, Capistrano connects via ssh to host.example.com first
and from there connects via ssh to the target host. (Without the gateway
defined, Capistrano will connect via ssh directly to the target host.)
With an ssh gateway defined in my Capfile
tasks run under
Ruby 1.9.2 are painfully slow. I collected some statistics quantifying
the problem.
I started with the following dead simple Capistrano task:
role :all, %w[server1 server2 server3 server4 server5]
task :uptime, :roles => :all do
run "uptime"
end
This task runs the uptime
command on five different servers
running Debian Linux. I ran the task 100 times in succession on a 2.6
GHz Core 2 Duo MacBook Pro running OS X 10.6.6 using Ruby Enterprise
Edition 2010.02 and Ruby
1.9.2p136 both installed via RVM. I
ran the tests with and without an ssh gateway. In each case I had
capistrano
2.5.19, net-ssh
2.0.24 and
net-ssh-gateway
1.0.1
installed. I timed each run.
The raw data for all the runs are available as a gist. I used this code to calculate the stats. Below is the analysis of those figures. First, running Ruby Enterprise Edition through an ssh gateway:
real | user | sys | |
min | 2.50s | 0.50s | 0.15s |
max | 3.93s | 0.60s | 0.21s |
mean | 2.61s | 0.56s | 0.17s |
stddev | 0.17s | 0.02s | 0.01s |
median | 2.56s | 0.56s | 0.17s |
And running the same task under Ruby Enterprise Edition without an ssh gateway:
real | user | sys | |
min | 2.54s | 0.47s | 0.13s |
max | 3.13s | 0.62s | 0.18s |
mean | 2.61s | 0.51s | 0.14s |
stddev | 0.11s | 0.02s | 0.01s |
median | 2.58s | 0.51s | 0.14s |
And now the same task under Ruby 1.9.2 without an ssh gateway:
real | user | sys | |
min | 2.29s | 0.64s | 0.21s |
max | 3.35s | 0.72s | 0.23s |
mean | 2.41s | 0.66s | 0.21s |
stddev | 0.11s | 0.01s | 0.00s |
median | 2.42s | 0.66s | 0.21s |
And finally the same task under Ruby 1.9.2 through an ssh gateway:
real | user | sys | |
min | 3.54s | 0.70s | 0.23s |
max | 279.17s | 1.83s | 1.11s |
mean | 66.80s | 0.96s | 0.43s |
stddev | 58.61s | 0.22s | 0.18s |
median | 50.64s | 0.92s | 0.39s |
So REE with and without a gateway and 1.9.2 without a gateway each took about 2.5 seconds to complete the task. That seems reasonable to me. But cripes, the average runtime for that task under 1.9.2 with a gateway was 66.80 seconds, and the peak runtime was 279.17 seconds! The shortest runtime was an almost forgiveable 3.54 seconds. If you look at the data for that run you will see the substantial swings in runtime. The standard deviation was a moody 58.61 seconds. With a median runtime of 50.64 seconds, it would seem most of the runs are shorter than average, but there are several exceptionally slow runs. Fourteen runs took two minutes or longer.
Again, apart from knowing it has something to do with using an ssh
gateway, I haven’t identified the actual problem yet. If you know what I
might be doing wrong, I’d love to know. If you don’t know what’s going
on, but you’re having a similar problem, well, at least you’re not
alone. In the meanwhile, I’ve modified my Capfile
to let me
conditionally enable the ssh gateway:
if ENV['CAP_SSH_GATEWAY']
set :gateway, ENV['CAP_SSH_GATEWAY']
end
# Run without an ssh gateway
# $ cap <task>
# Run through an ssh gateway
# $ cap <task> CAP_SSH_GATEWAY=host.example.com
This allows me to continue using Ruby 1.9 for daily systems
administration tasks while I’m on a trusted network. When working from
an untrusted network, I can throw CAP_SSH_GATEWAY
on the
end (and switch to Ruby 1.8 – which is easy with
RVM).