What about those bloated tests?
atmos : February 13th, 2007
Once your application grows larger than the trivial twelve model system that everyone seems to be writing in rails, your test suite (if you’re diligently writing one) can really start to slow down. Slower tests mean that fewer people run them and fewer people have faith in them. We work on a large healthcare application where we currently have 168 models and thousands of tests.
As the app has grown our test runs have gone from chill to completely insane. We develop against Postgres on our personal machines, but our continuous integration server tests against both Oracle and Postgres. It had reached the point where it took almost 15 minutes to run all of our tests against postgres, and nearly an hour on Oracle. The feedback lag on our CI server was really hurting us: if someone broke the build it might be an hour before anyone would be tipped off. By that point we had often shifted focus to something else, and it was always frustrating to stop what we were working on and fix that Oracle oddity. Not only was the CI run long, but the builds on our personal machines were long too and our developers were starting to run tests less and less frequently—which would eventually end up meaning they’d be writing fewer tests. So this week we went digging to see how we could alleviate some of the pain.
Eric Hodel posted something recently about a technique he has used to speed up his tests: he is profiling by lines logged. We ran this the other day before our epiphany and it actually tipped us off to two beefy controller methods.
We all know fixtures suck, but we could find very little information out there about speeding up your unit tests other than Jay Field’s No DB Unit Testing. He bypasses all database access and uses mocks to decouple his unit tests (he’s also got some good advice on unit testing). Everyone loves their mocks these days, so we tried getting our Mock On a little more (yes we were using mocks for various things already). The problem is we have a lot of models and even more relations, so stubbing/mocking things out was going to take a lot more time than we wanted to invest. We also noticed that the majority of the slow down in the oracle adapter was in the schema dumper/loader. It was actually spending more time dumping and loading the schema than it was actually testing our code.
We noticed a huge speedup in some of our unit tests by omitting calls to that evil ‘fixtures’ class method. Since we are using transactional tests, we only needed to populate the database before the test run. Giddy like schoolgirls we started pulling fixtures definitions left and right, but we’d set ourselves up for failure months earlier. We heavily used the dreaded fixtures accessor methods, the stuff that gives you parties(:first) when you declare ‘fixtures :parties’ in your test. This was gonna be a nightmare to overhaul because we were using it all over our unit tests, and without a “fixtures” call in the test the accessors wouldn’t work.
The approach we ended up taking was to pre-load a database with our full set of testing fixture data, then remove calls in our tests to fixture functionality. This means not only eliminating fixture accessor calls and “fixtures” declarations in tests, but updating the standard Rails rake tasks to preload our database, as well as stopping rake from wiping the database out before our test runs.
First, we needed to remove the db:test:prepare from all of the tests prerequisites, otherwise db:test:prepare will wipe out our preloaded fixture data (see a discussion on this). We weren’t aware of a way to remove Rake task prerequisites, so we added this monkey patch to Rake::Task in our Rakefile that lets you prune them:
1 2 3 4 5 6 7 8 9 10 11 |
class Rake::Task def detract(prerequisite) @prerequisites.delete(prerequisite) end end %w(units functionals integration recent uncommitted).each do |task| Rake::Task["test:#{task}"].detract('db:test:prepare') Rake::Task["test:#{task}"].enhance(['environment']) end |
Next, our tests were littered with calls to fixture accessor methods, and without getting rid of those we couldn’t sever our dependency on loading fixtures at test time. Turns out it’s not too hard to automatically convert those accessors to something cleaner, (going from “some_model(:foo_1)” to “SomeModel.find(1)”). We wrote a script to remove fixture accessor methods. It looks for accessor calls, culling out the false positives (like “assigns(:foo)” which is not a fixture accessor) and rewriting them in place by actually loading the fixture file and mapping the fixture handle to a model id. If you place this in your Rails app’s script/ directory it will cull the accessor calls from your tests when run:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
require File.dirname(__FILE__) + '/../config/boot' require File.dirname(__FILE__) + '/../config/environment' require 'erb' require 'yaml' # methods which look like accessors, but aren't stopwords = ['expects', 'assert_assigns', 'to_formatted_s', 'delete', 'on', 'stubs', 'find', 'assigns'] # load fixture files fixtures = {} Dir[File.dirname(__FILE__) + '/../test/**/*.yml'].each do |file| begin base = File.basename(file, '.yml') fixture = YAML.load(ERB.new(IO.read(file)).result) (fixture || {}).each_pair do |handle, struct| fixtures["#{base}(:#{handle})"] = "#{base.classify}.find(#{struct['id']})" end rescue Exception => e raise "ERROR converting fixture data in #{file}: #{e}" end end # replace accessors accessor = Regexp.new(/([a-z_]+)\(:[A-Za-z0-9_]+\)/) Dir[File.dirname(__FILE__) + '/../test/**/*_test.rb'].each do |file| File.open(file) do |f| File.open(file + '.clean', 'w') do |out| f.readlines.each do |l| l.gsub!(accessor) do |match| if stopwords.include?(match.sub(/\(.*$/, '')) match else raise "Error: no match for [#{match}] in fixture data." unless fixtures[match] fixtures[match] end end out.print l end end end File.rename(file + '.clean', file) end |
It tries to be really anal about verifying that an accessor is really represented in the fixtures, and will throw an exception when it finds a problem. Note that, amusingly (or not), this script even works on fixture accessor calls that are commented out.
If you have other custom methods that look like accessor calls, add them to the stopwords array at the top and the script won’t try to look those up. It is absolutely imperative that you have version control for your project because if this screws up you’ll want to revert back to a clean checkout, tweak, and try again. For our subversion repo I repeatedly reverted changes after each script run until I got the script happy with our code:
% svn st test/ | grep '^M' | awk '{print $2}' | xargs svn revert
Once we removed the accessor calls, we ran rake to verify that all our tests passed, and then we committed our changes to version control.
The next step was to remove our ‘fixtures :foo, :bar, :baz’ calls from our actual tests. We just used a quick command-line pass to trim the fixtures (yes, that’s perl in there):
% find test/ -type f -name '*_test.rb' | grep -v \.svn | xargs perl -n -i -e 'print unless /^\s*fixtures/'
Finally, we made a quick pass through any test helpers (like the one in test/test_helper.rb) that might be loading fixtures. Then, another run of the test suite. We caught a few errors here, which were simply stray accessor calls that the script couldn’t find. We had a few instances of string-constructed .send calls to accessor methods(!), as well as some with odd spacing, or line-breaks, etc. We got probably 99% of the accessors out with the script, and the rest in a single pass fixing the few broken tests that turned up at the end when we dropped all the fixtures calls.
After a clean run of our test suite, another commit brought the tree up to date. Our daily wash-rinse-repeat procedure has changed little. We still svn up and run rake. And upon seeing new migrations we still execute a few extra steps to update the database:
% svn up % rake db:migrate % rake db:test:prepare % RAILS_ENV=test rake db:fixtures:load % rake
We then took an extra step to streamline the test running ritual, and in the process managed to get Rake to detect for us (a la Autotest) when our test fixtures had changed so it could automatically call a db:fixtures:load on the test database.
As we stand now we’ve gone from a 15 minute postgres test run to one that takes just under 5 minutes (on our admittedly underpowered CI server), and unit tests alone now run in 25% of the time prior to the changes (coming in around 25-40 seconds on our developer machines). The biggest surprise has been performance on Oracle. Oracle is actually running faster than postgres for us—all year long we’ve been talking trash about Larry, Oracle, and the licensing fees that our company purchased. Imagine that, it doesn’t suck as bad as we thought.
At this point, everything is Zen. Fast(er) tests; everyone is happy; peace in the Middle East. Well… Except when someone makes a change to fixtures. Because without reloading the database our test suite pops and burns like a faulty Firestone on a Ford Explorer. So we thought it prudent to whip up a Rake task to take care of that for us:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
FIXTURES_LAST_MOD = File.join(File.dirname(__FILE__), 'tmp', 'data', 'fixtures_last_modified') desc 'sets last time fixtures were loaded' task :touch do FileUtils.touch(FIXTURES_LAST_MOD) end namespace :db do namespace :fixtures do desc 'checks whether fixtures were modified before loading' task :lazy_loader do reload_fixtures = true if File.exists?(FIXTURES_LAST_MOD) last_loaded_at = File.mtime(FIXTURES_LAST_MOD) reload_fixtures = false if Dir["test/fixtures/**/*.yml"].entries.find {|f| File.mtime(f) > last_loaded_at}.nil? end if reload_fixtures original_env, RAILS_ENV = RAILS_ENV, 'test' # doubt this makes a difference, but let's be safe Rake::Task["db:fixtures:load"].invoke RAILS_ENV = original_env end end end end Rake::Task['db:fixtures:load'].enhance(['touch']) %w(units functionals integration recent uncommitted).each do |task| Rake::Task["test:#{task}"].detract('db:test:prepare') Rake::Task["test:#{task}"].enhance(['environment','db:fixtures:lazy_loader']) end |
This task touches a file whenever db:fixtures:load fires. Next it looks at the contents of test/fixtures to compare modification times with the last time fixtures loaded. If there was a new change then call db:fixtures:load on the test environment, otherwise just execute the test suite. This works on all test calls: rake, rake test:units, rake test:functionals, etc.
(by atmos, rickbradley, and Kevin Barnes)
1 Response to “What about those bloated tests?”
Sorry, comments are closed for this article.
March 5th, 2007 at 02:56 PM
Thanks for that, very interesting!
I couldn’t quite face removing all our uses of the fixture accessors (at least not yet). In particular being able to write customer(:customerwithbad_credit) reads rather nicer than Customer.find( 12)
However I did find another way to shave off some time from out test runs. The fixtures aren’t reloaded for every test: once loaded they are cached on a per test case basis. Rejig that to make it a global cache, move all fixtures :foo statements to test_helper and hey presto: fixtures loaded exactly once. (Shaving a good third off run time in our case)