A Lenient Capybara

Flaky specs with Capybara seem to be a law of nature. Instead of hunting them down one after the other, we learned to stop worrying and let the rodent run.

Even after years of writing frontend specs with Capybara, we face the occasional flaky expectation where none of the proven best practices help. Sure, by investing a few hours, you might come up with a solution. But as we are paid for building shiny features and not for polishing technical quirks, this soon becomes way too costly.

Still, every flaky expectation might turn on the big red light on the CI. This is what really annoyed us. We expect our CI to report actual problems in our production code and not obscure issues in the test code. So we went to fix this.

The nature of a flaky spec is that it works in one run and fails in the next. As a consequence of this crucial insight, a CI run that failed because of a flaky spec might succeed the next time. All we have to do is to run the failed specs again until they succeed!

RSpec has the option to run only failures. The first, regular run records the results of all examples. This is achieved with the following RSpec setting:

config.example_status_persistence_file_path = 'tmp/example_status.txt'

If there are failures, do not exit directly with an error, but start a second run only for the failed specs. To handle tenaciously flaky specs, repeat for a third time and then exit with an error if failures persist.

Cast in Rake tasks, this is how our implementation looks like:

desc 'Run system tests at most three times to gracefully handle flaky specs'
task :lenient do
  sh 'rm -f tmp/example_status.txt'

  puts "\nFIRST ATTEMPT\n"
  Rake::Task['spec:system:start'].invoke
  next if $CHILD_STATUS.exitstatus.zero?

  puts "\nSECOND ATTEMPT\n"
  Rake::Task['spec:system:retry'].invoke
  next if $CHILD_STATUS.exitstatus.zero?

  puts "\nLAST ATTEMPT\n"
  Rake::Task['spec:system:last'].invoke
end

RSpec::Core::RakeTask.new('start') do |t|
  t.pattern = './spec/system/**/*_spec.rb'
  t.fail_on_error = false # don't stop the whole run
end

RSpec::Core::RakeTask.new('retry') do |t|
  t.fail_on_error = false # don't stop the whole run
  t.rspec_opts = '--only-failures'
end

RSpec::Core::RakeTask.new('last') do |t|
  t.fail_on_error = true # do fail the run
  t.rspec_opts = '--only-failures'
end

While this procedure adds some time to our test runs, flaky specs stopped bothering us a lot. We still hunt them down once in a while, but it has become a lot less urgent and we can use our time to focus on the good parts.

Kommentare sind geschlossen.