Capybara, synchronize, and monkeys that climb ivory towers

Sun, Feb 17, 2013

I like Capybara. It’s is an incredibly cool tool. Gone are the days where you need an army of bleary-eyed QA team members following Word document scripts for click testing. Now it is possible to grab Capybara, and optionally something like Cucumber, to develop your QA team into a true automation team without having to also include an engineering resource to build a whole tool chain on-site.

But there is a certain ivory-tower idealism to Capybara that can make it difficult to use with some applications. The authors believe that pretty much all browser testing should be via the DOM. If you click a button or load a page and can’t assert something about its content, selectors, appearance or other properties via XPath- or CSS-driven finders, there is probably something wrong with your your application.

And they’re absolutely right. If the results of clicking on a button doesn’t fire and event that comes back and modifies your DOM, how is your feature remotely useable? If Capybara can’t detect success of an Ajax call as reflected in your DOM, how the heck is your user getting feedback? The DOM should have mutated, a class should now be present on that button that turned it green, or the word “Success” should now appear at the top. The Capybara authors are dead on: this is how it should be.

But, this isn’t how it is. Sometime’s life gets in the way of ideology.

In a particular case I encountered it was simply more efficient to do a little page tweaking with jQuery rather than insisting the new feature’s MVP adhere to Capybara’s ivory tower proscriptions. Normally that wouldn’t be a big deal, but we have an additional wrinkle: All our JavaScript is loaded in a non-blocking, asynchronous manner to improve page render time. That means the much-beloved $ may not exist right when your test code runs.

We’ve already artfully dealt with the complexities of asynchronous JS loading in our application, but it still meant intermittent failures in our black box suite when the network was more sluggish than Capybara could run execute_script. If you poke around the Internet for categories of problems with asynchronous JS or Ajax, you’ll end up in a mire of nearly identical Stack Exchange answers citing how people “fixed” their problems with wait_until. I’d argue most of these people made things worse for their app without having realized it.

Noting the trend, the Capybara authors have removed wait_until and replaced it with a harder to reach synchronize in the 2.0.x releases. Rightly, they’ve asserted that you shouldn’t ever really need to use it since Capybara has a system of waiting and retrying built into all its query code; if you’re wrapping a DOM search in your own timer, you’ve probably failed to understand something fundamental about Capybara.

In our case, however, I need to loop back to the fact that we’re not dealing with the DOM here: the presence of jQuery is not at all a DOM event. Either $ is or isn’t defined. Try as I might, I couldn’t play nice with Capybara testing philosophy and get work done.

So, I reached for my least favorite tool: the monkey patch. Ugh, I feel dirty just typing the name…

Essentially these two new methods on Capybara::Session brings the characteristic Capybara zen-like calm to my jQuery execution. Just as Capybara will patiently let your Ajax calls complete while watching for “Success” to appear, calls to execute_jquery will run little jQuery scripts after waiting patiently for $ to be defined.

It really is grotesque and I don’t like it. But it works. Perhaps in the near future I can modify our JS loader to indicate various states via hints in the DOM. Or I can take the time to rewrite the problem assertions in pure JavaScript instead of jQuery (though I’m not inclined to take the “don’t use that useful tool” route). But maybe it is time to have a bit of a tussle with the Capybara folks and find out if a facility like this might be a worthwhile addition to the framework.

They’ve built a timeout-driven system for DOM testing. Why isn’t a similar timeout system available to evaluate_script?