The case of 'case/when' vs 'if/include'

Mon, 17.01.2011 rails 3, splat, include

I am not a native English speaker (and probably not a native developer either), so I had to search for splat, as I found this word the other day. It turned out to be the name of that funny operator which looks like an asterisk and is written pressing the asterisk key of your keyboard. But don’t get confound, it is not an asterisk, it is a splat, like so: *

As a result, I ended up reviewing my knowledge of that somehow ignored Ruby operator. Usually, most of us know from doing

1
2
3
4
def some_method(important_parameter, *args)
  do_something_with(important_parameter)
  super(args)
end

Kind of a Mary Poppins bag, where anything can pop in and out. On this respect, I found this very detailed post about the possibilities of the splat, which motivates mine now. Go through it and perhaps improve your understanding of this array operator, and afterward go back to the section 4.Case/When. The code in example is actually been taken from production code in current Rails 3.0.3:

1
2
3
4
5
6
7
8
9
10
def find(*args)
   ...
   case args.first
   when :first, :last, :all
      send(args.first)
   else
      find_with_ids(*args)
   end
   ...
end

That code called my attention: a case statement with a single when. I immediately thought about an obvious alternative:

1
2
3
4
5
6
7
8
9
def find(*args)
   ...
  if [:first, :last, :all].include? args.first
    send(args.first)
  else
    find_with_ids(*args)
  end
   ...
end

Both sets of code will give the same result. The Rails version allows having more alternatives if we wanted (like following different paths for :first, :last and :all ), but for now let us follow YAGNI and ask ourselves: if both do the same thing, which one of them is going to be faster? Let’s refactor and make use of that one!

So, I made my homework, and wrote the following file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class BM
  def self.a
    [1,2,3]
  end

  # rails benchmarker 'BM.casewhen'
  def self.casewhen
    9999.times do
      case 4
      when *a
        b=1
      else
        b=2
      end
    end
  end
  
  # rails benchmarker 'BM.ifinclude'
  def self.ifinclude
    9999.times do
      if a.include? 4
        b=1
      else
        b=2
      end
    end
  end
end

The evaluation will result as false since 4 is not in the array and Ruby will follow the else branch. This to ensure the worst case, so that the array is always gone through. I will measure the difference in running time for a standard Ruby 1.9.2 installation on a Debian Squeeze.

To make it fast and simple, I put this file into some rails project at app/model/bm.rb and run both methods manually 7 times each (which I suppose enough to avoid random delays due to other processes in my system), with the following results:

  • rails benchmarker ‘BM.casewhen’
    => real time: 14489 13849 14862 16344 16667 15016 15047
    => Mean: 15182, Standard Deviation:996.73

  • rails benchmarker ‘BM.ifinclude’
    => real time: 14266 13886 12886 14044 13973 14484 12896
    => Mean: 13776.42857, Standard Deviation:636.4741

As you can see, the second method is taking about a 10% less time to execute. I have to say that it doesn’t surprise me, a case statement should intuitively take more time to evaluate than an if statement, but you can run the test with other configurations (JRuby, Windows, you name it). If you do so, please comment here on your results.

And we all love our servers spending less time doing finds, isn’t it? I know that you’re thinking: “these three lines don’t make the world”, but I bet that ‘find’ is in the top ten of executed sentences in our applications. So why not get this little boost for free?

I am on my way of making it into the next Rails version.

Update 1:
With the help of rafmagana I realized that my code was incomplete. He had a much more clear code which I squeezed for newer, discouraging results (real time offered):

  • first with when: 0.923457
  • first with include?: 1.201857
  • last with when: 0.926521
  • last with include?: 1.301077
  • all with when: 0.945277
  • all with include?: 1.354739
  • find_with_ids with when: 3.035178
  • find_with_ids with include?: 1.816336

As we can see, :first, :last and :all prefer when (25-30% faster) whereas find_with_ids prefers include? (40% faster). How logical is that?
I mean, find_with_ids has (should have) to go through each item in the list and see whether it is there or not before the final diversion into the else. So it has more work to do and does it faster?
I hope I have a lesson learned: check every possibility, even if it seems unlogical, you never know…

Update 2:
The next time I want to do something on performance, remind to see again this incredible video from Aaron Paterson at the RubyConf 2010. I already downloaded a copy for my own future reference.

0 comments