May 2009

Excel WTF, have I been wasting my powers?

There are 2 things that strike me about this post over at Daily WTF. Well there are 3 things, but 1 of them (…my god, computer programmers are soooo anally retentive it’s untrue..) always strikes me, I should work on my duck and weave.

1. The formula sucks

2. Have I been miss an opportunity to “pick up chicks” by debugging excel formulas*?- Check out the comments?

*It seems unlikely…

A comparison of worksheet functions

The response to my blog post “Request an Excel blog post” was not stellar! But 2 people did request things. XL Dennis (aka Dennis Wallentin)  asked about Tables in Excel 2007, and Sam had some interest in function performance.
I did a quick video about tables in Excel 2007, which I need to compliment with one about how to use the new referencing system too, but that’s for another day.
Sam was kind enough to send me some work he’s done around function execution speeds. Sam used the MIE calculation tool to help him time these functions – good stuff!
The workbook is here, and the findings might make more sense when you see them in the worksheet ;-)
In Sam’s own words, here’s what he found out:

The — version of sumproduct is faster than the other two
Dsum – Sum is faster than Sumproduct and for Excel 2003 and below should be the most preferred way of summarising data
For 2007 – Sumifs beats everything else hands down
—————————————–
A Match/ Index combo is slightly slower than a Vlookup but more flexible.
But a Single Match column and multiple index columns is much faster than Vlookup. Index is super fast
—————————————–
Dynamic Names(with Index / Counta)  are faster than Table References both of which are faster than full column references
Dynamic Names with UDF is faster than Index/Counta
However  Fixed Range references are faster than Dynamic names of any kind

Here are my thoughts:
Sumproduct and DSum do slightly different things and the fact that DSun is faster kinda makes sense. However it’s easy enough to fall into the trap of relying on SP when other functions are better suited, I hardly every use Dsum, in fact I can’t think of a time I did, so this in a handy reminder!  It also makes me think if there are other instances which I’m missing out on, I bet there are loads!
The Sumif functions seems to trump everything that has come before it (see the vlookup times too) we should try and use these as much as possible.  Sumif’s are arguably easier to write and read the sumproducts too.  If the data fits try and use sumifs in 2007 plus. The performance gains are vast!
Having said that there are some instances when only sumprodct will achieve what you need, and for small data sets or number of functions then the benefits might not be noticeable, but for large data sets and high numbers of function calls it’s worthwhile thinking hard about which function to use.
Indexing/match lookups are about twice as fast as the equivalent vlookups in this example, but be careful because often only one vlookup is needed, and the performance gains might not be as good. I really should test this, but I just want to make the post right now!

References to ranges act as you would expect. Tables in 2007 might be a bit slower than dynamic ranges – not sure there’s that much in it though, but there are significantly different – see t-test on wks. Again I would like to see how these results are effect by relative data set size and number of function calls.

So there you have it a fairly comprehensive look at various “lookup” functions in Excel and Excel 2007. Thanks to Sam for his hard work, putting the tests together and shearing his findings with us, good work Sam!
If you have a topic you’d like to be discussed on this blog, then feel free to leave a comment on the Request post and mail me any supporting data you have.

Download Workbook (.xlsm)