Functions

The MIE Podcasts – Charles Williams

Charles Williams is probably best known for his work in Excel calculation speed and VBA performance, not to mention the worlds most famous Excel Addin Name Manager. In truth, he’s a expert in many areas of Excel and on top of all this he’s a great bloke. In this podcast we got down and dirty with Excel performance, uncovering some interesting insights along the way. It was a true pleasure talking with Charles.

Charles the First - King of Speed!

Charles in the founder of Decisions Models, a consulting firm specialising in Excel development, and the farther of the FastExcel addin, we talked about this addin and how it works and also his addin loader tool. This is one for the archives!

Thanks to Charles for making the time and sharing his knowledge with us all.

Spreadsheet functional programming

In his recent paper Spreadsheet functional programming, Dave Wakeling discussed the idea of using a functional programming language from within a spread sheet.

It’s an interesting approach, because the implementation effetely allows you to write code (in this case Haskell) in cell comments, in such a way that the results are computed on the fly – i.e. not complied.

As you might expect, there are a number of issues with the implementation from a practical stand point, but that’s not what I’m interested in here – I want to talk about the concept of functional programming in relation to spread sheets, which this paper also focuses on.

It’s my view that a spreadsheet is a functional programming language. This means that I have an issue with the goal of the paper, which is:

“Our hope is that by doing so, we might get spreadsheet programmers to give functional programming a try”

Nothing wrong with that, but my views that spread sheet users already are functional programmers!!! So why them might we want to use Haskell with Excel? Towards the end of the paper Wakeling suggest some possible strengths to this approach.

1. It uses an ordinary spreadsheet. The functionality of, experience with, and Support for a familiar product all carry over.

2.  It uses an ordinary functional language. The benefits of such languages also carry over. (Backus, 1978; Turner, 1982; Hughes, 1989).

3. It can accommodate any pace of change. The old way of doing things can coexist with the new one.

Unfortunately I don’t agree that these are real benefits. Oh dear! – Why?

1. This is a case for anything that  integrates with Excel; the benefit is independent of what it is that’s being integrated.  This is the benefit of using a spread sheet and a programming language over using a programming language on it own.  It’s tenuous to claim it as a strength.

2. The benefits are true of Functional v’s Imperative programming languages (lasy, high order functions). Since Excel functions operate in a very similar way to functional programming these benefits don’t really exists between Excel worksheet functions and a functional programming language.

3. This statement is true of any situation where the 2 technologies can co-exist you could say the same about using C, VSTO etc, etc, again it’s really tenuous to claim it as a strength of any like for like approach.

So if these strengths are not in fact all that real, then why would we want to use a functional programming langue with spreadsheets? Perhaps a clue is given in this passage:

“On the positive side, many everyday programming errors are detected that would otherwise lead to incorrect results, including dangling cell references, undefined functions, and functions with the wrong type of arguments.”

I can see how using a more “strongly typed” language could possibly help reduced type errors, not sure how undefined functions would be helped or how “dangling cell references” (in fact, I don’t know what this means at all) would be effected.  Generally further discussion and justification of the benefits of using a functional language with a spreadsheet are needed to build a case for doing so.

The paper did not convince me that using a functional language has any really benefits to using worksheet functions or any other programming language. Having said that, the paper does show that its possible to link technologies like this and it’s another interesting approach to programming spreadsheets. The prospect of F# and XLDNA and F# in VSTA/O are also interesting things to start thinking about!  The application of functional programming languages within spreadsheets is interesting  and Wakeling should be congratulated for opening up the debate for future discussion– as hopefully this post will do too!

A comparison of worksheet functions

The response to my blog post “Request an Excel blog post” was not stellar! But 2 people did request things. XL Dennis (aka Dennis Wallentin)  asked about Tables in Excel 2007, and Sam had some interest in function performance.
I did a quick video about tables in Excel 2007, which I need to compliment with one about how to use the new referencing system too, but that’s for another day.
Sam was kind enough to send me some work he’s done around function execution speeds. Sam used the MIE calculation tool to help him time these functions – good stuff!
The workbook is here, and the findings might make more sense when you see them in the worksheet ;-)
In Sam’s own words, here’s what he found out:

The — version of sumproduct is faster than the other two
Dsum – Sum is faster than Sumproduct and for Excel 2003 and below should be the most preferred way of summarising data
For 2007 – Sumifs beats everything else hands down
—————————————–
A Match/ Index combo is slightly slower than a Vlookup but more flexible.
But a Single Match column and multiple index columns is much faster than Vlookup. Index is super fast
—————————————–
Dynamic Names(with Index / Counta)  are faster than Table References both of which are faster than full column references
Dynamic Names with UDF is faster than Index/Counta
However  Fixed Range references are faster than Dynamic names of any kind

Here are my thoughts:
Sumproduct and DSum do slightly different things and the fact that DSun is faster kinda makes sense. However it’s easy enough to fall into the trap of relying on SP when other functions are better suited, I hardly every use Dsum, in fact I can’t think of a time I did, so this in a handy reminder!  It also makes me think if there are other instances which I’m missing out on, I bet there are loads!
The Sumif functions seems to trump everything that has come before it (see the vlookup times too) we should try and use these as much as possible.  Sumif’s are arguably easier to write and read the sumproducts too.  If the data fits try and use sumifs in 2007 plus. The performance gains are vast!
Having said that there are some instances when only sumprodct will achieve what you need, and for small data sets or number of functions then the benefits might not be noticeable, but for large data sets and high numbers of function calls it’s worthwhile thinking hard about which function to use.
Indexing/match lookups are about twice as fast as the equivalent vlookups in this example, but be careful because often only one vlookup is needed, and the performance gains might not be as good. I really should test this, but I just want to make the post right now!

References to ranges act as you would expect. Tables in 2007 might be a bit slower than dynamic ranges – not sure there’s that much in it though, but there are significantly different – see t-test on wks. Again I would like to see how these results are effect by relative data set size and number of function calls.

So there you have it a fairly comprehensive look at various “lookup” functions in Excel and Excel 2007. Thanks to Sam for his hard work, putting the tests together and shearing his findings with us, good work Sam!
If you have a topic you’d like to be discussed on this blog, then feel free to leave a comment on the Request post and mail me any supporting data you have.

Download Workbook (.xlsm)

Fill in empty cells

You know the score, you've done a pivot table and there are a load of empty cells. You want to fill them up:
[NB: you have to copy the PT as values or the code will fail!]

I use this almost every day:

Sub fill_in()
Dim cell As Object
    Application.ScreenUpdating = False
    For Each cell In Selection
        If cell.Value = "" Then
            cell.Value = cell.Offset(-1, 0).Value
        Else: End If
    Next cell
    Application.ScreenUpdating = True
End Sub

I think this may be a repost - I did look back but i could not find anything that i had done before, but i remember that Zack Barresse suggesting a none vba way that involvde a special paste of some sort.

Well there you are...

Writing Formulas IS Computer Programming.

I want to qualify this post. It's 11.20 and I've seen away a few.

Having said that this is a idea that was suggested to me by a member of www.eusprig.org

Lets do a little role play easy Simon ;-).

My name is Dave. I went to Manchester Uni. for 3 years, I've work as a VB dev for 5 years since I graduated. At the mo I work of a ISP that produces a manufacturing scheduling package. I follow all the best practices and am up to date with all the latest knowledge. The software I write is exceptional; my product can give a 20% performance increase to most customers.

Then there's Paul,

The model needs to be in by 10. It's this years spend forcaste; I only started to work on this last night at about 8. I need to get all the data from last year and apply this years growth, and the new Asia figures and the effects of the new IT systems. I don't have the IT figures, but I should get them, if not I'll wack in some figures.

So there you go.

Now let me tell you that this.

Dave's customers have a annual turn over of 5 million pounds.
Paul's net spend is 30 million.

Here's my question, where would you want the skilled computer programmer?

My point?

Excel more than any other application, is the one where programming skills are required writing SS formulas IS computer programming.

Your opinions are welcomed