XL DNA

Converting a VBA UDF to an Excel DNA UDF

In the last post we saw how to create a UDF in Excel DNA, now let’s look at how to convert an existing UDF into DNA one. Obviously every UDF is different, but hopefully this example will show the process needed, and help you get started in converting your own.

Having worked through this UDF to write the post, I’ve come to the conclusion that its not a brilliant example, so if you have a UDF you’d like to convert why not post the code in the comments and if it looks like a good candidate for conversion I’ll do a post on it.

The UDF we will use is taken from J-walks site [here], I’m using this function because although its not very useful it was a) around at the time (I used this at the UK Excel Dev Con earlier this year), and B) it has some aspects that require a change from VBA to dot Net.

This is the code that John used:

Sub TimeTest()
'100 million random numbers, tests, and math operations
Dim x As Long
Dim StartTime As Single
Dim i As Long
x = 0 StartTime = Timer For i = 1 To 100000000
If Rnd <= 0.5 Then x = x + 1 Else x = x - 1
Next i
MsgBox Timer - StartTime & " seconds"
End Sub

As you can see this is not actually a UDF, so we’re off to a good start! No matter lets convert the above to a VBA UDF:

Public Function RandomWalkVBA()
'100 million random numbers, tests, and math operations
Dim x As Long Dim i As Long
x = 0
For i = 1 To 100000000
If Rnd <= 0.5 Then x = x + 1 Else x = x - 1
Next i
RandomWalkVBA = x
End Function

So now this function will spit out some number after a given amount of time – that is the time it takes to do 100,000000 loops!

Now let’s make this a DNA function, building on the last blog post lets swap out the “My First Function” code and drop in the VBA code and see what happens, the DNA file should contain the following:

<DnaLibrary>
 <![CDATA[
 Public Module Mod1
 Public Function RandomWalkVBA()
 '100 million random numbers, tests, and math operations
 Dim x As Long
 Dim i As Long
 x = 0
 For i = 1 To 100000000
 If Rnd <= 0.5 Then x = x + 1 Else x = x - 1
 Next i
 RandomWalkVBA = x
 End Function
 End Module
 ]]>
 </DnaLibrary>

When the function is done it returns a #VALUE! error, why is this? First thing is that the code has compiled OK, and loaded in to Excel, so it’s unlikely to be any of the XML tags or module declarations. Likewise the function is in the wizard so the function declaration is OK. “Dim X as Long” is fairly straight forward, as are loops; so possible causes of the error, but unlikely. That only really leaves:

If Rnd <= 0.5 Then x = x + 1 Else x = x – 1

Looking at this it is conceivable that this could be a problem. If we look at the code we see the Rnd function, which is a VBA method.  Also the Then, Else, on one line might not work in VB.Net?  Lets comment out that line and get the function to return a string to test our theory.

Close Excel, open the text file, comment out the “IF Rnd” line and change “RandomWalkVBA = x” to “ RandomWalkVBA = “The Function Worked” ”, the DNA file should look like this:

<DnaLibrary>
 <![CDATA[
 Public Module Mod1
 Public Function RandomWalkVBA()
 '100 million random numbers, tests, and math operations
 Dim x As Long
 Dim i As Long
 x = 0
 For i = 1 To 100000000
 'If Rnd <= 0.5 Then x = x + 1 Else x = x - 1
 Next i
 RandomWalkVBA = "The Function Worked"
 End Function
 End Module    ]]>
 </DnaLibrary>

Close the DNA file and load the XLL again, then enter the function. Bingo, the text is displayed, we have found the error. So what to do about it? We can test to see if “Then If” will work, just by changing Rnd to a number, if we do then the function should return the text again, you can test it it you like, but I can tell you it will work ;-). So that just leave the Rnd Function. Lets try it with the rnd function in and the text as the result to check that it’s causing the error, – hun, it worked! So the problem is not in fact the Rnd function, it must be the returning call.

Why do return calls fail? Our experience would tell us its either some bug in the code, or the thing you’re returning at the end of the code is not correct. The variable X is declared as a Long, that seems to be working when the code is run, but not when it gets returned. Strings are returned OK, could it be that the issue is with the long type?

I’m going to save you the heart ache, but when VB went to dot Net, the types changed a bit, this was to make all the common languages (C++ dot Net, C#, F#, VB.Net etc.) work on the same level. Consequently a Long in VB.Net* is no longer the same as a Long it VBA. (Also Variant is gone, use Object instead), so now a long is an Int, but an Int is still and Int… see here for a list of types: http://www.thescarms.com/vbasic/VB6vsVBNet.aspx. Lets convert that Long into an Integer and see what happens!

<DnaLibrary>
 <![CDATA[
 Public Module Mod1
 Public Function RandomWalkVBA()
 '100 million random numbers, tests, and math operations
 Dim x As Integer
 Dim i As Long
 x = 0
 For i = 1 To 100000000
 If rnd <= 0.5 Then x = x + 1 Else x = x - 1
 Next i
 RandomWalkVBA = x
 End Function
 End Module    ]]>
 </DnaLibrary>

Bingo (again) it’s worked, we now have a fully function DNA port of the original VBA function and all we really needed to do was change the longs to Integers – amazing. But I didn’t waste your time, your functions may well be different and you might have to debug them along the way. In later instalments we’ll look at how to use the IDE to debug things, which if you have a large and complex function might be a batter way to go.

Get coding!

*VB.Net still has Long as a type but its 64 bits big, in VB6 a Long is 32 bits big, in dot Net an Integer is 32 bits big – that’s progress!

Excel DNA Hello World Function

OK, enough with the back story, lets get down to business.

In this post we are going to look at the simplest way to use Excel DNA. This methods is ideal if you want to write User Defined Functions (UDF), or want to copy a UDF from VBA to DNA for a quick and easy* performance improvement.

In this example we are going to use the XML file to hold our code, so there is no need to have an IDE installed. In following instalments we will look at how to use a IDE to write a compliled DLL and use that with Excel DNA.

First things first, we need to get hold of the Excel DNA files. Download the most up to date version of Excel DNA from the Excel DNA codeplex site here:

http://exceldna.codeplex.com/

Once you have down loaded the files, extract them to a new folder, called something like “XL DNA Tests”, there should be 2 folders in there, like this:

Open the distribution folder, find and copy the XLL file called “ExcelDNA.xll”, now paste that back in to your root folder, it should look like this:

Next create a new text file (use the short cut menu, right click>>New>>Text Document). Rename that text file to “MyFirstDNAAddin.DNA”, then remane the XLL file to the same name, but keep the .xll extention, i.e “MyFirstDNAAddin.xll”, it should all look like this:

Note that the 2 files have the same name. This is very important as it ties the code we are going to write in the DNA file with the XLL. When we load the XLL, it will be looking for code in a file with the same name as itself . Notice that for this to work the XLL and the DNA file must be on the same path – i.e. in the same folder, at the same level, like they are in this example.

Now time for the code. Open the DNA file (double click the icon it will open with notepad) and copy and paste the following code into the DNA file.

<DnaLibrary>
  <![CDATA[
     Public Module Mod1
        Function MyFirstFunction(byVal YourText As string)as string
          MyFirstFunction = YourText & ",...Hello World!"
        End Function
      End Module
   ]]>
</DnaLibrary>

Close the file, saving the changes and then double click the XLL file. This will open Excel, enable macros if required. With a new workbook open, open the function wizard, and find the category called “MyFirstDNAAddin”…

Select that category, and you will see your new function in the functions list, like this:

Click OK, and enter some text in the argument box, then hit OK, your function will appear in the active cell, something like this:

And that’s really all there is to it. To share this new function all you need to do is send out the XLL file and the DNA file to the end user, get them to save the file to their PC, then get them to add the addin to Excel via the addins manager in the Tools menu, henceforth the function will always be available to that user on that PC.

OK lets take a step back and look at what we just did.

The first 2 lines of code in the text file are:

<DnaLibrary>
  <![CDATA[

These are just XML element tags, they are used by Excel DNA to help it read in your code, they are not part of the function.

The next line opens a new Module and gives it a name “Mod1”, later this module is closed with “End Module”

Function MyFirstFunction(byVal YourText As string)as string
      'Other code here
End Module

The actual code that does the work are these three lines…

Function MyFirstFunction(byVal YourText As string)as string
      MyFirstFunction = YourText & ",...Hello World!"
End Function

“Function” Declares a new function, which we call “MyFirstFunction”, a bracket, then an argument is defined, another bracket, then the return type of the function. The next line returns the result to the function, which is the text you passed in the Argument “YourText” and the string, “,…Hello World!”. Finally "End Function" closes the function.

Then

]]>
</DnaLibrary>

Are just the closing parts for the XML trees.

Very simple stuff if you’ve written VBA I’m sure you’ll agree. In the next post will look at taking a VBA function and porting it over to Excel DNA, in real live so to speak…

*Generally VBA code transfers quite well in to VB.Net code, some times you need to change the syntax a  little, some times you need to use a dot net library etc., start with something small and simple and work up, you’ll soon get the hang of dot Net.

How does Excel DNA Work?

We’ve covered a lot of ground in the last two blog posts, now its time to start looking at Excel DNA in more detail. In this post we are going to look at how Excel DNA works. It’s important to have an idea of how the product works as this will help when we start writing solutions based on the technology.

Solutions written with Excel DNA work in the following way. The dot Net code is called by a standard C XLL, which is called by Excel. That’s all there is to it. It looks like this:

There are two ways that the bridge addin (the XLL) can use your dot Net code. In the simplest implementation you write your code in a basic XML file, this is why you can use Excel DNA with out even having an IDE installed. The DNA XLL will then pick up the XML file read out the code and compile it. It does this when the addin is first loaded, so there is a small amount of overhead here. The complier is part of the dot Net framework, so as long as you have the framework installed on the PC the code will run on, it will always work. The other method is where you provide a complied assembly* to the XLL. This methods requires you to write the code in some sort of IDE, compile it to a DLL and then you set a reference to that DLL in the XLM/DNA file, so there is one more file with this method. One of the many nice features about Excel DNA is that it provides a packing tool, so you can budel all these bits, and any other support file you need into a single XLL – which is of course super easy to deploy!

There one other very important component with Excel DNA and that’s the integration library (it’s called ExcelDna.Integration.dll). This is the brains of Excel DNA. It about 30 or so C# classes that deal with loading you managed code, implementing any attributes you have specified, creates RTD services if you have coded that, handles errors, sets up the ribbon/menu interfaces, gives up an Excel Application object/Instance etc… When you write a complied library you need to reference this library in that code, this step is not needed for the text file method. There’s a lot going on in the integration library, and its worth having a poke around in the source code once you get up and running with the product.

Well that it for now, next week, we’ll actually write some code and make an addin – exciting times!

*An assembly in dot Net is either a complied EXE or DLL, it basically another name for a library.

Excel DNA for dot Net 4 relased…

News just in, the lasted version of Excel DNA (0.27) has been released by Govert, and can be had here.

From the site:

Release Notes

This release adds support for the .NET Framework version 4: add a RuntimeVersion=”v4.0″ attribute to the DnaLibrary tag to load the .NET 4 runtime.
Additional changes:
  • XML-based hierarchical CommandBars updated (thanks to Bertrand).
  • Shadow copy support: add ShadowCopyFiles=”true” to the DnaLibrary tag.

This might inspire a change to C#!

Why Excel DNA?

So if you read the first post hopefully you’ll be convinced that using dot Net is a viable option for an Excel developer. The next logical question you face is what method should you use to deploy your dot Net code. Let’s start by looking at what options are available:

A Managed dll called from VBA
It’s easy enough to write a plain old dll, mark it COM callable and call it into Excel from VBA. In this way you can use of all of the dot Net stuff and don’t have to do to much in the way of interfaces or set up. There are different methods that can be used, Marshall Object is probably the best if you want to work with the Excel object model directly – see here for more details.

Managed COM addins
This is a like for like managed* equivalent of a classic COM addin written in VB6. It uses IDTExtensibility2 as the interface into Excel. It can do all the things that an unmanaged COM adding can do. You should note that COM addins need to be isolated, more details can be found in Dennis superb tutorial, here.

VTSO addins
This is the way Microsoft would like us to go. It’s a managed com addin, which has been specifically design to work with a single Office application type. A lot of the plumbing and set up stuff that you have to do with a Managed COM addin is done for you in VSTO. This comes at a price of making the addin not wholly compatible across different Office versions. (Although at the time of writing, VTSO in Visual Studio 2010 and dot Net 4 should change this.) Microsoft has been working hard to make VSTO much slicker in the last few years, and these days it’s a pretty good option. It won’t do UDF (easily) though, and you will need the professional version of Visual studio, or better.
You can find a fuller discussion on the differences between VSTO addin and Managed COM addin in PED2.

Addin Express
Addin Express is a company, they make a few different products to wrap up managed COM addins, and VSTO type addins and have an excellent reputation for making the developers life easier. Prices typically start from around £220 for a life time licence.

Other dot Net wrappers.
This includes Excel DNA and other products like XLW, Excel4Net, Managed XLL and the like**. I’ve grouped these together here, but there could be a separate entry of each one just as easily. Excel DNA, XLW and Excel4Net are free, Managed XLL is quite expensive, £1000 plus, I believe.

OK, so there the main options, but which one? Well to be honest they all have there pros and cons, and any of them would be a good way to start using dot Net code with Excel, but here are some things to consider.

Cost.
Most of these options are expensive. Managed Com addins can be written in SharpDevlop which is a free IDE, but shimming*** them is hard, if you use Visual studio you get the shim for free. VSTO also needs Visual Studio, Addin express cost some cash as does Managed XLL. XLW, Net4Excel, and Excel DNA are free, and as a result they might not be quite as fully featured as the paid for options.

Deployment.
VSTO and managed COM addin used to be extremely hard to deploy, VSTO is now much better. Certainly the environment and the control you have on that environment plays a part here. If you know all your target PC’s are set up in a particular way, with a particular Office version, the headaches reduce somewhat. Excel DNA is simple to deploy, it’s a XLL. Calling a managed COM dll isn’t to hard either. I can’t talk too much about all the others, so let assume they’ve all got good deployment stories.

Development.
As far as I know Excel DNA is the only one that will allow you to write code into a text file, and then compile that code at run time – this means you don’t even need an IDE to write a Excel DNA addin – how cool is that? So if you can’t get anything installed on your PC, you’re not beaten yet! (Although you will need the dot Net framework v2 or higher). Since all of these solutions can make use of some sort of IDE, then you get all the benefits of that. Addin Express, VSTO and Managed XLL also come with many other tools and wizards that make development much simpler and quicker.

Support.
All of the options are fairly well supported. Excel DNA has an active forum on Google groups, VSTO forums are common. There is some support on the internet for XLW, mostly from a quant community, and Addin Express and managed XLL have good support web pages, and offer extended support directly from the ISV.

Open source.
If you consider this to be important then you could factor it in. I’m not sure I would count it against something – it not being open – but it is really handy to be able to dig into the source code if you hit a problem, or need to expand/hack something.

Longevity.
The last thing you want to do is learn something new, then it dies 2 years down the line. Open source in theory is good from this perspective, as other developers can pick up the project if the original ones stop working on the code base. This doesn’t mean they will! Likewise if you buy into a “big” company then they will support the product for X years… but what does that actually mean? I think we could get ourselves in to knots on this one. The good news is that what ever you use you will be using the dot Net framework, so these skills are transferable.

Architecture.
I’m not going to go into the ins and outs of how each of these solutions work, but what I will say is that Excel DNA is one of the more flexible and agile ways to implement managed code into unmanaged applications. COM Interop**** isn’t. VSTO, Managed COM addins and indeed managed dll all use interop and a COM interfaces. Managed XLL uses the same methods to interface with Excel as Excel DNA, LXW is slightly different, but its still spits out a normal XLL at the end. Net4Excel is more of a manager of dot Net code, and works is a completely different way to the other products. Addin express does a few different things depending on what type of solution you are using, sometimes COM some times via the C interface similar to Excel DNA.

There might be other things to think about too, but I’ll leave it here for now, the important thing is to try and have at least some understanding of what the options are, and the pros and cons of each. Of course your situation might require something that only one or two products can offer, so your hand might be forced, but for most of us we have some flexibility.

Here’s my summary. If you want to get started, you might like to do that for free, and also create some solutions that can be used in the wild. In my view the easiest way to do that is with Excel DNA. There are some down sides to be aware of; some of the tooling is not as good in Excel DNA as in VSTO or Addin Express for example. Some of the features are missing, or not yet 100% stable, like they might be in Manged XLL for instance, but over all, pound for pound it’s a pretty good bet!

Now, that’s only my view, it might be wrong, I often am, you’re welcome to your own views, why not share them in the comments?

Thanks
Ross

* In this context – when talking about dot Net – Managed means that the code you write is “controlled” by the dot Net frame work. In practice what this means is there’s some part of the framework responsible for making sure your code runs smoothly, it catches small bugs and handles them in a way that won’t kill your app, it cleans up memory so you don’t have to, and lot of other stuff that I don’t know about!

** JetXLL and ex hale, don’t seem to be quite there yet, but I might be wrong…

***Shimming is where a managed component is interfaced by an unmanaged one. This is needed when working with Office applications because if the managed code in your addin errors, it can cause all the other managed code to stop working (if its not shimmed). VSTO has this build in, managed COM addins don’t, but there is a free template file for VS that adds this extra layer.

****COM Interop lets managed dot Net code be called and used by COM type code; in essence it converts types between native COM types and Managed types.