Writing efficient VBA UDFs (Part 1) – It ain’t what you do it’s the way that you do it

Its fairly easy to write a User Defined Excel function using VBA:

Suppose you want to write a function that calculates the average of a range of cells, but exclude from the average anything that is not a number or is less than a tolerance.
Lets call the Function AverageTol

Start Excel
Alt-F11 gets you to the Visual Basic Editor (VBE)
Insert–>Module
Enter the following VBA Code

Function AverageTol(theRange, dTol)
For Each Thing In theRange
If IsNumeric(Thing) Then
If Abs(Thing) > dTol Then
AverageTol = AverageTol + Thing
lCount = lCount + 1
End If
End If
Next Thing
AverageTol = AverageTol / lCount
End Function

The function loops through every cell in the range and, if the cell is a number greater than the tolerance, adds it to the total and increments a count. Finally it divides the total by the count and returns the result.
Now go back to the Excel worksheet, enter some data in cells A1:A10 and in B1 enter

=AverageTol(A1:A10 , 5)

That was pretty easy, and works well for 10 cells.
But if you have a lot of data, say 32000 cells, then 10 formulas using this UDF takes over 5 seconds to calculate on my fast PC (Intel I7 870 2.9 GHZ).

One major reason this is so slow is that I used all the defaults: I was lazy and did not declare any of the variables so they all defaulted to Variants.
Thats SLOW … but I can easily improve it: here is version A of AverageTol:

Function AverageTolA(theRange As Range, dTol As Double)
Dim oCell As Range
Dim lCount As Long
For Each oCell In theRange
If IsNumeric(oCell) Then
If Abs(oCell) > dTol Then
AverageTolA = AverageTolA + oCell
lCount = lCount + 1
End If
End If
Next oCell
AverageTolA = AverageTolA / lCount
End Function

This is the same function but with each variable declared as a sensible Type. This is good programming practice, and considerably faster.
10 formulas using this UDF on 32000 cells now calculates in 1.4 seconds, thats an improvement factor of 3.5 but still SLOW.
One reason its slow is that there is a large overhead each time a VBA program transfers data from an Excel cell to a VBA variable
And this function does that lots of times (3 times 32000).
If you transfer the data in one large block you can avoid much of this overhead:

Function AverageTolC(theRange As Range, dTol As Double)
Dim vArr As Variant
Dim v As Variant
Dim lCount As Long
'
On Error GoTo FuncFail
'
' get Range into a variant array
'
vArr = theRange
'
For Each v In vArr
If IsNumeric(v) Then
If Abs(v) > dTol Then
AverageTolC = AverageTolC + v
lCount = lCount + 1
End If
End If
Next v
AverageTolC = AverageTolC / lCount
Exit Function
FuncFail:
AverageTolC = CVErr(xlErrNA)
End Function

The statement vArr = theRange takes the values from all the cells in the Range and transfers it to a 2-dimensional Array of Variants. Then the UDF loops on each element of the Variant array. I also added an error handling trap that makes the UDF return #N/A if any unexpected error occurs.

Now the 10 formulas calculate in less than 0.1 seconds: thats an additional improvement factor of 14.

But we have’nt finished yet! Another speedup trick is to replace

vArr = theRange
with
vArr = theRange.Value2

That reduces the calculation time from 98 milliseconds (thousandths of a second) to 62 milliseconds. Using .Value2 rather than the default property (.Value) makes Excel do less processing (.Value checks to see if cells are formatted as Currency or Date, whereas .Value2 just treats all numbers including dates and currency as Doubles).

We can also make another small speedup by using Doubles rather than Variants wherever possible. Change the For Each v … Next v loop to:

Dim d as Double
Dim r as Double
On Error GoTo skip
For Each v In vArr
d = CDbl(v)
If Abs(d) > dTol Then
r = r + d
lCount = lCount + 1
End If
skip:
Next v

Now the calculation time has come down to 47 milliseconds.

So a series of small changes has improved the calculation speed of this simple UDF from 5.4 seconds to 0.047 seconds, 115 times faster!

About these ads
This entry was posted in UDF, VBA. Bookmark the permalink.

24 Responses to Writing efficient VBA UDFs (Part 1) – It ain’t what you do it’s the way that you do it

  1. Roger Govier says:

    Hi Charles

    As ever, absolutely fascinating to see the massive speed differences that can be achieved by just doing things “right”.

    Excellent Blog – thanks for sharing it with us all.

  2. Pingback: Writing efficient VBA UDFs (Part 2) « Excel and UDF Performance Stuff

  3. Jon says:

    So how does the value2 work with strings or mixed date/currency/double/strings? Is it still faster?

    Would value2 always be the best way to grab data from a range?

    • fastexcel says:

      Value2 works fine with all data types.
      The only reason I have ever found for using .Value (apart from the fact that its the default and sometimes I get lazy) is if you want to use the VBA IsDate() function and you are reading the cell value into a variant. In this case .Value will create a variant with a sub-type of Date and the IsDate() will work, whereas if you use .Value2 you would get a Variant with a sub-type of double and IsDate() will always return false..

  4. Jon says:

    Cool,

    So when you don’t specify .value or .value2 does it default to .value internally (as your example above shows). Just want to be clear on that.

    Your blog is a real help on understanding Excel better. I’ve read many Excel books but I haven’t read anything like this on fine tuning (value vs value2) like appears on your blog. Thanks for blogging.

  5. Oliver says:

    Hi all

    I am a dunce with UDFs and would like to make my code run faster. Is there anything I can add to the below formula to achieve that?

    Thank you

    Function BB(RngA As Range, RngB As Range)

    Dim Avg As Double
    Dim Stdev As Double
    Dim ND As Double
    Dim RngC As Variant

    Avg = WorksheetFunction.Average(RngB)
    Stdev = WorksheetFunction.Stdev(RngB)
    ND = WorksheetFunction.NormDist(RngA, Avg, Stdev, 1)

    If ND 0.995 Then BB = 1
    If ND >= 0.005 And ND <= 0.995 Then BB = 0

    End Function

  6. fastexcel says:

    Looks OK to me (assuming you aren’t using whole column refs when you call it). Probably most of the time is being used in NormDist.

  7. ryank says:

    Man! This one tutorial helped me cut a function from taking 98 seconds to run to only 6 seconds! Thank you!

  8. Johny Why says:

    hi, awesome articles! But your
    >
    symbol is displayed as
    >
    in my browser. Any way i can make it render correctly?
    thx!

  9. fastexcel says:

    HTML is wonderful …
    Thanks: I fixed it.

  10. fastexcel says:

    You can’t Transpose an entire column of rows: that would exceed Excel’s max number of columns.

    • Johny Why says:

      you mean max rows? this works without error:
      aryTranspose = WorksheetFunction.Transpose(Range(“$C:$C”))
      it only captures the first 65536 rows, which is fine for my purpose.
      this also works:
      aryTranspose = WorksheetFunction.Transpose(Range(“C1:C65535″).Value2)

  11. fastexcel says:

    I think Range(C:C) is a range object but Range(C:C).Value2 is an array of values; different error handling for different things.
    As a matter of interest why do you want to use Transpose? About the only time I use it is with dynamic arrays when you want to redim preserve the first dimension.

  12. Johny Why says:

    The answer would be pretty off-topic :/ Plus i’m exploring a different solution now– which is even more off-topic! The current issue is: Using Data Validation, list-style, which points to a Defined Name, which points to a VBA UDF. It basically works (and it’s a nifty trick to point Validation to a UDF :) Problem is, my UDF contains .SpecialCells(xlCellTypeConstants), and THAT’S not working. Passing an entire column in returns the same entire column, instead of just the constants…… :(

  13. fastexcel says:

    Yes, SpecialCells is not good in UDFs.
    Might be worth looking at these posts for ideas on whole-column refs and UDFs

    http://fastexcel.wordpress.com/2012/12/02/writing-efficient-udfs-part-11-full-column-references-in-udfs-used-range-is-slow/

    and the following post

  14. fastexcel says:

    SpecialCells is one of several things that does not work in UDFs (and its not very reliable in Subs with large amounts of data in Excel 2007 and later)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s