Its fairly easy to write a User Defined Excel function using VBA:
Suppose you want to write a function that calculates the average of a range of cells, but exclude from the average anything that is not a number or is less than a tolerance.
Lets call the Function AverageTol
Start Excel
Alt-F11 gets you to the Visual Basic Editor (VBE)
Insert–>Module
Enter the following VBA Code
Function AverageTol(theRange, dTol)
For Each Thing In theRange
If IsNumeric(Thing) Then
If Abs(Thing) > dTol Then
AverageTol = AverageTol + Thing
lCount = lCount + 1
End If
End If
Next Thing
AverageTol = AverageTol / lCount
End Function
The function loops through every cell in the range and, if the cell is a number greater than the tolerance, adds it to the total and increments a count. Finally it divides the total by the count and returns the result.
Now go back to the Excel worksheet, enter some data in cells A1:A10 and in B1 enter
=AverageTol(A1:A10 , 5)
That was pretty easy, and works well for 10 cells.
But if you have a lot of data, say 32000 cells, then 10 formulas using this UDF takes over 5 seconds to calculate on my fast PC (Intel I7 870 2.9 GHZ).
One major reason this is so slow is that I used all the defaults: I was lazy and did not declare any of the variables so they all defaulted to Variants.
Thats SLOW … but I can easily improve it: here is version A of AverageTol:
Function AverageTolA(theRange As Range, dTol As Double)
Dim oCell As Range
Dim lCount As Long
For Each oCell In theRange
If IsNumeric(oCell) Then
If Abs(oCell) > dTol Then
AverageTolA = AverageTolA + oCell
lCount = lCount + 1
End If
End If
Next oCell
AverageTolA = AverageTolA / lCount
End Function
This is the same function but with each variable declared as a sensible Type. This is good programming practice, and considerably faster.
10 formulas using this UDF on 32000 cells now calculates in 1.4 seconds, thats an improvement factor of 3.5 but still SLOW.
One reason its slow is that there is a large overhead each time a VBA program transfers data from an Excel cell to a VBA variable
And this function does that lots of times (3 times 32000).
If you transfer the data in one large block you can avoid much of this overhead:
Function AverageTolC(theRange As Range, dTol As Double)
Dim vArr As Variant
Dim v As Variant
Dim lCount As Long
'
On Error GoTo FuncFail
'
' get Range into a variant array
'
vArr = theRange
'
For Each v In vArr
If IsNumeric(v) Then
If Abs(v) > dTol Then
AverageTolC = AverageTolC + v
lCount = lCount + 1
End If
End If
Next v
AverageTolC = AverageTolC / lCount
Exit Function
FuncFail:
AverageTolC = CVErr(xlErrNA)
End Function
The statement vArr = theRange takes the values from all the cells in the Range and transfers it to a 2-dimensional Array of Variants. Then the UDF loops on each element of the Variant array. I also added an error handling trap that makes the UDF return #N/A if any unexpected error occurs.
Now the 10 formulas calculate in less than 0.1 seconds: thats an additional improvement factor of 14.
But we have’nt finished yet! Another speedup trick is to replace
vArr = theRange
with
vArr = theRange.Value2
That reduces the calculation time from 98 milliseconds (thousandths of a second) to 62 milliseconds. Using .Value2 rather than the default property (.Value) makes Excel do less processing (.Value checks to see if cells are formatted as Currency or Date, whereas .Value2 just treats all numbers including dates and currency as Doubles).
We can also make another small speedup by using Doubles rather than Variants wherever possible. Change the For Each v … Next v loop to:
Dim d as Double
Dim r as Double
On Error GoTo skip
For Each v In vArr
d = CDbl(v)
If Abs(d) > dTol Then
r = r + d
lCount = lCount + 1
End If
skip:
Next v
Now the calculation time has come down to 47 milliseconds.
So a series of small changes has improved the calculation speed of this simple UDF from 5.4 seconds to 0.047 seconds, 115 times faster!
June 1, 2011 at 11:48 am |
Hi Charles
As ever, absolutely fascinating to see the massive speed differences that can be achieved by just doing things “right”.
Excellent Blog – thanks for sharing it with us all.
June 6, 2011 at 2:12 pm |
Hi Roger,
Thanks, the plan is to do a number of posts on VBA UDFs, interspersed by anything else that takes my fancy.
June 6, 2011 at 2:34 pm |
[...] writing a UDF the same way as in Writing VBA UDFs Efficiently Part 1 you get this (ignoring error handling etc. for the sake of [...]
June 6, 2011 at 6:18 pm |
So how does the value2 work with strings or mixed date/currency/double/strings? Is it still faster?
Would value2 always be the best way to grab data from a range?
June 6, 2011 at 7:35 pm |
Value2 works fine with all data types.
The only reason I have ever found for using .Value (apart from the fact that its the default and sometimes I get lazy) is if you want to use the VBA IsDate() function and you are reading the cell value into a variant. In this case .Value will create a variant with a sub-type of Date and the IsDate() will work, whereas if you use .Value2 you would get a Variant with a sub-type of double and IsDate() will always return false..
June 6, 2011 at 8:37 pm |
Cool,
So when you don’t specify .value or .value2 does it default to .value internally (as your example above shows). Just want to be clear on that.
Your blog is a real help on understanding Excel better. I’ve read many Excel books but I haven’t read anything like this on fine tuning (value vs value2) like appears on your blog. Thanks for blogging.
June 6, 2011 at 9:54 pm |
Yes it defaults to .Value
September 12, 2011 at 7:30 am |
Hi all
I am a dunce with UDFs and would like to make my code run faster. Is there anything I can add to the below formula to achieve that?
Thank you
Function BB(RngA As Range, RngB As Range)
Dim Avg As Double
Dim Stdev As Double
Dim ND As Double
Dim RngC As Variant
Avg = WorksheetFunction.Average(RngB)
Stdev = WorksheetFunction.Stdev(RngB)
ND = WorksheetFunction.NormDist(RngA, Avg, Stdev, 1)
If ND 0.995 Then BB = 1
If ND >= 0.005 And ND <= 0.995 Then BB = 0
End Function
September 12, 2011 at 9:34 am |
Looks OK to me (assuming you aren’t using whole column refs when you call it). Probably most of the time is being used in NormDist.
September 12, 2011 at 12:35 pm |
Thank you so much Charles.
August 28, 2012 at 2:33 pm |
Man! This one tutorial helped me cut a function from taking 98 seconds to run to only 6 seconds! Thank you!