Its fairly easy to write a User Defined Excel function using VBA:
Suppose you want to write a function that calculates the average of a range of cells, but exclude from the average anything that is not a number or is less than a tolerance.
Lets call the Function AverageTol
Start Excel
Alt-F11 gets you to the Visual Basic Editor (VBE)
Insert–>Module
Enter the following VBA Code
Function AverageTol(theRange, dTol) For Each Thing In theRange If IsNumeric(Thing) Then If Abs(Thing) > dTol Then AverageTol = AverageTol + Thing lCount = lCount + 1 End If End If Next Thing AverageTol = AverageTol / lCount End Function
The function loops through every cell in the range and, if the cell is a number greater than the tolerance, adds it to the total and increments a count. Finally it divides the total by the count and returns the result.
Now go back to the Excel worksheet, enter some data in cells A1:A10 and in B1 enter
=AverageTol(A1:A10 , 5)
That was pretty easy, and works well for 10 cells.
But if you have a lot of data, say 32000 cells, then 10 formulas using this UDF takes over 5 seconds to calculate on my fast PC (Intel I7 870 2.9 GHZ).
One major reason this is so slow is that I used all the defaults: I was lazy and did not declare any of the variables so they all defaulted to Variants.
Thats SLOW … but I can easily improve it: here is version A of AverageTol:
Function AverageTolA(theRange As Range, dTol As Double) Dim oCell As Range Dim lCount As Long For Each oCell In theRange If IsNumeric(oCell) Then If Abs(oCell) > dTol Then AverageTolA = AverageTolA + oCell lCount = lCount + 1 End If End If Next oCell AverageTolA = AverageTolA / lCount End Function
This is the same function but with each variable declared as a sensible Type. This is good programming practice, and considerably faster.
10 formulas using this UDF on 32000 cells now calculates in 1.4 seconds, thats an improvement factor of 3.5 but still SLOW.
One reason its slow is that there is a large overhead each time a VBA program transfers data from an Excel cell to a VBA variable
And this function does that lots of times (3 times 32000).
If you transfer the data in one large block you can avoid much of this overhead:
Function AverageTolC(theRange As Range, dTol As Double) Dim vArr As Variant Dim v As Variant Dim lCount As Long ' On Error GoTo FuncFail ' ' get Range into a variant array ' vArr = theRange ' For Each v In vArr If IsNumeric(v) Then If Abs(v) > dTol Then AverageTolC = AverageTolC + v lCount = lCount + 1 End If End If Next v AverageTolC = AverageTolC / lCount Exit Function FuncFail: AverageTolC = CVErr(xlErrNA) End Function
The statement vArr = theRange takes the values from all the cells in the Range and transfers it to a 2-dimensional Array of Variants. Then the UDF loops on each element of the Variant array. I also added an error handling trap that makes the UDF return #N/A if any unexpected error occurs.
Now the 10 formulas calculate in less than 0.1 seconds: thats an additional improvement factor of 14.
But we have’nt finished yet! Another speedup trick is to replace
vArr = theRange
with
vArr = theRange.Value2
That reduces the calculation time from 98 milliseconds (thousandths of a second) to 62 milliseconds. Using .Value2 rather than the default property (.Value) makes Excel do less processing (.Value checks to see if cells are formatted as Currency or Date, whereas .Value2 just treats all numbers including dates and currency as Doubles).
We can also make another small speedup by using Doubles rather than Variants wherever possible. Change the For Each v … Next v loop to:
Dim d as Double Dim r as Double On Error GoTo skip For Each v In vArr d = CDbl(v) If Abs(d) > dTol Then r = r + d lCount = lCount + 1 End If skip: Next v
Now the calculation time has come down to 47 milliseconds.
So a series of small changes has improved the calculation speed of this simple UDF from 5.4 seconds to 0.047 seconds, 115 times faster!
Hi Charles
As ever, absolutely fascinating to see the massive speed differences that can be achieved by just doing things “right”.
Excellent Blog – thanks for sharing it with us all.
Hi Roger,
Thanks, the plan is to do a number of posts on VBA UDFs, interspersed by anything else that takes my fancy.
Pingback: Writing efficient VBA UDFs (Part 2) « Excel and UDF Performance Stuff
So how does the value2 work with strings or mixed date/currency/double/strings? Is it still faster?
Would value2 always be the best way to grab data from a range?
Value2 works fine with all data types.
The only reason I have ever found for using .Value (apart from the fact that its the default and sometimes I get lazy) is if you want to use the VBA IsDate() function and you are reading the cell value into a variant. In this case .Value will create a variant with a sub-type of Date and the IsDate() will work, whereas if you use .Value2 you would get a Variant with a sub-type of double and IsDate() will always return false..
Cool,
So when you don’t specify .value or .value2 does it default to .value internally (as your example above shows). Just want to be clear on that.
Your blog is a real help on understanding Excel better. I’ve read many Excel books but I haven’t read anything like this on fine tuning (value vs value2) like appears on your blog. Thanks for blogging.
Yes it defaults to .Value
Hi all
I am a dunce with UDFs and would like to make my code run faster. Is there anything I can add to the below formula to achieve that?
Thank you
Function BB(RngA As Range, RngB As Range)
Dim Avg As Double
Dim Stdev As Double
Dim ND As Double
Dim RngC As Variant
Avg = WorksheetFunction.Average(RngB)
Stdev = WorksheetFunction.Stdev(RngB)
ND = WorksheetFunction.NormDist(RngA, Avg, Stdev, 1)
If ND 0.995 Then BB = 1
If ND >= 0.005 And ND <= 0.995 Then BB = 0
End Function
Looks OK to me (assuming you aren’t using whole column refs when you call it). Probably most of the time is being used in NormDist.
Thank you so much Charles.
Man! This one tutorial helped me cut a function from taking 98 seconds to run to only 6 seconds! Thank you!
hi, awesome articles! But your
>
symbol is displayed as
>
in my browser. Any way i can make it render correctly?
thx!
…displayed as
& g t ;
(without spaces)
HTML is wonderful …
Thanks: I fixed it.
thx!
can Value2 be used with Transpose? I’m getting an error on:
aryTranspose = WorksheetFunction.Transpose(Range(“$C:$C”))
i meant:
aryTranspose = WorksheetFunction.Transpose(Range(“$C:$C”).Value2)
You can’t Transpose an entire column of rows: that would exceed Excel’s max number of columns.
you mean max rows? this works without error:
aryTranspose = WorksheetFunction.Transpose(Range(“$C:$C”))
it only captures the first 65536 rows, which is fine for my purpose.
this also works:
aryTranspose = WorksheetFunction.Transpose(Range(“C1:C65535”).Value2)
I think Range(C:C) is a range object but Range(C:C).Value2 is an array of values; different error handling for different things.
As a matter of interest why do you want to use Transpose? About the only time I use it is with dynamic arrays when you want to redim preserve the first dimension.
The answer would be pretty off-topic Plus i’m exploring a different solution now– which is even more off-topic! The current issue is: Using Data Validation, list-style, which points to a Defined Name, which points to a VBA UDF. It basically works (and it’s a nifty trick to point Validation to a UDF 🙂 Problem is, my UDF contains .SpecialCells(xlCellTypeConstants), and THAT’S not working. Passing an entire column in returns the same entire column, instead of just the constants…… 😦
Yes, SpecialCells is not good in UDFs.
Might be worth looking at these posts for ideas on whole-column refs and UDFs
and the following post
thx, i posted my current solution over on Part 11.
When you say “not good”, i think you’re referring to performance issues (as described in Part 11). But what i’m seeing just plain wrong results from SpecialCells, ignoring performance.
SpecialCells is one of several things that does not work in UDFs (and its not very reliable in Subs with large amounts of data in Excel 2007 and later)