Its fairly easy to write a User Defined Excel function using VBA:
Suppose you want to write a function that calculates the average of a range of cells, but exclude from the average anything that is not a number or is less than a tolerance.
Lets call the Function AverageTol
Alt-F11 gets you to the Visual Basic Editor (VBE)
Enter the following VBA Code
Function AverageTol(theRange, dTol) For Each Thing In theRange If IsNumeric(Thing) Then If Abs(Thing) > dTol Then AverageTol = AverageTol + Thing lCount = lCount + 1 End If End If Next Thing AverageTol = AverageTol / lCount End Function
The function loops through every cell in the range and, if the cell is a number greater than the tolerance, adds it to the total and increments a count. Finally it divides the total by the count and returns the result.
Now go back to the Excel worksheet, enter some data in cells A1:A10 and in B1 enter
=AverageTol(A1:A10 , 5)
That was pretty easy, and works well for 10 cells.
But if you have a lot of data, say 32000 cells, then 10 formulas using this UDF takes over 5 seconds to calculate on my fast PC (Intel I7 870 2.9 GHZ).
One major reason this is so slow is that I used all the defaults: I was lazy and did not declare any of the variables so they all defaulted to Variants.
Thats SLOW … but I can easily improve it: here is version A of AverageTol:
Function AverageTolA(theRange As Range, dTol As Double) Dim oCell As Range Dim lCount As Long For Each oCell In theRange If IsNumeric(oCell) Then If Abs(oCell) > dTol Then AverageTolA = AverageTolA + oCell lCount = lCount + 1 End If End If Next oCell AverageTolA = AverageTolA / lCount End Function
This is the same function but with each variable declared as a sensible Type. This is good programming practice, and considerably faster.
10 formulas using this UDF on 32000 cells now calculates in 1.4 seconds, thats an improvement factor of 3.5 but still SLOW.
One reason its slow is that there is a large overhead each time a VBA program transfers data from an Excel cell to a VBA variable
And this function does that lots of times (3 times 32000).
If you transfer the data in one large block you can avoid much of this overhead:
Function AverageTolC(theRange As Range, dTol As Double) Dim vArr As Variant Dim v As Variant Dim lCount As Long ' On Error GoTo FuncFail ' ' get Range into a variant array ' vArr = theRange ' For Each v In vArr If IsNumeric(v) Then If Abs(v) > dTol Then AverageTolC = AverageTolC + v lCount = lCount + 1 End If End If Next v AverageTolC = AverageTolC / lCount Exit Function FuncFail: AverageTolC = CVErr(xlErrNA) End Function
The statement vArr = theRange takes the values from all the cells in the Range and transfers it to a 2-dimensional Array of Variants. Then the UDF loops on each element of the Variant array. I also added an error handling trap that makes the UDF return #N/A if any unexpected error occurs.
Now the 10 formulas calculate in less than 0.1 seconds: thats an additional improvement factor of 14.
But we have’nt finished yet! Another speedup trick is to replace
vArr = theRange
vArr = theRange.Value2
That reduces the calculation time from 98 milliseconds (thousandths of a second) to 62 milliseconds. Using .Value2 rather than the default property (.Value) makes Excel do less processing (.Value checks to see if cells are formatted as Currency or Date, whereas .Value2 just treats all numbers including dates and currency as Doubles).
We can also make another small speedup by using Doubles rather than Variants wherever possible. Change the For Each v … Next v loop to:
Dim d as Double Dim r as Double On Error GoTo skip For Each v In vArr d = CDbl(v) If Abs(d) > dTol Then r = r + d lCount = lCount + 1 End If skip: Next v
Now the calculation time has come down to 47 milliseconds.
So a series of small changes has improved the calculation speed of this simple UDF from 5.4 seconds to 0.047 seconds, 115 times faster!