Pedro wants to know how to speed up his UDF, which needs to calculate results for 35040 cells the minimum difference between the cell and a column of values of unknown length.
Pedro’s UDF
Function MinofDiff(r1 As Long) As Variant
Dim r2 As Range
Dim TempDif As Variant
Dim TempDif1 As Variant
Dim j As Long
Dim LastRow As Long
On Error GoTo FuncFail
If r1 = 0 Then GoTo skip
With Sheets("Dados")
LastRow = .Cells(.Rows.Count, "P").End(xlUp).Row
Set r2 = .Range("P8", "P" & LastRow)
End With
TempDif1 = Application.Max(r2)
For j = 1 To LastRow – 7
If r1 >= r2(j) Then
TempDif = r1 – r2(j)
Else
TempDif = r1
End If
MinofDiff = Application.Min(TempDif, TempDif1)
TempDif1 = MinofDiff
Next j
skip:
Exit Function
FuncFail:
MinofDiff = CVErr(xlErrNA)
End Function
There is a fundamental problem with Pedro’s UDF: it is referencing a range in column P without passing it in as a parameter, so if anything changes in column P the UDF could give the wrong answer because Excel will not recalculate it. Pedro has done this so that the UDF can dynamically adjust to the number of entries in column P.
On test data with 60000 entries in column P 20 calls to the UDF take 18.5 seconds on my laptop, so 34K calls would take about 9 hours to calculate! So why is it so slow?
- Every time the function is called (35K times) it finds the last row and the MAX value in column P: but this only needs to be done once.
- 35040 calls will hit the VBE refresh slowdown bug: so we need to bypass that.
- The For loop is referencing each cell value in column P (using R2(j) ) twice. Each reference to a cell is slow because there is a large overhead for each call out to the Excel object model.
- The UDF uses Worksheetfunction.Min to find out which of 2 values is smaller: its much quicker to compare the values using VBA If than invoking a worksheet function.
The revised UDF
To solve the fundamental problem with the UDF I will pass it an additional parameter: a whole column reference to column P. Then the UDF can resize the range to the last cell containing data. (Another alternative would be to create a Dynamic Named Range for column P and pass that as a parameter.
To solve the first 2 slowdown problems the UDF will be made into an array formula UDF that returns an array of 35040 results.
To avoid referencing each cell in column P twice inside the loop, the UDF will get all the values from column P once, into a variant array and then loop on the variant array.
Function MinofDiff2(R1 As Range, R2 As Range) As Variant
Dim R2Used As Range
Dim vArr2 As Variant
Dim vArr1 As Variant
Dim vOut() As Double
Dim TempDif As Double
Dim TempDif1 As Double
Dim D1 As Double
Dim D2 As Double
Dim TMax As Double
Dim j1 As Long
Dim j2 As Long
Dim LastRow As Long
'
On Error GoTo FuncFail
'
' handle full column
'
LastRow = R2.Cells(R2.Rows.Count, 1).End(xlUp).Row
Set R2Used = R2.Resize(LastRow - 7, 1).Offset(7, 0)
'
' get values into arrays
'
vArr2 = R2Used.Value2
vArr1 = R1.Value2
'
' find max
'
TMax = Application.Max(R2Used)
'
' set output array to same size as R1
'
ReDim vOut(1 To UBound(vArr1), 1)
'
' loop on R1
'
For j1 = 1 To UBound(vArr1)
TempDif1 = TMax
D1 = vArr1(j1, 1)
'
' loop on R2
'
For j2 = 1 To (LastRow - 7)
D2 = vArr2(j2, 1)
If D1 >= D2 Then
TempDif = D1 - D2
Else
TempDif = D1
End If
If TempDif < TempDif1 Then
vOut(j1, 1) = TempDif
Else
vOut(j1, 1) = TempDif1
End If
TempDif1 = vOut(j1, 1)
Next j2
Next j1
MinofDiff2 = vOut
skip:
Exit Function
FuncFail:
MinofDiff2 = CVErr(xlErrNA)
End Function
Because this is an array function you need to select the 35040 cells that you want to contain the answer, then type the formula into the formula bar =MinofDiff2(A1:A35040,P:P) and then press Ctrl/Shift/Enter to enter the formula as an array formula into the 35040 cells.
This revised UDF takes .222 seconds for 20 values, and completes the 35040 UDF calculations in 6.25 minutes, a speedup factor of over 80.
Updated with Harlan Grove’s suggestions
Harlan Grove has pointed out several ways of speeding up the UDF. Here is a revised version implementing most of his suggestions. It is about 17% faster than my original version.
Function MinofDiff3(R1 As Range, R2 As Range) As Variant
Dim R2Used As Range
Dim vArr2 As Variant
Dim vArr1 As Variant
Dim vOut() As Double
Dim TempDif As Double
Dim TempDif1 As Double
Dim D1 As Double
Dim D2 As Double
Dim TMax As Double
Dim TMin As Double
Dim j1 As Long
Dim j2 As Long
Dim LastRow As Long
'
On Error GoTo FuncFail
'
' handle full column
'
LastRow = R2.Cells(R2.Rows.Count, 1).End(xlUp).Row - 7
Set R2Used = R2.Resize(LastRow, 1).Offset(7, 0)
'
' get values into arrays
'
vArr2 = R2Used.Value2
vArr1 = R1.Value2
'
' find max & Min
'
TMax = Application.Max(R2Used)
TMin = Application.Min(R2Used)
'
' set output array to same size as R1
'
ReDim vOut(1 To UBound(vArr1), 1)
'
' loop on R1
'
For j1 = 1 To UBound(vArr1)
TempDif1 = TMax
D1 = vArr1(j1, 1)
TempDif = D1 - TMax
If D1 > TMax Then
If TempDif < TMax Then
vOut(j1, 1) = TempDif
Else
vOut(j1, 1) = TMax
End If
Else
If D1 < TMin Then
vOut(j1, 1) = D1
Else
'
' loop on R2
'
For j2 = 1 To LastRow
D2 = vArr2(j2, 1)
If D1 >= D2 Then
TempDif = D1 - D2
Else
TempDif = D1
End If
If TempDif < TempDif1 Then TempDif1 = TempDif
vOut(j1, 1) = TempDif1
Next j2
End If
End If
Next j1
MinofDiff3 = vOut
skip:
Exit Function
FuncFail:
MinofDiff3 = CVErr(xlErrNA)
End Function
Harlan also points out that a version using QuickSort to sort R2 and Binary Search instead of the loop would be an order of magnitude faster!