Writing efficient VBA UDFs (Part5) – UDF Array Formulas go faster!

Just in case you thought the previous posts on writing efficient VBA UDfs (Part1, Part2, Part3, Part 4) meant we had finished making UDFs run faster, think again – its time to explore UDF Array Formulas.

Single and Multi-Cell Array Formulas

Excel array formulas can do amazing things. They are like ordinary formulas except that you enter them with Control/Shift/Enter rather than just enter.

There are two kinds of array formulae:

  • Single cell array formulae are entered into a single cell, loop through their arguments (which are often calculated arguments) and return a single answer.
  • Multi-cell array formulae are entered into multiple cells and return an answer to each of the cells.

With this power comes a cost: because array formulae are doing a lot of work they can be slow to calculate (particularly single-cell array formulas).

UDF Multi-cell Array Formulas go Faster!

You can break down the time taken by a VBA UDF into these components:

  • Overhead time to call the UDF.
  • Time to fetch the data thats going to be used by the UDF.
  • Time to do the calculations.
  • Overhead time to return the answer(s).

In the post on Excel VBA Read/Write timeings you could see that there was quite a significant overhead on each VBA read and write call, so that its usually much faster to read and write large blocks of data at a time.

So it sounds like a good idea to make your VBA UDF read as much data as possible in a single block and return data to Excel in as large a block as possible.
Enter the Multi-cell array formula – it does exactly that – and also minimises the calling overhead – and often it can read the data once and re-use it lots of times.

So how do you make a Multi-Cell Array formula?

Lets create an array version of the AverageTolE function shown in the first Writing Efficient VBA UDFs post.
The scenario is that you want to find the Averages of the data excluding a number of different tolerances rather just one tolerance.
To keep things simple I am assuming that

  • the tolerances are all in one row
  • both the data and the tolerances will be supplied as ranges
  • error-handling is largely ignored
  • the function returns a row of answers that correspond to the row of tolerances.

Public Function AverageTolM(theRange As Range, theTols As Range) As Variant
Dim vArr As Variant
Dim vArrTols As Variant
Dim v As Variant
Dim d As Double
Dim r As Double
Dim k As Long
Dim vOut() As Variant
Dim dTol As Double
Dim lCount As Long
On Error GoTo FuncFail
vArr = theRange.Value2
vArrTols = theTols.Value2
ReDim vOut(1 To 1, 1 To UBound(vArrTols, 2))
On Error GoTo skip
For k = 1 To UBound(vArrTols, 2)
dTol = CDbl(vArrTols(1, k))
r = 0#
lCount = 0
For Each v In vArr
d = CDbl(v)
If Abs(d) > dTol Then
r = r + d
lCount = lCount + 1
End If
skip:
Next v
vOut(1, k) = r / lCount
Next k
AverageTolM = vOut
Exit Function
FuncFail:
AverageTolM = CVErr(xlErrNA)
End Function

The changes to the UDF are quite simple:

  • theTols range is coerced into a variant array: vArrTols = theTols.Value2
  • an output array of the same size is created: ReDim vOut(1 To 1, 1 To UBound(vArrTols, 2))
  • The UDF loops on the tolerance array and populates the output array
  • The output array is assigned to the function variable: AverageTolM = vOut

Note that the Function is declared as returning a variant (which will contain an array) rather than being declared as returning an array of variants.

Assuming that the data is in H27:AA27 then enter the array function with Ctrl/Shift/Enter as

{=AVERAGETOLM(Data!$A$1:$A$32000,$H$27:$AA$27)} into 20 rows (so we will get 20 x 20 = 400 cells of answers).

Calculating this 20 formulas takes 975 milliseconds.

Using the original AVERAGETOLE formula for the 400 cells takes 1660 milliseconds, an improvement factor of 1.7

Summary

  • In many real-life cases using multi-cell array UDFs can be the fastest way to calculate.
  • Converting a conventional UDF to a multi-cell array UDf is straightforward.
About these ads

13 Responses to “Writing efficient VBA UDFs (Part5) – UDF Array Formulas go faster!”

  1. Jon Says:

    So I did an array UDF and I’m getting different results, time wise, when I do Application.Calculation=manual vs automatic.

    When I test the time for automatic my array one goes faster, but when I don’t it goes slower than the non arrayed formulas. Also, when I compare it to the SumProduct function (the one I’m replacing) it goes twice as fast as it in automatic mode, but goes slower in manual mode.

    I’m thinking it might be because one of the dependents is a volatile function (Today()). Have you experienced this? Using my own function does seem like it is faster than SumProduct but (the delay doesn’t seem quite as long now when calculating) just wondering why the difference.

  2. fastexcel Says:

    There is a small bug with UDF array formulas that I have not yetmentioned: as it says on my website:
    http://www.decisionmodels.com/calcsecretsj.htm

    Note that Excel behaves unexpectedly when a multi-cell UDF is entered or modified and depends on volatile formulae: the UDF is evaluated once for each cell it occupies. This does not happen when the UDF is recalculated, only when it is entered or changed.

    Maybe thats whats happening to you?

  3. Jon Says:

    Thanks, that page was very useful.

  4. Pedro Says:

    Hi! First of all congratulations for your great job in this page!
    I did an UDF array formula but I don’t know how to improve this in order to enhance time in calculation.
    Basically, I need to compare one specific cell with an array that I don’t know previously its dimensions (I just know where it starts and that just have only one column). The minimum difference between these values (the cell and all array values) will be the final answer for me. The problem here is that I need to use this UDF array formula in 35040 cells (every 15 minutes of a year) and the calculation time in a normal pc is too slow.

    The code of that UDF that I am talking is following:

    Function MinofDiff(r1 As Long) As Variant

    Dim r2 As Range
    Dim TempDif As Variant
    Dim TempDif1 As Variant
    Dim j As Long
    Dim LastRow As Long

    On Error GoTo FuncFail
    If r1 = 0 Then GoTo skip

    With Sheets(“Dados”)
    LastRow = .Cells(.Rows.Count, “P”).End(xlUp).Row
    Set r2 = .Range(“P8″, “P” & LastRow)
    End With

    TempDif1 = Application.Max(r2)
    For j = 1 To LastRow – 7

    If r1 >= r2(j) Then
    TempDif = r1 – r2(j)
    Else
    TempDif = r1
    End If

    MinofDiff = Application.Min(TempDif, TempDif1)
    TempDif1 = MinofDiff
    Next j

    skip:
    Exit Function
    FuncFail:
    AverageTolM = CVErr(xlErrNA)

    End Function

    Please, could you help me to find some problems in writing of this short code and how can I gain time with that? I will really appreciate that. Thanks in advance.

    Best Regards,
    Pedro

    • Jon Says:

      Pedro, Good question for StackOverflow. You need to use arrays instead of ranges to loop through the data. Using ranges goes really slow. Also, you should look at more material written by Charles he has a ton of good information and if you follow his advice (i.e., read what he has written) you’ll see.

      Here’s an answer I gave that shows you how to use arrays that you grab from excel:
      http://stackoverflow.com/a/7263655/632495

    • fastexcel Says:

      Hi Pedro,
      I thought this made a nice example using the techniques outlined in a number of previous posts, so I made another post for you!

  5. Jim Cone Says:

    Try this…
    ‘—
    Function MinofDiff_R(r1 As Long) As Variant
    On Error GoTo FuncFail

    Dim r2 As Excel.Range
    Dim TempDif As Long
    Dim TempDif1 As Long
    Dim j As Long
    Dim LastRow As Long
    Dim vRngValues As Variant

    If r1 = 0 Then GoTo skip

    With ActiveSheet
    LastRow = .Cells(.Rows.Count, “P”).End(xlUp).Row
    Set r2 = .Range(“P8″, .Cells(LastRow, 16))
    End With
    vRngValues = r2.Value2
    TempDif1 = Excel.WorksheetFunction.Max(r2)

    For j = 1 To LastRow – 7
    If r1 >= vRngValues(j, 1) Then
    TempDif = r1 – vRngValues(j, 1)
    Else
    TempDif = r1
    End If
    If TempDif < TempDif1 Then MinofDiff_R = TempDif Else MinofDiff_R = TempDif1
    TempDif1 = MinofDiff_R
    Next 'j

    skip:
    Exit Function
    FuncFail:
    MinofDiff_R = CVErr(xlErrNA)
    End Function
    '—

    • fastexcel Says:

      Hi Jim,
      Looks good to me: I made an array formula version for my latest post as an example.

    • Pedro Says:

      Thank you so much Jim! It works fine as the array formula version that “fastexcel” did in his latest post. At least I solved the overflow problem that I had in my version.

  6. Ram Says:

    Can we import an access data table/query through EXCEL UDF

  7. Jon Says:

    Do you why array formulas for Excel functions are slower than regular formulas for Excel? Is this always the case? Chandoo.org was doing a fast Excel week and that is one of the suggestions, don’t use array formulas for Excel functions. I tested one formula and came up with it being slower for built in Excel functions, of course, for UDFs it was much faster to do the arrays.

  8. » Why UDF Array Functions are Faster Spreadsheet Budget and Consulting Says:

    [...] at FastExcel the answer was given why UDF arrays go faster to the question I had on why UDF array formulas are faster than regular formulas. I had forgotten [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 34 other followers

%d bloggers like this: