Excel users often find it convenient to use full-column references in formulas to avoid having to adjust the formulas every time new data is added. So when you write a User Defined Function (UDF) you can expect that sooner or later someone will try to use it with a full-column reference:
When Excel 2007 introduced the “Big Grid” with just over 1 million rows it became even more important to handle these full-column references efficiently. The standard way to handle this in a VBA UDF is to get the INTERSECT of the full-column reference and the used-range so that the UDF only has to process the part of the full-column that has actually been used. The example VBA code below does this intersection and then returns the smaller of the number of rows in the input range and the number of rows in the used range.
Public Function GetUsedRows(theRng As Range) Dim oRng As Range Set oRng = Intersect(theRng, theRng.Parent.UsedRange) GetUsedRows = oRng.Rows.Count End Function
The parent of theRng is the worksheet that contains it, so
theRng.Parent.UsedRange gets the used range of the worksheet you want.
Two problems with this technique are:
- Getting the Used Range can be slow.
- The XLL interface does not have a direct way to access the Used Range, so you have to get it via a single-thread-locked COM call. (More on this later).
So just how slow is it to get the used Range?
I created a very simple UDF and timed the calculation of 1000 calls to this UDF for filled used ranges of between 10K rows and 640K rows.
Public Function CountUsedRows() CountUsedRows = ActiveSheet.UsedRange.Rows.Count End Function
It turns out that the time taken to execute this UDF is a linear function of the number of used rows in the used range.
And its quite slow, 1000 calls to this UDF with 640K rows of data takes 33 seconds!
When the used range is small you won’t notice the time taken, but for large used ranges with the big grid you certainly will. And the problem is that your UDF will do this check on every range that is passed to the UDF, even if its not really needed.
Colin points out that what affects the time is actually the number of cells containing data or formatting (or that previously contained data or formatting) rather than the last cell in the used range.
Speeding up finding the used range.
So you could start by only doing the used-range check when theRng parameter has a large number of rows:
Public Function GetUsedRows2(theRng As Range) Dim oRng As Range If theRng.Rows.Count > 500000 Then Set oRng = Intersect(theRng, theRng.Parent.UsedRange) GetUsedRows = oRng.Rows.Count Else GetUsedRows = theRng.Rows.Count End If End Function
This example only does the check if the user gives the UDF a range referring to more than half a million rows.
Another, more complicated, way of minimising the time is to store the number of rows in the used range in a cache somewhere and retrieve it from the cache when needed. The tricky part of this is to make sure that the used-range row cache always is either empty (in which case go and get the number) or contains an up-to-date number.
One way of doing this would be to use the Application AfterCalculate event (which was introduced in Excel 2007) to empty the cache. Then only the first UDF that requested the used range for each worksheet would use time to find the used range, and (assuming that the calculation itself did nothing to alter the used range) the correct number would always be retrieved.
The equivalent for Excel versions before Excel 2007 would be to use the Application SheetCalculate event to empty the cache for that particular worksheet. This technique would be less efficient since a worsheet may well be calculated several times in each calculation cycle.
As Colin points out, if you want to find the last row containing data it is faster to use Range.Find when you have many cells containing data.
Note that you can only use Range.Find in UDFS in Excel 2002 and later, and you cannot use the Find method at all from an XLL except in a command macro or via COM.
Public Function CountUsedRows2() CountUsedRows2 = ActiveSheet.Cells.Find(What:="*", LookIn:=xlFormulas, SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row End Function