Writing Efficient UDFs Part 11 – Full-Column References in UDFs: Used Range is Slow

December 2, 2012

Excel users often find it convenient to use full-column references in formulas to avoid having to adjust the formulas every time new data is added. So when you write a User Defined Function (UDF) you can expect that sooner or later someone will try to use it with a full-column reference:

=MyUDF(A:A,42)

When Excel 2007 introduced the “Big Grid” with just over 1 million rows it became even more important to handle these full-column references efficiently. The standard way to handle this in a VBA UDF is to get the INTERSECT of the full-column reference and the used-range so that the UDF only has to process the part of the full-column that has actually been used. The example VBA code below does this intersection and then returns the smaller of the number of rows in the input range and the number of rows in the used range.

Public Function GetUsedRows(theRng As Range)
Dim oRng As Range
Set oRng = Intersect(theRng, theRng.Parent.UsedRange)
GetUsedRows = oRng.Rows.Count
End Function

The parent of theRng is the worksheet that contains it, so theRng.Parent.UsedRange gets the used range of the worksheet you want.

Two problems with this technique are:

  • Getting the Used Range can be slow.
  • The XLL interface does not have a direct way to access the Used Range, so you have to get it via a single-thread-locked COM call. (More on this later).

So just how slow is it to get the used Range?

I created a very simple UDF and timed the calculation of 1000 calls to this UDF for filled used ranges of between 10K rows and 640K rows.

Public Function CountUsedRows()
CountUsedRows = ActiveSheet.UsedRange.Rows.Count
End Function

It turns out that the time taken to execute this UDF is a linear function of the number of used rows in the used range.

Used_Range_Times

And its quite slow, 1000 calls to this UDF with 640K rows of data takes 33 seconds!

When the used range is small you won’t notice the time taken, but for large used ranges with the big grid you certainly will. And the problem is that your UDF will do this check on every range that is passed to the UDF, even if its not really needed.

Colin points out that what affects the time is actually the number of cells containing data or formatting (or that previously contained data or formatting) rather than the last cell in the used range.

Speeding up finding the used range.

So you could start by only doing the used-range check when theRng parameter has a large number of rows:

Public Function GetUsedRows2(theRng As Range)
Dim oRng As Range
If theRng.Rows.Count > 500000 Then
Set oRng = Intersect(theRng, theRng.Parent.UsedRange)
GetUsedRows = oRng.Rows.Count
Else
GetUsedRows = theRng.Rows.Count
End If
End Function

This example only does the check if the user gives the UDF a range referring to more than half a million rows.

Another, more complicated, way of minimising the time is to store the number of rows in the used range in a cache somewhere and retrieve it from the cache when needed. The tricky part of this is to make sure that the used-range row cache always is either empty (in which case go and get the number) or contains an up-to-date number.

One way of doing this would be to use the Application AfterCalculate event (which was introduced in Excel 2007) to empty the cache. Then only the first UDF that requested the used range for each worksheet would use time to find the used range, and (assuming that the calculation itself did nothing to alter the used range) the correct number would always be retrieved.

The equivalent for Excel versions before Excel 2007 would be to use the Application SheetCalculate event to empty the cache for that particular worksheet. This technique would be less efficient since a worsheet may well be calculated several times in each calculation cycle.

As Colin points out, if you want to find the last row containing data it is faster to use Range.Find when you have many cells containing data.
Note that you can only use Range.Find in UDFS in Excel 2002 and later, and you cannot use the Find method at all from an XLL except in a command macro or via COM.

Public Function CountUsedRows2()
CountUsedRows2 = ActiveSheet.Cells.Find(What:="*", LookIn:=xlFormulas, SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row
End Function

So have you got any better ideas on how to process full-column references efficiently?

Getting Cell data with VBA and C++: Benchmarking Excel 2013

November 9, 2012

There is a new kid on the block: Excel 2013. So its time to see how it shapes up for VBA performance against its predecessors. Along the way I will try to answer these questions:.

How does Excel 2013 VBA performance compare with previous versions?

Is 64-bit VBA faster or slower than 32-bit VBA?

Is there an optimum block size (number of cells) for getting cell data with VBA or C++?

How much faster is C++ than VBA when reading data from Cells?

Excel 2013 VBA Benchmark

Frequently the bottleneck in VBA processing time is reading and writing to Excel cells. So my Variant_Benchmark times this for a variety of cell sizes.

It turns out the relationship between time taken and the number of cells you get in a single block is pretty much linear over small ranges:
Time = overhead + number_of_cells * Time_per_cell

Running this test for on my Dell Studio XPS desktop (2.9 MHZ) for all the versions of Excel I currently have installed gives this table, (slightly overcoloured by the Excel 2013 quick analysis tool – but Quick Analysis is great for charting and formatting idiots like me).

The first 2 columns give times in milliseconds for reading and writing a single block of 512 cells, and the second 2 columns give times in milliseconds for writing a single cell 512 times.

Two things are immediately obvious:

  • Reading and writing cell-by-cell is at least an order of magnitude slower than reading and writing a large block of cells in  a single call.
  • Writing data back to Excel is much slower than reading data from Excel.

Looking at the variations by Excel versions it is interesting to see that Excel 97 is the fastest version for cell-by-cell but the slowest version for block by block.

Here is a bar chart (again produced by Excel 2013 Quick Analysis) for reading and writing a block of 512 cells:

Read times for a 512 block have been decreasing since XL 2003: and the current champion reader is Excel 2013 64-bit.
But Write times for a 512 block have been increasing since  XL 2000 – the current champion writer.

Looking at cell-by-cell you get this bar chart:

Reading and writing cell-by-cell just goes from bad (Excel 97) to worse (Excel 2010 32).
But its interesting to note that the 64-bit versions are faster than the 32-bit versions for 512 cell blocks, but slower for cell-by-cell.

Is there an optimum Block size for VBA data transfer?

Many excel versions ago (Excel 97?) I did an analysis which showed that if you increased the number of cells being read into the variant array too much the time started to increase. So I thought I would look at this again with the latest Excel versions.

So far I have not detected any decrease in efficiency: you can use as large a block of cells as you like (subject to memory limits of course).

What about C++ XLLs?

The XLL equivalent to transferring data from a range of excel cells to a variant array is the COERCE function. This has the additional benefit of letting you know if the attempt failed because of uncalculated cells.

And XLL Coerce is faster than VBA, by a factor of 2 or more.

Coerce does seem to have an optimum block size. Beyond about 1000 cells the time taken suddenly jumps by 25 to 30% in the 32-bit versions, except for XL 2013 64-bit where this does not happen.

Conclusions

  • Read/write using large Ranges is much more efficient than cell-by-cell
  • Excel 2013 VBA read/write speed is comparable to Excel 2010
  • The 64-bit versions are faster for large ranges than the 32-bit versions
  • VBA does not currently appear to have an optimum block size: the larger the range the better.
  • XLL Coerce is a lot faster than VBA
  • XLL Coerce 32-bit has an optimum block size just under 1000 cells

I have not tested .NET Interop with Excel 2013, but I expect its still the same depressing performance story. If you know anything different please let me know!

Multi-threading XLL functions – Evaluate fails

November 1, 2012

I have just about finished converting the first part of the FILTER.IFS function from VB6 to C++.
This first part uses modified binary search routines to handle multiple kinds of criteria (EQ, GT, LT, GE, LE, NE with AND and OR etc) on sorted columns.

The second part handles unsorted columns and additional criteria types such as Regex and Like. Some of this is done in the VB6 version of the function using EVALUATE to get the results of array formulas on subsets of the data. So the C++ XLL version uses the same technique but using the XLL equivalent xlfEvaluate.

The explanation here http://msdn.microsoft.com/en-us/library/office/bb687899.aspx seems to say that using xlfEvaluate is threadsafe as long as the expression being evaluated does not contain any non-threadsafe components.

But in practice using xlfEvaluate on even a simple formula string like =2+3 fails with a return code of 128 if the UDF function is flagged as multi-threaded, but works OK if the function is flagged as single-threaded.

At the moment this leaves me with a choice of either making FILTER.IFS single-threaded or using some alternative to xlfEvaluate. Both of these choices look bad.

So has anyone found a way of using xlfEvaluate inside a multi-threaded XLL function?

Table Extension recalculation bug in Excel 2010 but not 2007

October 11, 2012

gizzzzmo has found an unexpected recalculation bug in Excel 2010.

You can duplicate it using the following steps.

Start Excel 2010 with a new blank workbook in Automatic Calculation mode.

  • in A1 enter =MAX(A4:A5)
  • in A2 enter =A1
  • in A3:A5 create a Table (Insert tab->Table, check My Table has Headers)
  • in A4 enter 1
  • in A5 enter 2

It should now look like this:

Now enter 3 into cell A6 and the screen changes to this:

The Table has been extended to include cell A6 and the formula in cell A1 has automagically been changed to reflect this (  =MAX(A4:A6)  ) , so cell A1 correctly shows 3.

But cell A2 still shows 2, even though its formula references cell A1, which has changed, and we are in automatic calculation mode so it should show 3. Recalc BUG!

If you now change cell A6 to 4 BOTH cells A1 and A2 correctly get recalculated, so the dependency tree is still valid.

This bug does not exists in Excel 2007, but is still there in my installation of Excel 2013 Tech Preview.

If you create the Table before entering the =MAX() formula then Excel creates a structured Table reference =MAX(Table1[Column1])  and recalculation works correctly.

 

Mouse Eats Sail: Mouse Walks Plank

October 10, 2012

On Sunday, as a break from rewriting my mega VBA FilterIFs function into C, I went to MWYC to sail my Comet Xtra in the Sunday pursuit race. When I took the cover off and the sail out of the sail bag, I discovered that a mouse had built a nest in the sail, and eaten some holes in the luff and dismantled chunks of the zip that runs up the luff.

After inspecting the damage I managed to get rid of the mouse nest, hoist the sail and manouvre the zip fastener past the missing parts of the zip. It looked a bit ragged but since the wind was light I thought no more damage would be done by sailing, so launched the boat.

I was just about to leave the pontoon when I discovered that the mouse was hiding in the bottom of the boat.
After some encouragement mouse decided that walking the plank was preferable to staying in the boat with me, so hopped over the edge and boldly swam for the bank of the reservoir. I did not know mice could swim but this one was quite fast.

Needless to say I did not win the race, but I definitely have the best excuse: “Mouse eat my sail” ranks up there with “Dog eat my homework” as unbelievable but (very occasionally) true excuses.

Postscript:

No, I have not seen the mouse since it scrambled ashore.

It turns out that Hyde Sails no longer have a repair loft in the UK, and that the sails are actually made in Malaysia. But Comet Dinghies were very helpful and arranged for a new luff sleeve and zip to be sent to me, and hopefully next week I can persuade Jon Clarke at Edge Sails Coventry to replace the mouse-bitten part with the new one.

The SpeedTools Function library – Licensing, Pricing, Piracy and Ecommerce

September 13, 2012

We had a great time spending August in Norfolk (a 30-year tradition) with lots of sailing and walking.

When I got back in September I started wrestling with how to license, price, sell and fulfill the sale of the SpeedTools Function library.
(This is displacement activity from tackling coding the FILTER functions …)

Fight Scope Creep by Splitting!

The first decision was that scope creep (rapidly heading towards 100 additional Excel functions!) had got to the point where I needed to split SpeedTools into a number of products:

  • SpeedTools Calc – The extended calculation methods and options
  • SpeedTools Lookups – The fast Lookup and Comparison functions
  • SpeedTools Filters – The Filtering, Sorting and Distinct/Unique functions
  • SpeedTools Extra – the Math, Logical, Array, Text and Information functions
  • SpeedTools Premium – a bundle containing all the other 4 products

SpeedTools Calc contains all the User Interface stuff and is mostly a VBA addin, and the other 3 products are mostly contained in an XLL.

Choosing an E-Commerce provider

When I started selling FastExcel in 2001 we custom built the website and licensing system that took the orders, priced them, maintained the License database and linked to Worldpay for the credit card processing in multiple currencies, then automatically emailed out the License codes etc. The system has worked pretty well for 11 years but has some serious limitations and would need rewriting to handle multiple products.

So I started looking at the many available E-Commerce providers who handle software sales. I found a useful starting point here.
More research gave me a shortlist of Avangate and FastSpring who both seemed to tick most of my boxes:

  • Reasonable costs
  • Multiple currencies
  • Integration with licensing software
  • Multiple products
  • Volume Licensing
  • Credit Cards (including AMEX), PayPal, Purchase Orders, Money Transfer supported
  • Coupons, promotions, time-limited pricing

Licensing and Piracy

As we all know, its pretty easy to hack a VBA addin, and difficult to prevent casual copying of your XLA/XLAM product. For FastExcel I used an installer that required required an installation password, but I wanted a better defence against casual copying for SpeedTools. I decided to use a License activation system, and so looked at the systems that were already integrated (see FastSpring DRM) with my chosen E-Commerce provider FastSpring.

The system I chose can be integrated with Excel VBA and VB6 addins as well as .NET and C++ stuff. You can embed a time-limited trial license inside your code and then upgrade with one or more licenses. It uses Public/Private key encryption and you can choose what kind of machine binding you want to use.

To make this work you also need a web server for the license database that can dynamically link both to your E-Commerce provider and to your application. I did not want the cost and hassle of hosting and maintaining my own license server and so opted for a system that provides a hosted web license server for you.

If no valid license to any of the products is found then the products do not load. If there is at least one licensed product then the User Interface and Help for all the products is shown, but using for example a MEMLOOKUP function without a Lookups license will make the function return a message saying “No valid license found for this function” to the calling cells.

Is this hacker-proof? Probably not – nothing really is, but I think its more than enough to stop casual copying.

Implementing all this is quite complex because you have to integrate a number of things:

  • Your own Website to sell and link to the Ecommerce provider
  • setting up the E-commerce provider and linking it to the License activation server
  • creating an install package that delivers and installs all the requisite software and files and products
  • integrating the licensing software into your products, including the UI for managing the licenses.

Pricing

Deciding on the right price for your software product is impossible.

I started by looking at what prices other people charge for Excel addins, then considered how useful each product would be to someone who needed the function provided, and how much time and effort had been used to create them. For most of these products I could not find any real competitive products.

This gave me a range of base prices for the different products (currently $29 (Calc) to $69 (Lookups), but this might change!).

The next step was to consider volume discounts. There are 2 main scenarios to consider here for a function library:

  • An individual who wants a license that covers work and home (so 2 PCs)
  • Workbooks that will get used by a number of different people on different Machines.

My current thinking is to start with a very steep discount for 2 licenses (so that buying 2 costs only 30% more than buying 1) and then increase the discount progressively for larger volumes.

Conclusion

It has taken me about 2 weeks work so far, and there is probably another week needed to finish, but thats not too bad considering the complexity of the task and that (hopefully) the system will be in use for the next ten years or so.

So what level do you think I should price at?

And how do you feel about license activation?

 

From VBA to C Part 9 – the additional Types of UDF you can create in an XLL

July 20, 2012

In VBA you can create a UDF that returns any VBA datatype, but there are really only 2 kinds of UDFs: Volatile and Non-Volatile. In the C XLL UDF world there are several other types that you can create.

Features in the XLL Plus Function Wizard

The features tab in the XLL Plus function wizard shows you many of the function types you can create just by clicking a checkbox or an option button:

Availability

A function can either be a worksheet function (UDF) or a Command Macro (think VBA Sub or XLM command macro, cannot be called as a UDF) or a Hidden Function (not shown in the Excel Function Wizard but still callable as a UDF).

Hidden functions are useful for testing purposes.

Behaviour

These checkboxes give you several interesting options:

Cache Results

One peculiarity of the Excel smart recalc engine is that it flags formulas whose precedents have been recalculated as needing recalculation even when the values of the arguments have not actually changed.
If you have a UDF that takes a long time to calculate this can be very inefficient.

Checking Cache Results stores both the values of the functions arguments and the results of the function in an application-level cache. Each time the function is called the values of the rguments are checked against the cache and if they have not changed then the previous results are returned.

This sounds wonderful, and it is, but there is a tradeoff: it takes time to store the values of the arguments and to check them. So if your function arguments contain large ranges of data (for example Lookups) then it may take longer to check the argument values than to calculate the result!

Create Asynchronous Version

This option creates an additional add-in function  which runs the core function asynchronously, in a worker thread. The function will be have the same name as the core function, plus the suffix “Async”.

Create Queued Version

This option is only available in Excel 2010 and later. When you use the option this Excel calculation thread does not get blocked but can continue whilst the queued function(s) are executing. When the queued functions complete they call back and Excel 2010 has function to handle this callback within the calculation process.

Do not call in Formula Wizard

When you are entering a function and its arguments using the Excel Function Wizard the function gets executed many times. If your function takes a long time to execute this can be annoying, so this option stops the function being executed (it just returns #Value to the Function Wizard). In VBA your function can do a similar thing by using

If (Not Application.CommandBars("Standard").Controls(1).Enabled) Then Exit Function

Thread safe

This registers the function as a thread-safe function. In Excel 2007 and later on a multi-core PC this can speed up execution significantly. VBA functions cannot be made thread-safe.

There are certain rules you have to follow when writing thread-safe functions. The most obvious ones are:

  • You can’t call other functions that are not thread-safe.
  • You can’t store data that is static or is accessible from outside the function itself (unless you use a thread lock)

Debugging thread-safe functions can be disconcerting since you get jumped from one thread to another: its best to start by debugging on a single thread (switch Excel’s multi-threaded calculation feature off).

Volatile

This registers the function as a volatile function, in a somewhat similar way to using Application.Volatile in a VBA UDF.

Other Function Types

In the Details tab of the XLL Function wizard you can also use some more function types:

Can Defer Recalculation

This option flags the UDF as a Macro Command Equivalent function. This means that function has slightly greater permissions than a standard UDF. In this example the PREVIOUS function returns the value that the calling cell had at the previous recalculation.
This enhanced capability comes at a price: the function cannot be Multi-threaded and will be volatile if using Reference type arguments.

Cluster Safe

Using this option you can offload calculation of your functions to a High Performance Computing (HPC) clusters.

Conclusion

XLL Functions can be flagged as a number of different types, mainly to handle special situations such as the need for queued functions or to support HPC clusters.

By far the most useful type is Multi-Threaded, which can produce calculation speed improvements that often scale almost linearly with the number of cores in the PC.

From VBA to C Part 8 – Using Containers from the Standard Template Library

July 11, 2012

In the VBA world we make a lot of use of arrays and Collections. And we can also use the Dictionary object from the VBScript library by adding a reference to the VBScripting runtime. All of these are types of containers.

The C++ world has a much richer set of containers available from the Standard Template Library (STL) and the Boost Template Library. This post wil show how to use one of these containers – the STL set container, by creating a simple function for comparing 2 lists. You can compare this with the VBA version in the previous post.

Templates

The STL library makes use of the c++ Templates concept.

In VBA we tend to use Variants as arguments for functions when we don’t know what data type will be used at run-time. An example is writing a single QuickSort routine that accepts Variants rather than a set of QuickSort routines that handle the individual data types (string, double …).

The Templates approach is to write a function or object that takes a data type that is determined at compile time. So the C++ code calls up the Template code and tells it what type to use by enclosing the Type in angle brackets:
std::widget<long> Fred; would define a widget object from the standard C++ library with a data type of long.

STL Containers

Some of the STL Containers I use are:

  • pair – a tuple of 2 elements which can be of different types
  • vector – a dynamic 1-dimensional array with automatic resizing capablility for insert and delete. Has methods such as SORT
  • set – an indexed vector of unique elements. Has methods such as union, difference, intersection.
  • map – an associative array of key-value pairs. Similar to Collections and Dictionaries

Iterators

Because Containers are templates and can contain a wide variety of data types its useful to define Iterators (generalised pointers/indexes) that allow you to move through them. And Containers all have some standard iterators like begin() which points to the first element in the container and end() that tells you you are one past the end rather than giving you the last item.

The COMPARE.LISTS Function

This is an array function that compares 2 lists and returns an array of True/False for each element in the first list (LookFor) showing if the element exists anywhere in the second list (LookIn).

I have defined 2 arguments, both of type value to simplify the code.
LookFor gives the list of items to look for in the LookIn list.

The chunk of code below does some generic error checking and gets the dimensions

RW12 nRLookFor=0,nRLookIn=0,nROut=0;    // number of rows
COL12 nCLookFor=0,nCLookIn=0,nCOut=0;    // number of columns


// default error result
xloResult=xlerrValue;
// check that both parameters are present
if (LookFor->IsMissing() || LookIn->IsMissing()) return xloResult.Ret();


// get caller dims
if (!CXllApp::GetCallerDims(nROut,nCOut))
return CXlOper::RetError(xlerrNA);


// get input array dims
LookFor->GetDims(nRLookFor,nCLookFor);
LookIn->GetDims(nRLookIn,nCLookIn);
// more than 1 column -> error
if (nCLookFor!=1 || nCLookIn!=1) return xloResult.Ret();

Then I create the output array as the smaller of the number of calling rows and the the number of LookFor rows, and initialise to False:

// allocate output array & initialise to false
if (nRLookFor<nROut) nROut=nRLookFor;
xloResult.AllocArray(nROut,1,false);

Now I create a set container called InList and an iterator of the same datatype caled it.
The type of data that the set will contain is CXlConstCell.
That is a read-only xlOper data structure (think of it as a variant variable, but read-only).

// create set for LookIn
std::set<CXlConstCell> InList;
std::set<CXlConstCell>::iterator it;

Next loop through the LookIn list and insert each element into the set.
Remember that the set is a vector of unique keys and will ignore duplicates.

// populate the set
for (RW12 j=0;j<nRLookIn;j++) {
InList.insert(LookIn->Cell(j,0));
}

Now loop through the LookFor list checking if each item exists in LookFor.
If it does NOT exist then the Find method will return end(), which means one past the last item.
Remember that != means not equal

// check lookfor against inlist
for (RW12 j=0;j<nROut;j++) {
it=InList.find(LookFor->Cell(j,0));
if (it!=InList.end()) xloResult.Cell(j,0)=true;
}

You don’t really need the iterator it in this case, you could do it all in one line of code:

if (InList.find(LookFor->Cell(j,0))!=InList.end()) xloResult.Cell(j,0)=true;

But I think its simpler to read with the iterator.

Finally return the output:

return xloResult.Ret();
}

Performance compared to VBA Dictionary UDF

In the previous post using the VBScript Dictionary was generally the fastest, but started running out of puff after about 500K rows.
For 1 Million rows the VBA Collection was fastest taking 6720 milliseconds for 50000 LookFor rows and 1 million LookIn rows.
And the VBA array function is limited to returning 64K rows.

The COMPARE.LISTS XLL UDF is considerably faster: 1923 milliseconds versus 6720 milliseconds (and of course its multi-threaded so there would be an even bigger speed advantage for multiple formulas).
The XLL array function is NOT limited to 64K rows. Comparing 1 million LookFor rows against 1 million LookIn rows takes just 2.5 seconds

Conclusion

Using C++ XLL’s you are not limited to just arrays, Collections and Dictionaries.
The STL and BOOST libraries can add significant power, performance and pre-built code to your projects.

Comparing Two Lists – VBA UDF shootout between Linear Search, Binary Search, Collection and Dictionary

July 10, 2012

I often run into the problem of having to compare two lists in Excel, to see what items are in the list to look for that can’t be found in the list to look in.

So when I saw Dick Kusleika’s post on Daily Dose of Excel I thought it was time I got my act together and worked out the best way of tackling this problem.

I decided to write a VBA UDF called IsInList2 that called 6 methods:

  • Sorting the the LookIn list and using Binary Search to compare the items in the LookFor list
  • Using linear search of the LookIn list for each item in the LookFor list
  • Create a Collection containing the LookIn list and check it for each item in the LookFor list
  • Create a Dictionary containing the LookIn list and check it for each item in the LookIn list
  • Use an already sorted LookIn List and Binary Search
  • Look for partial matches using InStr

Overview of the IsInList 2 UDF

The IsInList2 UDF is written as an array function returning an array of True/False. Its designed to be called as a multi-cell array function entered in the column next to the LookFor list, so that you can Filter for False to find all the items in the LookFor list that don’t exits in the LookIn list.

For simplicity the UDF is written assuming that both lists are Ranges with at least 2 items: so the first task is to get the values from the Ranges into variant arrays.
(I have ignored the optimisation of using MATCH directly on the Range).

Then the output array is created as the smaller of the calling cells and the the LookFor list

Then if its an exact match the data is either Sorted, added to a Collection or to a Dictionary

Then the function iterates through the LookFor list using the appropriate method subroutine and stores the result in the output array

The QuickSort Sub

Here is the code for a QuickSort of a variant containing an array. The Quicksort would be substantially faster if it was strongly typed and handled a vector rather than a 1-column matrix.

Sub QSortVar(InputValues As Variant, jStart As Long, jEnd As Long)
Dim jStart2 As Long
Dim jEnd2 As Long
Dim v1 As Variant
Dim v2 As Variant
jStart2 = jStart
jEnd2 = jEnd

‘ choose random pivot

v1 = InputValues((jStart + (jEnd – jStart) * Rnd()), 1)
While jStart2 < jEnd2
While InputValues(jStart2, 1) < v1 And jStart2 < jEnd
jStart2 = jStart2 + 1
Wend
While InputValues(jEnd2, 1) > v1 And jEnd2 > jStart
jEnd2 = jEnd2 – 1
Wend
If jStart2 < jEnd2 Then
v2 = InputValues(jStart2, 1)
InputValues(jStart2, 1) = InputValues(jEnd2, 1)
InputValues(jEnd2, 1) = v2
End If
If jStart2 <= jEnd2 Then
jStart2 = jStart2 + 1
jEnd2 = jEnd2 – 1
End If
Wend
If jEnd2 > jStart Then QSortVar InputValues, jStart, jEnd2
If jStart2 < jEnd Then QSortVar InputValues, jStart2, jEnd
End Sub

The Binary Search Function

For simplicity this function is also written to handle a variant containing an array, and would be faster if more strongly typed.
Its not designed as a UDF: its called from IsInList2.

Function BSearchMatch(LookupValue As Variant, LookupArray As Variant) As Boolean

‘ use binary search to find if lookupvalue exists in lookuparray

Dim low As Long
Dim high As Long
Dim middle As Long
Dim varMiddle As Variant
Dim jRow As Long

jRow = 1
BSearchMatch = False
low = 0
high = UBound(LookupArray)
Do While high – low > 1
middle = (low + high) \ 2
varMiddle = LookupArray(middle, 1)
If varMiddle >= LookupValue Then
high = middle
Else
low = middle
End If
Loop

If (low > 0 And high <= UBound(LookupArray)) Then
If LookupArray(high, 1) > LookupValue Then
jRow = low
Else
jRow = high
End If
End If
If LookupValue = LookupArray(jRow, 1) Then BSearchMatch = True
End Function

The Exact Linear Search Function

LMatchExactV is not designed as a UDF: its called from IsInList2.

Function LMatchExactV(LookupValue As Variant, LookupArray As Variant) As Boolean

‘ use linear search to find if LookupValue exists in LookupArray
‘ LookupArray must be a a 2-dimensional variant array of N rows and 1 column

Dim j As Long

LMatchExactV = False
For j = 1 To UBound(LookupArray)
If LookupValue = LookupArray(j, 1) Then
LMatchExactV = True
Exit For
End If
Next j
End Function

The Partial Match Linear Search Function

This function use InStr to check for partial match. It is not designed as a UDF: its called from IsInList2.

Function LMatchInV(LookupValue As Variant, LookupArray As Variant) As Boolean

‘ use linear search and Instr to find if LookupValue is within any value in LookupArray
‘ LookupArray must be a a 2-dimensional variant array of N rows and 1 column

Dim j As Long
Dim strLook As String

LMatchInV = False
strLook = CStr(LookupValue)
For j = 1 To UBound(LookupArray)
If InStr(LookupArray(j, 1), strLook) > 0 Then
LMatchInV = True
Exit For
End If
Next j
End Function

The IsInList2 Array Function

Because the function uses the Dictionary Object from VBScript you need to add a reference to the Microsoft Scripting Runtime.

The Function takes 2 optional parameters which control the method used:

jSorted – which Sort/Lookup?Match method to use

FindExact – True for an exact match, False for a partial match

Public Function IsInList2(LookFor As Variant, LookIn As Variant, Optional jSorted As Long = 0, Optional FindExact As Boolean = True)

‘ jSorted
‘   =0 data not sorted but sort it
‘   =1 data already sorted – use binary search
‘   =-1 use linear search
‘   =2 use collection
‘   =3 use dictionary

Dim nLookFor As Long
Dim nLookIn As Long
Dim nOut As Long
Dim vOut() As Variant
Dim j As Long
Dim coll As New Collection
Dim dict As New dictionary

‘ coerce ranges to values

LookFor = LookFor.Value2
LookIn = LookIn.Value2

‘ get row counts

nLookFor = UBound(LookFor)
nLookIn = UBound(LookIn)
nOut = Application.Caller.Rows.Count

If nLookFor < nOut Then nOut = nLookFor
ReDim vOut(nOut, 1)  ‘ create output array

If FindExact Then
If jSorted = 0 Then
QSortVar LookIn, LBound(LookIn), UBound(LookIn)    ‘ quicksort
jSorted = 1
ElseIf jSorted = 2 Then
On Error Resume Next
For j = 1 To nLookIn
coll.Add LookIn(j, 1), CStr(LookIn(j, 1))   ‘ collection
Next j
ElseIf jSorted = 3 Then
On Error Resume Next
For j = 1 To nLookIn
dict.Add LookIn(j, 1), LookIn(j, 1)    ‘ dictionary
Next j
End If
On Error GoTo 0
End If

For j = 1 To nOut
If Not FindExact Then
vOut(j, 1) = LMatchInV(LookFor(j, 1), LookIn)    ‘instr linear search
ElseIf jSorted = 1 Then
vOut(j, 1) = BSearchMatch(LookFor(j, 1), LookIn)    ‘ binary search
ElseIf jSorted = 2 Then

‘ use collection

On Error Resume Next
Err.Clear
vOut(j, 1) = coll.Item(CStr(LookFor(j, 1)))
If CLng(Err.Number) = 5 Then
vOut(j, 1) = False
Else
vOut(j, 1) = True
End If
ElseIf jSorted = 3 Then
vOut(j, 1) = dict.Exists(LookFor(j, 1))    ‘ Dictionary
ElseIf jSorted = -1 Then
vOut(j, 1) = LMatchExactV(LookFor(j, 1), LookIn)    ‘ exact linear
Else
vOut(j, 1) = CVErr(xlErrValue)
End If
Next j

IsInList2 = vOut    ‘ return output
End Function

Performance Comparison

The performance tests are all done using ranges containing a character followed by 5 digit random integer numbers.

The timings are all in Milliseconds.
LookFor and LookIn give the number of rows.

Linear Search timings on smaller numbers of rows are slightly pessimistic because of a high % of miss-matches
BSearchOnly gives timings for Binary Search on already sorted data.
Partial gives timings for Linear Search with partial matching using InStr.

Because Excel VBA array formulas cannot return more than 64K rows you cannot use the function directly for more than 64K rows of LookFor without using more than one array formula.

LookFor

LookIn

Linear

QSortBsearch

Collection

Dict

BsearchOnly

Partial

2 2

0.29

0.29

0.29

0.46

0.29

0.29

50 50

4.42

4.14

4.05

4.06

3.91

4.43

200 200

12.2

5.59

5.08

4.40

4.49

14.2

500 500

55.5

8.9

7.3

4.9

5.66

68.0

2000 2000

824

27.0

19.9

8.03

12.2

995

50000 50000

-

800

515

252

279

-

50000 200000

-

2583

1290

1044

350

-

50000 1048756

-

14204

6720

23650

-

-

Overall the sequence from fastest method to slowest method is:

  • Dictionary
  • Binary Search Only (but the data has to already be sorted)
  • Collection
  • Quick Sort plus Binary Search
  • Linear Search
  • Partial Linear Search

Conclusion

The best method is to use the VBScript Dictionary object, unless you have more than 500K rows after which the Collection object starts to win.

The coding for Dictionary is very simple and it has an Exists method which Collection lacks.

Below about 2000 rows you probably won’t notice the difference between any of the methods except for Linear search.

So now I just have to try the C++ XLL version … I reckon that will beat Dictionary! (Yes, looks like XLL wins: 1923 millisecs for 50000 in 1048756)

Makeing the most of your XIPS Part2 – when 40 MXIPS for AVERAGEIFS is too slow

July 9, 2012

Peter wants to calculate a rolling average over 600K rows. His data consists of a Timestamp in Column A and a Value in column B:

+-Timestamp-+-value-+
| 1340816430|  .02  |
---------------------

He is using an array formula in each of the 600K rows (although it would work as an ordinary non-array formula):

{=AVERAGEIFS(B:B,A:A,"<"&A1+1000,A:A,">"&A1-1000)}
This calculates the average of the values starting at -1000 timestamp units and ending at +1000 timestamp units.

I setup some test data for the full 1048576 rows and found that 500 of the AVERAGEIFS formulas took 25 seconds to calculate on multithreaded 4 core system.

How many MXIPS can AVERAGEIFS do?

AVERAGEIFS is a fast function – its calculating 500 x (2000000 comparisons and averages of matching results) – say 1000 million operations in 25 seconds = 40 MXIPS (Million eXcel Instructions Per Second). The only problem is that Peter wants to do 600K of these, not 500, and it looks like that will take over 8 hours!

A more complicated but faster solution

Lets assume that the data is sorted ascending on Timestamp (anyway Excel is very fast at sorting data). Start by making sure that Excel is in Manual calculation mode!

Then in column C put

=IFERROR(MATCH(A1-1000,$A:$A,1),1)

and copy down. This finds the row that is 1000 timestamp units before the current row. The MATCH function is using its binary search of sorted data algorithm, and thats lightning fast. The IFERROR is to handle the starting condition for the first rows where there are no Timestamps at -1000 from the current Timestamp.

In Column D put

=IFERROR(MATCH(A1+1000,$A:$A,1),1048576)

and copy down. This finds the row that is 1000 timestamp units after the current row, in a similar way to column C

Now we know that start and end row numbers of the block of rows that Peter wants to average. So we can use OFFSET to get just that subset of rows, and feed it to AVERAGE.
So in Column E put

=AVERAGE(OFFSET(B1,C1-ROW(),0,D1-C1+1,1))

and copy down.

Now press Ctrl/Alt/F9 for a full calculation – on my system that takes just 20 seconds – several thousand times faster than AVERAGEIFS!


Follow

Get every new post delivered to your Inbox.

Join 34 other followers