In part 1 and part 2 of Developing Faster Lookups I side-stepped the question of whether its faster to use your own VBA Binary Search routine than to use Excel’s Worksheet Functions.

In this post I will look at 2 Binary Search UDFs. Both UDFs are the equivalent of MATCH with **sorted ascending** data, except that when there are multiple identical values that could be the answer:

- MATCH returns the
**LAST**of the identical values - BSearchFirstV and BSearchFirstR return the
**FIRST**of the identical values

Binary search works by looking at the midpoint of the range of lookup values and comparing it to the value being looked up.

If the lookup value is less than the midpoint then the upper half of the range is discarded and Binary search looks at the midpoint of the lower half of the range. So on each successive iteration half the data gets discarded. This makes the Binary Search algorithm extremely efficient with anything more than 10-20 rows of data.

To find the first of multiple sorted matches we want to find the consecutive pair of values where the first is less than the lookupvalue and the second is >= the lookupvalue.

When the pair has been found we have to check whether its the first value we want or the second. And then we have to handle the boundary conditions:

- The lookup value is less than the first value in the lookup table
- The lookup value is equal to the first value in the lookup table
- The lookup value is greater then the last value in the lookup table

The algorithm is very simple and concise, but surprisingly difficult to program correctly. Many variations exist and its often useful to create your own variation for particular circumstances.

I have coded 2 variations: **(these are revised versions – the first versions I posted had bugs!)**

BSearchFirstR only works when you pass the LookupTable as a Range. BSearchFirstV is designed to work withe the LookupTable being passed as a 2-Dimensional 1 column N rows variant array (although if you give it a Range it converts it to a variant array anyway).

### Here is the code:

Option Explicit Option Base 1 Option Compare Text ' ' COPYRIGHT © DECISION MODELS LIMITED 2006, 2007, 2011. All rights reserved ' May be redistributed for free but ' may not be sold without the author's explicit permission. ' Public Function BSearchFirstR(LookupValue As Variant, LookupArray As Range) As Variant ' ' find the position of the largest value in LookupArray <= LookupValue ' LookupArray must be a Range ' LookupArray must be sorted ascending ' if multiple LookupValues exits in LookupArray, the position of the first will be returned ' Dim low As Long Dim high As Long Dim middle As Long Dim varMiddle As Variant BSearchFirstR = CVErr(xlErrNA) With LookupArray low = 0 high = .Rows.Count Do While high - low > 1 middle = (low + high) \ 2 varMiddle = .Cells(middle, 1).Value2 If varMiddle >= LookupValue Then high = middle Else low = middle End If Loop If (low > 0 And high <= .Rows.Count) Then If .Cells(high, 1) > LookupValue Then BSearchFirstR = low Else BSearchFirstR = high End If ElseIf low = 0 Then If .Cells(1, 1).Value2 <= LookupValue Then BSearchFirstR = 1 End If End If End With End Function ' ' ' Public Function BSearchFirstV(LookupValue As Variant, LookupArray As Variant) As Variant ' ' find the position of the largest value in LookupArray <= LookupValue ' LookupArray must be a Range or a 2-dimensional variant array of N rows and 1 column ' LookupArray must be sorted ascending ' if multiple LookupValues exits in LookupArray, the position of the first will be returned ' Dim low As Long Dim high As Long Dim middle As Long Dim varMiddle As Variant BSearchFirstV = CVErr(xlErrNA) If TypeName(LookupArray) = "Range" Then LookupArray = LookupArray.Value2 low = 0 high = UBound(LookupArray) Do While high - low > 1 middle = (low + high) \ 2 varMiddle = LookupArray(middle, 1) If varMiddle >= LookupValue Then high = middle Else low = middle End If Loop If (low > 0 And high <= UBound(LookupArray)) Then If LookupArray(high, 1) > LookupValue Then BSearchFirstV = low Else BSearchFirstV = high End If ElseIf low = 0 Then If LookupArray(1, 1).Value2 <= LookupValue Then BSearchFirstV = 1 End If End If End Function

### Notes:

- Most error checking has been omitted to simplify the code
- There is no checking for uncalculated cells
- There is no allowance for whole-column references or ranges containing empty cells.
- The UDFs assume the data is sorted ascending
- The UDFs do not allow for wild-card text matching

Timing Results

- BSearchFirst V is faster than BSearchFirstR when the Lookup Table contains less than about 500 rows.
- For large ranges BSearchFirstR is considerable faster because the whole of the Lookup Table data is not transferred to VBA.
- Both variations are slower than using approximate MATCH, but are faster than using Linear Search (unsorted) MATCH to find the first exact match for large ranges, and are faster than Array formulae for finding the first of multiple matches

### Conclusion

Using a VBA coded Binary Search routine will not improve the speed of the FLOOKUP UDF.

However when you want a variation of MATCH that is not in the built-in Excel MATCH VBA Binary Search is a useful tool.

Your 2nd function is missing the ‘With’ statement

Thanks Steve: actually it had an End With it did not need.

@ Charles

“MATCH returns the LAST of the identical values”

Well this not entirely true.

If your data is sorted in Ascending order and you use the 1 type it returns the position of the last of the identical values

If you data is sorted in Descending order and you use the -1 type it returns the position of the 1st of the identical values

Sam,

Yes: the whole post is written for sorted ascending (“the UDFs are the equiovalent of MATCH with sorted ascending”). For sorted descending you need another small variation of Binary search to get the Last value: I will leave that as an exercise for the readers!

Hi

Thanks for the functions.

I’m experiencing a problem with BSearchFirstR though.

I’m looking up a value which I can find in the LookupArray by doing a manual search but the function returns an #NA.

I have duplicate values in the lookuparray. Could this throw the function off?

Oh hang on…

Now it returns the wrong line number.

My value is sitting in line 338 but the function returns 343

If you can send me (Charles at DecisionModels.com) your data I will take a look

Ludgy’s problem was that the data contained a mixture of upper and lower case keys sorted in a case-insensitive way, but the binary search was coded in a case-sensitive way: I have added Option Compare Text to the code to fix this.

Thanks for the speedy reply and for the code.

It’s made my life a whole lot easier.

Ludgy