2014 MVP Summit

In the first week of November I was privileged to attend the MVP Summit at Microsoft in Redmond. I think it was the best Summit I have been to, with much more open interchange with the Excel team about things that might or might not make it into Excel one day.

Anyway I can’t tell you anything more about that, but I can say that all the Excel MVPs are a really nice bunch of people, who like to sit around in Rock Bottom, drink beer and talk Excel geek stuff (which I also like to do to!). Here are some photos of my fellow MVP geeks (you will have met many of these on various Excel forums):

Kevin1 - CopyFrom left to right: Jon Peltier, Roger Govier, Kevin “Zorvek” Jones, Brad Yundt, Jon Acampora, Mynda Treacy

Jan_Karel - Copy

Jan-Karel Pieterse pours the beer (of course he never drinks any himself!) for new Excel team member Joe McDaid

Claire - Copy

UK MVP Lead Claire (what does this button do?)

Mynda_Jon

Mynda Treacy and Jon Peltier ham it up for the camera with Ingeborg “Teylyn” Hawighorst in the background.

Zak_Smitty

Zack “Tables go this way” Barresse (aka FireFytr), Chris “Smitty” Smith and Frédéric (if its not Power Query I’m bored) Le Guen

Ron_Wim_Mandy

Ron (Wrong Bus) Coderre, Wim Gielis, Mandy Govier

XL_Beer

Excel beer (courtest of Frederic)

And now for this year’s mystery photos:

Mystery1_2014

Mystery Photo 1: Where is this?

Mystery_2

Mystery Photo 2: What does this have to do with Excel?

 

Posted in Uncategorized | Tagged , | 6 Comments

Timing the Ins and Outs of User Defined Functions: Multi-Cell array formulas can be slow

I was looking at some multi-cell array formula UDFs with John and Rich and could not understand why they seemed so much slower than I expected.

Each UDF (written as C++ XLLs) read data from around 200 cells using around 40 parameters, did some very simple logic and arithmetic and then returned values to a different 200 cells. Each of these UDF was taking around 16-20 milliseconds and the workbook contained several thousand calls to these UDFs.

So I started to investigate:

First Hypothesis: Maybe marshalling input data from multiple parameters is slow?

We know that the way to speed up reading data from Excel Cells is to read data in as large a block as possible rather than in many small chunks. So maybe the number of parameters was the problem. To test this I wrote a couple of XLL functions.
MarshallAll takes a single parameter of a range of cells and returns the number of columns. The parameter is type Q so arrives in the function as values rather than a reference.
MarshallMany takes 40 parameters (also type Q) and returns a constant value of 88.


CXlOper* MarshallAll_Impl(CXlOper& xloResult, const CXlOper* Arg1)
{
long nCols=0;
nCols=Arg1->GetWidth2();
xloResult=(double)nCols;
return xloResult.Ret();
}

CXlOper* MarshallMany_Impl(CXlOper& xloResult, const CXlOper* Arg1, const
CXlOper* Arg2, const CXlOper* Arg3, const CXlOper* Arg4, const CXlOper*
Arg5, const CXlOper* Arg6, const CXlOper* Arg7, const CXlOper* Arg8, const
CXlOper* Arg9, const CXlOper* Arg10, const CXlOper* Arg11, const CXlOper*
Arg12, const CXlOper* Arg13, const CXlOper* Arg14, const CXlOper* Arg15,
const CXlOper* Arg16, const CXlOper* Arg17, const CXlOper* Arg18, const
CXlOper* Arg19, const CXlOper* Arg20, const CXlOper* Arg21, const CXlOper*
Arg22, const CXlOper* Arg23, const CXlOper* Arg24, const CXlOper* Arg25,
const CXlOper* Arg26, const CXlOper* Arg27, const CXlOper* Arg28, const
CXlOper* Arg29, const CXlOper* Arg30, const CXlOper* Arg31, const CXlOper*
Arg32, const CXlOper* Arg33, const CXlOper* Arg34, const CXlOper* Arg35,
const CXlOper* Arg36, const CXlOper* Arg37, const CXlOper* Arg38, const
CXlOper* Arg39, const CXlOper* Arg40)
{
xloResult=88.0;
return xloResult.Ret();
}

But when I compared the execution times of these functions they were both fast and there was not much difference in timings, although it was faster to read as many cells as possible with each parameter.

So hypothesis 1 failed.

I checked how the time taken varied with the number of cells read and their datatype.

Marshall3 Marshall4

As expected large strings take longer than small strings which take more time than numbers. And it is more efficient to read as many cells as possible for each parameter.

Second Hypothesis: Maybe returning results to multiple cells is slow?

We know that writing data back to Excel cells from VBA is significantly slower than reading data from cells.

(see VBA read-write speed and Getting cell data with VBA and C++)

I wrote another simple XLL UDF: MarshallOut. This took a single argument of a range and returned it.


CXlOper* MarshallOut_Impl(CXlOper& xloResult, const CXlOper* Arg1)
{
xloResult=*Arg1;
return xloResult.Ret();
}

Bingo: returning data to multiple cells is comparatively slow.

I used FastExcel Profile Formulas to time the tests:

Marshall6Reading and returning a 255 character string to each of 100 cells takes 13 milliseconds.

Marshall1Marshall2Notice that the time taken is NOT linear with respect to the number of cells.

Multi-Threading Effects

I also noticed that FastExcel Workbook Profiler showed that these functions were making inefficient use of multi-threaded recalculation. Presumably this is because they need an exclusive lock on Excel’s results table whilst writing out the results, and most of the time used is doing exactly that.

By contrast, the first set of “read-many cells but write one cell”  functions made efficient use of multi-threading.

Comparing XLL functions with VBA Functions.

I did a small number of comparison with VBA equivalents of these XLL functions. The VBA functions showed the same kind of performance behaviour as the XLL functions and were slightly slower. Of course VBA functions cannot multi-thread.

Conclusions

  • Large numbers of array formulas that return results to multiple cells can be slow.
  • Multi-cell array formulas do not multi-thread efficiently whilst writing their results back.
  • It is more efficient to have larger ranges and fewer parameters than many parameters referencing small ranges.

 

Posted in Calculation, UDF, Uncategorized, VBA, XLL | Tagged , , , , , | Leave a comment

Automatic or Manual Calculation?: Grouped Sheets causes Calculation Confusion.

Thanks to Simon Hurst and Paul Wakefield for telling me about this calculation weirdo, and credit to Chatman at Accounting Web for discovering it.

Grouped Sheets and Calculation Mode

Try these experiments:

Experiment 1

  1. Start Excel
  2. Open a workbook in automatic calculation mode and with only 1 sheet selected (so there are no grouped sheets).
  3. Add  =NOW() to a couple of sheets and format the cells to show seconds so that you can see when it calculates.
  4. Check that Calculation shows Automatic both in File-> Options->Formulas and in the Formulas Tab->Calculation Options.
  5. Set Calculation to Manual (does not matter how you do this)
  6. Select 2 or more Sheets (hold down Control and select 2 sheet tabs) so that the sheets are grouped (The Workbook title bar should show [Grouped] ).
  7. Calculation is still Manual
  8. Set the Calculation to automatic using File->Options->Formulas
  9. Notice that the workbook calculates, but Calculation immediately returns to manual (check both Formulas Tab->Calculation Options and File->Options->Formulas)
  10. Now set the Calculation to automatic using Formulas Tab->Calculation Options
  11. Now the workbook calculates and stays in Automatic Calculation Mode, but File->Options->Formulas says its Manual and Formulas Tab->Calculation Options correctly says its Automatic!
  12. Now ungroup the sheets and everything works normally

Experiment 2

  1. Start Excel
  2. Open a workbook in automatic calculation mode and with only 1 sheet selected (so there are no grouped sheets).
  3. Add  =NOW() to a couple of sheets and format the cells to show seconds so that you can see when it calculates.
  4. Check that Calculation shows Automatic both in File-> Options->Formulas and in the Formulas Tab->Calculation Options.
  5. Select 2 or more Sheets (hold down Control and select 2 sheet tabs) so that the sheets are grouped (The Workbook title bar should show [Grouped] ).
  6. Excel should still be in Automatic calculation mode, but File->Options->Formulas shows Manual and Formulas->Calculation Options shows Automatic.
  7. Calculation is actually Automatic.

Confused? Well so is Excel!

Conclusions

  • Looks like it is safer to use Formulas->Calculation Options
  • If you are using Grouped Sheets be careful about your Calculation Mode!
Posted in Uncategorized | Leave a comment

Excel 2007 multi-threading bug with Range.SubTotal and XLL

OK: this is a fairly obscure Excel 2007 only bug but I thought I should place it on record.

The conditions required for the bug are:

  • Excel 2007 (tested with 12.0.6683.5002 SP3)
  • Automatic calculation Mode
  • Multi-threaded calculation is on and the system has multiple cores
  • XLL multi-threaded worksheet functions are being used
  • A VBA routine uses Range.SubTotal to create subtotals

The bug symptoms are

  • Either Range.subtotal fails and Excel becomes unstable
  • Or Range.subtotal seems to work but Excel becomes unstable

This bug seems to have been fixed in Excel 2010 and later versions.

Bypassing the bug

If your VBA switches off Multi-threaded calculation (Application.MultiThreadedCalculation.Enabled = False) just before doing the Range.SubTotal and then switches it back on again the bug is bypassed.

Whats causing the problem

Using Range.subtotal is triggering a calculation event. VBA always runs on the main thread, but the XLL multi-threaded functions can run on any thread.

So presumably the problem happens when the XLL function is not calculated on the main thread and tries to return a result to Excel, but Excel is not ready to accept it, thus a portion of memory gets overwritten.

Posted in Uncategorized | Leave a comment

Extracting Digits from Text: Using Formulas and Designing a Missing Excel Function: GROUPS

An Excel problem that crops up quite often is how to extract digits (0-9) from text. The text might be part numbers, or web addresses, or currency values or …

Some cases are easy to handle with formulas:

  • A fixed number of digits at the start or end of the text-string (Use LEFT or RIGHT)
  • A fixed number of digits starting at a fixed point within the string (Use MID)
  • Groups of n digits separated by a single separator character (Use MID)

But in real life things are often not so simple:

  • No fixed position for the start
  • A variable number of digits
  • A variable number of separators
  • Extract all the digits or only one group
  • Need to locate a particular text string as separator
  • Extract the nth group of digits
  • Work from left to right or right to left
  • Extract only the first or the last n digits from a group

Most of these more complex cases can still be solved using formulas, but the required formulas are often long, complicated, hard to understand and do not adapt well to changes in the data.

Some Formula Examples

The first example is from an article by MVP Ashish Mathur

The Data and results look like this:

Digits1

Ashish’s formula, entered using Control/Shift/Enter as an array formula, looks like this (see his article for an explanation).{=MID(A2,MATCH(TRUE,ISNUMBER(1*MID(A2,ROW($1:$9),1)),0), COUNT(1*MID(A2,ROW($1:$9),1)))}

The next example is from Chandoo

Digits2

This one is probably impossible for all the entries just using a formula, but here is Chandoo’s best attempt.{=MID(B4,MIN(IFERROR(FIND(lstNumbers,B4),””)), SUMPRODUCT(COUNTIF(lstDigits,MID(B4,ROW($A$1:$A$200),1))))+0}

Where lstNumbers is a range containing the digits 0-9 and listDigits is a range containing 0-9 comma and decimal point.

The final example is from a Bill Jelen/Mike Girvin Dueling Excel Podcast, and its really tricky. Some example data looks like this:

Digits4The challenge is to extract the last 3 of the consecutive digits after the first POV_ and before any non-numeric character. As Bill Jelen points out, this is much easier to do with a VBA UDF than with formulas. But if you watch the podcast you can see how Mike Girvin develops some incredible formulas to do the job.

Bill’s VBA UDF looks like this:


Function Nums(MV)
x = Application.WorksheetFunction.Find("POV_", MV) + 4
Nums = ""
For i = x To Len(MV)
ThisChar = Mid(MV, i, 1)
Select Case ThisChar
Case ".", "_"
GoTo FoundIt
Case "0", "1", "2", "3", "4", "5", "6", "7", "8", "9"
Nums = Nums & ThisChar
End Select
Next i

FoundIt:
Nums = Right(Nums, 3)
End Function

The Missing Excel Function: GROUPS

Well, after watching Bill and Mike solving this really tricky problem I started wondering why does Excel not have a function to do this kind of stuff? And what would it look like if it did have one?

So (of course) I decided to build one! After a few iterations and the inevitable scope creep the requirements looked like this:

  • Extract groups of characters from a text-string
  • Allow the user to define what constitutes a group of characters
  • Extract the Nth group from the start, or the Nth group working backwards from the end
  • Option to specify the maximum number of characters to extract from the front or the back of the group
  • Option to give the start and/or end position within the string for the search for groups.

GROUPS(Text, GroupNumber, MaxChars, GroupType, StartPos, EndPos)

GroupNumber can be zero (all Groups) or a positive or negative number to get the Nth group from the start or end.

MaxChars can be zero (all characters) or a positive or negative number to restrict the number of characters from the start or end of the group.

GroupType can be either of

  • a Regex Pattern string, for example [0-9,.] would define a group as consecutive characters consisting of 0 to 9 comma and decimal point
  • A number from 0 to 4 for the most common group types (0-9 , a-z , not 0-9 , 0-9 and . , not 0-9 and .)

StartPos and EndPos default to the first character (1) and the last character (0)

GROUPS Examples

To get the result from Ashish’s data shown above use

=GROUPS(A2)

The defaults are: get the first group of numbers starting at the left.

For Chandoo’s example you need to define the group of digits as being 0-9 comma and decimal point, and you could append the other characters as groups of everything except 0-9 comma and decimal point and space. A bit of experimentation shows that using -1 as the group number gives better results for this data!

=GROUPS(B4,-1,,”[0-9,.]“) & ” ” & GROUPS(B4,-1,,”[^0-9,. ]“)

Digits5Which is pretty good apart from maybe from having to choose between the INR18lacs or USD$36000!

For Bill Jelen and Mike Girvin’s podcast problem there are a couple of approaches:
a slight cheat notices that the numbers we want are always the second group of numbers so this works: find the last 3 digits from the second group of digits.

=GROUPS(A2,2,-3)

But for the original problem as stated we need to look for POV_

=GROUPS(A2,1,-3,0,SEARCH(“Pov_”,A2)+4)

Find the last 3 digits from first group of digits found starting after POV_.
I am using SEARCH rather than FIND because SEARCH is not case-sensitive.

Conclusion

  • It would be much easier to solve these kind of problems if Excel had a function like GROUPS.
  • You can try out the GROUPS function for yourself by downloading the 15-day trial of FastExcel Version 3 from my website.
  • The GROUPS function is implemented as a multi-threaded XLL function, so performance is quite good.

If you have a real-life extraction problem that cannot be solved by the GROUPS function please let me know!

Posted in UDF, VBA, XLL | Tagged , | Leave a comment

Getting Used Range in an XLL UDF: Multi-threading and COM

In two previous blog posts I discussed why handling whole-column references efficiently in VBA UDFs meant that you had to find the used range for the worksheet containing the whole-column reference. The posts also discussed how using a cache for the used ranges could give a significant performance improvement.

Full Column References in UDFs
Getting used Range efficiently with a cache

But how do you do this for multi-threaded C++ XLLs?

The XLL API cannot directly get the Used Range

The first problem is that the XLL API is based on the old XLM language, and that does not have a method for finding the used range from a worksheet function.

So you have to make a callback from the XLL into Excel to the COM interface, and that can very easily go wrong. Excel does not generally expect call-backs of this type, so the call-back may be ignored, or access the wrong part of memory or even crash Excel.

And with multi-threaded UDFs you certainly only want one UDF calling back at a time!

When you combine this with the need to access and maintain a cache for efficiency reasons the logic looks something like this:

  • For each reference referring to a large number of rows (I am currently using 100000 rows as large)
    • nRows=number of rows referenced
    • Try to get a shared read lock on the cache: if fails exit
    • If there is anything in the cache for this worksheet then nRows=MIN(cached rows, nRows)
    • Else try for an exclusive lock on the cache (write lock): if fails exit
      • Try to callback to Excel COM to get the used range rows.
      • If succeeded store in the cache for this worksheet & nRows=MIN(cached rows, nRows)
      • Unlock the cache
    • If failed to get a lock or COM failed then exit
  • Next large reference

Note that the logic is fail-safe: if the UDF cannot get the lock it needs on the cache or the COM call does not succeed it just uses the number of rows in the reference.
This worked well most of the time but in some situations it always failed.

Excel COM callbacks must be on the main thread.

Trying to figure out exactly what the problem was was tricky, but eventually Govert van Drimmelen, the author of the wonderful Excel DNA, pointed out that calls to COM have to be executed on the main thread.

So I grab the thread ID of the main thread in the XLL On Open event, and then only call the exclusive lock and COM if the UDF is being executed on the main thread.

And it works: thanks Govert!

Posted in Calculation, UDF, XLL | Tagged , | Leave a comment

Excel Modeling World Championships 2014

Excel Financial Modelling is not my thing – but if its yours you may be interested in this:

Invitation to the Excel Modeling World Championships 2014

ModelOff 2014Members of the ​FastExcel community are invited to the annual Excel and Financial Modeling World Championship 2014 Event (ModelOff, www.modeloff.com). The Advanced Excel educational competition helps celebrate Excel in Financial Services. The fun, challenging and innovative competition has a mission to inspire skill development with Microsoft Excel, Financial Modeling and Financial Analysis which is central to global businesses and communities. The competition showcases some of the fastest, hard-working and talented Excel minds from 100+ countries. Round 1 starts on 25th October 2014 (8 weeks away)

Summary of Event

Over 3,000 participants competed in the ModelOff 2013 event. Major Global Partners and Sponsors are: Microsoft, Intralinks, S&P Capital IQ, Kaplan Education, Bloomberg, AMT Training, Corality and Ernst & Young.  Participants come from diverse companies and jobs – such as Analysts, Associates and Managers at Investment Firms and Accounting Firms, CFOs, Analytics Professionals, In-House Excel Gurus and Consultants with a shared passion for Microsoft Excel and Finance.  Students comprise ~35% of all entrants worldwide – most studying Commerce, Accounting, Finance and Masters university qualifications. The countries most represented have typically been: United States, UK, Poland, Russia, Canada, Australia, India and Hong Kong.  Female participation is ~20% of all competitors – hopefully higher this year (the reigning champion is Hilary Smart 26yo from London).

How It Works

The ModelOff competition involves two online qualification rounds (2 hours each) conducted simultaneously around the world. The Top 16 performers are flown to New York for the Live Finals at the offices of Microsoft and Bloomberg in early December 2014.  Questions are mostly case study and multiple choice format – with some ranging from a basic understanding of discounted cash flow (DCF) analysis, 3-way integrated cash flow models to more complex project finance and simulation techniques.  Some basic Accounting, Finance and Excel knowledge is likely needed to progress to Round 2.

Free Training and Preparation

All past questions and tests from ModelOff 2012 and 2013 are free and available on the ModelOff website. The organizers believe in accessible excellence and this can be a great starting point for anyone looking to become involved for the first time and improve their Advanced Excel skills in Financial Services and Financial Modeling.  We actively encourage all participants to visit all our community partners, bloggers, our major global sponsors during and following the event for their own learning, mentoring and professional development. Competitors in the Top 10% of ModelOff 2014 will be eligible for exciting local and international opportunities, offers from community partners and fun learning experiences (e.g. Trips to Microsoft Excel in Redmond). We’re also hosting free networking events in major Financial Centres including Hong Kong, London, Sydney, New York and Regional Meetups in the coming months for anyone interested in networking and mentoring with Excel-users in their local cities. Entry to the competition is $20 for students and $30 for professionals.

 

Posted in Uncategorized | 2 Comments

VBA searching shootout between WorksheetFunction.Match: Performance Pros and Cons

It is a rainy day here in Norfolk so …

Prompted by a claim that searching using a variant array was much faster than calling MATCH from VBA I thought it was time for a detailed performance analysis.
I shall try to discover when it is better to use the array method rather than MATCH and whether the performance penalty of a wrong choice is significant.

An earlier post match-vs-find-vs-variant-array-vba-performance-shootout looked at doing a slightly more complex search. This time I will use a simpler test case and concentrate on determining the influence of data size, the length of the search within the data and the overhead of each method.

Test Setup

The test setup is very simple: column A1:A100000 contains a sequence of numbers from 1 to 100000.
Each different method is tested using a variety of sizes of rows and a variety of search depth percentages.

Match_Array1So for example 25% search depth of 400 rows means that the search will look 100 rows deep in the 400.

I am testing with Excel 2010 32-bit mainly (with some Excel 2013 32-bit comparisons) under Windows 7 Pro 64 bit.
My PC is a Dell Latitude E6530 with 8GB memory and a quad core i7-3720QM 2.60 GHZ. However since all the tests are VBA based only one core will be used.

Methods Tested

I am testing 6 different methods.

  1. Linear search using a For loop on a variant array created from the resized range in Column A
  2. Linear search using WorksheetFunction.MATCH with the unsorted option directly on a range created from Column A.
  3. Linear search using Application.MATCH with the unsorted option directly on a range created from Column A.
  4. Linear search using WorksheetFunction.MATCH with the unsorted option on a variant array created from the resized range in Column A
  5. Binary search using WorksheetFunction.MATCH with the sorted option directly on a range created from Column A.
  6. Cell by Cell loop directly on a range created from Column A.

The VBA Code

The VBA code is designed as a main subroutine returning a variant array of results to a driver sub.

Each search method is embedded in 3 loops:

  • Loop on Range Size (Number of Rows)
    • Loop on % Search Depth (How far to traverse within the range)
      • Loop on Trails (each test is timed 5 times and the median of the 5 times is used)

Timing is done using the MicroTimer Windows high-resolution timer.


Sub DoSearchTests()
Dim j As Long
Dim vResults() As Variant
'
' call each method in turn
'
For j = 1 To 6
SearchTimer vResults(), j
Worksheets("Sheet1").Range("E4").Offset((j - 1) * 14, 0).Resize(UBound(vResults), UBound(vResults, 2)) = vResults
Next j
End Sub
Sub SearchTimer(vResults() As Variant, jMethod As Long)
'
' 6 Search Methods:
'
' 1 Linear search of variant array
' 2 Linear WorksheetFunction.MATCH on Range
' 3 Linear Application.MATCH on Range
' 4 Linear WorkSheetFunction.Match on Array
' 5 Binary Search Match on Range
' 6 Cell by Cell search
'
Dim vArr As Variant
Dim j As Long
Dim i As Long
Dim jF As Long
Dim jRow As Long
Dim jTrial As Long
Dim dTime As Double
Dim NumRows As Variant
Dim SearchDepth As Variant
Dim vTrials() As Variant
Dim rng As Range
Dim SearchVal As Variant
''' initialise
NumRows = Names("nRows").RefersToRange.Value2
SearchDepth = Names("SearchDepth").RefersToRange.Value2
ReDim vResults(LBound(NumRows) To UBound(NumRows), LBound(SearchDepth, 2) To UBound(SearchDepth, 2))
dTime = MicroTimer
With Worksheets("Sheet1")
''' loop on number of rows
For i = LBound(NumRows) To UBound(NumRows)
''' loop on % search depth
For jF = LBound(SearchDepth, 2) To UBound(SearchDepth, 2)
''' derive search value as a % of the number of rows
SearchVal = Int(SearchDepth(1, jF) * NumRows(i, 1))
If SearchVal < 1 Then SearchVal = 1
''' find the median time of 5 trials
ReDim vTrials(1 To 5)
For jTrial = 1 To 5
''' timing loop
If jMethod = 1 Then
''' get array and loop for search
dTime = MicroTimer
Set rng = .Range("a1:A" & CStr(NumRows(i, 1)))
vArr = rng.Value2
For j = LBound(vArr) To UBound(vArr)
If vArr(j, 1) = SearchVal Then
jRow = j
Exit For
End If
Next j
dTime = MicroTimer - dTime
ElseIf jMethod = 2 Then
''' use linear WorksheetFunction.Match on the range
dTime = MicroTimer
Set rng = .Range("a1:A" & CStr(NumRows(i, 1)))
jRow = WorksheetFunction.Match(SearchVal, rng, 0)
dTime = MicroTimer - dTime
ElseIf jMethod = 3 Then
''' use linear Application.Match on the range
dTime = MicroTimer
Set rng = .Range("a1:A" & CStr(NumRows(i, 1)))
jRow = Application.Match(SearchVal, rng, 0)
dTime = MicroTimer - dTime
ElseIf jMethod = 4 Then
''' use linear WorksheetFunction.Match on an array from the range
dTime = 0#
If NumRows(i, 1) <= 65536 Then
dTime = MicroTimer
Set rng = .Range("a1:A" & CStr(NumRows(i, 1)))
vArr = rng.Value2
jRow = WorksheetFunction.Match(SearchVal, vArr, 0)
dTime = MicroTimer - dTime
End If
ElseIf jMethod = 5 Then
''' use binary search Match on the range
dTime = MicroTimer
Set rng = .Range("a1:A" & CStr(NumRows(i, 1)))
jRow = WorksheetFunction.Match(SearchVal, rng, 1)
dTime = MicroTimer - dTime
ElseIf jMethod = 6 Then
''' get cell value and loop
dTime = MicroTimer
Set rng = .Range("a1:A" & CStr(NumRows(i, 1)))
For Each vArr In rng
If vArr = SearchVal Then
jRow = j
Exit For
End If
Next vArr
dTime = MicroTimer - dTime
End If
''' store timings for trials
vTrials(jTrial) = dTime * 1000000
Next jTrial
''' get median of the trials
vResults(i, jF) = WorksheetFunction.Median(vTrials)
Next jF
Next i
End With
End Sub

 Test Results

All timings are in Microseconds (millionths of a second).

Looping the Variant Array from a Range (Method 1)

XL2010_Method1The timings for this method show:

  • An overhead of 10 Microseconds per call
  • The 0% column is a good approximation to the time taken to read the range into a variant array. This increases with the size of the range.
  • Search time could be calculated by subtracting the 0% column from the other columns and as expected increases with the number of cells  being traversed.

Using WorksheetFunction.Match on Ranges (Method 2)

XL2010_Method2

The timings for this method show:

  • An overhead of 13 Microseconds per call (larger than the array method)
  • The 0% column is constant because no data is transferred from Excel to VBA..
  • Search time less overhead increases with the number of cells  being traversed but is lower than the array method.

Using Application.Match on Ranges (Method 3)

XL2010_Method3

I added this method to compare using Application.MATCH with WorksheetFunction.MATCH.
The timings for this method show:

  • An overhead of 16 Microseconds per call (larger than the WorkSheetFunction method)
  • The 0% column is constant because no time is taken to transfer the data from Excel to VBA
  • Search time less overhead increases with the number of cells  being traversed and is very similar to the WorkSheetFunction method.

Using WorksheetFunction.Match on an array derived from a Range (Method 4)

XL2010_Method4

The timings for this method show:

  • An overhead of 15 Microseconds per call (larger than both the array and Match methods)
  • The 0% column increases sharply with data size and is larger than Method 1 because the data is transferred from Excel to VBA and back to the MATCH function.
  • Search time less overhead increases with the number of cells  being traversed but is lower than the array method.
  • The 100000 row is missing because this method only allows a maximum of 65536 rows before the MATCH function  fails.

Using the Binary Search (sorted) option of WorksheetFunction.Match on Ranges (Method 5)

XL2010_Method5WOW!!! That is seriously fast.

  • The overhead is comparable to Linear Search WorksheetFunction.MATCH (Method 2)
  • The 0% column is constant because no data is transferred from Excel to VBA.
  • Search time is close to zero even over 100000 rows because of the efficiency of the Binary Search method being used.
  • The data has to be sorted for this to work
  • If there is missing data it would need to be handles with the double VLOOKUP/MATCH trick.

Using Cell by Cell search (Method 6)

XL2010_Method6The timings for this method show:

  • The overhead is comparable to Method 1 but the 0% column does not increase with increasing data size because only the minimum amount of data is being transferred from Excel to VBA.
  • For large amounts of data this method is extremely inefficient, but for small (20-30 roes) volumes you will not usually notice it.

Breakeven between the Array Method (1) and the WorksheetFunction.MATCH method (2)

Excel2010Compare

This table shows the ratio of method 1 timings to method timings, so anything >1 shows that method 2 is faster.

  • For large amounts of data MATCH is significantly faster than the array method.
  • The breakeven point for Excel 2010 32-bit is around 40 rows.
  • The traversing search speed of MATCH is faster than the array method and data does not need to be transferred from Excel to VBA.
  • But the overhead of the Array method is lower than that of Worksheetfunction.Match

Comparing Excel 2013 with Excel 2010

Excel2013CompareWith Excel 2013 the breakeven point is slightly more in favour of MATCH than with Excel 2010.

Excel 2013 Method 1 (Variant array search)

XL2013_Method1

Excel 2013 Method 2 (WorksheetFunction.Match)

XL2013_Method2

 Conclusions

  •  Both the Array method and WorksheetFunction method are fast.
  • When you just want to do a search the MATCH method is the way to go.
  • When you want to do a search combined with processing the data the Array method is often more efficient. (See match-vs-find-vs-variant-array-vba-performance-shootout  )
  • For small amounts of data either method is good because the timing differences will not be noticeable unless you are iterating the method thousands of times.
  • There does not seem to be a major difference in these tests between Excel 2010 and Excel 2013.
Posted in Uncategorized | Tagged | 5 Comments

FastExcel V3 Released with Introductory Offer

FastExcel has been used successfully by thousands of users since it was first launched in 2001. The last version 2.4 was introduced in 2008 and since that time there have been major changes to Excel with Excel 2007, 2010 and 2013, including 64-bit Excel and Multi-threaded calculation.

FastExcel Version 3 is a major upgrade of the successful V2.4 product and has been under development for several years.

Special Introductory Offer – 50% off the FastExcel V3 Bundle

You can get a special launch offer of a 50% discount on the FastExcel V3 Bundle (all the FastExcel V3 products) as long as you buy a license before the end of July 2014.

The FastExcel V3 Family of Products

There are 3 major products in the FastExcel V3 family which are targeted at different types of useage. The aim is to allow you to only buy the tools you need.

FastExcelV3

FastExcel V3 Profiler

The Profiler gives you a comprehensive set of tools focussed on finding and prioritising calculation bottlenecks. If your spreadsheet takes more than a few seconds to calculate you need FastExcel profiler to find out and prioritize the reasons for the slow calculation

  • Profiling Drill-Down Wizard – the easy way to find bottlenecks
  • Profile Workbook – profiles worksheet calculation times, volatility, multi-threaded efficiency and memory useage.
  • Profile Worksheet – profiles worksheet areas, columns and row including conditional formats
  • Profile Formulas and Functions – profiles each unique formula’s calculation time and function types
  • Map Cross-References – shows and counts the links between worksheets.

FastExcel V3 Manager

FastExcel Manager contains tools to help you build, debug and maintain Excel workbooks.

  • Name Manager Pro – an invaluable tool for managing Defined Names and Tables
  • Formula Viewer/Editer – a better way of editing and debugging more complex formulas.
  • Sheet Manager – Easily manage and manipulate worksheets.
  • Workbook – Cleaner – Trim down any excess bloat in your workbooks
  • Where-Used Maps – See where your Defined Names, Number Formats and Styles are being used

FastExcel SpeedTools

SpeedTools provides you with a state-of-the-art tool-kit to help you speed up your Excel Calculations

  • Calculation timing tools for workbooks, worksheets, and ranges
  • Additional calculations modes to enhance control of calculation so that you only calculate what needs to be calculated.
  • 90 superfast multi-threaded functions
  • Faster and more powerful Lookups and List comparisons
  • Multi-condition filtering and Distinct formulas to eliminate many slow SUMPRODUCT and Array formulas
  • Enhanced functions for Array handling, text, mathematics, sorting, information and logic

Try it out for yourself:

Download the 15-day full-featured trial of FastExcel V3 build 215.642.755.317

Note: the trial version of FastExcel V3 profiler does not enable Profile Workbook, Profile Worksheet and Profile Formulas, and the Drill down wizard will only profile a single worksheet.

You can convert the trial version of FastExcel V3 to a fully licensed version at any time by purchasing one of the FastExcel V3 licensing options.

Want to know more?

Download the FastExcel V3 Help file or the FastExcel V3 User Guide (PDF)
(you may need to unblock the downloaded .CHM file – Right-Click->Properties->Unblock)

Posted in Calculation, Lookups, Memory, UDF, VBA, XLL | Leave a comment

Making sense of complex Formulas: an Indenting Viewer-Editer

Some time ago I was working with a client to speed up one of their workbooks.
I was using the FastExcel V3 formula profiler and it showed that one formula was taking a significant proprtion of the calculation time.

And the formula was too complicated to easily understand. So I decided that FastExcel V3 really needed a better way of understanding, creating and modifying formulas and started developing one. It currently looks like this:

Indent1Of course the trouble with creating a formula indenter is:

  • Nobody agrees what the “correct” indentation style is!
  • And anyway what works well for one formula does not neccessarily work well for another formula.

So I added the ability to dynamically switch indentation styles: for me splitting the OR(…) section by commas makes it easier to read -

Indent2The Viewer-Editer also helps you debug the formula by showing you the result of the selected portion of the formula and by making it easy and fast to jump to and select different parts of the formula.

Indent3The Select options work in conjunction with the navigate arrows (Next left, This, Next right, Expand selection, Contract selection).
So if you click the right arrow with Functions selected the selection jumps to the next function on the right and shows you the result in the evaluate box.

Indent4Modifying the Formula

You can modify the formula by directly editing the formula text and there are also many of the familiar Excel tools built-in:

  • Function Wizard
  • Insert a Reference or a Defined Name
  • Change a reference from Relative to Absolute (F4)
  • Build up a Mega-Formula by inserting a copy of another formula

Indent6

Clicking the Function Wizard button when a function is selected brings up the function wizard for that function so that its easy to change:

Indent5but if nothing is selected then the Function Wizard is called allowing you to choose a function, enter its parameters, and have it inserted at the current insertion point.

Conclusion

I have added quite a lot of function to the viewer-editer since the original concept, but I am sure it can be improved further.

So please download FastExcel V3 Beta 3, try it out and let me have your comments.

 

 

Posted in Uncategorized | 6 Comments