Zack (Firefytr) Barresse (who wrote the definitive guide to Excel Tables with Kevin (Zorvek) Jones) recommends a limit of around 10K rows for tables if you want to keep performance reasonable.

Prompted by a thread on the Excel-L forum I thought I should spend some time researching this.

Eric Lacroix kindly posted a test problem on Dropbox. I have simplified it further to make it clearer what is going on.

The workbook has a Table with 15000 rows and 2 calculated columns.

Columns C and D contain COUNTIFS formulas referencing columns :

A full calculation on this workbook with Excel 2013 takes **3.6 seconds** on my 4.5GHZ I7 6700K.

**In Manual Calculation mode if you copy Column B2:B15000 (which do NOT contain any formulas) then doing a Past Special Values back onto column B takes 1.9 seconds!**

There is no calculation time involved in this operation and none of the formulas in columns C and D are recalculated or re-evaluated. It is just the paste values operation that takes the time.

If you convert the table to a normal range, which converts the structured references to normal range references, then

- Full Calculation still takes 3.6 seconds
**But the Paste operation takes about 2 milliseconds! About 1000 times faster.**

So the slowdown is :

- Not caused by Calculation
- Caused by Structured References

After doing some more research I discovered that the problem is caused by Excel 2013 being extremely slow to flag all the formulas containing structured references to the data in Column B for recalculation (make them dirty).

And its only slow if the formulas are not already dirty (but note that doing a recalculation automatically “cleans” all the formulas)

**If you set the ForceFullCalculation property of the workbook to true** then Excel does not bother to dirty formulas. The downside is that Excel then always does a full calculation of all the formulas in the workbook rather than a smart recalculation of only the dirtied formulas.

**So it’s a trade-off: faster editing but slower calculation.**

I was surprised to find that when I tried to duplicate the problem with Excel 2016 I could not!

**The Excel team have fixed the slowdown! (But don’t seem to have told anyone).**

]]>

Looking at his benchmark and its VBA code you can see that has ScreenUpdating ON and repeatedly calculates in Automatic mode. Each calculation triggers some RANDBETWEEN functions with a number of dependent cells.

Gurs does not want to speed up his benchmark because that would destroy his historic speed comparisons.

But the problem is that a large portion of the time in his benchmark is taken by screen updating, and so his benchmark results vary significantly depending what part of the worksheet is actually visible on the screen, and hence **how many visible cells get refreshed at each calculation**.

I ran Gurs benchmark on my desktop PC with Excel 2003 to Excel 2013., with a constant screen area visible (rows 1:118 and columns A:BF). Since I cannot install Excel 2016 on the same PC as previous versions without causing unwanted problems I used a VM on my desktop and also ran the benchmark on my Surface Pro 3.

This shows loops per second (higher is better) with Screen Updating On and Off, the ratio of OFF to ON and the ratio to Excel 2003.

The results for Excel 2003, 2007 and 2010 show that:

- Excel 2007 and 2010 are slower than Excel 2003
- Screen Updating On is 15-20 times slower than Screen Updating Off

But something changed with Excel 2013!

- Screen Updating ON gets significantly faster
- Screen Updating OFF gets significantly slower

My visual impression is that Excel 2013 does not try to update the screen on every iteration when the update frequency is high, and this is the reason for the change.

And although I don’t have an exact comparison for Excel 2016 it looks comparable to Excel 2013.

So my next step was to try to create a pure screen updating benchmark.

In column A I put 28 =RAND() formulas, and then in columns B:V i put very simple formulas that linked back to column A.

This gives me 616 cells that will change on each calc, and its easy to keep all 616 cells visible on the screen. The VBA code times 10000 calculations with Screen Updating Off and again with Screen Updating Nn. The difference between these is the time taken by the screen updating.

Sub ScreenTest() ' ' time screenrefresh ' Dim i As Long Dim tStart As Double Dim tEnd As Double Dim tScreenOn As Double Dim tScreenOff As Double With Application .Calculation = xlCalculationManual .ScreenUpdating = False tStart = MicroTimer For i = 1 To 10000 .Calculate Next i tEnd = MicroTimer tScreenOff = (tEnd - tStart) .ScreenUpdating = True tStart = MicroTimer For i = 1 To 10000 .Calculate Next i tEnd = MicroTimer tScreenOn = (tEnd - tStart) End With MsgBox "Off " & Int((tScreenOff) * 1000) & _ " On " & Int(tScreenOn * 1000) & _ " Diff " & Int((tScreenOn - tScreenOff) * 1000) & " Millisecs" End Sub

The timing results for this test of screen updating are:

**For this (very extreme) benchmark Excel 2013 screen updating is about 180 times faster than previous versions.**

**But with screen updating turned off Excel 2013 runs this benchmark 5 times slower!**

**I think what has happened is that Excel 2013 is still doing all the work to update and format the values, but has added a check to limit the frequency of requests to Windows to actually repaint the screen.
**

]]>

Excel’s usable memory has been increasing steadily with each version:

- Excel 2003: 1 Gigabyte of working set memory
- Excel 2007: 2 Gigabytes of virtual memory
- Excel 2010, Excel 2013 and Excel 2016 32-bit: 2 Gigabytes of virtual memory
- Excel 2010, Excel 2013 and Excel 2016 64-bit: 131072 Gigabytes of virtual memory

Although the introduction of the 64-bit versions of Excel in theory removed any real limitation many people were not able to switch to 64-bit Excel because

- Most OCX controls are only available in 32-bit
- Many third party addins are only available in 32-bit

But the need for larger usable memory has also been increasing:

- Excel models seem to get larger every year
- Each successive Excel version uses more memory than the previous version
- PowerPivot and other BI tools need a lot of memory

So earlier this year the Excel team announced and made available a change to 32-bit Excel 2013 and 2016:

**If you are using a 64-bit version of Windows this change doubles available virtual memory for 32-bit Excel 2013 and 2016 to 4 Gigabytes.**

**If you are using a 32-bit version of Windows then this change can increase virtual memory for Excel 2013 and and 2016 to 3 Gigabytes, BUT:**

- With 32-bit Windows you need to enable the /3GB boot switch
- This switch halves the amount of memory (from 2GB to 1 GB) available to 32-bit Windows.

This LAA change was introduced in updates in May and June 2016:

- For Excel 2013 you need to be using Build 15.0.4833 or later.
- For Excel 2016 Office 365 you need to be using Build 16.0.6868.2060 or later
- For Excel 2016 MSI you need to be using Build 16.0.4394.1000 or later

For more details on the LAA change see this Knowledge Base article

Finding out how much virtual memory Excel is actually using, and what the current maximum limit is for your installation, is surprisingly difficult.

- Task Manager only shows working set memory, which is not the same thing as virtual memory.
- Process Explorer can show virtual memory used, but you have to add an additional column.

And I have not found a readily available tool that tells you what Excel’s maximum usable memory is. **So I decided to create one using Windows API calls and VBA.
Here are a few examples of the output:**

**Excel 2013 32-bit with 64-bit Windows: 4GB**

**Excel 2016 32-bit with 32-bit Windows without the /3GB boot switch: 2GB**

**Excel 2016 64-bit with 64-bit Windows: 131072 GB**

Let me know of any problems!

]]>

**It was also the day of the USA presidential election. This was my flashbulb moment, observed from a European perspective:**

A small group of us Excel MVPs (Roger Govier, Liam Bastick & me) were working late at MSoft with a bunch of Excel Dev Team Microsofties (including Ben who was with us at Excel Summit South in OZ/NZ earlier this year, and Joe who was at the London GTC last December).

So when we left at 18:45 we had to get a lift back to Bellevue from the Microsoft Redmond campus with Joe because all the MVP transport had finished. There was a bad crash on the freeway so we were stuck in traffic and Joe called his girlfriend to say he was going to be late. The very first thing she said was “its awful – I am very distressed – CNN says 75% probability for Trump”.

Stunned silence and disbelief in the car then we start listening to the radio.

It becomes clear that he really will win.

We eventually get to the UK MVP get-together just after 8 and manage to scrounge a glass of wine but no food left, then at 9 migrate to the Billiard Parlour (this year’s Excel haunt) and watch TV over a beer, trying to come to terms with this disaster. The Canadian Immigration website crashes because too many people try to apply to become Canadian citizens. USA MVP Jon keeps apologizing to us “ I’m sorry. I’m sorry – I did all I could …”

I wake up at 5:30 although my alarm is set for 6.30. The full scale of the disaster becomes apparent. Not only has Trump won but the Republicans have majorities in both the House and the Senate and there is a supreme court judge to be appointed who will hold the balance there.

There are no checks and balances left. The Donald has the keys to the nuclear codes.

]]>

We had some great sessions with the Excel and the Office Extensibility Product teams, and I have to say that IMHO some of the things they are working on (a few years out) are fairly revolutionary. Much as I would love to tell you all about it, I cannot – strict NDA applies.

Of course the Summit is also an opportunity to meet up with many old and new MVP friends (Thanks to Boriana for sharing this photo):

From Left -to-right by row in reverse ragged row order:

Ingeborg Hawighorst, Brad Yundt, Heidi Enho.

Charles Williams, Jacob Hildebrand, Boriana Petrova, Mynda Treacy.

Jon Acampora, Zack Barresse, Bob Umlas.

Ken Puls, Jon Peltier, Frederic Le Guin, Roger Govier.

Henk Vlootman, Jan Karel Pieterse, Bill Manville, Kevin Jones.

This years mystery picture is the well-known Excel Jedi Master with added beer:

I don’t think this needs many guesses …

]]>

Lets walk through an example of using a function that has had Intellisense enabled by Govert’s method. As you start typing the name of the function you get a list of functions and an explanation of the function:

Double-clicking the selected function starts entering the function in the formula bar and gives you an additional explanation of the first parameter:

Selecting the name of the function in the Intellisense popup shows a blue link if Help has been enabled for the function:

Clicking the link shows you help:

Pressing **Control-A** invokes the Function Wizard:

Or pressing **Control-Shift-A** fills the function in the formula bar and you can double-click each parameter to get text describing the parameter.

It is really simple to implement this: see Govert’s Excel-DNA Intellisense GitHub page.

For VBA workbooks or add-ins:

- Download and load the latest ExcelDna.IntelliSense.xll or ExcelDna.IntelliSense64.xll from the Releases page.
- Either add a sheet with the IntelliSense function descriptions, or a separate xml file

For my example I added a worksheet called _IntelliSense_ with the descriptions:

**Note:** DNA Intellisense does not itself enable the descriptions in the Function Wizard or build the Help text for you.

At the moment ExcelDNA Intellisense works with Excel 2010 and later versions, Windows 7 and later versions.

You can log issues on the Github site and Govert is very responsive.

]]>

When Excel **expects to get a single cell reference** but you give it a range of cells instead, Excel automagically works out the result of intersecting the the range of cells with the row or column of the current cell and uses that. For example:

Entering =A:A in cell B7 does **not** return the whole of columns A: it returns the intersection of row 7 and column A. Similarly if A1:A20 is named TwentyCells then entering =TwentyCells in B10 does not return all of A1:A20: it returns the intersection of TwentyCells with row 10.

If you enter =TwentyCells in row 30 there is no intersection, so Excel returns #Value.

If you array-enter (the Control-Shift-Enter keys all at the same time) the formula you are telling Excel that you want all the values in the range, not just one. So that is what you get. If you only array-enter the formula into a single cell, (for example array enter {=A:A} in cell B5) then you only get the first of the result values (**a** is the result of {=A:A} in cell B5).

If you array enter into more than one cell you get more than one result: for example select cells B2:B5, enter =A:A into the formula bar and hit Control-Shift-Enter and B2:B5 will show a b c d.

Usually you give VLOOKUP a single value or reference to use for the lookup value, and a range to use for the lookup table: =VLOOKUP(A4,$A:$C,3,false).

If you give VLOOKUP a range for the lookup value (=VLOOKUP($A:$A,$A:C,3,false) and do NOT array-enter the formula Excel will do the implicit intersection on the lookup value but not on the lookup table.

Excel has implemented implicit intersection very efficiently: it only passes the single cell reference to the formula or function rather than the whole range.

And only that single cell is treated as a precedent, so the formula/function only gets recalculated when that single cell gets changed/dirtied instead of when any cell in the range gets changed/dirtied.

**Unfortunately ever since Excel 95 implicit intersection does not automagically work for VBA, Automation or XLL UDFs.**

**But you can still make it happen in 2 different ways:**

- Put a plus sign in front of the function parameter
- Use VBA to do the implicit intersection for you

Function ImplicitV(theParam As Variant) As Variant ImplicitV = theParam End Function

When you enter this very simple UDF with a whole column reference Excel pass a reference to the entire column and the UDF has to handle it all: this is slow – on my fast machine it takes 83 milliseconds.

If you add a + sign Excel only passes the UDF the single cell that is the intersect – this is extremely fast (0.02 milliseconds, over 4000 times faster!).

**And the +sign works (very surprisingly) with both text and numbers!**

**(Thanks to MVP Rory Archibald for pointing this out to me!)**

As you can see when you use +$A:$A Excel treats it as an expression and therefore evaluates the expression before passing it to the UDF:

- Evaluating the expression invokes implicit intersection
- Excel does not pass a range to the UDF – it passes the result of the expression

**Adding a plus sign works well but you and your users have to remember to do it!**

Here is a general purpose VBA function you can call from inside your VBA UDF to do the implicit intersection for you..

**Note: fixed 6/October/2016 to handle implicit intersection with a range on a different sheet.**

' ' example UDF ' Function Implicit2V(theParam As Variant) As Variant Implicit2V = fImplicit(theParam, Application.Caller) End Function ' ' helper function to hande implicit intersect ' Function fImplicit(theInput As Variant, CalledFrom As Range) As Variant ' ' handle implicit intersection of an input with a calledfrom range ' ' Charles Williams - Decision Models - 3 october 2016 ' ' check for input range ' If TypeOf theInput Is Range Then If TypeOf CalledFrom Is Range Then ' ' both input and called from are ranges ' If Not CalledFrom.HasArray And theInput.CountLarge > 1 Then ' ' called from is not an array formula and the input has more than 1 cell so do implicit ' ' try intersect with row first ' Set fImplicit = Intersect(theInput, theInput.Parent.Cells(CalledFrom.Row, 1).EntireRow) ' ' if no intersect try column ' If fImplicit Is Nothing Then Set fImplicit = Intersect(theInput, theInput.Parent.Cells(1, CalledFrom.Column, 1).EntireColumn) ' ' if still nothing return #Value to mimic XL standard behaviour ' If fImplicit Is Nothing Then fImplicit = CVErr(xlErrValue) Else ' ' both are ranges but implicit intersect not applicable ' Set fImplicit = theInput End If Else ' ' calledfrom is not a range but input is a range so return a range Set fImplicit = theInput End If Else ' ' input is not a range so return it in a variant ' fImplicit = theInput End If End Function

**This is nearly as efficient as using the plus sign (0.04 milliseconds compared to 0.02 milliseconds) – and has the major advantage that you can build it into your UDFs.**

The disadvantage compared to the +sign trick is that the whole range is treated as a precent so the UDF will be recalculated whenever ANY cell in the input range gets dirtied or recalculated.

It still works even when when array-entered or when you add the plus sign, but of course that is going to be slow.

If you use the + sign trick then the UDF parameter has to either be a Variant or Double/String/Boolean type that matches the data type: Range and Object don’t work because Excel always passes the result value rather than a reference.

If you use the fImplicit helper function without the + sign and pass a range then you can use a parameter data type of Variant or Range or Object.

**Using Implicit Intersection with functions can be very efficient****The + sign trick works well but needs training and remembering to use it!****A general purpose helper function like fImplicit is fast and more user friendly than + sign**

]]>

Well I did not see that coming – the reasons people do not like it are:

I feel like I have to run my head through a meat grinder to understand the syntax for referencing pivotfields |

So bad |

GETPIVOTDATA is evil because the slightest change in a pivottable’s layout breaks the function and it is extremely hard to pin-point what element of the PivotTable the function is getting. |

It demand text, not cell references. That makes it unuseful. |

It always breaks, eventually, and more frequently than any other formula. It’s not dynamic whatsoever. It’s unbelievably slow. |

-Pain in the ass to use- can’t fill down formulas easily -Hard to use when manupulating pivot tables -Hard to debug |

I loath this function. You can not disable it. To get a value from a pivot table that is relative you have to manually key it in. I also dislike indirect, but getpivotdata is the worst. |

can’t copy |

It’s just a d@mned nuisance! |

It has limited use and I am always changing it to a direct reference. Less of an issue in the newer Excels |

Weird referencing protocol, and formula traceability doesn’t work. |

Defaults are absolute. Editing for relative references are cumbersome. |

Too many parameters |

Might actually be a good function but have not read very much on it to fully understand what it does or how to use it. |

It’s never what I want it to do and not as smart as I assume it should be.. |

pivot tables are lovely for quick data review, but a pain in the ass for long term analysis set ups. far better to create some combined index fields and filter them as a pseudo pivot for greater control.
IfERROR() is also pretty evil, but in a subtle way, it hides failures in your working so tthey are difficult to debug! |

It always appears when I don’t want it to |

**The reasons people gave for hating VLOOKUP were:**

- Slow
- Wrong default (sorted) mostly gives wrong answer
- Column number not understandable and breaks when columns inserted
- Inflexible

**The reasons people loved it were:**

It solves so many data cleansing situations. |

Because for most people it’s the first function that gives them that ‘wow, Excel is powerful’ moment. It makes them want to learn more. Including better functions like INDEX & MATCH! |

Because I hate Access and VLOOKUP lets me avoid it! Matching and classifying without all that Access nonsense. |

Have used it the most. Need to learn how to use Index and Match in its place however. |

Easy to understand the syntax |

It lets me get stuff done. |

Because I use it multiple times a day, and it allows you to re-create relational database functionality within Excel. |

Easy to use. Sometimes however I use INDEX(MATCH instead of VLOOKUP |

Easily the most loved function is INDEX/MATCH. Although it is more cumbersome to use than VLOOKUP people love it’s flexibility, ability to construct robust solutions and potential for optimisation. Many people start by using VLOOKUP and then migrate to INDEX/MATCH.

If you look at the combined positive votes for VLOOKUP and INDEX/MATCH its probable that the survey people consider LOOKUPs the most important thing in Excel.

I think the reason for the love-hate relationship with SUMPRODUCT is because its being used to fill the hole of a function that does not exist in Excel (I call it FILTER.IFS),=.

It’s power allows it to be used for a purpose it was never designed for, but that same power comes with a significant cost in performance terms.

Undoubtedly there is a need for a GetPivotData like function.

If it worked directly from the pivot cache in a multi-threaded way it would be a lot more robust and performant.

It would also need a wizard of some kind to simplify picking the field names.

**So how would you improve GETPIVOTDATA?**

]]>

**If you have a different choice you can cast your vote for your most evil function (and your favourite function) here.**

In no particular order here my reasons: you may have more!

The INDIRECT function is volatile, which makes any formula that contains it volatile so that they defeat Excel’s smart recalc and recalc at every calculation. And of course this ripples down the dependency chains to make all dependent formulas also recalculate: **SLOW!!!**

The INDIRECT function is single-threaded and so defeats Excel’s multi-threaded calculation engine:

**SLOW!!**

If the argument you give INDIRECT cannot be resolved to a usable reference INDIRECT returns #Ref. The problem is that the process of attempting to resolve the reference involves looking in a very large number of places which consumes a lot of time:

**Very SLOW!!!
**(Colin Legg has a more detailed post about this problem here)

If you use INDIRECT to refer to external workbooks then they have to be open or else INDIRECT won’t work.

**Error-prone & Fragile**

Because the argument to INDIRECT is text rather than a cell reference it does not automagically adjust when rows or columns are added/deleted/moved. OK its possible to build more complex INDIRECT formulas in some cases that do adjust, but they tend to get complex and error-prone.

**Error-prone & Fragile**

It can be very difficult to understand & debug formulas containing INDIRECT because they are often complex, and because the trace precedents tool gets blocked by a textual reference.

**Error-prone & Difficult**

INDIRECT is an extremely powerful function that is often used to create workbooks that can dynamically adjust to structural changes such as changing the ranges or worksheets or the external workbooks that are being used in formulas. Useful alternatives can be:

The CHOOSE function is not volatile and is multi-threaded and is easy to debug and maintain.

CHOOSE(Index_Num,Arg1,Arg2, … Arg254)

The first argument must resolve to a number between 1 and 254 that determines which of the following arguments is returned. The first argument could be a MATCH function that looks up a parameter in a list to get a number. The arguments to be chosen from could be defined names referring to ranges, references, formulas or values.

The drawback of the CHOOSE function is that the formula gets unwieldy when there are a large number of choices.

In addition to the standard form of INDEX (=INDEX(Range, Row_Index, Column_Index) ) there is a reference form which can be used to select from multiple ranges.

INDEX((references),Row_Index,Column_Index,Reference_Index)

The multiple ranges must be:

- Enclosed in ( )
- On the same worksheet

The drawback of this form of INDEX are that the ranges must be on the same worksheet.

If INDIRECT is being used to insulate the workbook against structural changes you could consider using VBA to modify the relevant formulas. Using Defined Names to hold frequently used formulas, and modifying the defined names may prove easier and more efficient than modifying every formula. Remember that Defined Names can also hold formulas containing relative references. For relative references I recommend using R1C1 mode and notation when creating the named relative formulas.

INDIRECT is evil because:

- Its slow
- Its fragile and easily broken
- Its hard to debug
- Its hard to understand

If you have better ideas for eliminating INDIRECT please help stamp us stamp out use of INDIRECT!

]]>

Just use coupon code LIMITED – the first 15 registrations using that coupon code will get a 10% discount.

Also joining us will be Yigal Edery and Ben Rampson from the Excel Dev Team:

]]>