2016 Microsoft MVP Summit, Part 1

I was privileged to attend the 2016 Microsoft MVP Summit in Redmond November 6-10 along with many other Excel MVPs.

We had some great sessions with the Excel and the Office Extensibility Product teams, and I have to say that IMHO some of the things they are working on (a few years out) are fairly revolutionary. Much as I would love to tell you all about it, I cannot – strict NDA applies.

Of course the Summit is also an opportunity to meet up with many old and new MVP friends (Thanks to Boriana for sharing this photo):


From Left -to-right by row in reverse ragged row order:

Ingeborg Hawighorst, Brad Yundt, Heidi Enho.
Charles Williams, Jacob Hildebrand, Boriana Petrova, Mynda Treacy.
Jon Acampora, Zack Barresse,  Bob Umlas.
Ken Puls, Jon Peltier, Frederic Le Guin, Roger Govier.
Henk Vlootman, Jan Karel Pieterse, Bill Manville, Kevin Jones.

This years mystery picture is the well-known  Excel Jedi Master with added beer:


I don’t think this needs many guesses …

Posted in Uncategorized | Tagged , , | Leave a comment

Writing Efficient VBA UDFs Part 15: Adding Intellisense to your UDFs

For several years people have been asking the Excel Dev team to add the capability to create Intellisense for UDFs to work the same way as native Excel functions. Well, now Govert van Drimmelen, author of the wonderful Excel DNA, has made a solution available for both VBA and XLL UDFs. It is now available in public Beta test.

Entering a Function

Lets walk through an example of using a function that has had Intellisense enabled by Govert’s method. As you start typing the name of the function you get a list of functions and an explanation of the function:


Double-clicking the selected function starts entering the function in the formula bar and gives you an additional explanation of the first parameter:


Selecting the name of the function in the Intellisense popup shows a blue link if Help has been enabled for the function:


Clicking the link shows you help:


Pressing Control-A invokes the Function Wizard:


Or pressing Control-Shift-A fills the function in the formula bar and you can double-click each parameter to get text describing the parameter.


Implementing IntelliSense using Govert’s Method

It is really simple to implement this: see Govert’s Excel-DNA Intellisense GitHub page.

For VBA workbooks or add-ins:

  • Download and load the latest ExcelDna.IntelliSense.xll or ExcelDna.IntelliSense64.xll from the Releases page.
  • Either add a sheet with the IntelliSense function descriptions, or a separate xml file

For my example I added a worksheet called _IntelliSense_ with the descriptions:


Note: DNA Intellisense does not itself enable the descriptions in the Function Wizard or build the Help text for you.


At the moment ExcelDNA Intellisense works with Excel 2010 and later versions, Windows 7 and later versions.

You can log issues on the Github site and Govert is very responsive.


I think Govert has done a fantastic job with this.

Please help by testing this solution so that it can progress beyond Beta test.










Posted in .NET, UDF, VBA, XLL | Tagged , , , , | 10 Comments

Writing Efficient VBA UDFs Part 14: Handling whole columns using Implicit Intersection – Updated

Excel has an interesting and very efficient trick called Implicit Intersection which allows you to use large Named Ranges and whole column references efficiently.

What is Implicit Intersection?

When Excel expects to get a single cell reference but you give it a range of cells instead, Excel automagically works out the result of intersecting the the range of cells with the row or column of the current cell and uses that. For example:


Entering =A:A in cell B7 does not return the whole of columns A: it returns the intersection of row 7 and column A. Similarly if A1:A20 is named TwentyCells then entering =TwentyCells in B10 does not return all of A1:A20: it returns the intersection of TwentyCells with row 10.

What happens if there is no intersection?

If you enter =TwentyCells in row 30 there is no intersection, so Excel returns #Value.

What happens if you array-enter the formula?

If you array-enter (the Control-Shift-Enter keys all at the same time) the formula you are telling Excel that you want all the values in the range, not just one. So that is what you get. If you only array-enter the formula into a single cell, (for example array enter {=A:A} in cell B5)  then you only get the first of the result values (a is the result of {=A:A} in cell B5).
If you array enter into more than one cell you get more than one result: for example select cells B2:B5, enter =A:A into the formula bar and hit Control-Shift-Enter and B2:B5 will show a b c d.

So how does this work with functions like VLOOKUP?

Usually you give VLOOKUP a single value or reference to use for the lookup value, and a range to use for the lookup table: =VLOOKUP(A4,$A:$C,3,false).
If you give VLOOKUP a range for the lookup value (=VLOOKUP($A:$A,$A:C,3,false) and do NOT array-enter the formula Excel will do the implicit intersection on the lookup value but not on the lookup table.


Implicit Intersection is Amazingly Fast!

Excel has implemented implicit intersection very efficiently: it only passes the single cell reference to the formula or function rather than the whole range.

And only that single cell is treated as a precedent, so the formula/function only gets recalculated when that single cell gets changed/dirtied instead of when any cell in the range gets changed/dirtied.

But UDFs require Special Tricks

Unfortunately ever since Excel 95 implicit intersection does not automagically work for VBA, Automation or XLL UDFs.

But you can still make it happen in 2 different ways:

  • Put a plus sign in front of the function parameter
  • Use VBA to do the implicit intersection for you

A VBA UDF Example

Function ImplicitV(theParam As Variant) As Variant
ImplicitV = theParam
End Function


When you enter this very simple UDF with a whole column reference Excel pass a reference to the entire column and the UDF has to handle it all: this is slow – on my fast machine it takes 83 milliseconds.


If you add a + sign  Excel only passes the UDF the single cell that is the intersect – this is extremely fast (0.02 milliseconds, over 4000 times faster!).

And the +sign works (very surprisingly) with both text and numbers!

(Thanks to MVP Rory Archibald for pointing this out to me!)


As you can see when you use +$A:$A Excel treats it as an expression and therefore evaluates the expression before passing it to the UDF:

  • Evaluating the expression invokes implicit intersection
  • Excel does not pass a range to the UDF – it passes the result of the expression

Adding a plus sign works well but you and your users have to remember to do it!

 Using VBA to do the Implicit Intersection

Here is a general purpose VBA function you can call from inside your VBA UDF to do the implicit intersection for you..

Note: fixed 6/October/2016 to handle implicit intersection with a range on a different sheet.

' example UDF
Function Implicit2V(theParam As Variant) As Variant
Implicit2V = fImplicit(theParam, Application.Caller)
End Function
' helper function to hande implicit intersect
Function fImplicit(theInput As Variant, CalledFrom As Range) As Variant
' handle implicit intersection of an input with a calledfrom range
' Charles Williams - Decision Models - 3 october 2016
' check for input range
If TypeOf theInput Is Range Then
If TypeOf CalledFrom Is Range Then
' both input and called from are ranges
If Not CalledFrom.HasArray And theInput.CountLarge > 1 Then
' called from is not an array formula and the input has more than 1 cell so do implicit
' try intersect with row first
Set fImplicit = Intersect(theInput, theInput.Parent.Cells(CalledFrom.Row, 1).EntireRow)
' if no intersect try column
If fImplicit Is Nothing Then Set fImplicit = Intersect(theInput, theInput.Parent.Cells(1, CalledFrom.Column, 1).EntireColumn)
' if still nothing return #Value to mimic XL standard behaviour
If fImplicit Is Nothing Then fImplicit = CVErr(xlErrValue)
' both are ranges but implicit intersect not applicable
Set fImplicit = theInput
End If
' calledfrom is not a range but input is a range so return a range
Set fImplicit = theInput
End If
' input is not a range so return it in a variant
fImplicit = theInput
End If
End Function

This is nearly as efficient as using the plus sign (0.04 milliseconds compared to 0.02 milliseconds) – and has the major advantage that you can build it into your UDFs.

The disadvantage compared to the +sign trick is that the whole range is treated as a precent so the UDF will be recalculated whenever ANY cell in the input range gets dirtied or recalculated.

It still works even when when array-entered or when you add the plus sign, but of course that is going to be slow.

Using Range or Object or Double or String as the parameter datatype.

If you use the + sign trick then the UDF parameter has to either be a Variant or Double/String/Boolean type that matches the data type: Range and Object don’t work because Excel always passes the result value rather than a reference.

If you use the fImplicit helper function without the + sign and pass a range then you can use a parameter data type of Variant or Range or Object.


  • Using Implicit Intersection with functions can be very efficient
  • The + sign trick works well but needs training and remembering to use it!
  • A general purpose helper function like fImplicit is fast and more user friendly than + sign



Posted in arrays, Calculation, UDF, VBA, XLL | Tagged , , , , | 14 Comments

Excel: Evil and Hero Functions

Some time back I setup a (not too serious) survey (based on Excel online of course) to see what people thought were the best (most heroic) and worst (most evil) Excel worksheet functions.

And the most evil function is:



Well I did not see that coming – the reasons people do not like it are:

I feel like I have to run my head through a meat grinder to understand the syntax for referencing pivotfields
So bad
GETPIVOTDATA is evil because the slightest change in a pivottable’s layout breaks the function and it is extremely hard to pin-point what element of the PivotTable the function is getting.
It demand text, not cell references. That makes it unuseful.
It always breaks, eventually, and more frequently than any other formula. It’s not dynamic whatsoever. It’s unbelievably slow.
-Pain in the ass to use- can’t fill down formulas easily
-Hard to use when manupulating pivot tables
-Hard to debug
I loath this function. You can not disable it. To get a value from a pivot table that is relative you have to manually key it in.
I also dislike indirect, but getpivotdata is the worst.
can’t copy
It’s just a d@mned nuisance!
It has limited use and I am always changing it to a direct reference. Less of an issue in the newer Excels
Weird referencing protocol, and formula traceability doesn’t work.
Defaults are absolute. Editing for relative references are cumbersome.
Too many parameters
Might actually be a good function but have not read very much on it to fully understand what it does or how to use it.
It’s never what I want it to do and not as smart as I assume it should be..
pivot tables are lovely for quick data review, but a pain in the ass for long term analysis set ups. far better to create some combined index fields and filter them as a pseudo pivot for greater control.

IfERROR() is also pretty evil, but in a subtle way, it hides failures in your working so tthey are difficult to debug!

It always appears when I don’t want it to

VLOOKUP – the love-hate relationship

The reasons people gave for hating VLOOKUP were:

  • Slow
  • Wrong default (sorted) mostly gives wrong answer
  • Column number not understandable and breaks when columns inserted
  • Inflexible

The reasons people loved it were:

It solves so many data cleansing situations.
Because for most people it’s the first function that gives them that ‘wow, Excel is powerful’ moment. It makes them want to learn more. Including better functions like INDEX & MATCH!
Because I hate Access and VLOOKUP lets me avoid it! Matching and classifying without all that Access nonsense.
Have used it the most.  Need to learn how to use Index and Match in its place however.
Easy to understand the syntax
It lets me get stuff done.
Because I use it multiple times a day, and it allows you to re-create relational database functionality within Excel.
Easy to use. Sometimes however I use INDEX(MATCH instead of VLOOKUP


Easily the most loved function is INDEX/MATCH. Although it is more cumbersome to use than VLOOKUP people love it’s flexibility, ability to construct robust solutions and potential for optimisation. Many people start by using VLOOKUP and then migrate to INDEX/MATCH.

If you look at the combined positive votes for VLOOKUP and INDEX/MATCH its probable that the survey people consider LOOKUPs the most important thing in Excel.


I think the reason for the love-hate relationship with SUMPRODUCT is because its being used to fill the hole of a function that does not exist in Excel (I call it FILTER.IFS),=.
It’s power allows it to be used for a purpose it was never designed for, but that same power comes with a significant cost in performance terms.

Could GETPIVOTDATA be improved?

Undoubtedly there is a need for a GetPivotData like function.
If it worked directly from the pivot cache in a multi-threaded way it would be a lot more robust and performant.

It would also need a wizard of some kind to simplify picking the field names.

So how would you improve GETPIVOTDATA?

Posted in Uncategorized | 5 Comments

INDIRECT – Excel’s Most Evil Function

My choice for Excel’s most evil worksheet function is INDIRECT, for reasons outlined below.

If you have a different choice you can cast your vote for your most evil function (and your favourite function) here.

So what makes INDIRECT so evil?

In no particular order here my reasons: you may have more!

Its Volatile

The INDIRECT function is volatile, which makes any formula that contains it volatile so that they defeat Excel’s smart recalc and recalc at every calculation. And of course this ripples down the dependency chains to make all dependent formulas also recalculate: SLOW!!!

Its Single-threaded

The INDIRECT function is single-threaded and so defeats Excel’s multi-threaded calculation engine:

#Ref Problem

If the argument you give INDIRECT cannot be resolved to a usable reference INDIRECT returns #Ref. The problem is that the process of attempting to resolve the reference involves looking in a very large number of places which consumes a lot of time:
Very SLOW!!!
(Colin Legg has a more detailed post about this problem here)

External workbooks

If you use INDIRECT to refer to external workbooks then they have to be open or else INDIRECT won’t work.
Error-prone & Fragile

Does not adjust when rows/columns added/deleted/moved

Because the argument to INDIRECT is text rather than a cell reference it does not automagically adjust when rows or columns are added/deleted/moved. OK its possible to build more complex INDIRECT formulas in some cases that do adjust, but they tend to get complex and error-prone.
Error-prone & Fragile

 Difficult to debug

It can be very difficult to understand & debug formulas containing INDIRECT because they are often complex, and because the trace precedents tool gets blocked by a textual reference.
Error-prone & Difficult

So are there any alternatives?

INDIRECT is an extremely powerful function that is often used to create workbooks that can dynamically adjust to structural changes such as changing the ranges or worksheets or the external workbooks that are being used in formulas. Useful alternatives can be:


The CHOOSE function is not volatile and is multi-threaded and is easy to debug and maintain.

CHOOSE(Index_Num,Arg1,Arg2, … Arg254)

The first argument must resolve to a number between 1 and 254 that determines which of the following arguments is returned. The first argument could be a MATCH function that looks up a parameter in a list to get a number. The arguments to be chosen from could be defined names referring to ranges, references, formulas or values.

The drawback of the CHOOSE function is that the formula gets unwieldy when there are a large number of choices.

The Reference form of INDEX

In addition to the standard form of INDEX (=INDEX(Range, Row_Index, Column_Index) ) there is a reference form which can be used to select from multiple ranges.


The multiple ranges must be:

  • Enclosed in ( )
  • On the same worksheet

The drawback of this form of INDEX are that the ranges must be on the same worksheet.


If INDIRECT is being used to insulate the workbook against structural changes you could consider using VBA to modify the relevant formulas. Using Defined Names to hold frequently used formulas, and modifying the defined names may prove easier and more efficient than modifying every formula. Remember that Defined Names can also hold formulas containing relative references. For relative references I recommend using R1C1 mode and notation when creating the named relative formulas.


INDIRECT is evil because:

  • Its slow
  • Its fragile and easily broken
  • Its hard to debug
  • Its hard to understand

If you have better ideas for eliminating INDIRECT please help stamp us stamp out use of INDIRECT!



Posted in Uncategorized | 14 Comments

Special Offer for Excel Summit South

We_HeaderSpeakersIf you were thinking about joining us at Excel Summit South for what promises to be a great 2 days of Excel then register now to take advantage of a 10% discount offer.

Just use coupon code LIMITED – the first 15 registrations using that coupon code will get a 10% discount.

Also joining us will be Yigal Edery and Ben Rampson from the Excel Dev Team:




Posted in Uncategorized | Tagged , , , | Leave a comment

2015 in review

The WordPress.com stats helper monkeys prepared a 2015 annual report for this blog.

Here’s an excerpt:

The Louvre Museum has 8.5 million visitors per year. This blog was viewed about 170,000 times in 2015. If it were an exhibit at the Louvre Museum, it would take about 7 days for that many people to see it.

Click here to see the complete report.

Posted in Uncategorized | Leave a comment

Excel What-If Data Tables: Faster calculation with VBA

For some reason this year I keep on meeting Excel’s What-If Data Tables. These are used (often in real estate and financial valuation models) to simplify doing sensitivity analysis against a range of values for certain inputs such as interest rates. What-If Data Tables are a great tool but they tend to calculate very slowly, in fact more slowly than I think they should.

Using What-If Data Tables

What-If Data Tables can be either 1-dimensional or 2-dimensional. You create them using the What-If pull-down on the Data tab.

WhatIf1First you need to create the outline for the Data Table:

WhatIf2This is a 2-dimensional data table. The idea is that you have a single formula (in cell E7) that shows the results of a set of calculations that are dependent on 2 constants (in F3 and G3). We want to recalculate the workbook for every combination of the values in the range (row) F7:H7 and the column E8:E12, and place the corresponding result in F87:H12.

Start by selecting the range E7:H12 and click the What-If Data Table button:

WhatIf3You are asked to pick the Row Input Cell and Column Input Cell, and when you click OK the range F8:H12 gets filled with a TABLE array function formula {=TABLE(F3,G3)}

Note that for some strange reason Excel requires the row input and column input cells to be on the same worksheet as the data table.

What-If data tables calculate slowly because Excel has to recalculate many times (in this case 3 x 5 = 15 times. So you need to be in Manual calculation mode, or possibly in Automatic except Data Tables mode (this makes calculation automatic but does not do the iterative calculation of the data table(s)).

Speeding Up What-If Data Table Calculation.

I did some experiments to see if the data table calculation could be speeded up and found:

  • Each iteration is calculated using single-threaded calculation.
  • The process starts with a multi-threaded recalculation, then iterates using single-threaded recalculation, then finishes with an additional recalculation.
  • There seems to be quite a large overhead to this overall process.
  • If the one combination of values in the table is the same as the initial set of values it gets recalculated anyway even if not needed.

The VBA Solution

So I wrote some VBA to see if by eliminating these problems I could make it run faster.
Note that the VBA is written as a proof-of-concept demonstration rather than production level code.

Sub IterateTables()
' Proof-of-Concept code
' for faster calculation of a 2-D What-If data table
' Copyright Charles Williams, Decision Models 16-December-2015
Dim rngTable As Range
Dim rngRowCell As Range
Dim rngColCell As Range
Dim varRowSet As Variant
Dim varColSet As Variant
Dim varResults() As Variant
Dim rngFormula As Range
Dim nRows As Long
Dim nCols As Long
Dim lCalcMode As Long
Dim j As Long
Dim k As Long
Dim varStartRowVal As Variant
Dim varStartColVal As Variant
Dim varFirstVal As Variant
Dim blCalculated As Boolean
Dim dTime As Double
' get the what-if table and its formula
Set rngTable = ActiveCell.CurrentRegion    ' expand active cell to the current region
Set rngFormula = rngTable.Cells(1, 1)   ' Top-Left corner
nRows = rngTable.Rows.Count - 1         ' number of rows in the Column of what-if values
nCols = rngTable.Columns.Count - 1      ' number of columns in the row of what-if values
' get the row and column input cells using RefEdit
With ufIterTable
.RefEditRow.Value = ""
.RefEditCol.Value = ""
If ufIterTable.RefEditRow.Value <> "" Then Set rngRowCell = Range(.RefEditRow.Value)
If ufIterTable.RefEditCol.Value <> "" Then Set rngColCell = Range(.RefEditCol.Value)
End With
' if 2-D and we have got the row and column cells then proceed
If nRows > 0 And nCols > 0 And Not rngRowCell Is Nothing And Not rngColCell Is Nothing Then
dTime = Microtimer
' create output results array
ReDim varResults(1 To nRows, 1 To nCols)
' get row and column arrays of what-if values
varRowSet = rngFormula.Offset(0, 1).Resize(1, nCols).Value2
varColSet = rngFormula.Offset(1, 0).Resize(nRows, 1).Value2
' set environment
Application.ScreenUpdating = False
lCalcMode = Application.Calculation
If Application.Calculation <> xlCalculationManual Then Application.Calculation = xlCalculationManual
' can only skip initial values if workbook is calculated at start
If Application.CalculationState = xlDone Then
blCalculated = True
blCalculated = False
End If
' initial start values
varStartRowVal = rngRowCell.Value2
varStartColVal = rngColCell.Value2
varFirstVal = rngFormula.Value2
' calculate result for each what-if values pair
For j = 1 To nRows
For k = 1 To nCols
If blCalculated And varRowSet(1, k) = varStartRowVal And varColSet(j, 1) = varStartColVal Then
' if whatis value pair is the same as the start values then skip recalc
varResults(j, k) = varFirstVal
' show calc status in the status bar
Application.StatusBar = "What-If Table Row " & j & " Column " & k
' set values for this iteration, recalc, store result
rngRowCell.Value2 = varRowSet(1, k)
rngColCell.Value2 = varColSet(j, 1)
varResults(j, k) = rngFormula.Value2
End If
Next k
Next j
' reset status bar
Application.StatusBar = False
' put results back
rngFormula.Offset(1, 1).Resize(nRows, nCols) = varResults
' reset back to initial values & recalc
rngRowCell.Value2 = varStartRowVal
rngColCell.Value2 = varStartColVal
Application.Calculation = lCalcMode
' timer message
dTime = Int((Microtimer - dTime) * 1000) / 1000
MsgBox "Time for " & nRows * nCols & " Iterations: " & dTime & " Seconds"
End If
End Sub

Timing Results

I ran some timing tests using a 4-core I7 with hyper-threading so that Excel tries to use 8 cores for multi-threaded calculation.

WhatIf4The first row gives the time for a single calculation in multi-threaded and single-threaded mode.
The second row gives the time to calculate using Excel’s What-If data table method. You can see that difference between single-threaded and multi-threaded is equal to 2*(Single-thread time – multi-thread time).
The third row gives the time to calculate using my VBA routine: a lot faster in multi-threading mode and slightly faster in single-threading mode.

A timing breakdown for XL Data Tables could be:
2 * multi-threaded Calc + 15 * single-threaded Calc + overhead
2*0.65 + 15 * 2.8 + 0.3 = 43.6

And for the VBA:
15 * multi-threaded calc + overhead
15 * 0.65 + 0.15 = 9.9


You can download the test workbook from


I have made a suggestion in Excel User Voice to use these findings to speed up Excel’s What-If Data Tables. Please vote for it if you think it has merit:


Also it would be great if existing data table users could try out my VBA code to see how it works for them.

I plan to build a more generalised solution into FastExcel Calc.




Posted in arrays, Calculation, VBA | 3 Comments

Excel Full Column References and Used Range: Good Idea or Bad Idea?

Some people love using full column references such as A:A, other people say they slow down your workbook to a crawl and lead to workbook and memory bloat.

The attraction of using whole column references is that you don’t have to adjust your formulas when additional rows of data are added or deleted.

So who is right?

Let’s run some simple tests to find out.

We need to test a variety of functions and formulas. I picked

  • SUM(A:A)
  • MATCH(99,A:A,0)
  • COUNTIF(A:A,99)
  • {IF(A:A=1,1,0)} (array entered into each cell)
  • {IF(A:A=1,1,0)} (one formula array entered into D1:D100)
  • IF(A:A=1,1,0) (not array entered)

All the test cases have cell A1 set to a value of 1, and the formula is entered into D1:D1000.

To explore the effect of changing the used range and the number of cells containing formatting or data I used a variety of test cases:

  • Test Case 1 – Base case: no other data or formatting in the worksheet.

All the other test cases are derived from the base case, making only the changes mentioned for each case.

Formatting test cases:

  • Test Case 2 – entire column A is colored yellow
  • Test Case 3 – cells A2000:A1048576 are colored yellow
  • Test Case 4 – cell XFD1048576 empty but colored yellow
  • Test Case 5 – every other row in column A colored yellow

Cells with data test cases:

  • Test Case 6 – cell A1048576 contains 98
  • Test Case 7 – cells A2000:A1048576 contain 1
  • Test Case 8 – every other row in column A contains 1

The calculation times are given in Milli-seconds for the median of 3 single-threaded Range.Calculates on the 1000 formulas in D1:D1000. Screen updating time is excluded.

Test Results

FullColInterpretation of Test Results

1. Base Case: No Additional Formatting or Data

The most efficient formula is =IF(A:A=0,1,0)
This formula uses Excel’s implicit intersection method so that each cell containing the formula in D1:D1000 only looks at a single cell in the same row in column A.

The SUM, MATCH and COUNTIF formulas are efficient because each formula only looks at the 1000 cells in column A bounded by the used range (row 1000).

The SUMPRODUCT and single-cell array formula version of =IF(A:A=0,1,0) force Excel to look at every single one of the million cells in column A, so there are 1000*10000000 cells looked at. The IF formula is slower than the SUMPRODUCT because it has to create additional virtual columns.

The multi-cell array version of the IF is a single formula array-entered into the 1000 cells in column D. This single formula looks at 1000000 cells and then returns 1000 results. Because it looks at 1000 times fewer cells it is a lot faster.

2. Column A:A formatted Yellow

Formatting the entire column does not increase the used range and so has virtually no effect on the timings.

3. Format over a Million cells from from A2000:A1048576 Yellow

Formatting a large but contiguous block of cells does increase the used range if the start cell address is beyond the previous last used cell. But (surprisingly) it does not extend the used range to the last formatted row!
The timing results show a small increase for those formulas that are dependent on the used range, and the file size increases slightly.

4. Format the Last Cell in the worksheet (XFD1048576) Yellow.

This does dramatically extend the used range: the last used row is now the last row in the worksheet.
The timing results that depend on the size of the used range increase significantly, but the file size does not change.

The results are the same if you color the entire row 1048576 or just the single cell A1048576.

5. Format every other row in column A Yellow.

This creates a much more complex used range, even though the last used row is actually 1 less than test case 4. Excel has to store formatting information for each of half a million rows!
All the timings increase significantly
(not just the formulas that are dependent on the used range) because of the additional work need to traverse the much larger cell table, and the file size shoots up.

6. Set the last cell in column A to 98.

This extends the used range in the same way as formatting the last cell, and the timings and file size are virtually identical to test case 4.

7. Set cells A2000:A1048576 to 1.

You might expect this to behave the same way as test case 3 which just formats this block of cells rather than inserting a constant value.

But the time taken by the formulas dependent on the used range increases by a factor of around 1000, and even the timings for SUMPRODUCT and the array formulas about double. The file size also increases significantly.

Clearly adding data to a contiguous block of cells has a much bigger impact than just formatting a contiguous block of otherwise empty cells.

8. Set every other cell in column A to 1.

This results in about half as many cells containing data as test case 7, but the last used row is only 1 less.
The timings for formulas dependent on the used range increases by about a factor of 2 compared to test case 5, which only formatted the same cells, but decreases by about  a factor of 2 compared to test case 7 (which had about double the number of cells containing data). The file size is not much larger than test case 5.

This seems to show that formula calculation time is dependent on both the complexity of the used range that formula has to traverse and the number of cells containing data.


Excel’s sparse storage methods are efficient and allow for the use of whole column references and contiguous blocks of identical formats. But you need to be careful to minimize both the extent and complexity of the used range.

  • Formulas other than array formulas and SUMPRODUCT handle whole column references efficiently.
  • But avoid array formula and SUMPRODUCT usage of whole column references.
  • Non-array formulas using whole column references are much slower with large used ranges, and even slower with complex used ranges.
  • Formulas handle empty cells more efficiently than cells containing data.
  • Excel is optimized to handle contiguous blocks of identical formatting efficiently.
  • File size is not dependent on the size of the used range.
  • File size is largely dependent on the number of cells containing data/formulas and also on the number of non-contiguous formatted cells.




Posted in arrays, Calculation, Formatting, Memory | 5 Comments

Timing Calculation Consistently: Windows Update/Background Process Woes

Last Friday  4 December 2015 I was presenting a session at the Modeloff Global Training Camp in London. The session was “How to make your Excel workbooks calculate faster”, and so it involved demonstrations of timing formula calculation. No problem I thought, I will use my Surface Pro 3, recently loaded with Windows 10, Excel 2016 and FastExcel V3.

Why do my calculations take twice as long as they did yesterday?

But doing my final practice run-through on Thursday the formulas were calculating more than twice as slowly as earlier in the week. And my Surface Pro 3 was getting very hot. And the battery was running down very fast.

Of course when the Surface Pro 3 gets too hot it starts throttling down the I7 processor, but that would not explain a factor of 2, or why it was getting hot in the first place.

So I started up Task Manager and clicked More Details:

Task2In this example you can see that the CPU is running at 79% with Windows Search and Carbonite Backup chewing up most of the processor time.

But on Thursday it was one of the Service Host processes. Expanding it showed one of the services was Windows Update, and stopping that service moved the cpu % down to a small percentage.

Of course I had to remember to switch it back on next day.


If you want to get more consistent calculation timings:

  • Don’t let your Surface Pro 3 get too hot!
  • Use Task Manager to see what processes and services are running and chewing up the processors.
  • Temporarily stop any processes and services that are making heavy use of the processors.



Posted in Calculation | Leave a comment