Prompted by a comment from Sam on Match vs Find I thought I would take a look at Excel’s rather quirky Evaluate method with Excel 2010 to see how it performed.
The Evaluate method internally uses Excel’s formula parser and calculator, and this makes it surprisingly powerful. You can use it on virtually any kind of formula, range reference or Defined Name. But, as we will see, it does have a number of strange “quirks” that you have to navigate around.
Depending on the context Evaluate will either return an object (for example a Range) or values.
I will be using exactly the same test setup of 100000 rows of randomly generated XY pairs and timing routine as in Match vs Find, so you can directly compare the results.
Using the square brackets [ ] shortcut for Evaluate
Sam’s comment suggested using [COUNTIFS] to see how the timing compared with MATCH and FIND. Of course its not quite the same thing because the loop on Match and Find allows the VBA to do something for each XY pair found. Sam’s VBA looks like this:
Sub FindXY_COUNTIFS1()
Dim j As Long
Dim dTime As Double
dTime = Microtimer
j = [COUNTIFS(A1:A100000,"x",B1:B100000,"y")]
Debug.Print "COUNTIFS1 " & j & " " & (Microtimer - dTime) * 1000
End Sub
It takes 11.6 millisecs to find the 25135 XY pairs generated using a constant of 0.5 in the test data generator.
[ ] is a shortcut for Application.Evaluate. The advantage of using the [ ] brackets is that it is concise and the formula inside the [ ] is exactly the same as when used in a worksheet cell. The disadvantage is that you cannot generate the formula as a string. I tend to only use this notation when evaluating my hidden workbook-scoped defined names, because they are not likely to change. (Of course sometimes I get lazy …)
Using Application.Evaluate instead of [ ]
You can use Evaluate or Application.Evaluate with a string instead of the [ ]
j = Evaluate("COUNTIFS(A1:A100000," & Chr(34) & "x" & Chr(34) & ",B1:B100000," & Chr(34) & "y" & Chr(34) & ")")
The timing is virtually identical to the [ ] shortcut method.
Application.Evaluate and the Activesheet
One trap for the unwary with [ ] , Evaluate and Application.Evaluate is that all references that do not contain a worksheet (unqualified references like A1:A100000) are assumed to refer to whatever the Active sheet currently happens to be.
So if you are going to use Application.Evaluate you should always use a qualified reference (Sheet1!A1:A100000) unless you like your code to live dangerously.
Worksheet.Evaluate
Worksheets and Charts also have an Evaluate Method. When you use these methods unqualified references are assumed to refer to the worksheet or chart.
Sub FindXY_COUNTIFS3()
Dim j As Long
Dim dTime As Double
dTime = Microtimer
j = Worksheets("Sheet1").Evaluate("COUNTIFS(A1:A100000," & Chr(34) & "x" & Chr(34) & ",B1:B100000," & Chr(34) & "y" & Chr(34) & ")")
Debug.Print "COUNTIFS3 " & j & " " & (Microtimer - dTime) * 1000
End Sub
Now for the surprise: Worksheet.Evaluate is twice as fast as Application.Evaluate!
(actually 6.1 millisecs as opposed to 11.6 millisecs)
I am fairly sure that there is a bug in Application.Evaluate that actually does the evaluation twice.
Certainly if you use Application.evaluate on a UDF the UDF will be executed twice.
Chart.Evaluate
In the real world I have never actually used Chart.Evaluate (probably because I hate programming Chart objects), but according to Help it seems you can do interesting things with it:
“Chart Objects. You can specify any chart object name, such as “Legend”, “Plot Area”, or “Series 1″, to access the properties and methods of that object. For example, Charts("Chart1").Evaluate("Legend").Font.Name returns the name of the font used in the legend.”
Evaluating Array Formulas
Amazingly if you give Evaluate an array formula it evaluates it as an array formula:
Sub FindXY_COUNTIFS4()
Dim j As Long
Dim dTime As Double
dTime = Microtimer
j = ActiveSheet.Evaluate("SUM((A1:A100000=" & Chr(34) & "x" & Chr(34) & ")*(B1:B100000=" & Chr(34) & "y" & Chr(34) & "))")
Debug.Print "COUNTIFS4 " & j & " " & (Microtimer - dTime) * 1000
End Sub
This is quite a lot slower than COUNTIFS: it takes nearly 19 milliseconds.
If you are a SUMPRODUCT fan you could use
j = ActiveSheet.Evaluate("SUMPRODUCT(--(A1:A100000=" & Chr(34) & "x" & Chr(34) & "),--(B1:B100000=" & Chr(34) & "y" & Chr(34) & "))")
But its not significantly faster.
Evaluate speed compared to a formula in a cell
You would expect Evaluate to be slower than Excel natively calculating the formula in a cell. And indeed it is, but its quite close for a single formula;
Countifs Formula 6.0
Evaluate Countifs 6.1
Array Sum Formula 16.9
Evaluate Array Sum 18.9
Evaluate speed compared to using Application.Worksheetfunction
It looks like there is a higher overhead to using Evaluate, which is what you would expect.
Here is a version of the FindXY sub using Evaluate with MATCH instead of Worksheetfunction.Match.
Sub FindXYEval()
Dim oRng As Range
Dim oLastRng As Range
Dim j As Long
Dim jRow As Long
Dim n As Long
Dim Rw As Long
Dim dTime As Double
dTime = Microtimer
Set oRng = Range("a1:A100000")
Set oLastRng = oRng(oRng.Rows.Count)
Rw = oLastRng.Row
On Error GoTo finish
With Application.WorksheetFunction
Do
Set oRng = Range(oRng(j + 1), oLastRng) '<<<<= Rw
j = ActiveSheet.Evaluate("Match(" & Chr(34) & "X" & Chr(34) & "," & oRng.Address & ", 0)")
jRow = jRow + j
If oRng(j, 2).Value2 = "Y" Then n = n + 1
Loop Until jRow + 1 > Rw
End With
finish:
Debug.Print "MatchEval " & n & " " & (Microtimer - dTime) * 1000
End Sub
This takes 3720 milliseconds compared to 478 milliseconds using Worksheetfunction.Match. There are just over 50000 calls to Evaluate or Match so I reckon the additional overhead of using Evaluate is about 65 Microseconds per call.
More Evaluate Limitations: Updated
- The string being evaluated must be less than 256 characters, even in Excel 2010.
- A1 style references can be evaluated in both A1 and R1C1 reference mode (Application.ReferenceStyle), but R1C1 style references can only be evaluated in R1C1 mode.
- Relative references in the string are treated as absolute, unless they are contained in defined names in which case the defined name is evaluated with respect to cell A1.
- Dates should be in USA format (Month-Day-Year).
- Evaluate will return an error value if the string formulae contains external references to closed workbooks or XLM functions.
- Using Evaluate INDEX(rng,rownum,COLUMN()) gives incorrect answers except for the first column. Evaluate 0+INDEX(rng,rownum,COLUMN()) works
- If the string formula contains a reference to both a UDF and a name it will fail with an error 2029 if the name reference occurs AFTER the UDF reference:
- If fred is a named range and xyz() is a user defined VBA function then this statement returns error 2029: application.Evaluate(“=xyz(b1)+fred”)
- This statement returns the correct value: application.Evaluate(“=fred+xyz(b1)”)
- Microsoft KB article 823604 identifies this problem but does not correctly diagnose the circumstances that cause it.
You can bypass many of these limitations (at the cost of performance) by inserting the formula string into a worksheet cell and then reading the resulting cell value back into a VBA variable.
Error Handling
If Evalaute cannot evaluate the formula string it returns an error rather than raising an error, so the result of Evaluate should always be assigned to a Variant.
Conclusion
- The Evaluate method adds a lot of power to VBA
- Always use Worksheet.Evaluate rather than Application.Evaluate: its twice as fast and less error-prone
- Using Worksheet.Evaluate has comparable speed to a cell formula but a higher overhead
- Worksheetfunction has a lower overhead than Evaluate
- Beware the Quirks of Evaluate!
So whats your experience of Evaluate?
November 2, 2011 at 5:26 pm |
xlfEvaluate from the C SDK also has it’s quirks. It doesn’t work unless you prepend an excalmation point in front of all references.
November 2, 2011 at 5:42 pm |
Interesting. I’ve used the Evaluate function in a user form to return the result of a formula written by a the user and I cropped off the parts that weren’t finished being written. For example,
User Input: “10+20+”
Crop to: “10+20″
Evaluated: 30
I guess you would call that, live evaluations.
November 2, 2011 at 6:56 pm |
Jon, that sounds very similar to the way the function wizard works, presumably the FuncWiz is using Evaluate under the covers.
November 2, 2011 at 6:55 pm |
Keith,
I have not got to xlfEvaluate yet: it will be interesting to see how many of the VBA Evaluate limitations also apply, and what the performance is like.
November 3, 2011 at 12:48 pm |
Another truly weird Evaluate quirk has been found here
http://www.mrexcel.com/forum/showthread.php?p=2917624#post2917624
Seems that Evaluate cannot correctly handle Column() when embedded in INDEX: the example that fails is INDEX($A$19:$I$39,11,column())
The COLUMN() part always returns 1.
It works if you make the INDEX part of an expression
0+INDEX($A$19:$I$39,11,column())
November 6, 2011 at 6:32 am |
@Charles
[Row()] or [Column()] are returning a single dimensional single element array
Try this in the immediate window
?[Row()] and it gives us a type mismatch error
Modify it to
?[Index(Row(),1,1)]
And it works fine
So in the Index problem above
The below should work fine
Sub TestEval()
Dim cPos As Variant
cPos = [Column()]
Debug.Print Sheet2.Evaluate(“Index(A1:C12,11,” & cPos(1) & “)”)
End Sub
November 7, 2011 at 10:47 am |
Sam, You could bypass the problem that way, but it does not explain the bug
part of Evaluates weird behaviour is that you can use Evaluate(“Column()”) and it works correctly (gives you the column number of the active column).
The part that does not work correctly is when you embed the Column() inside INDEX.
In that case COLUMN() always returns 1 regardless of the active column. Even weirder is that if you make the INDEX statement part of an expression it starts working correctly.
November 8, 2011 at 5:39 am
Charles, I tried this in the immediate window
?Evaluate(“Column()”) is gave me a Type mismatch error.
Now try
?Evaluate(“0+Column()”) also gives the Type mismatch error.
I think inside Evaluate the Column() and Row() are getting treated as an array formula and are returning a single element array like so {Number}
In the MATCH function the 2nd Parameter is LookupArray and hence should accept a formula that returns an array
Try this in the immediate window
With your cursor in any cell of Column B
?[Match(2,column(),0)] give 1 which is correct
as it must be doing
Match(2,{2},0)
Also
?[Match(2,Column(A1:C1),0)] gives 2 which is correct
as it now must be doing
Match(2,{1,2,3},0) and giving 2
If you change the cursor to cell c1 and try the above it gives 3
Another example
?[Small({1,2,3},Column()] will give 1 or 2 or 3 depending on the position of the cursor being in a or b or c
Like wise
?[Small(Column(A1:C1),2) will give 2
Small will accept Arrays for both the parameters
Also
?[SUMPRODUCT(Column(),Column())] will give 9 if the active cell is in column c
Now to try and explain INDEX
In the syntax INDEX(Array,RowPos,ColPos) the RowPos and ColPos parameters are not designed to accept arrays natively
So Index(A1:C10,5,{1,2}) when array entered in a cell will only return one element but when array entered across 2 cells will return both elements
So in the immediate window
?[Index(column(A1:C1),1,2)] gives 2 as
Index({1,2,3},1,2) is 2
and
?Index(Row(A1:A10),2,1)] also gives 2 as
Index({1;2;3….10},2,1) is 2
Conclusion
In functions that accept arrays natively as their parameters
The row/column functions work as expected inside evaluate
and do not need to be part of an expression
However when specified as parameters of functions that dont accept arrays natively they seem to return the Lbound value of the array
Why the 0+ works but it reminds me of a bug / feature in Subtotal
“If the subtotal function is present in the last row of a filtered table, excel treats that row as a “total” row and excludes it from the filter range. But if you make the subtotal as a part of an expression eg. 1*Subtotal then things work fine”
November 8, 2011 at 1:12 pm
@Sam,
Thanks for the research: I think you may be correct about why certain functions such as INDEX don’t work with COLUMN() but others do.
The 0+ explanation might be something to do with the way Excel handles function arguments – if they are expressions then they get evaluated before being passed to the function as arrays (equivalent to the distinction between P and R types in XLLs). So maybe makeing it an expression means that it gets passed to a different bit of Excel before being returned to Evaluate.
November 8, 2011 at 7:48 am |
Seems that references get lost in translation as this doesn’t happen with XLM Evaluate. Looks to be a result of a deeper issue with evaluating relative references in formulas when the return value is a reference, eg [Indirect("rc",0)] always returns A1 but ExecuteExcel4Macro(“!rc”) always returns B2.
November 8, 2011 at 12:39 pm |
@Lori, Yes I wondered if it had something to do with the way Evaluate treats relative references, but that didnt explain the fact that it works with 0+ on the front.
November 9, 2011 at 7:41 am |
Yeah it is odd, formulas appear to be evaluated relative to A1 if a reference is returned but relative to the activecell in other cases. One could speculate why this is but it doesn’t seem right to me. Defining the formula as a name and using the Referstorange property does evaluate relative to the activecell though.
November 9, 2011 at 2:20 pm |
@Charles,
“So maybe makeing it an expression means that it gets passed to a different bit of Excel before being returned to Evaluate.”
That certainly seems to be the case…
Lets assume A1 and A2 have 1 and 2 but stored as text (single quote before them or cells formatted as text)
=SUM(A1:A2) return an 0
but =A1+A2 gives 3.
When made part of expressions A1 and A2 seem to get passed to a “different bit of Excel”
Maybe A1+A2 is giving us the equivalent of A1.Value2 + A2.Value2 (unformated/underlying value of the cell)
where as =SUM(A1:A2) must be working on just .Value or something like that
Thanks for these post…got some nice insights !
March 17, 2012 at 10:50 am |
It strikes me that another use of the EVALUATE worksheet function is as a non-volatile version of INDIRECT. This could be quite handy if you want to have a user be able to select a range from say a validation list dropdown, and then have a formula calculate something based on the range they selected.
For instance, you might have a whole lot of different scenarios that you want to model – or different divisions of a company that you want to be able to crunch some data for.Say you have different sets of input data for the model that you want to use to generate scenarios 1 through n. If you assign named ranges to the input data, such as Scenario_1, Scenario_2, …, Scenario_n then you can set up a named range called Current_Scenario in the Name Manager with the formula =Evaluate(input_1), where input_1 is a cell where you type the particular Scenario_n you want to model. Then you just use the output of Current_Scenario to feed your model, and hey presto, you can change your model input at will.
June 17, 2012 at 4:36 am |
Charles – All very interesting. I copied your examples onto my machine and got much the same result, although there seems to be an enormous variation in the time from run to run.
However when I tried it on my own UDF I only got an improvement of 10-20% (with Excel 2010). The difference seems to be that I am using Evaluate on a VBA array of strings, calling the Evaluate command for each item in the array, rather than a single call of an array function on spreadsheet data.
Incidentally I tried modifying your routine by converting the string formula to a string before calling the Evaluate function:
FString = “COUNTIFS(A1:A100000,” & Chr(34) & “x” & _
Chr(34) & “,B1:B100000,” & Chr(34) & “y” & Chr(34) & “)”
j = Worksheets(“Sheet1″).Evaluate(FString)
It didn’t make any significant difference.
July 11, 2012 at 3:29 pm |
Thanks for doing the research and posting findings.
re:Evaluating Array Formulas
This works: ?[AVERAGEIFS(DETOdo[Odometer],DETOdo[Unit],”176″)]
But this fails: ?[MEDIAN(IF(DETOdo[Unit]=”176″,DETOdo[Odometer]))]
Error is: Item with the specified name wasn’t found
Also – I guess I’m not very smart, but I couldn’t follow the code in: Evaluate speed compared to using Application.Worksheetfunction
July 11, 2012 at 5:25 pm |
Craig, I am not surprised you could not follow the code since it was severely mangled by the HTML. I think I have fixed it now …
July 11, 2012 at 5:53 pm |
Thanks
Any guess as to why MEDIAN fails? Did I mss something about array formulas?