Excel JavaScript API Part 3: Benchmark of Processing Arrays Performance – Loop vs Reduce vs COUNTIFS vs VBA

In a previous post I compared several different ways of processing data with VBA. The fastest way was to get the data into a variant array and loop on it.

JavaScript has some powerful array methods so I wanted to see how they performed compared to VBA. I looked at four different ways of processing data using the JS API:

  • Get data into a JavaScript Array and loop
  • Get data into a JavaScript proxy Range object and loop
  • Get data into a JavaScript array and use the JS Reduce method
  • Call the worksheet function COUNTIFS from JavaScript

The Test Data

I am using the same test data as the previous post: 100000 rows with 2 columns.
Column A is randomly 50% populated with “X” and Column B is randomly 50% populated with “Y”.

The task is to count how many rows have an “X” in column A and a “Y” in column B in the same row.

Getting the data from Excel into a JavaScript array

This task is much slower than VBA: JavaScript takes around 290 milliseconds to read 100000 rows and 2 columns, but VBA takes only around 16 milliseconds.

Let’s ignore that and just focus on how efficient JavaScript is at processing arrays.

Benchmark Timings

JSArrayBench

If you ignore the time taken to get the data into the array then JavaScript and VBA are equally efficient – 19 milliseconds each.

I expected it to be faster to process the proxy range object directly rather than copying the data into an array and using that – but I was wrong. Looping directly on the proxy object is a lot slower than looping on an array derived from the proxy object.

Similarly I thought that the array Reduce method would be fast: but Reduce is slower than directly looping the array!

The overall winner in this case is COUNTIFS because the data never gets transferred to VBA or JavaScript so the extra 290 (JS) or 16 (VBA) milliseconds don’t get incurred. Calling a worksheet function from VBA also has a much smaller overhead than calling it from JS.

The JavaScript code

Here are the four different examples of the JavaScript code.

Looping the Array

The complete JavaScript code I am using for the looping the array method looks like this:


var t0;
var t1;
var t2;
$('#run').click(function () {
t0 = performance.now();
invokeRun()
.then(() => {
t2 = performance.now();
console.log("Time to read data from XL " + Math.round(t1 - t0) + " milliseconds.")
console.log("Time to process array " + Math.round(t2 - t1) + " milliseconds.")
})
.catch(OfficeHelpers.logError);
});

function invokeRun() {
return Excel.run(function (context) {
var range1 = context.workbook.worksheets.getItem("Sheet1").getRange("A1:B100000").load('values');
return context.sync()
.then(() => {
var arr = range1.values;
t1 = performance.now();
var nFound = 0;
for (var j = 0; j < arr.length; j++) {
if (arr[j][0] === 'X') {
if (arr[j][1] === 'Y') nFound++;
};
};
console.log("XY pairs found " + nFound);
})
});
}

The invokeRun function loads the values from the range into a proxy range object using context.sync() , then creates a JavaScript array (actually an array of arrays) from the proxy object and loops down the array looking for an X and a Y in the same row.

Looping the Proxy Range Object


function invokeRun() {
return Excel.run(function (context) {
var range1 = context.workbook.worksheets.getItem("Sheet1").getRange("A1:B100000").load('values');
return context.sync()
.then(() => {
t1=performance.now();
var nFound = 0;
for (var j = 0; j < range1.values.length; j++) {
if (range1.values[j][0] === 'X') {
if (range1.values[j][1] === 'Y') nFound++;
};
};
console.log(nFound);
})
});
}

This version of InvokeRun loops directly on the proxy object values rather than on an array derived from the proxy object.

Using the REDUCE array method

This version of InvokeRun uses JavaScript’s REDUCE array method, where you supply an aggregating function to be applied to each row of the array.


function invokeRun() {
return Excel.run(function (context) {
var range1 = context.workbook.worksheets.getItem("Sheet1").getRange("A1:B100000").load('values');
return context.sync()
.then(() => {
t1 = performance.now();
var arr = range1.values;
var nFound = arr.reduce(function (ct, currval, ix, arr) {
if (arr[ix][0] === 'X') {

if (arr[ix][1] === 'Y') {
ct++;
}
}
return ct;
}, 0);
console.log(nFound);
})
});
}

Using Worksheet Function COUNTIFS

This version demonstrates calling a worksheet function on the range. Since the range values never get passed across to JavaScript it is by far the fastest solution for this case!


function invokeRun() {
return Excel.run(function (context) {
var rangeA = context.workbook.worksheets.getItem("Sheet1").getRange("A1:A100000");
var rangeB = context.workbook.worksheets.getItem("Sheet1").getRange("B1:B100000");
var count = context.workbook.functions.countIfs(rangeA, "X", rangeB, "Y");
count.load();
return context.sync()
.then(() => {
t1 = performance.now();
console.log("XY pairs found " + count.value);
})
});
}

The VBA Code

For completeness sake here is the equivalent VBA code for the looping the array case.


Sub FindXY3()
Dim vArr As Variant
Dim j As Long
Dim n As Long
Dim dTime As Double
Dim dTime2 As Double
dTime2 = MicroTimer
vArr = Range("a1:B100000").Value2
dTime = MicroTimer
For j = LBound(vArr) To UBound(vArr)
If vArr(j, 1) = "X" Then
If vArr(j, 2) = "Y" Then
n = n + 1
End If
End If
Next j
Debug.Print "Var array " & n & " Get " & (dTime - dTime2) * 1000 & " Find " & (MicroTimer - dTime) * 1000
End Sub

 

 

Conclusions

JavaScript seems fairly efficient at processing arrays, and is very comparable to VBA.

The problem is the time taken to transfer data from Excel to JavaScript.

Of the JavaScript array methods bench-marked, Reduce and direct processing of the proxy object were slower than direct looping on an array.

Using a worksheet function such as COUNTIFS is considerably faster than looping an array as long as you can avoid transferring the data to VBA or JavaScript.

 

Advertisements
This entry was posted in arrays, JS-API, VBA and tagged , , . Bookmark the permalink.

4 Responses to Excel JavaScript API Part 3: Benchmark of Processing Arrays Performance – Loop vs Reduce vs COUNTIFS vs VBA

  1. Jeff Weir says:

    Very informative series, Charles. I appreciate your work and sharing on this.

  2. Yuriy says:

    In case of use VBA code, the variable dTime is set before the search cycle. This probably affected the final time?

    • fastexcel says:

      @Yuriy – Both the JS & VBA code shown give 2 times: first for getting the data and second for doing all the finds – so the times are comparable. The benchmark timings table shows only the second of these times (we already know that reading data into JavaScript is extremely slow compared to VBA)

  3. Yuriy says:

    Thank you for answer. My carelessness. The code was hidden in the right part.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s