Note
Aggregation Pipeline as Alternative to Map-Reduce
An aggregation pipeline provides better performance and usability than a map-reduce operation.
Map-reduce operations can be rewritten using aggregation pipeline
operators, such as
$group, $merge, and others.
For map-reduce operations that require custom functionality, MongoDB
provides the $accumulator and $function
aggregation operators starting in version 4.4. Use these operators to
define custom aggregation expressions in JavaScript.
For examples of aggregation pipeline alternatives to map-reduce operations, see Map-Reduce to Aggregation Pipeline and Map-Reduce Examples.
An aggregation pipeline is also easier to troubleshoot than a map-reduce operation.
The reduce function is a JavaScript function that “reduces” to a
single object all the values associated with a particular key during a
map-reduce operation. The reduce function
must meet various requirements. This tutorial helps verify that the
reduce function meets the following criteria:
The
reducefunction must return an object whose type must be identical to the type of thevalueemitted by themapfunction.The order of the elements in the
valuesArrayshould not affect the output of thereducefunction.The
reducefunction must be idempotent.
For a list of all the requirements for the reduce function, see
mapReduce, or the mongo shell helper method
db.collection.mapReduce().
Note
Starting in MongoDB 4.4, mapReduce no longer supports
the deprecated BSON type JavaScript code with scope
(BSON type 15) for its functions. The
map, reduce, and finalize functions must be either BSON
type String (BSON type 2) or BSON
type JavaScript (BSON type 13). To
pass constant values which will be accessible in the map,
reduce, and finalize functions, use the scope parameter.
The use of JavaScript code with scope for the mapReduce
functions has been deprecated since version 4.2.1.
Confirm Output Type
You can test that the reduce function returns a value that is the
same type as the value emitted from the map function.
Define a
reduceFunction1function that takes the argumentskeyCustIdandvaluesPrices.valuesPricesis an array of integers:var reduceFunction1 = function(keyCustId, valuesPrices) { return Array.sum(valuesPrices); }; Define a sample array of integers:
var myTestValues = [ 5, 5, 10 ]; Invoke the
reduceFunction1withmyTestValues:reduceFunction1('myKey', myTestValues); Verify the
reduceFunction1returned an integer:20 Define a
reduceFunction2function that takes the argumentskeySKUandvaluesCountObjects.valuesCountObjectsis an array of documents that contain two fieldscountandqty:var reduceFunction2 = function(keySKU, valuesCountObjects) { reducedValue = { count: 0, qty: 0 }; for (var idx = 0; idx < valuesCountObjects.length; idx++) { reducedValue.count += valuesCountObjects[idx].count; reducedValue.qty += valuesCountObjects[idx].qty; } return reducedValue; }; Define a sample array of documents:
var myTestObjects = [ { count: 1, qty: 5 }, { count: 2, qty: 10 }, { count: 3, qty: 15 } ]; Invoke the
reduceFunction2withmyTestObjects:reduceFunction2('myKey', myTestObjects); Verify the
reduceFunction2returned a document with exactly thecountand theqtyfield:{ "count" : 6, "qty" : 30 }
Ensure Insensitivity to the Order of Mapped Values
The reduce function takes a key and a values array as its
argument. You can test that the result of the reduce function does
not depend on the order of the elements in the values array.
Define a sample
values1array and a samplevalues2array that only differ in the order of the array elements:var values1 = [ { count: 1, qty: 5 }, { count: 2, qty: 10 }, { count: 3, qty: 15 } ]; var values2 = [ { count: 3, qty: 15 }, { count: 1, qty: 5 }, { count: 2, qty: 10 } ]; Define a
reduceFunction2function that takes the argumentskeySKUandvaluesCountObjects.valuesCountObjectsis an array of documents that contain two fieldscountandqty:var reduceFunction2 = function(keySKU, valuesCountObjects) { reducedValue = { count: 0, qty: 0 }; for (var idx = 0; idx < valuesCountObjects.length; idx++) { reducedValue.count += valuesCountObjects[idx].count; reducedValue.qty += valuesCountObjects[idx].qty; } return reducedValue; }; Invoke the
reduceFunction2first withvalues1and then withvalues2:reduceFunction2('myKey', values1); reduceFunction2('myKey', values2); Verify the
reduceFunction2returned the same result:{ "count" : 6, "qty" : 30 }
Ensure Reduce Function Idempotence
Because the map-reduce operation may call a reduce multiple times
for the same key, and won't call a reduce for single instances
of a key in the working set, the reduce function must return a value of the
same type as the value emitted from the map function. You can test
that the reduce function process "reduced" values without
affecting the final value.
Define a
reduceFunction2function that takes the argumentskeySKUandvaluesCountObjects.valuesCountObjectsis an array of documents that contain two fieldscountandqty:var reduceFunction2 = function(keySKU, valuesCountObjects) { reducedValue = { count: 0, qty: 0 }; for (var idx = 0; idx < valuesCountObjects.length; idx++) { reducedValue.count += valuesCountObjects[idx].count; reducedValue.qty += valuesCountObjects[idx].qty; } return reducedValue; }; Define a sample key:
var myKey = 'myKey'; Define a sample
valuesIdempotentarray that contains an element that is a call to thereduceFunction2function:var valuesIdempotent = [ { count: 1, qty: 5 }, { count: 2, qty: 10 }, reduceFunction2(myKey, [ { count:3, qty: 15 } ] ) ]; Define a sample
values1array that combines the values passed toreduceFunction2:var values1 = [ { count: 1, qty: 5 }, { count: 2, qty: 10 }, { count: 3, qty: 15 } ]; Invoke the
reduceFunction2first withmyKeyandvaluesIdempotentand then withmyKeyandvalues1:reduceFunction2(myKey, valuesIdempotent); reduceFunction2(myKey, values1); Verify the
reduceFunction2returned the same result:{ "count" : 6, "qty" : 30 }