$sample from an internal document list

Experiment Setup

db.SAMPLE_COLLECTION.insertMany([
	{ list: [ 1, 2, 3, 4, 5, 6, 7, 8 ] },
	{ list: [ 1, 2, 3, 4, 5 ] },
	{ list: [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 ] }
])

Produces a collection that looks somewhat like this

{ _id: 1, list: [ 1, 2, 3, 4, 5, 6, 7, 8 ] },
{ _id: 2, list: [ 1, 2, 3, 4, 5 ] },
{ _id: 3, list: [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 ] }

I want my output to look something like this (the exact numbers in each list don’t matter, so long as there are 3 randomly sampled numbers)

{ _id: 1, list: [ 1, 4, 7 ] },
{ _id: 2, list: [ 2, 3, 5] },
{ _id: 3, list: [ 1, 6, 11 ] }

I want to sample 3 items from each list. I’m trying to figure out how to do this with $unwind and $sample in an aggregation pipeline.

If I try

db.SAMPLE_COLLECTION.aggregate([
	{ $unwind: '$list' },
	{ $sample: { size: 3 } }
])

I will only get 3 results back total, instead of 3 results per list. I can see how I might do this if I processed only one document at a time, but I’m not sure how to process only one document at a time in an aggregation pipeline.

Any suggestions?

You may use $rand to generate 3 random array indexes.

You would then use $filter on list to get the values that correspond to the random indexes.

2 Likes