Iterable/cursor issues in Nodejs

I’m using the latest mongodb driver for nodejs 3.5.3 but I’m having issues with cursors.

I’m planning on processing a table of 450k+ rows and doing some async operations so obviously I won’t want to use toArray() first.

  1. A simple cursor.forEach isn’t working – the async function isn’t being called at all.

    const cursor = client.db().collection('properties').find().limit(15);
    cursor.forEach(
         async function(row){
             return knex('properties').insert(row).then(console.log).catch(console.error);
             }
         ,async function(err){
             if(err) console.error(err);
             await client.close();
             console.log('done');
             process.exit();
             })
    
  2. When I try simply using Bluebird’s Promise.map (which should handle async and concurrency), I get UnhandledPromiseRejectionWarning: TypeError: expecting an array or an iterable object but got [object Null]

  3. When I use npm: mongo-iterable-cursor to convert the cursor, I get the same error with Bluebird. (maybe that’s for driver v2?)

  4. When I use a native iterator code found on stack exchange (javascript - Async Cursor Iteration with Asynchronous Sub-task - Stack Overflow)

    for await ( let row of cursor ) {
         console.log(row._id)
         return Promise.delay(2000);
         }
    

    We never get to the second row.

  5. Similar with this code, we never get to the second row: node.js - Iterating over a mongodb cursor serially (waiting for callbacks before moving to next document) - Stack Overflow

     while(await cursor.hasNext()) {
     	const row = await cursor.next();
     	console.log(row._id);
     	return Promise.delay(2000);
     	}
    

    (The Promise.delay is the same as my issues with calling knex with DB commands)

Besides for the cursor simply not working, I want to use bluebird for processing the cursor since it has a concurrency setting. I’ve never seen an explanation of how cursor.forEach handled awaits.

What am I doing wrong? Suggestions? Thanks!

For your 4th and 5th example, you never see the second result because you have a return inside the loop, which will cut short the loop’s operation.

For example, if I have no return:

let get_docs = async function(collection) {
  const cursor = collection.find({})
  for await (const doc of cursor) {
    console.log(doc)
  }
}

called from this main function:

let run = async function() {
  ... connect to database ...
  await get_docs(conn.db('test').collection('test'))
  await conn.close()
  console.log('db closed')
}().catch(console.error)

it will print the whole collection as expected and then print db closed (using my test data):

{ _id: 0 }
{ _id: 1 }
{ _id: 2 }
db closed

However if I put a return like in your example:

let get_docs = async function(collection) {
  const cursor = collection.find({})
  for await (const doc of cursor) {
    console.log(doc)
    return
  }
}

it will instead print:

{ _id: 0 }
db closed

It only printed the first document, then it printed db closed because the for await() function returned after the first (and only) run through the loop.

I don’t know why you return Promise.delay(2000) in your code, but if you want to wait a specific time between loop iteration, you can await on a promisified setTimeout function. For example:

let get_docs = async function(collection) {
  const sleep = util.promisify(setTimeout)
  const cursor = collection.find({})
  for await (const doc of cursor) {
    console.log(doc)
    await sleep(2000)
  }
}

this will print:

{ _id: 0 }
...2 seconds wait...
{ _id: 1 }
...2 seconds wait...
{ _id: 2 }
...2 seconds wait...
db closed

For more examples, see ways to iterate on a cursor, async or otherwise, in the node driver manual page.

Best regards,
Kevin