MongoDB migration

Hi,

I am trying to come up with a NodeJS script to migrate data from one collection to another collection in a different database using mongoose. The schema of the target collection maybe different but the collection names are the same.

But the issue I am facing is we cannot connect to 2 different DB’s at the same time from a single script and also we cannot create 2 different models with the same name.

I have tried “mongodump” and “mongorestore”, but the issue if for some reason a document cannot be migrated to target collection I need the process to stop and show error and rollback. Also I need the flexibility to manipulate data before migrating to target collection

Can anyone please suggest a solution?

I found this link online which mentions the above limitations

Regards,
Harsha

Hello @Harsha_Buggi, Welcome to the MongoDB community forum,

There are lots of ways to do your requirement in the link that you shared.

It will be helpful if you post your try and where you fail, so I can help you with that details.

Hi Vishal,

I couldnt connect to 2 different data bases ( 2 different ATLAS accounts ) in the same script. I tried to do the following

  1. Connect to first DB
  2. Read collection
  3. Disconnect from DB
  4. Connect to second DB
  5. Update data to collection in second DB

But I found we cannot compile 2 models with same name. This is the code.

const mongoose = require('mongoose');
const { Schema } = mongoose;

async function connectToDb() {
    try {
        await mongoose.connect("mongodb+srv://<URL>/<DB_name>?retryWrites=true&w=majority", {
            useNewUrlParser: true,
            useUnifiedTopology: true
        });
        console.log('DB connection successfull');

        const Users = mongoose.model('user', Schema({
               ...
               ...
          }));

        users_list = await Users.find()
        console.log(users_list)
        mongoose.connection.close()
        await mongoose.connect("mongodb+srv://<URL>/<second_ DB_name>?retryWrites=true&w=majority", {
            useNewUrlParser: true,
            useUnifiedTopology: true
        });
        console.log('DB 2 connection successfull');
        const Users_DB2 = mongoose.model('user', Schema({
                ...
                ...
          }));
        users_second_list = (await Users_DB2.find()).forEach(function (x) {
            console.log(x)
            console.log('insert to collection in second DB')
        });
        mongoose.connection.close()
    } catch (err) {
        console.log(err);
    }
}
connectToDb();

This is the error I am getting

OverwriteModelError: Cannot overwrite `user` model once compiled.

The model for the second DB may or may not have the same schema as first DB. Also if I am able to get past this error I would like to get the code to insert data to second DB where the line I have mentioned as “console.log(‘insert to collection in second DB’)”. If the insert operation fails I need to be able to detect it and take action like roll back the copy operation.

Also it would be great if you can tell me how to be connected with both DB when I am migrating data instead of having to close connection to DB.

Regards,
Harsha

Hello Harsha,

You need to store the specific connection in a variable and create a model using that variable, means should be 2 separate variables for both connections.

Ex:

const firstConn = mongoose.connect("first connection url");
const firstSchema = firstConn.model("user", schema);

const secondConn = mongoose.connect("second connection url");
const secondSchema = secondConn.model("user", schema);

Well I have created a quick demo in GitHub and it is working, It will just connect 2 separate servers connection and insert a document, you can modify it as per your use, this is just to clear the things how you can implement it.

Hi Vishal,

I tried what you suggested and storing the connection to a variable in my script. But I still get the same error. What am I missing ?

let conn = await mongoose.connect("mongodb+srv://<URL_1>/<DB_name>?retryWrites=true&w=majority", {
            useNewUrlParser: true,
            useUnifiedTopology: true
        });
const Users = conn.model('user', Schema({
               ...
               ...
          }));

let conn_2 = await mongoose.connect("mongodb+srv://<URL_2>/<second_ DB_name>?retryWrites=true&w=majority", {
            useNewUrlParser: true,
            useUnifiedTopology: true
        });
const Users_DB2 = conn_2.model('user', Schema({
               ...
               ...
          }));

Error is

OverwriteModelError: Cannot overwrite `user` model once compiled.

I am going through your code in GitHub. But I would appreciate it if you could tell me what is wrong with my script.

In my scenario the 2 different DB’s and collections already exist. I am currently unable to read from both using a single script.

Regards,
Harsha

If mongoose stops you from doing exactly what you want to do you may try connecting to the read-from server using pure MongoDB API and you mongoose for the write-to server.

If I understand @turivishal’s example, the schema is the same for both models, while yours

If you must absolutely read using the mongoose model API, you may try to temporarily rename your write-to model to something like newuser just for the purpose of the migration. Once the migration is done you may use aggregation $out to move the newuser collection to user.

Another avenue you may look at is to use mongoose’s schema versioning. I really do not know how this work, but may be you can read with __v:0 and write with __v:1.

Issues like this is one reason why I avoid abstraction layers like mongoose. It is so much easier in pure mongo.

1 Like

Hello @Harsha_Buggi,

Can you please try the demo that I have provided on GitHub, Sorry I did not explain the code here, let me do that first,

You have to use the createConnection method to connect with the specific MongoDB server, where I have created a common connect.js file to reuse this code for the second MongoDB server connection.

This will return a connection object when you import it in the main file.

const mongoose = require("mongoose");
module.exports = (name, connString) => {
    const db = mongoose.createConnection(connString);
    db.on('connected', () => {
        console.info(`${name} MongoDB connection succeeded!`);
    });
    db.on('error', (err) => {
        console.error(`${name} MongoDB connection failed, ` + err);
        db.close();
    });
    db.on('disconnected', () => {
        console.info(`${name} MongoDB connection disconnected!`);
    });
    process.on('SIGINT', () => {
        mongoose.connection.close().then(() => {
            winston.info(`${name} Mongoose connection disconnected through app termination!`);
            process.exit(0);
        });
    });
    // EXPORT DB OBJECT
    return db;
}

Let’s move to the main index.js file, Update you constants

// UPDATE YOUR DEFAULTS
const firstDB = {
    name: "First", // whatever for display purpose in console
    connStr: "your first db connection url",
    db: "mydb",
    coll: "mycoll"
};

Let’s import connect file and connect your first db,

// CONNECT DB
const firstConn = require("./connect")(firstDB.name, firstDB.connStr);

Create model and schema, as you said you have different schemas for both collections so this will create a nonstrict schema, by providing empty {} schema property and strict: false in schema options. and the useDB method will select the database for real-time operations

// CREATE MODEL & SCHEMA
const FirstSchema = firstConn.useDb(firstDB.db).model(
    firstDB.coll, 
    new mongoose.Schema({}, { strict: false, collection: firstDB.coll })
);

Now you can do any operation by the above schema object, as I did insert the operation,

// DO TRANSACTIONS WITH SCHEMA
async function firstInsert() {
    let doc = await FirstSchema.create({ name: "test", calledAt: new Date() });
    console.log(doc);
}
firstInsert();

You can repeat the same steps that we did above for the second MongoDB server connection, you can see my code in GitHub that I have already done, and also I am adding here by combining them all together,

// UPDATE YOUR DEFAULTS
const secondDB = {
    name: "Second", // for demo
    connStr: "your second db connection url",
    db: "mydb",
    coll: "mycoll"
};
// CONNECT DB
const secondConn = require("./connect")(secondDB.name, secondDB.connStr);
// CREATE MODEL & SCHEMA
const SecondSchema = secondConn.useDb(secondDB.db).model(
    secondDB.coll, 
    new mongoose.Schema({}, { strict: false, collection: secondDB.coll })
);
// DO TRANSACTIONS WITH SCHEMA
async function secondInsert() {
    let doc = await SecondSchema.create({ name: "test", calledAt: new Date() });
    console.log(doc);
}
secondInsert();
1 Like

HI @turivishal ,

Thank you so much !!!

I tried your code and it works. I am able to connnect to 2 different DB’s in 2 different ATLAS accounts and read “users” collection from both and display data.

I just replaced the connection strings and DB and collection names in your code and instead of writing I did a .find() and its working.

let doc = await FirstSchema.find();
let doc = await SecondSchema.find();

I also tried to add an actual schema object where you are passing an empty one in this line. I will try to check if it works if I pass different schemas here. Would you be able to tell me if it will work ?

new mongoose.Schema({}, { strict: false, collection: firstDB.coll })

I just tried adding the property “{ strict: false, collection: ‘users’ }” to the “Schema” in my script but it didnt work.

I still have a long way to go (if you see my requirement from my first post) and I am going over your code to understand all the parts.

@steevej
Thank you so much for your suggestions !!!
I am sure I will use them at some point …
Is there any readymade script I can use using MongoDB API ? I am just trying to assemble something that works and meets my requirement.

Regards,
Harsha

1 Like

Hello @Harsha_Buggi,

Glad you solved the connection issues, I think now it is easy, you just need to put your logic of migration and insert/update it into a new collection or log the error.

It should work, make sure you are using the latest version of Mongoose npm, the purpose of .Schema({}, and strict: false is to make schema flexible, there is no restriction to insert new properties in the collection, but it is okay if you do fix schema properties, it will work. like this .Schema({ // your schema properties }, { collection: "collection name" })

I am not sure if is it available or not…

@turivishal, I am still understanding the parts. But this is exactly what I needed and works like a charm. This is how I am currently using it to copy data from one collection to another (I am using the function names you gave - but you get the idea … )

async function secondInsert() {
    source_data =  await firstInsert();
    source_data.forEach(async element => {
        let result = await SecondSchema.collection.insertOne(element);
        console.log(result)
    });
}

Thank you.

I will try to keep you posted on its development.

Regards,
Harsha

1 Like

@turivishal could you tell me how to get a list of all collections in the DB’s in your script ? I am unable to find how to do it.

I searched online and got this page but couldnt get anything that worked.

Regards,
Harsha

Hello Harsha,

You can use listCollections method to get collections, before that you need to update the below line because I have done it directly with a model in my script,

const firstUseDB = firstConn.useDb(firstDB.db);

Now check is connection open then get the list of collections,

firstConn.on('open', () => {
  firstUseDB.db.listCollections().toArray().then((names) => {
    console.log(names)
  })
  .catch((err)=>{
    return err;
  });
});

@turivishal thanks again !!! This works. I was earlier tried to do something like this

async function firstInsert() {
    my_db = firstConn.useDb(firstDB.db)
    console.log(my_db.listCollections())
    let doc = await FirstSchema.find();
    //console.log(doc);
    return doc
}

… but it wouldn’t work and I get error “UnhandledPromiseRejectionWarning: TypeError: my_db.listCollections is not a function”. I thought it’s enought to be connected to DB. It appears we need to wait for “open” event too. I am trying to look it up online.

2 Likes