Advantages and disadvantages of Stored Procedure equivalents

Nadav_Megnazi · November 18, 2021, 6:21pm

Hey,
I’ve been using SQL databases in the last couple of years and one of their great tools is the Stored Procedures.
One of Stored Procedures many advantages is its performance, security and ease of use.
I searched for an equivalent to it in MongoDB and came across this post.
I was wondering - in terms of performance and security, are any of the suggested solutions in the post similar to stored procedures?

Xavier_Robitaille · November 18, 2021, 11:57pm

Hi @Nadav_Megnazi,

PERFORMANCE
In terms of performance , The MongoDB Aggregation Pipeline is executed on the database and can be very fast, as mentioned by @MaBeuLux88_xxx (MongoDB employee / power user).

When I build a complex aggregation pipeline that I want to be able to re-use easily, I bundle it in a MongoDB Server-side function. Note that I share @MaBeuLux88_xxx’s concerns about stored procedures, and I apply the same guiding principles with my MongoDB Server-side functions. Namely:

they should be version controlled alongside the rest of the application’s code
one should not put business logic in the stored procedure / MongoDB Server-side function
i.e. use such procedures / functions sparingly, e.g. if the Aggregation Pipeline in your function becomes unwieldy, that’s a cue you may want to do things differently (e.g. break it down and put some of the logic in your server side code).
I would avoid using Javascript in my MongoDB Server-side functions (e.g. I keep them as close as possible to one simple Aggregation Pipeline per Server-side function). My understanding is that my Javascript code will run faster on my Node server than on my MongoDB Server (because my server is hot and Node will apply JIT optimizations, and because at this point it’s easier for me to scale my Node servers than my MongoDB server).

SECURITY
It’s interesting that you mention security, because it’s a question I’ve asked myself. I’m aware of SQL Injection Attacks (SQLIA), and how even stored procedures can be vulnerable to SQLIA.

I would be interested to know what @MaBeuLux88_xxx has to say with regards to query injection type attacks in MondoDB. In particular, I found it telling that roughly a quarter of the documentation on MongoDB Server-side function is dedicated to disabling server-side execution of JavaScript… I actually wish there was more documentation on MongoDB Server-side function, etc.

Regards,
Xavier Robitaille
Feather Finance

John_Page · November 19, 2021, 8:41am

Hi,
I’m an 8+ year MongoDB employee and have a lot of experience with Aggregations both internally and with customers.

There is a historic artifact in MongoDB that allows you to store a JS function in a collection which can then be used in Javascript running on the server. The can be used in $where, $map-reduce and the now deprecated eval() however the sandbox they run in has no access to the database, it cannot run aggregations, or find() or make any changes - they are only used in explicitly comparing a set of documents they are presented with and long ago were replaced with aggregation functions for the same reason.

Personally I do have code where I store aggregation pipelines in collections and then the client code retrieves them and runs them - pipelines are seldom long enough for this to be an inefficient option as these are usually interactive choices or loaded by the client on startup and cached.

As for Security - MongoDB does not combine any of the things sent to it - by the time it is sent to the server all evaluation and parsing is complete so the server cannot facilitate an injection however. If the developer is using Javascript on the front or back end there are lots of opportunities for client-side/application-side injection attacks for example if a node app gets posted a username and password and then doesn’t explicity sanitise them but treats them as Javascript values - imagine your JS/Node based coded does

(user,pass) = (req.query.user,req.query.pass)
user = db.appusers.findOne({ user,pass))
if (user == null) return false;

In this case - someone passing a pass in the query string of { $ne: "xxx"} will sucessfully log in - because the Javascript code interpreted something it shouldn’t - it’s not MongoDB that’s vulnerable to injection - but Node definitely is.

The above code should really be

(user,pass) = (req.query.user,req.query.pass)
user = db.appusers.findOne({ `${user}`,`${pass}`))
if (user == null) return false;

Xavier_Robitaille · November 19, 2021, 12:16pm

Hi @John_Page,

Thank you for your detailed answer! You and @MaBeuLux88_xxx are the best! You two threads will become the reference for people googling “MonoDB stored procedures”

Can I ask two clarifications which I believe could benefit the community? I will reference a concrete example below, one of my MongoDB Server-side functions (it does a simple aggregation whereby it “joins” data from two collections):

I am not sure I understand what you mean when you say MongoDB Server-side functions “cannot run aggregations”. I can call my function getUser("john.doe@unknown.com"), either from my mongo shell or from Node. Are you referring to other types of aggregations that cannot run?
When you say “the client code retrieves them and runs them” are you referring to MongoDB Server-side functions stored in system.js? Or are you referring to storing aggregation pipelines code e.g. in a non-system collection? When you say that the “client code […] runs them”, could the “client” (from the perspective of MongoDB) in fact be a node server (from the perspective of the web app)?

// collection: system.js
{
   "_id" : "getUser",
   "value" : Code("
      function (emailAddress) {\n
         return db.users.aggregate([\n
            { $match: { email: emailAddress } },\n
            {\n
                $lookup: {\n
                    from: \"activitiesbuckets\",\n
                    localField: \"_id\",\n
                    foreignField: \"user\",\n
                    as: \"activities\"\n
                }\n
            }\n
        ]);\n
    }
")
}

John_Page · November 26, 2021, 12:27pm

How are you executing the above function? As far as I am aware the only way it can be executed is by it being retrieved by the client (Shell, Node.JS app etc) and then sent executed outside the database sending the aggregtion back to the server like any other bit of code. Logically it also cannot be called by any other programming language other than JS as it’s not running on the database server but in the app code. system.js is not really any different to any other collection and MongoBD is just being used to store ‘data’ for your app - the data just happens to be pipeline defintions. No part of this is server side functions , stored procedures or server side javascript.

I could be wrong however but my understanding is all that was disabled years ago.

Xavier_Robitaille · November 26, 2021, 2:32pm

Hi @John_Page, I am indeed calling this function from either a Mongo Shell or NodeJS app after running loadServerScripts(). I agree with you that it is nothing more than a pipeline definition that’s stored in the database, but that doesn’t make them any less useful / worthy of documentation. To the extent that system.js & loadServerScripts are useful tools provided by MongoDB, they should be well documented: provide the pros and cos of .

In my view stored procedures (whether SQL or no-SQL) can be part of a good database design when used correctly, which is the reason developers like me and @Nadav_Megnazi search google for “MongoDB stored procedures”, and are disappointed by the results. In my view, system.js & loadServerScripts provide the essence of what SQL stored procedures do. E.g any client can call function getUser to load the user, without having to know the mechanics of how the user is stored.

Xavier Robitaille
Feather Finance

John_Page · December 7, 2021, 9:53am

Hi Xaiver,
I agree that a shared set of functions used by everyone that encapsulate and hide the schema (and ideally business logic) is the way to design software. We teach the importance of using DALs in code design and many if not most developers now take that further wrapping what you do with ‘stored procedures’ into micro-service APIs. If you are using a dynamic programming language like Javascript then you do have the option to effectively store run-time-loaded code on the server, you can do the same in python and system.js is simply a convention - the code is stored as a string in BSON after all. Load server scripts is just an obscure part of the MongoDB shell - and the shell was never intended to be anything but a test and debugging tool although we have seen some people use it as part of production, I wasn’t aware you could access it from node.

My experience of the nature an purpose of true stored procedures is very different - in my worldview they exist to ensure logic runs at and in the database server - this is important with and RDBMS where you may need to perform a number of operations inside a transaction and the overhead of the client/middleware making multiple TCP calls to the server to submit all operations individually is too much - a stored procedure greatly reduces the time between begin and commit - during which the database is having to lock things reducing concurrency. This is one reason I recommend avoiding MongoDB transactions in favour of correct document modelling - the document model was created to solve just this problem.

The other use of stored procedures is about correctness - if you ONLY give access via stored procedures you can enforce business logic and security server-side. This is less important since the advent of three tier architectures in the early 2000s but really mattered when we were all running Windows. executable applications talking directly to the database.

The downside of true stored procedures is that the work is being done in the database server - which limits scalability - by moving it out to app-servers as mongoDB encourages you don’t have that issue.

So if you are still with me - if you are building application servers that dynamically pull their code from inside MongoDB it’s cool that you are able to do so - it’s an very very uncommon pattern and I’m on the fence if it’s a good idea or not versus simply maintaining a library/module that encapsulates your logic . MongoDB really doesn’t have explicit support for it that system.js collection was intended to support in-database stored procedures back when someone foolishly thought we should have them but the nature of MongoDB means it can store source code and you can fetch and run it dynamically in a few languages.

Xavier_Robitaille · December 9, 2021, 8:05pm

@John_Page, thank you very much for this thorough and well written discussion on the evolution of stored procedure and DAL. I went over and beyond what I hoped to achieve by posting on this forum. I learned a great deal, and I will reference this thread for years to come.

My takeaway: system.js javascript functions where I store aggregation pipelines are not necessarily evil, however they should be limited to small “inline” functions. The main reason for this is that code execution on the DB server doesn’t scale as well as code that runs as a micro-service. I will follow this principle: store complex aggregation pipelines in system.js when I feel like they will benefit from being encapsulated, but migrate them to a micro-service if they start using up significant database server resources.

John_Page · December 14, 2021, 1:53pm

Hi Xavier - I think you may have misunderstood me- MongoDB almost never runs JS on the server, certainly not from system.js. loadServerScripts is not running them on the server - it’s running them in your shell, on your local computer, its just loading js code from the DB rather than a file on the local disk. In your function above, it runs this js on the client it then passes a BSON message to the server with the code of your pipeline in, no Javascript running on the server. Nothing wrong with that but a bit archaic .

Once - people would call these functions from mapReduce (Deprecated in 2013) or $where (Effectively deprecated about 2016) but most people 100% disable the javascript interpreter in the server anyway.

Xavier_Robitaille · January 3, 2022, 3:05pm

For reference, I discovered this eBook on Practical MongoDB Aggregations (written by a MongoDB engineer?). It explains how the Aggregation Framework fits in Mongo’s overall data query framework, and it also gives a history of Mongo’s server side Javascript, why it was implemented then deprecated to be replaced by Aggregations.

It’s an interesting reference for anyone who wants 1) deeper knowledge of Aggregations, and 2) understand why Mongo is currently the way it is. The book quotes @John_Page .

Xavier Robitaille
Feather Finance

MaBeuLux88_xxx · January 11, 2022, 6:28pm

Yes, @Paul_Done wrote that book. He is currently an Executive Solution Architect @MongoDB.