Developer feedback for ease-of-use improvements for Go Driver

TopherGopher · March 3, 2020, 5:17am

I’d like to continue the conversation we started in the Google Group about an ease-of-use feature-set - making day-to-day operations easier/more compact.
I can copy the contents over, but it sounds like that thread will be archived.
Ultimately, I’d love to work with the team and the community to help develop a really straightforward interface and I am happy to use my own hours to get it done.

TopherGopher · March 4, 2020, 5:48pm

Also adding this ticket as an example of ease-of-use features for the library: https://jira.mongodb.org/browse/GODRIVER-903
ListCollections() and Indexes().List() both return a cursor that you have to define a custom type and unmarshal into. Why don’t they just return a slice of Collection{} and a slice of IndexModel{} instead, so that I as a developer can access things WAY more efficiently.

Divjot_Arora · March 12, 2020, 2:57am

Hi Christopher,

We currently have our hands full working on the 1.4.0 driver release, which will correspond with the MongoDB 4.4 server release. Once that’s over, I plan to organize a meeting with the team to go over the feedback you outlined and get you an answer about which tickets we can re-open/accept PRs for now and which ones are already planned for cross-drivers work in the future (e.g. client side operations timeouts). Does this sound OK to you?

– Divjot

TopherGopher · March 18, 2020, 4:49pm

Actually Divjot, I was hoping we could take a slightly different tack.
I understand that the driver tries to use the API fairly exclusively and at this point we can’t introduce breaking changes, and my primary qualm is with how hard the functions themselves are to use. It takes me forever to get teammates spun up and it’s hugely error prone because there are just so many knobs you have to turn and so many gotchas you have to check for.

I would like to add another package to mongo-go-driver - mongo-for-humans or something similar which still uses mongo-go-driver, but abstracts a lot of the complexity away from the developer. That way, if you want, you can have full control, or if you are just doing something common (e.g. SelectAll with no timeout), then it’s really easy. If you support this route, I would love to just come up with an interface spec together here and I’m happy to do a majority of the work.

Divjot_Arora · April 6, 2020, 1:39pm

I understand that there are things that are more difficult to do in our API compared to mgo and many of these are because we have a set of specs to follow and have to ensure that we offer a way to work with both raw BSON and decode BSON into native Go types so an application can choose what it wants to do based on performance constraints. I’m not convinced that another package is the right way to proceed, though. Many of the BSON issues you linked in your original post have been solved by the mgocompat.Registry, released in v1.3.0 of the driver. This new BSON registry was introduced to address the issues users were facing when migrating BSON libraries. Some of the other BSON features linked would have to be solved in our BSON library, not in a helper. An example is a struct tag to treat string fields as ObjectIDs. Many of the CRUD helpers in the tickets you linked (e.g. FindAll) would be added to the main mongo package rather than having users remember that some helpers are in mongo and others are in a separate helper package.

TopherGopher · April 6, 2020, 5:52pm

Hey Divjot -

Happy Monday

I truly appreciate you working the tickets I linked - that will make life easier for most.

My fundamental issue is how verbose normal functions are to implement.

Ultimately, I’m about the least about of key strokes - I like to write lean, clean code, and unfortunately it’s not possible to be DRY with the current go library for standard tasks.

The problem is there are too many knobs exposed that have to be turned. I want to leverage all these awesome and cool features you’ve exposed to make some really simple, less error prone functions. Look at how clean and readable this line is. It’s really easy for me to teach new developers - they just check a single error object. Forget about how we would do it for a moment, and just compare:

if err = coll.Find(query).All(&sliceToUnpackTo).WithTimeout(time.Minute); err != nil {}

Here’s how I have to do that in today’s world:

   ctx, cancel := context.WithTimeout(context.Background(), time.Minute)
   defer cancel()
   cursor, err := coll.Find(ctx, query)
   defer cursor.Close(context.Background())
   for cursor.Next(context.Background()) {
       if err = cursor.Decode(&someObj); err != nil {
           break
       }
       sliceToUnpackTo = append(sliceToUnpackTo, someObj)
   }
   if err == nil {
       err = cursor.Err()
       if err == nil && len(sliceToUnpackTo) { err = ErrNotFound }
   }
   if err != nil {
       return err
   }

No need to deal with using context or cancel functions or closing the cursor. (They always forget the defers). No need to play with defering the close of the cursor. Every time they implement, I always see them forget to check for a decode error or a cursor error, or setting the error not found. Usually, day to day, this is what our developers are doing. We’ve needed to actually use the cursor to do something interesting three times across literally hundreds of functions. It’s useful to have, but man it makes for verbose and repeated code.

I want a standard mongo library to have functions to make unpacking and timeouts easy. Because of the current mongo-go-driver design though, introducing this friendly syntax there isn’t possible because of backwards breaking changes. This is why I suggest a different package to wrap mongo-go-driver.

Do you follow my logic?

TopherGopher · April 13, 2020, 10:21pm

I can use .All() as well, but my point is the number of errors you have to check when you’re doing it is still higher. It would be nice to bundle that up a little.

Divjot_Arora · April 16, 2020, 9:50pm

All is definitely a more concise way of doing this. Changing how we do timeouts isn’t really on the table. Parts of the Go language that were built before context.Context has been updated to accept contexts (e.g. net/http). I would be open to a PR to add FindAll/AggregateAll to condense two error checks into one, but adding a completely separate fluent API isn’t really possible, especially given the fact that Go doesn’t have method overloading.

TopherGopher · April 16, 2020, 11:48pm

As a heavy user of this library, I, a user, am providing feedback that the current interface isn’t user friendly in general compared to literally every other mongo golang package out there. I don’t think we need method overloading, so that works out - Ideally, I want a different interface on top of mongo-go-driver - wrapping the existing code in mongo-go-driver, not the API.
Let’s put how to do timeouts aside and realize that usually we aren’t using a context, we pass nil, background or todo, so why force the user to specify it on every call? Why not just add it as an option instead? We can’t make Distinct easier without breaking the library, We can’t make these kinds of changes without breaking the library. That’s why I want to work together to design some wrappers in a separate library that I’m willing to maintain myself.
There are just design decisions in this library that have been made with Mongo, not the community, in mind. And that’s fine, we have different target audiences for our libraries.

In the Google thread I think that you had mentioned trying to do an mgo shim. We could use that as an opportunity to team up, extend that further and create something awesome. I envision things like, rather than having to manipulate a cursor, you could instead read from a channel.

I think what we could do is find libraries that currently consume mongo-go-driver and see what kind of patterns are being used to help drive what users are usually doing.

Frederic_Marand · June 9, 2020, 3:02pm

The approach taken for the PHP driver is interesting here: while the original PHP extension was designed to be used in userland, the second-generation extension was reduced in scope, to provide a small and inconvenient lower-layer, on top of which the userland driver was written, and meant to be used by most developers.
We have a similar situation here, with the new driver being aligned in terms of API with other MongoDB drivers, but being arguably less convenient than mgo. It could be used to provide the basis for a more user-friendly and more idiomatic driver written on top of it.

Divjot_Arora · June 9, 2020, 3:31pm

Hi,

Sorry I haven’t updated this thread in a while. Our product manager is looking into this and conducting some field research regarding the API differences between the Go driver and mgo. As you can imagine, we want to make sure we don’t speed through this as we want this to be an official API that we can recommend to our users.We’ll post updates here once we have a concrete plan.

– Divjot

Dustin_Currie · September 16, 2020, 5:22am

We’re moving from node.js to golang. The driver is a real blocker. I’m a new user of the driver, but I’d like to say that @TopherGopher has done you are real service here. I’ve spent my first three days on a new project trying to box the driver out of the rest of my code. It’s requiring me and others to organize the code in ways that are discouraged by the go community.

Divjot_Arora · September 16, 2020, 2:28pm

To provide an update on this, we are continuing to meet with community members to gather feedback about the pain points of the driver and from there, we will figure out whether there are improvements we can make directly to the driver or we need to add a separate API.

@Dustin_Currie Sorry to hear the migration process has been tough. If you have any specific questions, feel free to create new threads about them and hopefully we can provide some advice to make things easier.

– Divjot

Dustin_Currie · October 13, 2020, 9:27pm

@Divjot_Arora while we wait for improvements, is it possible to get an escape hatch. I desperately need something that will translate bson.Raw into an interface{} that consists of only go default concrete types. e.g. bool, string, float64, nil []interface{} and map[string]interface{}. see the go blogs json post here:

Dustin_Currie · October 13, 2020, 9:40pm

I’m surprised that the mongo driver pushes its internal type system onto users. I don’t know the numbers, but it seems like a majority of people who are writing in Go with MongoDB are in my situation–a node app is too slow so we’re rewriting. In this situation, it’s going to be common to need to model deep json hierarchies as interface{}. In my situation, I have data that just needs to pass through to the front end. Writing a huge set of structs to model it, just so I can transform it from bson to json, is a real economic problem.

TopherGopher · November 6, 2020, 8:25pm

I thought comparing a few of the articles available might be telling of some common usage patterns.

Tutorials

Note how they create a global context.TODO, essentially a global nil placeholder so that they can ignore the context value a majority of the time.
Note the use of a global client (like us).
Note in InsertMany the necessity to cast to a slice of interface.
Note the lack of .All example for cursor

This is arguably the most complete documentation I’ve found for creating indices, so it’s obvious the writer has done their homework, yet note the use of a bson.M for creating an index (just like us). Having an enforced type on the index, not a bsonx/interface/Elem value, would completely eliminate this dangerous pitfall.
Note the lack of context consumption
Note the lack of .All example for cursor

https://vkt.sh/go-mongodb-driver-cookbook/

Note the use of bson.M for queries in general
Note the lack of .All example for cursor
Note the lack of context consumption

https://wb.id.au/computer/golang-and-mongodb-using-the-mongo-go-driver/

Again - a wonderful guide, but see how bson.M is used for index value - easy pitfall.
None of the examples consume context and they leave it out for brevity - perhaps using the same global nil style as severalnines?
Note the lack of .All example for cursor

The context is a global context.TODO() - so a global nil of ctx
Note how an updated/found result of 0 is overridden to mongo.ErrNoDocuments
Use bson.D for queries
Note the lack of .All example for cursor

Use bson.D for queries
Indices are created using bson.M
Note the lack of context consumption
Note the lack of .All example for cursor

Official documentation suggests structuring queries using bson.D - I would argue that bson.M ends up more readable in a majority of cases.
Even in the examples, for InsertMany, it’s necessary to case a slice to an array of interface - it seems like this is something the driver could handle without backwards breaking changes.
Every single context on the page is context.TODO - unused/nil context
Note the lack of .All example for cursor

Real-World Usage

Then I went to find projects that consume the mongo-go-driver and spot checked a few (some are garbage or test projects, but it gives a great idea of where people who are starting out struggle or miss something key): mongo package importedby - go.mongodb.org/mongo-driver/mongo - Go Packages

https://bitbucket.org/QuizKhalifa/mongowrapper/src/master/manipulation.go

Created ease-of-use wrappers to mirror mgo - Upsert/UpsertById
Context is not overridden - singleton wide cancel func. Note that canceling a single query would result in the cancelation of all subsequent queries
Ended connect with a Ping check

github.com

letsgo-framework/letsgo/blob/master/database/database.go

/*
|--------------------------------------------------------------------------
| Mongo Database Connection
|--------------------------------------------------------------------------
|
| We are using mongo-go-driver to connect to mongodb
| Connect is used to make connection
| TestConnect is used to make connection while running tests
|
*/

package database

import (
	"context"
	"fmt"
	"github.com/joho/godotenv"
	letslog "github.com/letsgo-framework/letsgo/log"
	"go.mongodb.org/mongo-driver/mongo"
	"go.mongodb.org/mongo-driver/mongo/options"

This file has been truncated. show original

Global mongo.Client and mongo.Database objects to ensure connection pooling
Ended connect with a Ping check

https://github.com/ornell/mongoDB/blob/master/mongodb.go

Starting creating ease of use wrappers to be able to ignore context
Use of cur.Next in a place where curr.All probably makes more sense
Each time a new collection is retrieved, connection pooling will not be leveraged

https://github.com/pipe-cd/pipe/blob/master/pkg/datastore/mongodb/mongodb.go

They’ve created a common interface to several DB backends, including mongo. This is a very clean implementation.
A global database object is not used - instead it is nested purposefully in a singleton that’s activated at initialization time.
Created helpers for updating by ID (e.g. makePrimaryKeyFilter())

github.com

se7entyse7en/npm-packages-deps-retrieval/blob/master/internal/store/store.go#L143


      
          func (ms *MongoStore) Save(ctx context.Context, r *Record) error {
          	_, err := ms.collection.InsertOne(
          		ctx, bson.M{
          			"_id":          r.ID,
          			"name":         r.Name,
          			"version":      r.Version,
          			"dependencies": r.Dependencies,
          		},
          	)
          
          
	if err != nil {
          		if ferr, ok := err.(mongo.WriteException); ok {
          			// ignore duplicate key error
          			if len(ferr.WriteErrors) > 0 && ferr.WriteErrors[0].Code == 11000 {
          				return nil
          			}
          		}
          
          
		return err
          	}

Handle IsDup is a somewhat inflexible fashion - only check for a single code from one error type

github.com

squzy/mongo_helper/blob/master/mongo.go#L35


      
          }
          
          
func (c *connector) UpdateOne(ctx context.Context, filter interface{}, update interface{}, opts ...*options.UpdateOptions) (*mongo.UpdateResult, error) {
          	return c.collection.UpdateOne(ctx, filter, update, opts...)
          }
          
          
func (c *connector) InsertOne(ctx context.Context, document interface{}, opts ...*options.InsertOneOptions) (*mongo.InsertOneResult, error) {
          	return c.collection.InsertOne(ctx, document, opts...)
          }
          
          
func (c *connector) FindOne(ctx context.Context, filter interface{}, structToDeserialize interface{}, opts ...*options.FindOneOptions) error {
          	raw, err := c.collection.FindOne(ctx, filter, opts...).DecodeBytes()
          	if err != nil {
          		return err
          	}
          	return bson.UnmarshalWithContext(bsoncodec.DecodeContext{
          		Registry: bson.DefaultRegistry,
          		Truncate: true,
          	}, raw, structToDeserialize)
          }

Check out FindOne - it looks like they were struggling with how to decode an object - but perhaps the use of the bsoncodec is intentional
FindAll runs into a common pitfall in that a shared Context is used not just for the database query creation, but also the subsequent unpacking using .All - this might or might not be intentional

github.com

2-72/gorilla-sessions-mongodb/blob/84372f8fd8ee/store.go#L214


      
          		}
          
          
	}
          	indexOpts := options.Index().
          		SetExpireAfterSeconds(int32(mstore.options.MaxAge)).
          		SetBackground(true).
          		SetSparse(true).
          		SetName(indexName)
          
          
	indexModel := mongo.IndexModel{
          		Keys: bson.M{
          			"modified_at": 1,
          		},
          		Options: indexOpts,
          	}
          	_, err = mstore.coll.Indexes().CreateOne(ctx, indexModel)
          	if err != nil {
          		return fmt.Errorf("mongodbstore: error ensuring TTL index. Unable to create index: %w", err)
          	}
          
          
	return nil

Use bson.M for index key creation

A client is maintained globally

Created helper wrappers around common mongo-go-driver functions to more closely approximate mgo
Check out how easy it is to forget to check one of the three errors when using a cursor and still have valid go

Check out how error decoding is handled - using strings.Contains as opposed to a const/enum comparison. Very expensive to do on each function call.
Refactored so everything is handled through a common function exec() - which allows them to completely disregard context unless a Timeout value has been specified in one of their options.
Use a common object for storing everything necessary to easily construct a query - very similar to the Query type in mgo.

Some commonalities across the examples:

context is used once across all tutorials / code examples that I’ve linked. To me, this denotes that it’s a much less used feature. The fact that it is passed into every function call seems to result in people creating workarounds with bad practice (e.g. creating a global context.TODO() nil ctx) in order to make code more concise.
It’s a common pitfall with index creation to use bson.M, which can result in indices being out of order. If there was an enforced type explicitly created (e.g.
For InsertMany developers have to take an array of a concrete type and cast it to an array of interface. It seems like InsertMany could be updated to take an interface{} (rather than interface slice) without backwards breaking changes.
In several instances, I found the use of a global connection object. This means that the connection does not need to be passed around to every function and there’s no chance of unnecessary reinitialization. There are not a lot of cases where globals make sense, but this does seem to be one of them. In the cases that globals weren’t used, it was apparent that it was easy to forget about connection pooling and re-connect each time. It requires a lot of careful thought and knowledge to be able to leverage the connection object correctly without making it a global.
In all of the Find examples, the queries themselves were simple, but look at the fact that 3 error handling blocks are necessary. All of those typically contain the same exact error handling code, e.g. log/panic/return error, so being able to check for errors in a single location would result in less verbose code.
Interestingly, I only saw one instance across all of these of cursor.All - an iterative cursor is consistently used. Does this mean that cursor.All() is less documented, harder to use or is it just that people enjoy the power of the cursor? I think it’s more lack of examples than anything given that no one seems to mention it.
Distinct() is not mentioned either - in fact, in googling for Distinct Examples, I can only find it mentioned in the golang tests themselves. It’s not in the documentation example code. The test would lead me to believe that the recommended method for unpacking an interface would be to marshal it via JSON, but really it’s way more efficient to cast it back to its expected type.
It would be nice to provide the ability to not have to call Ping manually, but instead pass an option that forces Connect to do a ping check automatically.

TopherGopher · November 6, 2020, 8:35pm

There are 3 other feature gaps that I’m hoping we can address somehow.

Provide the ability to view the raw query results before they are unpacked to an interface. Think about the use-case of developing an aggregation pipeline query. When using group/project, it’s sometimes confusing to understand the data that’s coming back, so it’s difficult to craft a receiver struct. The best thing you can do now is unpack to an interface, which has it’s own side-effects. Being able to see the same return that comes back as though you’ve run the query via the mongo CLI would be extremely beneficial.
Provide the ability to see the rendered mongo query that is going to be executed (e.g. a dry-run option or something available on the cursor. Sometimes our developers get tripped up trying to translate the golang in their head to the query - being able to see the query itself would make life easier.
Provide the ability to convert a standard mongo query to golang. This is probably some JS hosted on the mongo docs site that performs this rewrite in a similar fashion to something like JSON-to-Go: Convert JSON to Go instantly.

Isabella_Siu · November 16, 2020, 5:34pm

@TopherGopher Thank you for organizing all this feedback! As we plan out these improvements, we looking into the points that you’ve brought up.

–Isabella

Isabella_Siu · November 18, 2020, 9:12pm

Hi! I’ve made a separate post for the improved error API that we’re considering, where we would like some user feedback: Seeking Developer Feedback: Go Driver Error Improvements