How to query to get Distinct data from embedded documents

I have a test database, where 5:100 cardinality exist between Parent:Child entity.
I have used Parent document embedding in child document. A child document may associated with more than one parent document.

To search non-duplicated set of child documents by parent’s properties, I have used $group as mentioned below in query:

Database size is approx 1.2GB., and wiredTigerCache is configured 11G. Even after that, I am getting Error :Exceed Memory for $group operation.

Can any one help me to resolve this error?

Is this correct? I don’t have a good test-case in front of me but I’m curious and the link doesn’t really seem to describe what you posted… Are you saying that DISTINCT would only select one row with “USA” if I had two columns with countries listed (say one for a Supplier and one for a Destination)?

What are the specification of the machine running this? RAM, CPU, disk, dedicated to mongod or shared?

Note that $group blocks until all incoming documents are processed. It is possible that you too many unique pair of c_id and c_field.

One idea is to reduce the total size used by the documents of the $group result set.

One way to do it could be like:

{ "$group" : {
    "_id" : "$cs.c_id" ,
    "cfield" : { "$addToSet" : "$cs.c_field" }
} }

You will of course to do another $unwind to get one top level document per unique pair like you have right now.

Thank you steevej for your prompt response.
c_id is id of child document, and cfield is property of child document.
So, for every c_id, there is only one cfield associated.

Therefore, I have also tried

{ "$group" : {
    "_id" : "$cs.c_id" ,
    "cfield" : { "$first" : "$cs.c_field" }
} }

Yes sir. It is analogous to supply relationship between supplier and product, where same supplier can supply multiple product , and same product can be supplied by multiple supplier.
And query something like find products supplied by supplier having rating between 2 to 5 .
So, query would be
collection.aggregate( [
{“$match”: {“rating”: { “$gte”: 2 ,“$lte”: 5 }}},
{ “_id” : { “$ products.sku”, “” } }

I am already using $first to reduce size. For duplicated records, $first is sufficient.

{ "$group" : {
    "_id" : "$cs.c_id" ,
    "cfield" : { "$first" : "$cs.c_field" }
} }