Hello : )
I think you want count distinct in each field,in the first answer,it was about count only.
Because in that sql query you didnt add the distinct.
I used “$group”: {"_id": null …} because you already kept only 1 date on match stage,
so here all collection will be 1 group.
Maybe this is what you want.
Example data.
Data in
[
{
"program_name": "E",
"user_name": "B",
"zipfile_dt": "2019-01-01"
},
{
"program_name": null,
"user_name": "B",
"zipfile_dt": "2020-01-01"
},
{
"program_name": "E",
"user_name": "B",
"zipfile_dt": "2020-01-01"
},
{
"program_name": "C",
"user_name": "B",
"zipfile_dt": "2020-01-01"
},
{
"program_name": "D",
"user_name": "A",
"zipfile_dt": "2020-01-01"
},
{
"program_name": "D",
"user_name": "A",
"zipfile_dt": "2020-01-01"
},
{
"program_name": "D",
"user_name": "A",
"zipfile_dt": "2020-01-01"
}
]
Query
{
"aggregate": "testcoll",
"pipeline": [
{
"$match": {
"$expr": {
"$eq": [
"$zipfile_dt",
"2020-01-01"
]
}
}
},
{
"$group": {
"_id": null,
"program_name_array": {
"$addToSet": "$program_name"
},
"user_name_array": {
"$addToSet": "$user_name"
}
}
},
{
"$project": {
"_id": 0
}
},
{
"$project": {
"users_count": {
"$size": "$user_name_array"
},
"program_count": {
"$size": "$program_name_array"
}
}
}
],
"maxTimeMS": 0,
"cursor": {}
}
Result
{
"users_count": 2,
"program_count": 4
}
If you want to not count the null values
In the above query replace the last project with that,it filters the array and keeps
only the not nulls,you can make a function and generate the ,here is the same code 2x
{
"$project": {
"users_count": {
"$size": {
"$filter": {
"input": "$user_name_array",
"as": "m",
"cond": {
"$not": [
{
"$eq": [
"$$m",
null
]
}
]
}
}
}
},
"program_count": {
"$size": {
"$filter": {
"input": "$program_name_array",
"as": "m",
"cond": {
"$not": [
{
"$eq": [
"$$m",
null
]
}
]
}
}
}
}
}
}
Result (program_name null value wasn’t counted)
{"users_count":2,"program_count":3}
Hope this time helps