Mongocxx low throughput

Hi, I am new to MongoDB. Sorry in advance if my description is unclear.
I have been trying to benchmark the throughput performance (query given the id of the object) using the C++ Driver, but I am getting really low measurement results. around 7kOps/sec. Neither compute/network is saturated. Could someone help me check what went wrong?
Thanks a lot!

int main(){
    mongocxx::instance instance{}; 
    mongocxx::uri uri("mongodb://111.11.111.11:27017");
    mongocxx::pool pool{uri};

    auto client = pool.acquire();
    mongocxx::database db = (*client)["test"];
    mongocxx::collection coll = db["test"];
    mongocxx::cursor cursor_throughput = coll.find({});

    std::vector<std::string> id_vector;
    for (auto doc : cursor_throughput) {
        id_vector.push_back(doc["_id"].get_oid().value.to_string());
    }
    int partition_num = 24;

    std::vector<std::thread> threads{};
    for (int i = 0; i < partition_num; i++) {
        auto run_find = [&](std::int64_t j) {
            auto client = pool.acquire();
            auto coll = (*client)["test"]["test"];

            int start = id_vector.size() / partition_num * j;
            int end = id_vector.size() / partition_num * (j + 1) - 1;

            for (int k = start; k <= end; k ++){
                auto str = id_vector[k];
                auto cursor_id = coll.find_one
                    (document{} << "_id" << bsoncxx::oid(str)
                    << finalize);
            }
        };
        std::thread runner{run_find, i};
        threads.push_back(std::move(runner));
    }    
    for (auto &th : threads) {
        th.join();
    }
    ...
}

I am using 5 data shards. The config server and the query router are on the same machine.
The dataset is around 14 million points & 1.7GB