I’m trying to understand the behavior of reads in a mongodb replica set. In particular I have an environment with high rate of reads, low rate of writes, and relatively small data set.
I read this document:
http://docs.mongodb.org/manual/core/read-preference/
In particular:
primary Default mode. All operations read from the current replica set primary.
primaryPreferred In most situations, operations read from the primary but if it is unavailable, operations read from secondary members.
secondary All operations read from the secondary members of the replica set.
secondaryPreferred In most situations, operations read from secondary members but if no secondary members are available, operations read from the primary.
nearest Operations read from the nearest member of the replica set, irrespective of the member’s type.
So my understanding is that reads by default go to the primary. There are read preferences that allow reading from secondary ( secondary
, and secondaryPreferred
). In these cases stale data may be served.
It seems to me that it would be preferable to distribute the reads across both primary and secondary machines, so that I can make best use off all 3 machines. But I don’t really see this as an option. The following statement in particular perplexes me:
If read operations account for a large percentage of your application’s traffic, distributing reads to secondary members can improve read throughput. However, in most cases sharding provides better support for larger scale operations, as clusters can distribute read and write operations across a group of machines.
However, in the case of a relatively small data set, sharding simply doesn’t make sense. Can someone shed some light on the right configuration?