Very casual whiteboard type chat sessions on a regular basis in SF, NYC, Redwood Shores, Mountain View, and Atlanta. Stop by!
Also check out the meetup users groups in lots of other cities.
How Journaling and Replication Interact
Version 1.8 of MongoDB supports journaling in the storage engine for crash safety and fast recovery. An interesting question arises then regarding how journaling interacts with replication. A traditional approach might be to wait for the commit (i.e., journal physical write confirmed) before replicating any data. MongoDB does not do this. Instead, it allows data to replicate even if the journal write has yet to occur or be confirmed. We then must ask “but what happens if we crash before journaling but the data replicated out?” With replica sets , it turns out this is ok. In a replica set the rule is that the freshest node will be elected primary. Thus if the crashed node comes back up, but the node which received the unjournaled data is ahead, it will be primary. We might then ask about a cascade of failures. This is ok too as replica sets have a notion of rolling back to a consistent point of view. How do we know our data won’t be rolled back? The answer is that a write is truly committed in a replica set when it has been written at a majority of set members. We can confirm this with the getLastError command. For example, if our write has made it to the journal on two out of three total set members, we know the data is committed even if nodes fail in a cascading sequence, and even if a minority of nodes are permanently lost. Why bother replicating so quickly? It lets us minimize latency between secondaries and primaries. In addition more writes will be successful than traditionally when a crash occurs. Yet the latency reduction is the biggest advantage: fsyncing to disks can be slow – replication lag (if on a LAN) can be less than the time to fsync to disk. Plus both can then be underway concurrently.
Use x509 certificate-based authentication with MongoDB and Apache Kafka
Kafka has emerged as a popular event streaming platform. The inherent "pub/sub" model can be viewed as a method for moving data between systems. As such, MongoDB offers a Kafka connector , enabling Kafka topics to be copied into a MongoDB cluster (the sink). Similarly, the connector enables data movement from a MongoDB cluster (the source) into Kafka topics. To access data securely, certificate-based X.509 authentication is a natural choice for server-to-server authentication scenarios with Kafka and MongoDB. Certificates avoid having to store or manage usernames and passwords when used with database connection strings. For example, such user credentials could be inadvertently exposed if "hard-coded" in configuration files or other uses. An X.509 certificate is a structured, binary record. This record consists of several key and value pairs. X.509 certificates use the widely accepted international X.509 public key infrastructure (PKI) standard. The use of certificates prevents user credential exposure. Authentication requests with certificates verifies that any public key presented by a client or another member of the cluster belongs to that client or member. The X.509 certificate method for authentication is more secure than conventional password-based certification because each server machine needs their own dedicated key to participate in the cluster. For use with secure TLS/SSL connections, MongoDB supports X.509 certificate authentication allowing clients to use public key infrastructure in lieu of SCRAM (username and password). The certificate encodes two very important pieces of information: the server's public key and a digital signature that can be used to confirm the certificate's authenticity. Additionally, the certificate will include metadata used by the Certificate Authority to track the certificate and provide guidelines on how the public key can be used. Using the server's public key, the client and server are able to negotiate a shared symmetric key securely, which can be used to secure communications. Users can either generate their own certificates and keys (self-managed) or use the Atlas PKI. In either case, first a project-specific CA private and public key is generated, and then a per-user private key and signed X.509 identity certificate is created. If using self-managed X.509 infrastructure , you'll need to upload your CA public key certificate into your Atlas project. If using Atlas-managed X.509 infrastructure, you'll need to download the project private key and provide that to your Kafka Connect service. This signed certificate is then pushed to each server member in your Atlas cluster. The below diagram shows the deployment of a standard 3 node replica set and client using x.509 authentication: In non-production environments, the basic SCRAM authentication method may be most suitable. However, for production environments or server-server scenarios such as a Kafka-MongoDB integration, X.509 authentication is the recommended mechanism. To use X.509 certification for server-server authentication, first confirm that you are able to authenticate to an Atlas cluster using X.509 certificates. Then follow the steps below. Prerequisites: Openssl must be installed Project-level CA & user certificates created in PEM format If using Atlas-managed certificates, user-specific client certificate (see X.509 tab: https://docs.atlas.mongodb.com/security-add-mongodb-users/#database-user-authentication ) If using self-managed X.509 auth, you will need to create & upload your CA public key to Atlas (see https://docs.atlas.mongodb.com/security-self-managed-x509/ ), and have a user-specific client certificate ready Ensure that you have installed the MongoDB Kafka Connector and understand how to use it with Kafka Connect. Then follow these steps: Obtain the client user certificate from your system administrator (or from Atlas). In this example, the user certificate is stored in PEM file kafkaclient-X509-cert.pem and will be associated with the Atlas database user kafka-svc . Convert the PEM file to a password-protected PKCS12 formatted certificate by running this command: openssl pkcs12 -export -in kafkaclient-X509-cert.pem -out kafkaclient-X509-cert.p12 -password pass:mypassword Copy PKCS12 certificate ( kafkaclient-x509-cert.p12 ) to the server where Kafka Connect is running. Note the full path of the PKCS12 certificate location. Update the Kafka Connect configuration in the KAFKA_OPTS environment variable: export KAFKA_OPTS="-Djavax.net.ssl.keyStore=<path to kafkaclient-x509-cert.p12> -Djavax.net.ssl.keyStorePassword=mypassword -Djavax.net.ssl.keyStoreType=PKCS12" Restart Kafka Connect Update the MongoDB Connector configuration to use a connection URI with the following parameter options: Connection.uri: "mongodb+srv://<mongodb-host>/test?authSource=%24external&authMechanism=MONGODB-X509&subjectName=kafka-svc" Re-deploy the MongoDB connector using the Kafka Connect REST API, with the above configuration for the connection URI. Download the latest MongoDB Connector for Apache Kafka 1.5 from the Confluent Hub ! Read the MongoDB Connector for Apache Kafka documentation . Questions/Need help with the connector? Ask the Community .