Foursquare Intersections logo
Explore
Subscribe
Engineering

Looking forward to Mongo 2.6: A deep dive into the new write commands

Written by Foursquare on Apr 08, 2014 - Read time: 4 min - Read Later

We've been longtime Mongo users at Foursquare and each new version has brought enhancements in performance, reliability, and features. Mongo 2.6 has a bunch of new features that we're excited about, but I'm going to focus on just one which might be easy to gloss over if you're looking at the release notes.

Write operations will be sent synchronously over the wire allowing for some simplifications to mongo internals that should yield performance benefits.

Background

Up until 2.6 all write operations were sent asynchronously over the wire. They were completely fire and forget. When a write command packet was sent to Mongo over a TCP connection, no bytes were returned by the server. In order for a client to get feedback on success, a “getLastError" command had to be sent separately on the same TCP connection with the developer's durability preferences. The durability preference is referred to as the WriteConcern and might be something like “replicate to two servers" or “fsync to one server". All the major mongo client drivers abstracted that async behavior so that it looked synchronous, but there were some real negative consequences.

Because the “getLastError" command could be sent at some future point after a write operation, clients had to have their TCP connections pinned for the duration of their sessions. That meant that there were constraints on connection pooling within the drivers and within the mongoS sharded router. Additionally, both the mongoS router and the mongoD servers had to keep track of the accounting of failed writes in order to match up future requests for feedback. Everything could be much simpler if write operations just blocked the client and returned their results.

With mongo 2.2 we hit some problems related to connection growth on our primary servers and mongodb engineers created a patch to allow for better connection reuse on the mongoS [https://jira.mongodb.org/browse/SERVER-9022]. That was a bit of a hack but it has worked very well for our use case.

New write commands

In 2.6, write operations are now sent using the existing command infrastructure that query, count, getlasterror, and all the operational commands utilize. Unlike the old fire and forget write operations, the new command based write operations send a reply for every request. Each type of write command also supports batching for more efficient network utilization and reliability. In the case of any failures within a batch, the reply will contain error messages for each of the operations that failed.

With 2.6, we hope to see further performance and scalability enhancements in our sharded clusters. The mongoS will do a much better job connection pooling because it will be able to reuse connections since they will no longer need to be pinned to each client connection. That should cause the connection counts to the mongoD servers to drop and allow us to add even more clients in the future without worrying about running into limits.

How it works

The drivers will continue to abstract things in a similar way, but for those interested on how the new write operation commands actually work, read further:

There are just a few different types of packets that can be sent over the wire. All queries and commands are sent using an OP_QUERY packet. OP_INSERT, OP_UPDATE, and OP_DELETE are the way to do write operations prior to 2.6 and the big change is that writes are now sent as commands using the OP_QUERY packet.

Commands and queries are both sent using the same packet over the wire and a simple convention is used to tell the server that the packet is a command: the “fullCollectionName" will be of the form “dbname.$cmd" and the “query" will be a BSON document with the command name and parameters.

For a new 2.6 update command, the document will look something like this:

{
 "update": "collectionName",
 "updates": [
   {
     "q": {...}, // the query
     "u": {...}, // the update
     "multi": false,
     "upsert": false
   }
 ],
 "ordered": true, // should errors stop the batch, true by default
 "writeConcern": {"w": 1}
}

Notice that the “updates" key is actually an array. That's used to support batching of writes. A single update will simply be a batch of one. The “ordered" flag determines whether an error in one item in the batch will terminate further processing or if it will continue on despite the errors. If the flag is set to true, each operation will be executed sequentially, otherwise mongo can split the batch up and execute things out of order in the case where the operation affects multiple shards of a cluster.

As I mentioned above, the command is synchronous and the reply might look like this in the case of an upsert:

{
 "ok" : 1,
 "nModified" : 0,
 "n" : 1,
 "upserted" : [{
   "index" : 0, // matches the index of the update batch
   "_id" : ObjectId("532f9406fec9bb9b1bee9290")
 }]
}

If there was an error with one or more of the operations, the response might look like this in the case of an insert:

{
 "ok" : 1,
 "n" : 0,
 "writeErrors" : [{
   "index" : 0, // matches the index of the insert batch
   "code" : 11000,
   "errmsg" : "insertDocument :: caused by :: 11000 E11000 duplicate key error index: test.c.$_id_ dup key: { : ObjectId('532f941bfec9bb9b1bee9291') }"
 }]
}

If I had batched multiple inserts and the first failed but I had “ordered" set to false, the rest would be attempted. Otherwise the command would stop processing.

I hope this has been interesting for you. Keep in mind that the performance benefits that we're excited about are completely theoretical until tested out in production. We aren't using 2.6 in production yet, but plan on rolling it out slowly in the coming months. If you're going to try 2.6 for yourself, you'll need to upgrade your client driver to take advantage of the new command infrastructure.

- Jon Hoffman
Foursquare Engineer

Subscribe

Follow Foursquare

Looking forward to Mongo 2.6: A deep dive into the new write commands

Read Later

Pardot response heading