Note: Relates to v1.8.0 release.
v1.8 of MongoDB (production release hot off the press, available from here) brings single server durability to MongoDB via its support for write-ahead journaling. The following summarises the key points from the journaling documentation, administration guide and from my initial playing around.
- you have to specify the –journal switch for mongod in order to enable journaling (was previously –dur prior to production release)
- journal files are created within a journal subfolder of your dbpath, up to 1GB each in size
- in the event of a crash, when mongod restarts it will replay the journal files to recover automatically before the server goes online
- if mongod shuts down cleanly, the journal files are cleared down. Also note that you shouldn’t end up with a lot of journal files as they are rotated out as they are no longer needed
- batch commits are performed approximately every 100ms (note the mention in the docs that this will be more frequent in future)
- preallocation of journal files may be done when mongod starts up. If it decides this is worth doing, it will fully provision 3 x 1GB files BEFORE accepting any connections - so there could be a couple of minutes delay after starting up before you can connect. These preallocated files are not cleared down when mongod is shutdown - so it won’t have to create them each time.
- when running with journal files on a SATA drive, I found that it chose not to preallocate them. When I set up a junction to map the journal folder onto a separate 15K SAS drive, it did then choose to preallocate the files.
Durability vs. performance - what’s the cost?
I wanted to get an idea of how much this journaling costs in terms of performance. Obviously there’s going to be a hit, I just like to get a rough feel for just how much. So I took a test 295MB CSV data file containing 20 million rows of data with 2 columns: _id (integer) and val (random 5 character string) and loaded into a fresh database, with/without journaling enabled.
Tests were run on a single machine, Intel Xeon W3520 Quad Core, 10GB RAM, Disk 1=7200RPM SATA, Disk 2=15K SAS, Win 7 64Bit. MongoDB data is on Disk 1 (slower but larger disk).
Journaling? | Journal location | Files preallocated? | Import time (s) | Avg. Import Rate/s | Performance |
---|---|---|---|---|---|
No | n/a | n/a | 291 | 68729 | |
Yes | Disk 1 (same as data) | No | 442 | 45249 | -34% |
Yes | Disk 1 (same as data) | Yes (manually*) | 422 | 47393 | -31% |
Yes | Disk 2 (separate from data) | Yes | 448 | 44643 | -35% |
(*) mongod always chose not to preallocate when the journal directory was on the slower disk (same as the data) so I had to manually preallocate the files by copying the ones that were created when the faster disk was used.
I couldn’t run the test with the journals on disk 2 without them being preallocated because they were always preallocated and you can’t delete them all while mongod is running.
Summing up
In my tests, I found:
- using –journal resulted in about a 30-35% drop in throughput for my mongoimport job (just under 70K docs/s down to less than 50K docs/s)
- preallocating of journal files (as you’d expect) helps as it doesn’t have to create the files as it’s going along
- setting up a junction on Windows to map the journal directory onto a separate (and faster) disk to where the data is resulted in slower performance, presumably due to the overhead of the junction redirects. If there’s another/better way of doing this I’d be interested to know. Also I haven’t run this on Linux, so maybe there would be less of a hit on that. Personally, I’d like to see support for the journal folder to be explicitly configurable so you can point it at a separate disk.
Based on these results, for me, the decision on whether to use journaling or not would come down to how much I actually need single server durability. Is it critical for my specific use? Could I live without it and just use replica sets? What is the value / importance of it for my data vs. raw performance? Let’s not forget this is a new feature in MongoDB. As stated in the docs, there are a number of cases in 1.8.1 for performance improvements with regard to journaling. Definitely something to keep an eye on.