MongoDB 1.8.0-rc0 is now available, you can download from here and find the release notes here.
The main changes over the current 1.6 production release relate to:
durability, a much talked about topic (!), with support now for write-ahead journaling (docs)
sparse and covered indexes (on my radar to try out)
support for more Map/Reduce functionality (note there are some breaking changes in this area)
The full info on the changes is in the Release Notes document.
[Read More]
MongoUK 2011 - Upcoming MongoDB event
There’s a one day conference on MongoDB coming up, being held at the Skills Matter eXchange in London on Monday 21st March 2011. If you’re interested in MongoDB or even just curious as to what it’s all about, then it’s worth registering for it based on what’s on the agenda - it’s going to be tough to choose between the different sessions, which include ones on:
schema design storage engine internals indexing and query optimiser scalability map/reduce & geo indexing It will also be a good chance to hear from speakers outside of 10gen such as Thoughtworks, Guardian and Boxed Ice.
[Read More]
SQLSoton UserGroup Room Change from March
Due to the success of the SQL Southampton User Group and based on the feedback/opinions of those who have attended, the room the group is held in is changing. It’s still in the same venue, Avenue St Andrews United Reformed Church, but instead of being in the Upper Room as per the last few meets, it will be in St Andrew’s Hall. This gives a bigger, more flexible room.
The now famous #SQLSoton signs will be in place to direct as per usual - still round via the back carpark but a different entrance now.
[Read More]
MongoDB replication - oplogSize
Background An oplog is a write operation log that MongoDB uses to store data modifications that need to be replicated out to other nodes in the configured set. This oplog is a capped collection which means it will never grow in size beyond a certain point - once it reaches it’s max size, old data drops off the end as new data is added so it keeps cycling round. The size of the oplog basically determines how long a secondary node can be down for and still be able to catch up when it comes back online.
[Read More]
OS CodePoint Data Geography Load Update
Following on from my previous post on loading the Ordnance Survey Code-Point data for GB post codes to SQL Server and converting to the GEOGRAPHY data type, I’ve made a few tweaks to the importer app that is up on GitHub:
The schema of the SQL Server table generated has changed - postcodes are now split into 2 distinct columns: OutwardCode and InwardCode The importer now calculates a basic average for each postcode district (e.
[Read More]
MongoDB - Does My Data Look Big In This?
You have an existing relational database containing x amount of data and you decide to migrate that data, for whatever reason, into MongoDB. You may have an pre-conceived belief that as your relational database is x GB in size, that after loading that data into MongoDB your NoSQL database size will be around about x GB too - after all, you’re loading exactly the same data from one to the other right?
[Read More]
GB Post Code Geographic Data Load to SQL Server using .NET
Ordnance Survey now make available a number of mapping data/geographic datasets to download for free, allowing unrestricted use for commercial and non-commercial use. One of these is the Code-Point Open dataset, which gives a precise geographic location of each post code in Great Britain as a CSV file.
This file contains the Easting and Northing coordinates for each postcode. You may want to convert these to Latitude/Longitude coordinates to then load into a GEOGRAPHY column in SQL Server (as of 2008) and do all kinds of spatial wizardry in the database.
[Read More]
Running MongoDB as a Windows Service
Following on from my opening “Getting started with MongoDB and .NET” post, one of the next logical steps was to get mongodb running as a Windows service.
From a command prompt, in the mongodb bin directory run:
mongod --install --serviceName "MongoDB" --dbpath C:\mongodb\data\db --logpath C:\mongodb\logs\mongolog.txt --logappend Obviously you can tweak the arguments for your environment - I opted to use a different dbpath to the default \data\db
Starting the service is then just a case of running:
[Read More]
SqlBulkCopy to SQL Server in Parallel
In an earlier post last year, I blogged about high performance bulk loading to SQL Server from .NET using SqlBulkCopy. That post highlighted the performance gain that SqlBulkCopy gives over another batched insert approach using an SqlDataAdapter. But is it possible to squeeze more performance out? Oh yes.
First, a quick recap. For optimal performance:
load into a heap table (with no indexes - add any indexes you need AFTER you’ve loaded the data)
[Read More]
GROUPING SETS in SQL Server
Something I personally haven’t seen a lot of out there in the SQL Server world, is use of GROUPING SETS - an operator that can be applied in a GROUP BY clause. So what does it do? How would you use it?
Take the AdventureWorks sample database as an example playground. Suppose you want to query the sales data to find the following:
total sales for each product total sales for each product category total sales There’s a number of ways you could do this.
[Read More]