First Steps at MongoDB

First good surprise: MongoDB is surprisingly easy to install. Second good suprise: MongoDB is surprisingly easy to use.

Time to create a new database: 0 sec 🙂

I decided to give it a go, with a fisheries management database. Fisheries management systems store data about fisheries, typically organized around fishing trips and giving information about their catch. These systems are normally a “nightmare” because:

– they assemble information that comes from different sources (e.g. paper sheets, computer applications, data files, etc).

– the information collected changes over time (fields are constantly removed, added, amended).

– in some cases, information may be missing or unusable.

All these factors, deal to a frequent “refactoring” of the database schema. Everyone that already refactored a database, knows that this is a big headache, and it is hardly possible without taking some time, and invariably loosing some data.

And this is where “schema-free” databases such as MongoDB become atractive.

In my example, I want to store paper sheets that refer to the definition of sampled fishing trips. For each fishing trip, I want to collect some information such as date/time, sampler, sampled catch, etc. However, in my example I have a set of sheets coming from different sources and filled with different degrees of accuracy. That is no problem for MongoDB!

First I start the MongoDB shell by typing “Mongo”.

Then, I immediately start inserting my sheets. Each sheet is an object; let’s call them by letters: “a”, “b”, “c” and “d”.

a = { ls : "aberdeen", sampler:"spock", no_prof_fish: 3, no_part_time_fish: 1, start_dt:"05/11/2012", start_time:"15:00", end_dt:"05/11/2012", end_time:"19:40", no_operations: 6, catch_weight: 23, catch_units: "kg", comments:"short trip" };
{
"ls" : "aberdeen",
"sampler" : "spock",
"no_prof_fish" : 3,
"no_part_time_fish" : 1,
"start_dt" : "05/11/2012",
"start_time" : "15:00",
"end_dt" : "05/11/2012",
"end_time" : "19:40",
"no_operations" : 6,
"catch_weight" : 23,
"catch_units" : "kg",
"comments" : "short trip"
}

b = { ls : "vigo", sampler:"piccard", no_prof_fish: 4, start_dt:"12/10/2010", end_dt:"15/10/2010", end_time:"23:00", no_operations: 11, catch_weight: 549, catch_units: "kg", comments:"good catch" };
{
"ls" : "vigo",
"sampler" : "piccard",
"no_prof_fish" : 4,
"start_dt" : "12/10/2010",
"end_dt" : "15/10/2010",
"end_time" : "23:00",
"no_operations" : 11,
"catch_weight" : 549,
"catch_units" : "kg",
"comments" : "good catch"
}

c = { ls : "Ostia", sampler:"Riker", start_dt:"23/01/2010", end_dt:"24/01/2010", catch_weight: 9, catch_units: "kg" };
{
"ls" : "Ostia",
"sampler" : "Riker",
"start_dt" : "23/01/2010",
"end_dt" : "24/01/2010",
"catch_weight" : 9,
"catch_units" : "kg"
}

d = { ls : "Portimao", start_dt:"05/11/2012" catch_weight: 11, catch_units: "kg" };
Mon Nov  5 15:57:34 SyntaxError: missing } after property list (shell):1
> d = { ls : "Portimao", start_dt:"05/11/2012", catch_weight: 11, catch_units: "kg" };
{
"ls" : "Portimao",
"start_dt" : "05/11/2012",
"catch_weight" : 11,
"catch_units" : "kg"
}

You may have noticed that each one of the objects has different fields. As I said before, that is no problem: they do not need to have common fields at all (although if you are creating a collection, the odds are that they will have some!) You may also have noticed the MongoDB reply to you insertion, showing you the object in JSON notation, which is how it internally stores the data.

Next step is to create a collection to store all these objects. A collection may be seen as the equivalent to a “relational” table, since it groups together a set of objects (which would be “records” in the relational world.

db.trips.save(a);
db.trips.save(b);
db.trips.save(c);
db.trips.save(d);

In the lines above, I have created the collection “trips” and stored all my objects inside it. To list all the elements in “Trips”, I can write:

db.trips.find();
{ "_id" : ObjectId("5097d41651dac5d23d125cf6"), "ls" : "aberdeen", "sampler" : "spock", "no_prof_fish" : 3, "no_part_time_fish" : 1, "start_dt" : "05/11/2012", "start_time" : "15:00", "end_dt" : "05/11/2012", "end_time" : "19:40", "no_operations" : 6, "catch_weight" : 23, "catch_units" : "kg", "comments" : "short trip" }
{ "_id" : ObjectId("5097d41c51dac5d23d125cf7"), "ls" : "vigo", "sampler" : "piccard", "no_prof_fish" : 4, "start_dt" : "12/10/2010", "end_dt" : "15/10/2010", "end_time" : "23:00", "no_operations" : 11, "catch_weight" : 549, "catch_units" : "kg", "comments" : "good catch" }
{ "_id" : ObjectId("5097d42051dac5d23d125cf8"), "ls" : "Ostia", "sampler" : "Riker", "start_dt" : "23/01/2010", "end_dt" : "24/01/2010", "catch_weight" : 9, "catch_units" : "kg" }
{ "_id" : ObjectId("5097d42251dac5d23d125cf9"), "ls" : "Portimao", "start_dt" : "05/11/2012", "catch_weight" : 11, "catch_units" : "kg" }

Or, if I want a “pretty” JSON notation, I can use a “cursor”.

db.trips.find().forEach(printjson);
{
"_id" : ObjectId("5097d41651dac5d23d125cf6"),
"ls" : "aberdeen",
"sampler" : "spock",
"no_prof_fish" : 3,
"no_part_time_fish" : 1,
"start_dt" : "05/11/2012",
"start_time" : "15:00",
"end_dt" : "05/11/2012",
"end_time" : "19:40",
"no_operations" : 6,
"catch_weight" : 23,
"catch_units" : "kg",
"comments" : "short trip"
}
{
"_id" : ObjectId("5097d41c51dac5d23d125cf7"),
"ls" : "vigo",
"sampler" : "piccard",
"no_prof_fish" : 4,
"start_dt" : "12/10/2010",
"end_dt" : "15/10/2010",
"end_time" : "23:00",
"no_operations" : 11,
"catch_weight" : 549,
"catch_units" : "kg",
"comments" : "good catch"
}
{
"_id" : ObjectId("5097d42051dac5d23d125cf8"),
"ls" : "Ostia",
"sampler" : "Riker",
"start_dt" : "23/01/2010",
"end_dt" : "24/01/2010",
"catch_weight" : 9,
"catch_units" : "kg"
}
{
"_id" : ObjectId("5097d42251dac5d23d125cf9"),
"ls" : "Portimao",
"start_dt" : "05/11/2012",
"catch_weight" : 11,
"catch_units" : "kg"
}

And here we have, a very efficient and flexible way of storing a collection of trip sheets.

Now, how would we do a query?

For instance the SQL query “SELECT * from Trips WHERE ls=”Ostia””, would translate into:

db.trips.find({ls:"Ostia"}).forEach(printjson);
{
"_id" : ObjectId("5097d42051dac5d23d125cf8"),
"ls" : "Ostia",
"sampler" : "Riker",
"start_dt" : "23/01/2010",
"end_dt" : "24/01/2010",
"catch_weight" : 9,
"catch_units" : "kg"
}

Easy, no? 🙂 Another example? “Select * from Trips WHERE start_dt=”05/11/2012″)”

db.trips.find({start_dt:"05/11/2012"}).forEach(printjson);
{
"_id" : ObjectId("5097d41651dac5d23d125cf6"),
"ls" : "aberdeen",
"sampler" : "spock",
"no_prof_fish" : 3,
"no_part_time_fish" : 1,
"start_dt" : "05/11/2012",
"start_time" : "15:00",
"end_dt" : "05/11/2012",
"end_time" : "19:40",
"no_operations" : 6,
"catch_weight" : 23,
"catch_units" : "kg",
"comments" : "short trip"
}

There are a few more interesting feature for queries (such as the “Syntactic Sugar” :-)), that maybe I will save from another post. Also bear in mind, that mostly you will be accessing MongoDB through a programming language (using one of their many “language drivers”) rather than from this shell.

MongoDB is using JSON, which is a programmer’s friendly format (unlike relational tables). Its concepts (objects) are closer to the object oriented paradigm that most modern languages implement, and most importantly closer to the way we think. Its adaptability makes it particularly AGILE, accompanying the evolution of the processes we want to represent. All good reasons to get interested on this technology! 🙂 Keep tuned!

Advertisements

2 thoughts on “First Steps at MongoDB

  1. Sabes que um dos meus trabalhos de UNIV foi parecido. Fazer um modelo para informatização de uma empresa de comercio de peixe. 🙂 So agora é que me lembrei!

  2. Pingback: Database Merging | heartcode

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s