Hands-on MongoDB :: Part-1


Bhaskar S 03/09/2014


Overview

MongoDB is an open-source Document-oriented NoSQL database with the following features:

Installation

Download the latest stable version (2.4.9 as of 03/09/2014) of MongoDB database from the project site located at the URL www.mongodb.org/downloads

Also, download the latest stable version (2.11.4 as of 03/09/2014) of MongoDB Java driver from the maven site located at the URL central.maven.org/maven2/org/mongodb/mongo-java-driver/2.11.4/

Following are the steps to install MongoDB on a Ubuntu 64-bit based Linux workstation:

Basic Concepts

This section will cover some basic concepts of MongoDB:

Hands-on with MongoDB

The best way to explore MongoDB is to use the command-line interface called mongo, which is nothing more than an interactive JavaScript shell. In the following paragraphs we will explore some basics of MongoDB.

To launch the command-line interactive MongoDB client, execute the following command:

$MONGODB_HOME/bin/mongo

The following will be the output:

Output.1

MongoDB shell version: 2.4.9
connecting to: test
>

To list all the currently available databases, execute the following command:

show dbs

The following will be the output:

Output.2

> show dbs
local	0.078125GB
> 

The above output indicates that there is one database called local. Also, by default the MongoDB client connects to the database called test, which will be physically created only when we perform some operation on that database.

For our demo, we will create our own database called mydb. To create the mydb database, execute the following command:

use mydb

The following will be the output:

Output.3

switched to db mydb
> 

Notice that we only indicated that we want to use the mydb database; we did not create one. MongoDB uses lazy initialization and delays the creation of the database physically until we create a collection and add a document to it.

MongoDB client sets the global variable db to the current database in use.

To check the database currently in use, execute the following command:

db

The following will be the output:

Output.4

mydb
> 

The above output indicates that we are currently using the mydb database.

To perform any operation on a database, we will use the global variable db. For the demo, we will work with the collection contacts. To access this collection, we refer to it as db.contacts. Again, just as MongoDB did not physically create a database, MongoDB will defer the creation of the collection contacts until we add at least one document to that collection.

To list all the collection(s) in a database, execute the following command:

show collections

The following will be the output:

Output.5

> 

The above output indicates that there are no collection(s) yet.

To create the collection contacts, we need to add at least one document to the collection. To add a new document to a collection, use the insert() command. Let us add a new document by executing the following command:

db.contacts.insert({ first: "Bill", last: "Gates", email: "bill.gates@microsoft.com", mobile: "123 456 7890" })

The following will be the output:

Output.6

> 

The above output indicates that there were no errors and the document was successfully added.

This is similar to the INSERT INTO contacts VALUES(...) SQL statement from the relational world.

Let us now list all the collection(s) in a database by executing the following command:

show collections

The following will be the output:

Output.7

contacts
system.indexes
> 

As can be seen from the above output, MongoDB has created the collection contacts.

To display the number of documents in the collection contacts, use the count() command. Now, execute the following command:

db.contacts.count()

The following will be the output:

Output.8

1
> 

As can be seen from the above output, we have 1 document in the collection contacts.

This is similar to the SELECT COUNT(*) FROM contacts SQL statement from the relational world.

To display all the documents in the collection contacts, use the find() command. Now, execute the following command:

db.contacts.find()

The following will be the output:

Output.9

{ "_id" : ObjectId("531e555ef4448f6405eccea4"), "first" : "Bill", "last" : "Gates", "email" : "bill.gates@microsoft.com", "mobile" : "123 456 7890" }
> 

As can be seen from the above output, we see the document we inserted into the collection contacts earlier.

This is similar to the SELECT * FROM contacts SQL statement from the relational world.

But wait !!! What is with the key _id ??? We never had that in the document when we added it.

Every MongoDB document must have a unique key by with the document can be identified. This is analogous to the primary key of a table in relational database. The key _id is the unique primary key automatically added by MongoDB.

The document key _id is an object of type ObjectId which contains a hex-string of 12 bytes that is guaranteed to be unique across a cluster of machines and is generated by concatenating:

Now let us insert 4 more documents to the collection contacts by executing the following commands:

db.contacts.insert({ first: "James", last: "Gosling", email: "jgosling@oracle.com", mobile: "234 567 8901" })

db.contacts.insert({ first: "Larry", last: "Wall", email: "larryw@perl.org" })

db.contacts.insert({ first: "Guido", middle: "van", last: "Rossum", email: "gvr@python.org", mobile: "345 678 9012" })

db.contacts.insert({ first: "Anders", last: "Hejlsberg", email: "anders@microsoft.com", mobile: "456 789 0123" })

Now, let us query and display all the documents in the collection contacts by executing the following command:

db.contacts.find()

The following will be the output:

Output.10

{ "_id" : ObjectId("531e555ef4448f6405eccea4"), "first" : "Bill", "last" : "Gates", "email" : "bill.gates@microsoft.com", "mobile" : "123 456 7890" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e779"), "first" : "James", "last" : "Gosling", "email" : "jgosling@oracle.com", "mobile" : "234 567 8901" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e77a"), "first" : "Larry", "last" : "Wall", "email" : "larryw@perl.org" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e77b"), "first" : "Guido", "middle" : "van", "last" : "Rossum", "email" : "gvr@python.org", "mobile" : "345 678 9012" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e77c"), "first" : "Anders", "last" : "Hejlsberg", "email" : "anders@microsoft.com", "mobile" : "456 789 0123" }
> 

To query all the document(s) on the key first with a value of Larry from the collection contacts, execute the following command:

db.contacts.find({ first: "Larry" })

The following will be the output:

Output.11

{ "_id" : ObjectId("531fa52ac4793a19ae72e77a"), "first" : "Larry", "last" : "Wall", "email" : "larryw@perl.org" }
> 

As can be seen from the above output, we have one document from the collection contacts with the key first having a value of Larry.

This is similar to the SELECT * FROM contacts WHERE first = "Larry" statement from the relational world.

To query all the document(s) on the key first with a value of Bill and on the key last with a value of Gates from the collection contacts, execute the following command:

db.contacts.find({ first: "Bill", last: "Gates" })

The following will be the output:

Output.12

{ "_id" : ObjectId("531e555ef4448f6405eccea4"), "first" : "Bill", "last" : "Gates", "email" : "bill.gates@microsoft.com", "mobile" : "123 456 7890" }
> 

As can be seen from the above output, we have one document from the collection contacts with the key first having a value of Bill and the key last having a value of Gates.

This is similar to the SELECT * FROM contacts WHERE first = "Bill" AND last = "Gates" statement from the relational world.

To query all the document(s) on the key first with a value of James from the collection contacts and display them in a pretty JSON format, execute the following command:

db.contacts.find({ first: "James" }).forEach(printjson)

The following will be the output in a pretty JSON format:

Output.13

{
	"_id" : ObjectId("531fa52ac4793a19ae72e779"),
	"first" : "James",
	"last" : "Gosling",
	"email" : "jgosling@oracle.com",
	"mobile" : "234 567 8901"
}
> 

To query all the document(s) and list only the keys first and last from the collection contacts, execute the following command:

db.contacts.find({}, { first: 1, last: 1 })

The following will be the output:

Output.14

{ "_id" : ObjectId("531e555ef4448f6405eccea4"), "first" : "Bill", "last" : "Gates" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e779"), "first" : "James", "last" : "Gosling" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e77a"), "first" : "Larry", "last" : "Wall" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e77b"), "first" : "Guido", "last" : "Rossum" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e77c"), "first" : "Anders", "last" : "Hejlsberg" }
> 

As can be seen from the above output, it shows all the documents from the collection contacts with the keys first and last.

This is similar to the SELECT first, last FROM contacts SQL statement from the relational world.

But wait !!! Why is the key _id showing up ??? We never asked for it - did we ?

MongoDB by default includes the key _id in every query irrespective of whether we asked for it or not. If we do not want the key _id to show up, we need to explicitly suppress it.

To query all the document(s) and list only the keys first and last (without the key _id) from the collection contacts, execute the following command:

db.contacts.find({}, { first: 1, last: 1, _id: 0 })

The following will be the output:

Output.15

{ "first" : "Bill", "last" : "Gates" }
{ "first" : "James", "last" : "Gosling" }
{ "first" : "Larry", "last" : "Wall" }
{ "first" : "Guido", "last" : "Rossum" }
{ "first" : "Anders", "last" : "Hejlsberg" }
> 

Until now we have been using the find() command on the MongoDB collection contacts and it appears to return a list of documents from that collection. In reality, the find() command returns a database cursor and not a list of documents (even if there is one entry).

Since MongoDB client is also a Javascript engine, we can iterate the database cursor from the command-line interface. Execute the following commands in the command-line interface:

var cur = db.contacts.find({}, { first: 1, last: 1, _id: 0 })

while (cur.hasNext()) {

var doc = cur.next();

print("First name: " + doc.first + ", Last name: " + doc.last);

}

The following will be the output:

Output.16

First name: Bill, Last name: Gates
First name: James, Last name: Gosling
First name: Larry, Last name: Wall
First name: Guido, Last name: Rossum
First name: Anders, Last name: Hejlsberg
> 

This is cool, ain't it !!!

Now, to query and return an actual document for the key first with a value of Larry from the collection contacts, execute the following command:

db.contacts.findOne({ first: "Larry" })

The following will be the output:

Output.17

{
	"_id" : ObjectId("531fa52ac4793a19ae72e77a"),
	"first" : "Larry",
	"last" : "Wall",
	"email" : "larryw@perl.org"
}
> 

To limit the number of documents returned by the find() query command, use the limit() function. To demonstrate this capability, execute the following command:

db.contacts.find({}, { first: 1, last: 1, _id: 0 }).limit(3)

The following will be the output:

Output.18

{ "first" : "Bill", "last" : "Gates" }
{ "first" : "James", "last" : "Gosling" }
{ "first" : "Larry", "last" : "Wall" }
> 

We will cover more advanced queries in a later part in this series.

Let us move on to updating documents now.

To update a document, use the update() function.

Let us go ahead and update the document for the key first with a value of Larry to contain the key mobile. For this, let us execute the following command:

db.contacts.update({ first: "Larry" }, { mobile: "987 654 3210" })

The following will be the output:

Output.19

> 

Now, let us query the document for the key first with a value of Larry from the collection contacts by executing the following command:

db.contacts.find({ first: "Larry" })

The following will be the output:

Output.20

> 

No document found ??? What happened here ???

The default behavior of the update() command is to replace the whole document. If we query the document for the key mobile with a value of 987 654 3210, we will find the document. Let us execute the following command:

db.contacts.find({ mobile: "987 654 3210" })

The following will be the output:

Output.21

{ "_id" : ObjectId("531fa52ac4793a19ae72e77a"), "mobile" : "987 654 3210" }
> 

Let us fix the document for the key mobile with a value of 987 654 3210 to contain the missing keys first, last, email, and mobile. For this, let us execute the following command:

db.contacts.update({ mobile: "987 654 3210" }, { first: "Larry", last: "Wall", email: "larryw@perl.org", mobile: "987 654 3210" })

Now, we should be able to query the document for the key first with a value of Larry from the collection contacts by executing the following command:

db.contacts.find({ first: "Larry" })

The following will be the output:

Output.22

{ "_id" : ObjectId("531fa52ac4793a19ae72e77a"), "first" : "Larry", "last" : "Wall", "email" : "larryw@perl.org", "mobile" : "987 654 3210" }
> 

We will cover more advanced updates in a later part in this series.

Let us move on to deleting documents now.

To delete a document, use the remove() function.

Let us go ahead and delete the document for the key first with a value of James. For this, let us execute the following command:

db.contacts.remove({ first: "James" })

The following will be the output:

Output.23

> 

This is similar to the DELETE FROM contacts WHERE first = "James" SQL statement from the relational world.

Now, let us query all the documents from the collection contacts by executing the following command:

db.contacts.find()

The following will be the output:

Output.24

{ "_id" : ObjectId("531e555ef4448f6405eccea4"), "first" : "Bill", "last" : "Gates", "email" : "bill.gates@microsoft.com", "mobile" : "123 456 7890" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e77b"), "first" : "Guido", "middle" : "van", "last" : "Rossum", "email" : "gvr@python.org", "mobile" : "345 678 9012" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e77c"), "first" : "Anders", "last" : "Hejlsberg", "email" : "anders@microsoft.com", "mobile" : "456 789 0123" }
{ "_id" : ObjectId("531fa52ac4793a19ae72e77a"), "first" : "Larry", "last" : "Wall", "email" : "larryw@perl.org", "mobile" : "987 654 3210" }
> 

As can be seen from the above output, the document for the key first with a value of James is gone !!!

To delete all the documents from a collection, use the remove() function without any criteria.

Let us go ahead and delete all the documents by executing the following command:

db.contacts.remove()

This is similar to the DELETE FROM contacts SQL statement from the relational world.

Now let us display the number of documents in the collection contacts by executing the following command:

db.contacts.count()

The following will be the output:

Output.25

0
> 

As can be seen from the above output, all the documents from the collection contacts are gone !!!

To drop the collection contacts, use the drop() function.

Let us go ahead and drop the collection contacts by executing the following command:

db.contacts.drop()

The following will be the output:

Output.26

true
> 

This is similar to the DROP TABLE contacts SQL statement from the relational world.

Finally, to exit the MongoDB command-line shell, execute the following command:

exit