11 - NoSQL and MongoDB - CRUD Statements

Table of Contents

Suggested Reading Materials


Welcome to Chapter 11

Welcome to chapter 11 of DBS710, Advanced Database Systems.  This week we continue our work with MongoDB in the area of writing statements to manipulate the data in the database, the equivalent of the SQL statements known as CRUD:

Create or Drop Databases

Creating Using MongoDB Shell

MongoDB creates a database (if it does not exist) when you insert the first document into your database.

> use NewDatabaseName
> db.NewCollection.insertOne({"keyname": "value"})

Both the database, NewDatabaseName, and collection, NewCollection, will be created when the second line is executed.

Dropping Using MongoDB Shell

To drop a database, i.e. delete it permanently

> use NewDatabaseName
> db.dropDatabase()
removes the current database and all data inside it. The use statement is only required if the database to be deleted is not the current database. This will be the case when first loading the shell each day.

Creating Using Compass

Alternatively you can use Compass to create a database from the main screen by clicking on "Create Database" and entering the database name and the first collection name.

Dropping Using Compass

To Drop a database within Compass, navigate to the main screen by clicking on "Local" in the top left corner, then clicking the trashcan icon beside the database to be deleted. Note that only admins can delete the three reserved databases present. In Compass, you will be prompted to type in the database name to confirm the deletion as an added security protocol.

Create or Drop Collections

Creating Using MongoDB Shell

In addition to creating collections while creating databases shown above, you can choose to create a second collection. By setting the current database first and then using a insert statement to add a document

> db.NewCollectionName.insert({"keyname":"value", "keyname2":"value2"})
WriteResult({ "nInserted" : 1 })  

Dropping Using MongoDB Shell

With the current database set correctly, you can type

> db.CollectionName.drop();

Creating using Mongo Compass

From the main MongoDB screen where the databases are listed, you can click on the correct database and then either click the CREATE COLLECTION button or Click the + icon beside the database name. Both bring up the create collection screen where you type in the name and click CREATE COLLECTION.

Dropping Using Mongo Compass

To drop a collection and all the contained documents, click on the correct database and on the list of collections, hit the trashcan icon beside it. To verify the drop, you are prompted to type in the collection name and then click the DROP COLLECTION button.

Manipulating Documents - CRUD

INSERT

The insertOne function can be used to insert a single document into a collection.

> db.collectionName.insertOne({"key1":value1, "key2":value2, "key3":value3})

The insertion automatically creates an _id key and an auto generated value for each document. You can override this feature by including a "_id" key in the document insertion.

PRACTICE: Insert one document into the book collection, the insertion will be the book "Lord of the Rings" and include the author as "J.R.R. Tolkien". Use the books ISBN, 9780007488377, as the document key value.

> db.book.insertOne({"_id" : "9780007488377", "title" : "Lord of the Rings", "author" : "J.R.R. Tolkien"})
{ "acknowledged" : true, "insertedId" : "9780007488377" }

Bulk INSERT

In the Shell, we use the .insert function to add one or more documents to a collection.

To insert a single row with .insert

> db.collectionName.insert({"key1":value1, "key2":value2, "key3":value3})

or inserting multiple documents at the same time:

> db.collectionName.insert({"key1":value1, "key2":value2}, {"key1" : value1b, "key2" : value2b})

Example:

Insert both a Jeep Cherokee (2 doors) and a Nissan Maxima (4 doors) as documents to the car collection in one statement

> db.car.insert([{"make" : "Jeep", "model" : "Cherokee", "numDoors" : "2"},
... {"make" : "Nissan", "model" : "Maxima", "numDoors" : 4}])
BulkWriteResult({
    "writeErrors" : [ ],
    "writeConcernErrors" : [ ],
    "nInserted" : 2,
    "nUpserted" : 0,
    "nMatched" : 0,
    "nModified" : 0,
    "nRemoved" : 0,
    "upserted" : [ ]
})

The following code would fail

> db.foo.insert([{"_id" : 0}, {"_id" : 1}, {"_id" : 1}, {"_id" : 2}])
as the _id is duplicated. The first 2 would insert correctly, but the second 2 would fail.

The insertMany function can also be used for bulk inserts.

> db.blog.insertMany([
… {"title" : "My Blog Post", "content" : "Here's my blog post."},
… {"title" : "New Post", "content" : "Here's the new blog post."},
… {"title" : "Public Post", "content" : "Here's the public post."},
])

Deleting Documents (REMOVE)

The remove function deletes documents and can be used in several forms:

> db.collectionname.remove()
removes ALL documents in the given collection.

The following command removes all documents where the keyname field's value is the value provided.

> db.collectionname.remove{"keyname" : value}
for example:
> db.collectionname.remove{"make" : "Honda"}
would remove all cars where the make was set to Honda, regardless of all other values. This would be the equivalent of the SQL statement:
DELETE FROM cars WHERE make = 'Honda';
that has no filtering by primary key.

The documents cannot be recovered after they are removed.

Updating Documents (UPDATE)

There are two main functions to update data in a MongoDB: .replace() and .update(). Let us look at replace first.

Replace and ReplaceOne

To replace a document with a new one, the replaceOne() function is used.

To replace the following document (in the People collection):

{ "name" : "Joe" }
we could use something like:
> db.people.replaceOne({"name" : "Joe"}, {"name" : "Joe", "friends" : 32, "enemies" : 2})
It is possible to change Joe's name too.

Update()

The update() function is used to update the value of a key in a document and takes 2 parameters:

  1. a query document to locate the document to update (like a WHERE clause in SQL)
  2. a modifier document describing the changes to be made (like the SET clause in SQL)

The update operation is atomic, meaning that the LILO (last-in, last-out) philosophy is applied. If there are two update requests coming to the server, the one that reaches the server first will be executed, and when it is done, the second one will be executed.

A Basic Update
given an existing record : { "_id" : 2, "fname" : "Josh", "lname" : "Phan", "favNum" : 16 }
> db.students.update({"_id": 2} , {"fname" : "Josh", "lname": "Phan", "favNum" : 22, "favColor" : "Red"})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

NOTE: All the keys and values are included except the _id key. This is important as the following code will cause major problems.

given an existing record : { "_id" : 2, "fname" : "Josh", "lname" : "Phan", "favNum" : 16 }
> db.students.update({"_id": 2} , {"favColor" : "Red"})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
RESULTS IN : { "_id" : 2, "FavColor" : "Red" }

NOTE: All the other Keys and Values were lost using that statement. Be CAREFUL.

To solve this you can use a little JavaScript.

given an existing record : { "_id" : 2, "fname" : "Josh", "lname" : "Phan", "favNum" : 16 }
> var id2 = db.students.findOne({"_id" : 2});
> id2.favColor = "Red";
Red
> db.students.update({"_id" : 2}, id2);
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

to delete a key and value from a record, you can use the same code as above and add:

> delete id2.favNum;
> db.students.update({"_id" : 2}, id2);
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

This would simply remove the favNum key and value from the document with _id 2.

Duplicate Key/Value Pairs

Assume we have the following documents:

> db.people.find()
{"_id" : ObjectId("4b2b9f67a1f631733d917a7b"), "name" : "joe", "age" : 65},
{"_id" : ObjectId("4b2b9f67a1f631733d917a7c"), "name" : "joe", "age" : 20},
{"_id" : ObjectId("4b2b9f67a1f631733d917a7d"), "name" : "joe", "age" : 49}

Let’s say we want to increment the value of age for the second document.

    > joe = db.people.findOne({"name" : "joe", "age" : 20});
    {
        "_id" : ObjectId("4b2b9f67a1f631733d917a7c"),
        "name" : "joe",
        "age" : 20
    }
    > joe.age++;                                 // makes joe 21 now
    > db.people.update({"name" : "joe"}, joe);
    E11001 duplicate key on update

The update function wants to replace the updated document to the first matching {"name" : "joe"}. But when updating there is another document with the same _id which is the second document. The update will fail.

> db.people.update({"_id" : ObjectId("4b2b9f67a1f631733d917a7c")}, joe)

Update Modifiers

Now, if we want to update specific Key Values within a document we can use modifiers to alter, add or remove key values.

$inc Modifier

The $inc modifier can be used to increment values by a supplied amount.

> db.websites.findOne({"url":"www.example.com"});
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "url" : "www.example.com",
    "pageviews" : 52
}
> db.analytics.update({"url" : "www.example.com"}, {"$inc" : {"pageviews" : 1}})
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "url" : "www.example.com",
    "pageviews" : 53
}

$set Modifier

The $set modifier allows you to update a specific field without replacing the entire document. If the key provided does not exist, it will create it.

For Example: given the following document:

{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "url" : "www.example.com",
    "pageviews" : 52
}

We can use the .update() function to change a value for a specific existing key.

> db.websites.update({"url" : "www.example.com"}, {"$set" : {"pageviews" : 23}})
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "url" : "www.example.com",
    "pageviews" : 23
}

We can use the .update() function to add a new key and value to an existing document.

> db.websites.update({"url" : "www.example.com"}, {"$set" : {"homepage" : "/index.html"}})
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "url" : "www.example.com",
    "pageviews" : 23
    "homepage" : "/index.html"
}
Using Set for Nested (Embedded) Documents

The $set operator can be used to update an embedded document.

> db.blog.posts.findOne({"title":"A Blog Post"});
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "title" : "A Blog Post",
    "content" : "...",
    "author" : {
        "name" : "joe",
        "email" : "joe@example.com"
    }
}

> db.blog.posts.update({"author.name" : "joe"},
... {"$set" :{"author.name" :"joe schmoe"}})
> db.blog.posts.findOne()
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "title" : "A Blog Post",
    "content" : "...",
    "author" : {
        "name" : "joe schmoe",
        "email" : "joe@example.com"
    }
}

$unset Operator

The $unset function is used to remove a key from a document.

> db.websites.update({"url":"www.example.com")},{"$unset", "homepage":"index.html"});
{
    "_id" : ObjectId("4b253b067525f35f94b60a31"),
    "url" : "www.example.com",
    "pageviews" : 23
    "homepage" : "/index.html"
}

the homepage key and value are deleted.

MongoDB Queries

There are many similarities between queries in NoSQL and those in relational SQL, but there are also many differences. So we will go through the basics one step at a time and for the purposes of examples or demonstrations, we will use a collection of data called FoodTrolley containing the following documents.

Sample Data

Database: HarryPotter
CollectionName: FoodTrolley

[
    {"_id":1,
        "FoodItem":"Chocolate Frogs",
        "Description":"literally chocolate shaped like a frog that jumps around - comes with a wizard card",
        "Price":1.2,
        "Inventory":56},
    {"_id":2,
        "FoodItem":"Bertie Bots Every Flavour Beans",
        "Description":"Basic jelly beans made to taste like just about anything",
        "Price":12.5,
        "Flavours":["Pear","Peppermint","Rocky Road","Salt","Soap","Toothpaste","Vanilla","Watermelon"]},
    {"_id":3,
        "FoodItem":"Pumpkins Pasties",
        "Description":"Pastries made to taste like pumpkins",
        "Price":1.8,
        "Inventory":14},
    {"_id":4,
        "FoodItem":"Cauldron Cakes",
        "Description":"Cauldron shaped licorice flavoured sweets",
        "Price":1.3},
    {"_id":5,
        "FoodItem":"Licorice Wands",
        "Price":0.9,
        "Colour":"Black"}
]

Find()

The find() operation is used to find documents within a collection. This is the equivalent of SELECT in SQL. The general syntax of the find() command is:

> db.collectionname.find({filter conditions},{keys to include in the output results});

where the filter condition(s) would be the equivalent of the WHERE sql clause and the keys to include would be the list of fields included in the SELECT clause in SQL. Both the condition and the keys must be in JavaScript object notation with key:value pairs.

To list all documents and all keys in a collection, the filter and keys would be left blank:

> db.collectionname.find(); is the equivalent of: SELECT * FROM tablename 

Choosing Which Keys to Output

Like in SQL, you can select which key/value pairs are included in the output and which are not.

> db.collectionname.find({},{key1:1, key2:1});

Would show all documents and include the _id key as well as both key1 and key2. We typically use a 0 to specify not to show a key and 1 to show it.

Example 1

> db.FoodTrolley.find({},{FoodItem:1})

outputs

{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }
{ "_id" : 2, "FoodItem" : "Bertie Bots Every Flavour Beans" }
{ "_id" : 3, "FoodItem" : "Pumpkins Pasties" }
{ "_id" : 4, "FoodItem" : "Cauldron Cakes" }
{ "_id" : 5, "FoodItem" : "Licorice Wands" }

Note that the _id is shown by default, you must specify if you do not want it to show.

Showing the _id or not

The _id key is shown by default. In order to not show the _id, we must specify that it is not to be shown like:

> db.FoodTrolley.find({},{FoodItem:1,_id:0})

outputs

{ "FoodItem" : "Chocolate Frogs" }
{ "FoodItem" : "Bertie Bots Every Flavour Beans" }
{ "FoodItem" : "Pumpkins Pasties" }
{ "FoodItem" : "Cauldron Cakes" }
{ "FoodItem" : "Licorice Wands" }

Filter Results

To choose a specific value to filter by we can add a key:value pair to the filter portion of the command:

> db.collectionname.find({filter conditions});

Example 1

> db.FoodTrolley.find({"FoodItem":"Chocolate Frogs"})

outputs

{ "_id" : 1, 
    "FoodItem" : "Chocolate Frogs", 
    "Description" : "literally chocolate shaped like a frog that jumps around - comes with a wizard card", 
    "Price" : 1.2, 
    "Inventory" : 56 }

and combing the filter and key:value choice portions we can do something like:

> db.FoodTrolley.find({"FoodItem":"Chocolate Frogs"}, {FoodItem: 1, inventory: 1})

outputs

{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }

Wildcards

We sometimes need the ability to search for a smaller string within a value and in SQL we would use the LIKE operator with the * or % comparator. In MongoDB, we can use easy regular expressions to do the same thing.

Example

> db.FoodTrolley.find({FoodItem: /Frog/}, {FoodItem: 1})
    -- note that the expression is case sensitive (using frog rather than Frog would return no results

outputs

{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }

However, if we investigate further, we will see that this string search is case sensitive!

> db.FoodTrolley.find({FoodItem: /frog/}, {FoodItem: 1})
    -- note that the expression is case sensitive (using frog rather than Frog would return no results

will return no results because of the lower case f on frog

To make the search find result regardless of case, we can use the i regex qualifier

Example - case sensitivity

> db.FoodTrolley.find({FoodItem: /frog/i}, {FoodItem: 1})
    -- note that the expression is case sensitive (using frog rather than Frog would return no results

Using a Variety of Comparators

Sometimes we need to use different comparators in our filters, such as: >, <, >=, <=, !=, etc.

For these comparators, we can use modifiers in our key value pairs. These modifiers are:

Comparator Modifier
> $gt
>= $gte
< $lt
<= $lte
!= $ne
== $eq
IN $in
NOT IN $nin

Each modifier is entered as the key and the value will be another nested key/value pair representing the comparison values.

> Example

List all food items where the inventory is more than 20!

> db.FoodTrolley.find({Inventory: {$gt: 20}}, {FoodItem: 1})

outputs

{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }

Note that not all the documents contain a key called "Inventory", so only those documents that both include Inventory and the value is greater than 20 was returned.

Between Example

Find all food items whose inventory is between 12 and 60 exclusively!

db.FoodTrolley.find({Inventory: {$gt: 12}, Inventory: {$lt:60}},{_id:0, Inventory:1, FoodItem:1})

outputs

{ "FoodItem" : "Chocolate Frogs", "Inventory" : 56 }
{ "FoodItem" : "Pumpkins Pasties", "Inventory" : 14 }

IN Example

Find all food items whose name contains either the word "frog" or the word "bean".

> db.FoodTrolley.find({FoodItem: {$in: [/Frog/,/Bean/]}}, {FoodItem: 1})
    for case sensitive
-- or
> db.FoodTrolley.find({FoodItem: {$in: [/frog/i,/bean/i]}}, {FoodItem: 1})
    for case insensitive

outputs

{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }
{ "_id" : 2, "FoodItem" : "Bertie Bots Every Flavour Beans" }

$Exists and NULL

Because of the flexibility that NoSQL databases to have different keys in each document and some documents do not contain some keys, we need a way to filter the results by the very existence of a key, and not just by it's values.

We can use an $exists modifier to limit results by those that contain the specified key.

> db.FoodTrolley.find({"Inventory": {$exists: true}}, {FoodItem: 1})

outputs

{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }
{ "_id" : 3, "FoodItem" : "Pumpkins Pasties" }

and using NULL we can determine if the key does not exist.

> db.FoodTrolley.find({"Inventory": null}, {FoodItem: 1})

outputs

{ "_id" : 2, "FoodItem" : "Bertie Bots Every Flavour Beans" }
{ "_id" : 4, "FoodItem" : "Cauldron Cakes" }
{ "_id" : 5, "FoodItem" : "Licorice Wands" }

$IN and $NIN

Often our comparator will include a list of values that would be acceptable, or not acceptable. We can use the $IN modifier to include the values and $NIN to not include others.

Example $IN

> db.FoodTrolley.find({"Inventory": {"$in": [10,14,23]}}, {FoodItem: 1})

outputs

{ "_id" : 3, "FoodItem" : "Pumpkins Pasties" }

Example $NIN

and the opposite

> db.FoodTrolley.find({"Inventory": {"$nin": [10,14,23]}}, {FoodItem: 1})

outputs

{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }
{ "_id" : 2, "FoodItem" : "Bertie Bots Every Flavour Beans" }
{ "_id" : 4, "FoodItem" : "Cauldron Cakes" }
{ "_id" : 5, "FoodItem" : "Licorice Wands" }

Note: That all records that do not include the inventory key is also included along with those that have the key but not with one of the listed values.

$AND and $OR

Similar to SQL, we also need the ability to filter by multiple criteria at the same time using AND and OR boolean operators.

AND Example

List all food trolly items where the inventory is between 40 and 60, we will need to use an $and boolean operator as well!

> db.FoodTrolley.find(
        {$and: [{Inventory: {$gt: 12}}, {Inventory: {$lt:60}}, {FoodItem: /Frogs/}]},
        {_id:0, Inventory:1, FoodItem:1})

outputs

{ "_id" : 1, "FoodItem" : "Chocolate Frogs", "Inventory" : 56 }

This statement essentially is:
show the fooditem name and inventory key/values from FoodTrolley where (inventory > 12 AND inventory < 60 AND fooditem contains the string "FROGS")

OR Example

Find all food items that contain either the word "frog" or the word "bean"

> db.FoodTrolley.find(
    {$or: [{FoodItem: /frog/i}, {FoodItem: /bean/i}]},
    {_id:0, Inventory:1, FoodItem:1})

outputs

{ "FoodItem" : "Chocolate Frogs", "Inventory" : 56 }
{ "FoodItem" : "Bertie Bots Every Flavour Beans" }

Order of Operations/Precedence in NoSQL

In mathematics and SQL we learned that things happen in a specific order to determine an answer. Order of operations still apply within calculations in NoSQL, however items such as AND and OR in order of precedence are not impacted. This means that the {} curly braces will determine the hierarchy of the operators and in which order they execute. These operate as brackets and are executed from the inside out when nested.