Welcome to chapter 11 of DBS710, Advanced Database Systems. This week we continue our work with MongoDB in the area of writing statements to manipulate the data in the database, the equivalent of the SQL statements known as CRUD:
MongoDB creates a database (if it does not exist) when you insert the first document into your database.
> use NewDatabaseName > db.NewCollection.insertOne({"keyname": "value"})
Both the database, NewDatabaseName, and collection, NewCollection, will be created when the second line is executed.
To drop a database, i.e. delete it permanently
> use NewDatabaseName > db.dropDatabase()removes the current database and all data inside it. The use statement is only required if the database to be deleted is not the current database. This will be the case when first loading the shell each day.
Alternatively you can use Compass to create a database from the main screen by clicking on "Create Database" and entering the database name and the first collection name.
To Drop a database within Compass, navigate to the main screen by clicking on "Local" in the top left corner, then clicking the trashcan icon beside the database to be deleted. Note that only admins can delete the three reserved databases present. In Compass, you will be prompted to type in the database name to confirm the deletion as an added security protocol.
In addition to creating collections while creating databases shown above, you can choose to create a second collection. By setting the current database first and then using a insert statement to add a document
> db.NewCollectionName.insert({"keyname":"value", "keyname2":"value2"})
WriteResult({ "nInserted" : 1 })
With the current database set correctly, you can type
> db.CollectionName.drop();
From the main MongoDB screen where the databases are listed, you can click on the correct database and then either click the CREATE COLLECTION
button or Click the +
icon beside the database name. Both bring up the create collection screen where you type in the name and click CREATE COLLECTION
.
To drop a collection and all the contained documents, click on the correct database and on the list of collections, hit the trashcan icon beside it. To verify the drop, you are prompted to type in the collection name and then click the DROP COLLECTION
button.
The insertOne
function can be used to insert a single document into a collection.
> db.collectionName.insertOne({"key1":value1, "key2":value2, "key3":value3})
The insertion automatically creates an _id key and an auto generated value for each document. You can override this feature by including a "_id" key in the document insertion.
PRACTICE: Insert one document into the book collection, the insertion will be the book "Lord of the Rings" and include the author as "J.R.R. Tolkien". Use the books ISBN, 9780007488377, as the document key value.
> db.book.insertOne({"_id" : "9780007488377", "title" : "Lord of the Rings", "author" : "J.R.R. Tolkien"})
{ "acknowledged" : true, "insertedId" : "9780007488377" }
In the Shell, we use the .insert
function to add one or more documents to a collection.
To insert a single row with .insert
> db.collectionName.insert({"key1":value1, "key2":value2, "key3":value3})
or inserting multiple documents at the same time:
> db.collectionName.insert({"key1":value1, "key2":value2}, {"key1" : value1b, "key2" : value2b})
Insert both a Jeep Cherokee (2 doors) and a Nissan Maxima (4 doors) as documents to the car collection in one statement
> db.car.insert([{"make" : "Jeep", "model" : "Cherokee", "numDoors" : "2"},
... {"make" : "Nissan", "model" : "Maxima", "numDoors" : 4}])
BulkWriteResult({
"writeErrors" : [ ],
"writeConcernErrors" : [ ],
"nInserted" : 2,
"nUpserted" : 0,
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
})
The following code would fail
> db.foo.insert([{"_id" : 0}, {"_id" : 1}, {"_id" : 1}, {"_id" : 2}])as the _id is duplicated. The first 2 would insert correctly, but the second 2 would fail.
The insertMany function can also be used for bulk inserts.
> db.blog.insertMany([ … {"title" : "My Blog Post", "content" : "Here's my blog post."}, … {"title" : "New Post", "content" : "Here's the new blog post."}, … {"title" : "Public Post", "content" : "Here's the public post."}, ])
The remove function deletes documents and can be used in several forms:
> db.collectionname.remove()removes ALL documents in the given collection.
The following command removes all documents where the keyname field's value is the value provided.
> db.collectionname.remove{"keyname" : value}for example:
> db.collectionname.remove{"make" : "Honda"}would remove all cars where the make was set to Honda, regardless of all other values. This would be the equivalent of the SQL statement:
DELETE FROM cars WHERE make = 'Honda';that has no filtering by primary key.
The documents cannot be recovered after they are removed.
There are two main functions to update data in a MongoDB: .replace() and .update(). Let us look at replace first.
To replace a document with a new one, the replaceOne() function is used.
To replace the following document (in the People collection):
{ "name" : "Joe" }we could use something like:
> db.people.replaceOne({"name" : "Joe"}, {"name" : "Joe", "friends" : 32, "enemies" : 2})It is possible to change Joe's name too.
The update() function is used to update the value of a key in a document and takes 2 parameters:
The update operation is atomic, meaning that the LILO (last-in, last-out) philosophy is applied. If there are two update requests coming to the server, the one that reaches the server first will be executed, and when it is done, the second one will be executed.
given an existing record : { "_id" : 2, "fname" : "Josh", "lname" : "Phan", "favNum" : 16 } > db.students.update({"_id": 2} , {"fname" : "Josh", "lname": "Phan", "favNum" : 22, "favColor" : "Red"}) WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
NOTE: All the keys and values are included except the _id key. This is important as the following code will cause major problems.
given an existing record : { "_id" : 2, "fname" : "Josh", "lname" : "Phan", "favNum" : 16 } > db.students.update({"_id": 2} , {"favColor" : "Red"}) WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 }) RESULTS IN : { "_id" : 2, "FavColor" : "Red" }
NOTE: All the other Keys and Values were lost using that statement. Be CAREFUL.
To solve this you can use a little JavaScript.
given an existing record : { "_id" : 2, "fname" : "Josh", "lname" : "Phan", "favNum" : 16 } > var id2 = db.students.findOne({"_id" : 2}); > id2.favColor = "Red"; Red > db.students.update({"_id" : 2}, id2); WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
to delete a key and value from a record, you can use the same code as above and add:
> delete id2.favNum;
> db.students.update({"_id" : 2}, id2);
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
This would simply remove the favNum key and value from the document with _id 2.
Assume we have the following documents:
> db.people.find() {"_id" : ObjectId("4b2b9f67a1f631733d917a7b"), "name" : "joe", "age" : 65}, {"_id" : ObjectId("4b2b9f67a1f631733d917a7c"), "name" : "joe", "age" : 20}, {"_id" : ObjectId("4b2b9f67a1f631733d917a7d"), "name" : "joe", "age" : 49}
Let’s say we want to increment the value of age for the second document.
> joe = db.people.findOne({"name" : "joe", "age" : 20}); { "_id" : ObjectId("4b2b9f67a1f631733d917a7c"), "name" : "joe", "age" : 20 } > joe.age++; // makes joe 21 now > db.people.update({"name" : "joe"}, joe); E11001 duplicate key on update
The update function wants to replace the updated document to the first matching {"name" : "joe"}. But when updating there is another document with the same _id which is the second document. The update will fail.
> db.people.update({"_id" : ObjectId("4b2b9f67a1f631733d917a7c")}, joe)
Now, if we want to update specific Key Values within a document we can use modifiers to alter, add or remove key values.
The $inc modifier can be used to increment values by a supplied amount.
> db.websites.findOne({"url":"www.example.com"}); { "_id" : ObjectId("4b253b067525f35f94b60a31"), "url" : "www.example.com", "pageviews" : 52 } > db.analytics.update({"url" : "www.example.com"}, {"$inc" : {"pageviews" : 1}}) { "_id" : ObjectId("4b253b067525f35f94b60a31"), "url" : "www.example.com", "pageviews" : 53 }
The $set modifier allows you to update a specific field without replacing the entire document. If the key provided does not exist, it will create it.
For Example: given the following document:
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"url" : "www.example.com",
"pageviews" : 52
}
We can use the .update() function to change a value for a specific existing key.
> db.websites.update({"url" : "www.example.com"}, {"$set" : {"pageviews" : 23}})
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"url" : "www.example.com",
"pageviews" : 23
}
We can use the .update() function to add a new key and value to an existing document.
> db.websites.update({"url" : "www.example.com"}, {"$set" : {"homepage" : "/index.html"}})
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"url" : "www.example.com",
"pageviews" : 23
"homepage" : "/index.html"
}
The $set operator can be used to update an embedded document.
> db.blog.posts.findOne({"title":"A Blog Post"}); { "_id" : ObjectId("4b253b067525f35f94b60a31"), "title" : "A Blog Post", "content" : "...", "author" : { "name" : "joe", "email" : "joe@example.com" } } > db.blog.posts.update({"author.name" : "joe"}, ... {"$set" :{"author.name" :"joe schmoe"}}) > db.blog.posts.findOne() { "_id" : ObjectId("4b253b067525f35f94b60a31"), "title" : "A Blog Post", "content" : "...", "author" : { "name" : "joe schmoe", "email" : "joe@example.com" } }
The $unset function is used to remove a key from a document.
> db.websites.update({"url":"www.example.com")},{"$unset", "homepage":"index.html"});
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"url" : "www.example.com",
"pageviews" : 23
"homepage" : "/index.html"
}
the homepage key and value are deleted.
There are many similarities between queries in NoSQL and those in relational SQL, but there are also many differences. So we will go through the basics one step at a time and for the purposes of examples or demonstrations, we will use a collection of data called FoodTrolley containing the following documents.
Database: HarryPotter
CollectionName: FoodTrolley
[ {"_id":1, "FoodItem":"Chocolate Frogs", "Description":"literally chocolate shaped like a frog that jumps around - comes with a wizard card", "Price":1.2, "Inventory":56}, {"_id":2, "FoodItem":"Bertie Bots Every Flavour Beans", "Description":"Basic jelly beans made to taste like just about anything", "Price":12.5, "Flavours":["Pear","Peppermint","Rocky Road","Salt","Soap","Toothpaste","Vanilla","Watermelon"]}, {"_id":3, "FoodItem":"Pumpkins Pasties", "Description":"Pastries made to taste like pumpkins", "Price":1.8, "Inventory":14}, {"_id":4, "FoodItem":"Cauldron Cakes", "Description":"Cauldron shaped licorice flavoured sweets", "Price":1.3}, {"_id":5, "FoodItem":"Licorice Wands", "Price":0.9, "Colour":"Black"} ]
The find() operation is used to find documents within a collection. This is the equivalent of SELECT in SQL. The general syntax of the find() command is:
> db.collectionname.find({filter conditions},{keys to include in the output results});
where the filter condition(s) would be the equivalent of the WHERE
sql clause and the keys to include would be the list of fields included in the SELECT
clause in SQL. Both the condition and the keys must be in JavaScript object notation with key:value pairs.
To list all documents and all keys in a collection, the filter and keys would be left blank:
> db.collectionname.find(); is the equivalent of: SELECT * FROM tablename
Like in SQL, you can select which key/value pairs are included in the output and which are not.
> db.collectionname.find({},{key1:1, key2:1});
Would show all documents and include the _id key as well as both key1 and key2. We typically use a 0 to specify not to show a key and 1 to show it.
> db.FoodTrolley.find({},{FoodItem:1})
outputs
{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }
{ "_id" : 2, "FoodItem" : "Bertie Bots Every Flavour Beans" }
{ "_id" : 3, "FoodItem" : "Pumpkins Pasties" }
{ "_id" : 4, "FoodItem" : "Cauldron Cakes" }
{ "_id" : 5, "FoodItem" : "Licorice Wands" }
Note that the _id is shown by default, you must specify if you do not want it to show.
The _id key is shown by default. In order to not show the _id, we must specify that it is not to be shown like:
> db.FoodTrolley.find({},{FoodItem:1,_id:0})
outputs
{ "FoodItem" : "Chocolate Frogs" }
{ "FoodItem" : "Bertie Bots Every Flavour Beans" }
{ "FoodItem" : "Pumpkins Pasties" }
{ "FoodItem" : "Cauldron Cakes" }
{ "FoodItem" : "Licorice Wands" }
To choose a specific value to filter by we can add a key:value pair to the filter portion of the command:
> db.collectionname.find({filter conditions});
> db.FoodTrolley.find({"FoodItem":"Chocolate Frogs"})
outputs
{ "_id" : 1,
"FoodItem" : "Chocolate Frogs",
"Description" : "literally chocolate shaped like a frog that jumps around - comes with a wizard card",
"Price" : 1.2,
"Inventory" : 56 }
and combing the filter and key:value choice portions we can do something like:
> db.FoodTrolley.find({"FoodItem":"Chocolate Frogs"}, {FoodItem: 1, inventory: 1})
outputs
{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }
We sometimes need the ability to search for a smaller string within a value and in SQL we would use the LIKE operator with the * or % comparator. In MongoDB, we can use easy regular expressions to do the same thing.
> db.FoodTrolley.find({FoodItem: /Frog/}, {FoodItem: 1})
-- note that the expression is case sensitive (using frog rather than Frog would return no results
outputs
{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }
However, if we investigate further, we will see that this string search is case sensitive!
> db.FoodTrolley.find({FoodItem: /frog/}, {FoodItem: 1})
-- note that the expression is case sensitive (using frog rather than Frog would return no results
will return no results because of the lower case f on frog
To make the search find result regardless of case, we can use the i
regex qualifier
> db.FoodTrolley.find({FoodItem: /frog/i}, {FoodItem: 1}) -- note that the expression is case sensitive (using frog rather than Frog would return no results
Sometimes we need to use different comparators in our filters, such as: >
, <
, >=
, <=
, !=
, etc.
For these comparators, we can use modifiers in our key value pairs. These modifiers are:
Comparator | Modifier |
---|---|
> | $gt |
>= | $gte |
< | $lt |
<= | $lte |
!= | $ne |
== | $eq |
IN | $in |
NOT IN | $nin |
Each modifier is entered as the key and the value will be another nested key/value pair representing the comparison values.
List all food items where the inventory is more than 20!
> db.FoodTrolley.find({Inventory: {$gt: 20}}, {FoodItem: 1})
outputs
{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }
Note that not all the documents contain a key called "Inventory", so only those documents that both include Inventory and the value is greater than 20 was returned.
Find all food items whose inventory is between 12 and 60 exclusively!
db.FoodTrolley.find({Inventory: {$gt: 12}, Inventory: {$lt:60}},{_id:0, Inventory:1, FoodItem:1})
outputs
{ "FoodItem" : "Chocolate Frogs", "Inventory" : 56 } { "FoodItem" : "Pumpkins Pasties", "Inventory" : 14 }
Find all food items whose name contains either the word "frog" or the word "bean".
> db.FoodTrolley.find({FoodItem: {$in: [/Frog/,/Bean/]}}, {FoodItem: 1}) for case sensitive -- or > db.FoodTrolley.find({FoodItem: {$in: [/frog/i,/bean/i]}}, {FoodItem: 1}) for case insensitive
outputs
{ "_id" : 1, "FoodItem" : "Chocolate Frogs" } { "_id" : 2, "FoodItem" : "Bertie Bots Every Flavour Beans" }
Because of the flexibility that NoSQL databases to have different keys in each document and some documents do not contain some keys, we need a way to filter the results by the very existence of a key, and not just by it's values.
We can use an $exists
modifier to limit results by those that contain the specified key.
> db.FoodTrolley.find({"Inventory": {$exists: true}}, {FoodItem: 1})
outputs
{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }
{ "_id" : 3, "FoodItem" : "Pumpkins Pasties" }
and using NULL we can determine if the key does not exist.
> db.FoodTrolley.find({"Inventory": null}, {FoodItem: 1})
outputs
{ "_id" : 2, "FoodItem" : "Bertie Bots Every Flavour Beans" }
{ "_id" : 4, "FoodItem" : "Cauldron Cakes" }
{ "_id" : 5, "FoodItem" : "Licorice Wands" }
Often our comparator will include a list of values that would be acceptable, or not acceptable. We can use the $IN modifier to include the values and $NIN to not include others.
> db.FoodTrolley.find({"Inventory": {"$in": [10,14,23]}}, {FoodItem: 1})
outputs
{ "_id" : 3, "FoodItem" : "Pumpkins Pasties" }
and the opposite
> db.FoodTrolley.find({"Inventory": {"$nin": [10,14,23]}}, {FoodItem: 1})
outputs
{ "_id" : 1, "FoodItem" : "Chocolate Frogs" }
{ "_id" : 2, "FoodItem" : "Bertie Bots Every Flavour Beans" }
{ "_id" : 4, "FoodItem" : "Cauldron Cakes" }
{ "_id" : 5, "FoodItem" : "Licorice Wands" }
Note: That all records that do not include the inventory key is also included along with those that have the key but not with one of the listed values.
Similar to SQL, we also need the ability to filter by multiple criteria at the same time using AND
and OR
boolean operators.
List all food trolly items where the inventory is between 40 and 60, we will need to use an $and boolean operator as well!
> db.FoodTrolley.find( {$and: [{Inventory: {$gt: 12}}, {Inventory: {$lt:60}}, {FoodItem: /Frogs/}]}, {_id:0, Inventory:1, FoodItem:1})
outputs
{ "_id" : 1, "FoodItem" : "Chocolate Frogs", "Inventory" : 56 }
This statement essentially is:
show the fooditem name and inventory key/values from FoodTrolley where (inventory > 12 AND inventory < 60 AND fooditem contains the string "FROGS")
Find all food items that contain either the word "frog" or the word "bean"
> db.FoodTrolley.find( {$or: [{FoodItem: /frog/i}, {FoodItem: /bean/i}]}, {_id:0, Inventory:1, FoodItem:1})
outputs
{ "FoodItem" : "Chocolate Frogs", "Inventory" : 56 } { "FoodItem" : "Bertie Bots Every Flavour Beans" }
In mathematics and SQL we learned that things happen in a specific order to determine an answer. Order of operations still apply within calculations in NoSQL, however items such as AND and OR in order of precedence are not impacted. This means that the {} curly braces will determine the hierarchy of the operators and in which order they execute. These operate as brackets and are executed from the inside out when nested.