MongoDB – BMC Software | Blogs

MongoDB Indexes: Creating, Finding & Dropping Top Index Types

Shanika Wickramasinghe — Thu, 18 Feb 2021 00:00:30 +0000

Indexes provide users with an efficient way of querying data. When querying data without indexes, the query will have to search for all the records within a database to find data that match the query.

In MongoDB, querying without indexes is called a collection scan. A collection scan will:

Result in various performance bottlenecks
Significantly slow down your application

Fortunately, using indexes fixes both these issues. By limiting the number of documents to be queried, you’ll increases the overall performance of the application.

In this tutorial, I’ll walk you through different types of indexes and show you how to create and manage indexes in MongoDB.

(This article is part of our MongoDB Guide. Use the right-hand menu to navigate.)

What are indexes in MongoDB?

Indexes are special data structures that store a small part of the Collection’s data in a way that can be queried easily.

In simplest terms, indexes store the values of the indexed fields outside the table or collection and keep track of their location in the disk. These values are used to order the indexed fields. This ordering helps to perform equality matches and range-based query operations efficiently. In MongoDB, indexes are defined in the collection level and indexes on any field or subfield of the documents in a collection are supported.

For this tutorial, we’ll use the following data set to demonstrate the indexing functionality of MongoDB.

use students
db.createCollection("studentgrades")
db.studentgrades.insertMany(
    [
        {name: "Barry", subject: "Maths", score: 92},
        {name: "Kent", subject: "Physics", score: 87},
        {name: "Harry", subject: "Maths", score: 99, notes: "Exceptional Performance"},
        {name: "Alex", subject: "Literature", score: 78},
        {name: "Tom", subject: "History", score: 65, notes: "Adequate"}
    ]
)
db.studentgrades.find({},{_id:0})

Result

Creating indexes

When creating documents in a collection, MongoDB creates a unique index using the _id field. MongoDB refers to this as the Default _id Index. This default index cannot be dropped from the collection.

When querying the test data set, you can see the _id field which will be utilized as the default index:

db.studentgrades.find().pretty()

Result:

Now let’s create an index. To do that, you can use the createIndex method using the following syntax:

db..createIndex(, )

When creating an index, you need to define the field to be indexed and the direction of the key (1 or -1) to indicate ascending or descending order.

Another thing to keep in mind is the index names. By default, MongoDB will generate index names by concatenating the indexed keys with the direction of each key in the index using an underscore as the separator. For example: {name: 1} will be created as name_1.

The best option is to use the name option to define a custom index name when creating an index. Indexes cannot be renamed after creation. (The only way to rename an index is to first drop that index, which we show below, and recreate it using the desired name.)

Let’s create an index using the name field in the studentgrades collection and name it as student name index.

db.studentgrades.createIndex(
{name: 1},
{name: "student name index"}
)

Result:

Finding indexes

You can find all the available indexes in a MongoDB collection by using the getIndexes method. This will return all the indexes in a specific collection.

db..getIndexes()

Let’s view all the indexes in the studentgrades collection using the following command:

db.studentgrades.getIndexes()

Result:

The output contains the default _id index and the user-created index student name index.

Dropping indexes

To delete an index from a collection, use the dropIndex method while specifying the index name to be dropped.

db..dropIndex()

Let’s remove the user-created index with the index name student name index, as shown below.

db.studentgrades.dropIndex("student name index")

Result:

You can also use the index field value for removing an index without a defined name:

db.studentgrades.dropIndex({name:1})

Result:

The dropIndexes command can also drop all the indexes excluding the default _id index.

db.studentgrades.dropIndexes()

Result:

Common MongoDB index types

MongoDB provides different types of indexes that can be utilized according to user needs. Here are the most common ones:

Single field index
Compound index
Multikey index

Single field index

These user-defined indexes use a single field in a document to create an index in an ascending or descending sort order (1 or -1). In a single field index, the sort order of the index key does not have an impact because MongoDB can traverse the index in either direction.

db.studentgrades.createIndex({name: 1})

Result:

The above index will sort the data in ascending order using the name field. You can use the sort() method to see how the data will be represented in the index.

db.studentgrades.find({},{_id:0}).sort({name:1})

Result:

Compound index

You can use multiple fields in a MongoDB document to create a compound index. This type of index will use the first field for the initial sort and then sort by the preceding fields.

db.studentgrades.createIndex({subject: 1, score: -1})

In the above compound index, MongoDB will:

First sort by the subject field
Then, within each subject value, sort by grade

The index would create a data structure similar to the following:

db.studentgrades.find({},{_id:0}).sort({subject:1, score:-1})

Result:

Multikey index

MongoDB supports indexing array fields. When you create an index for a field containing an array, MongoDB will create separate index entries for every element in the array. These multikey indexes enable users to query documents using the elements within the array.

MongoDB will automatically create a multikey index when encountered with an array field without requiring the user to explicitly define the multikey type.

Let’s create a new data set containing an array field to demonstrate the creation of a multikey index.

db.createCollection("studentperformance")
db.studentperformance.insertMany(
[
{name: "Barry", school: "ABC Academy", grades: [85, 75, 90, 99] },
{name: "Kent", school: "FX High School", grades: [74, 66, 45, 67]},
{name: "Alex", school: "XYZ High", grades: [80, 78, 71, 89]},
]
)
db.studentperformance.find({},{_id:0}).pretty()

Result:

Now let’s create an index using the grades field.

db.studentperformance.createIndex({grades:1})

Result:

The above code will automatically create a Multikey index in MongoDB. When you query for a document using the array field (grades), MongoDB will search for the first element of the array defined in the find() method and then search for the whole matching query.

For instance, let’s consider the following find query:

db.studentperformance.find({grades: [80, 78, 71, 89]}, {_id: 0})

Initially, MongoDB will use the multikey index for searching documents where the grades array contains the first element (80) in any position. Then, within those selected documents, the documents with all the matching elements will be selected.

Other MongoDB index types

In addition to the popular Index types mentioned above, MongoDB also offers some special index types for targeted use cases:

Geospatial index
Test index
Hashed index

Geospatial Index

MongoDB provides two types of indexes to increase the efficiency of database queries when dealing with geospatial coordinate data:

2d indexes that use planar geometry which is intended for legacy coordinate pairs used in MongoDB 2.2 and earlier.
2dsphere indexes that use spherical geometry.

db..createIndex( {  : "2dsphere" } )

Text index

The text index type enables you to search the string content in a collection.

db..createIndex( { : "text" } )

Hashed index

MongoDB Hashed index type is used to provide support for hash-based sharding functionality. This would index the hash value of the specified field.

db..createIndex( {  : "hashed" } )

MongoDB index properties

You can enhance the functionality of an index further by utilizing index properties. In this section, you will get to know these commonly used index properties:

Sparse index
Partial index
Unique index

Sparse index

The MongoDB sparse property allows indexes to omit indexing documents in a collection if the indexed field is unavailable in a document and create an index containing only the documents which contain the indexed field.

db.studentgrades.createIndex({notes:1},{sparse: true})

Result:

In the previous studentgrades collection, if you create an index using the notes field, it will index only two documents as the notes field is present only in two documents.

Partial index

The partial index functionality allows users to create indexes that match a certain filter condition. Partial indexes use the partialFilterExpression option to specify the filter condition.

db.studentgrades.createIndex(
{name:1},
{partialFilterExpression: {score: { $gte: 90}}}
)

Result:

The above code will create an index for the name field but will only include documents in which the value of the score field is greater than or equal to 90.

Unique index

The unique property enables users to create a MongoDB index that only includes unique values. This will:

Reject any duplicate values in the indexed field
Limit the index to documents containing unique values

db.studentgrades.createIndex({name:1},{unique: true})

Result:

The above-created index will limit the indexing to documents with unique values in the name field.

Indexes recap

That concludes this MongoDB indexes tutorial. You learned how to create, find, and drop indexes, use different index types, and create complex indexes. These indexes can then be used to further enhance the functionality of the MongoDB databases increasing the performance of applications which utilize fast database queries.

MongoDB Replication: A Complete Introduction

Shanika Wickramasinghe — Thu, 04 Feb 2021 14:32:02 +0000

Replication using replica sets is a robust method to increase the data resilience of your entire MongoDB database.

In this article, I’ll walk you through:

The replication concept
Main components of replication
Configuration requirements (vs clusters)
Ways to configure and manage a replica set in a MongoDB environment

Let’s get started.

(This article is part of our MongoDB Guide. Use the right-hand menu to navigate.)

What is MongoDB Replication?

In simple terms, MongoDB replication is the process of creating a copy of the same data set in more than one MongoDB server. This can be achieved by using a Replica Set. A replica set is a group of MongoDB instances that maintain the same data set and pertain to any mongod process.

Replication enables database administrators to provide:

Data redundancy
High availability of data

Maintaining multiple MongoDB servers with the same data provides distributed access to the data while increasing the fault tolerance of the database by providing backups.

Additionally, replication can also be used as a part of load balancing, where read and write operations can be distributed across all the instances depending on the use case.

How MongoDB replication works

MongoDB handles replication through a Replica Set, which consists of multiple MongoDB nodes that are grouped together as a unit.

A Replica Set requires a minimum of three MongoDB nodes:

One of the nodes will be considered the primary node that receives all the write operations.
The others are considered secondary nodes. These secondary nodes will replicate the data from the primary node.

Basic replication methodology

While the primary node is the only instance that accepts write operations, any other node within a replica set can accept read operations. These can be configured through a supported MongoDB client.

In an event where the primary node is unavailable or inoperable, a secondary node will take the primary node’s role to provide continuous availability of data. In such a case, the primary node selection is made through a process called Replica Set Elections, where the most suitable secondary node is selected as the new primary node.

The Heartbeat process

Heartbeat is the process that identifies the current status of a MongoDB node in a replica set. There, the replica set nodes send pings to each other every two seconds (hence the name). If any node doesn’t ping back within 10 seconds, the other nodes in the replica set mark it as inaccessible.

This functionality is vital for the automatic failover process where the primary node is unreachable and the secondary nodes do not receive a heartbeat from it within the allocated time frame. Then, MongoDB will automatically assign a secondary server to act as the primary server.

Replica set elections

The elections in replica sets are used to determine which MongoDB node should become the primary node. These elections can occur in the following instances:

Loss of connectivity to the primary node (detected by heartbeats)
Initializing a replica set
Adding a new node to an existing replica set
Maintenance of a Replica set using stepDown or rs.reconfig methods

In the process of an election, first, one of the nodes will raise a flag requesting an election, and all the other nodes will vote to elect that node as the primary node. The average time for an election process to complete is 12 seconds, assuming that replica configuration settings are in their default values. A major factor that may affect the time for an election to complete is the network latency, and it can cause delays in getting your replica set back to operation with the new primary node.

The replica set cannot process any write operations until the election is completed. However, read operations can be served if read queries are configured to be processed on secondary nodes. MongoDB 3.6 supports compatible connectivity drivers to be configured to retry compatible write operations.

MongoDB Replica Set vs MongoDB Cluster

A replica set creates multiple copies of the same data set across the replica set nodes. The basic objective of a replica set is to:

Increase data availability
Provide a built-in backup solution

Clusters work differently. The MongoDB cluster distributes the data across multiple nodes using a shard key. This process will break down the data into multiple pieces called shards and then copy each shard to a separate node.

The main purpose of a cluster is to support extremely large data sets and high throughput operations by horizontally scaling the workload.

The major difference between a replica set and a cluster is:

A replica set copies the data set as a whole.
A cluster distributes the workload and stores pieces of data (shards) across multiple servers.

MongoDB allows users to combine these two functionalities by creating a sharded cluster, where each shard is replicated to a secondary server in order to provide high data availability and redundancy.

Dealing with replication delay

A major concern when it comes to configuring replication is the replication delay (lag). This refers to the delay in the replication process to a secondary node after an update to the primary node in the replica set.

A certain replication lag while replicating large data sets is normal. Still, the following factors can increase the replication delay, negating the benefits of an up-to-date replication:

Network latency. As you are dealing with multiple MongoDB instances residing in different servers during replication, the primary communication method will be the network. If the network is insufficient to cater to the needs of the replication process, there will be delays in replicating data throughout the replica set. Therefore, it is better to always route your traffic in a stable network with sufficient bandwidth.
Disk throughput. If the replication nodes use different disk types (e.g., the primary node using SSD while secondary nodes using HDD as disks), there will be a delay in replication since the secondary nodes will process the write queries slower compared to the primary node. This is a common issue in multi-tenant and large-scale deployments.
Heavy workloads. Executing heavy and long-running write operations on the primary node will also lead to delays in the replication process. So, it’s best to configure the MongoDB Write Concern correctly so that the replication process will be able to keep up with the workload without affecting the overall performance of the replica set.
Background tasks. Another important step is to identify the background tasks such as server updates, cron jobs, and security checkups that might have unexpected effects on the network or disk usage, causing delays in the replication process.
Database operations. – Some database queries can be slow to execute, while some might take a considerable time to execute. Using a database profiler, you can identify such queries and try to optimize them accordingly.

Configuring the Replica Set

The above sections have covered all the important theories related to replication. Next, let’s configure a replica set using MongoDB instances installed on three Ubuntu servers.

Setting up the environment

Each Ubuntu server will have its own MongoDB instance with the standard MongoDB port 27017 accessible through the firewall. MongoDB recommends using logical DNS hostnames instead of IP addresses when configuring replica sets in production environments. That is to avoid disruptions in communication within the replica set due to changes in IP address.

You can update the /etc/hosts file to assign hostnames to each server in a test environment. There, you have to add the below-mentioned hostnames as the hosts indicating each node and reboot each server to load the new configuration.

/etc/hosts

10.10.10.56 mongodb-node-01
10.10.10.57 mongodb-node-02
10.10.10.58 mongodb-node-03

Starting the MongoDB Instance

Before starting the MongoDB instance, you need to modify the config file in each server to reflect the IP address and indicate the replica set. Let’s make the following modifications in each mongod.conf file.

mongodb-node-01

# network interfaces
net:
port: 27017
bindIp: 127.0.0.1,mongodb-node-01# replica set
replication:
replSetName: replicasetMain

mongodb-node-02

# network interfaces
net:
port: 27017
bindIp: 127.0.0.1,mongodb-node-02# replica set
replication:
replSetName: replicasetMain

mongodb-node-03

# network interfaces
net:
port: 27017
bindIp: 127.0.0.1,mongodb-node-03# replica set
replication:
replSetName: replicasetMain

After the updates are completed, restart the mongod service in each instance to reload the configurations.

sudo systemctl restart mongod

Initializing the Replica Set

You can initialize a replica set using the rs.initiate() method. This method is only required to be executed on a single MongoDB instance in the replica set. Within the initiate method, you can specify the replica set name and member. These details must match with the configurations you made in each config file in the previous step.

rs.initiate( {
_id : "replicasetMain",
members: [
{ _id: 0, host: "mongodb-node-01:27017" },
{ _id: 1, host: "mongodb-node-02:27017" },
{ _id: 2, host: "mongodb-node-03:27017" }
]
})

Result:

Using the rs.conf() command, you can view the replica set configuration as shown below.

rs.conf()

Result:

Validate Data Replication

Now that you’ve configured the replication set, the next step is to validate the replication process. To do that, first login to the primary MongoDB node in the replica set.

Then you need to create a collection with some sample data using the following commands:

use replicatestdata
db.createCollection("replicatestCollection01")
db.replicatestCollection01.insertMany([
{name: "test_record_one", description: "testing replica set", record: 1},
{name: "test_record_two", description: "testing replica set", record: 2},
{name: "test_record_three", description: "testing replica set", record: 3}
])

Result:

Next, log in to a secondary node and check if the data is replicated. An important thing to note here is that, by default, read queries are disabled in secondary nodes. So, you need to enable them using the following command.

db.getMongo().setSecondaryOk()

After that, you can search for the data and verify if they are replicated well to the secondary node.

show dbs
use replicatestdata
show collections
db.replicatestCollection01.find().sort({record: 1}).pretty()

Result:

The above result indicates that the data of the primary node was successfully replicated to the secondary instances.

Adding a new node to the Replica Set

Using the rs.add() command, you can add a new node to an existing replica set.

Before adding a new node, you need to configure it. For that, modify the mongd.conf file to indicate the replica set and restart the mongod service.

mongodb-node-04

# network interfaces
net:
port: 27017
bindIp: 127.0.0.1, mongodb-node-04# replica set
replication:
replSetName: replicasetMain

Then, go to the primary node of the replica set and run the add() command with the parameters listed below.

host refers to the IP address or the hostname of the new node.
priority indicates the relative eligibility of the new node to become the primary node. (priority 0 means that the node cannot become the primary node under any circumstance.)
vote indicates if the node is capable of voting in elections to select the primary node.

rs.add( { host: "mongodb-node-04:27017", priority: 0, votes: 1 } )

Result:

The above command will add a new node to the replica set. You can verify if the new node has been added by using the rs.status() command, which will display the details of the new node.

rs.status()

Result:

Finally, the data in the other nodes will be automatically replicated to the new node.

Removing a node from the Replica Set

The rs.remove() command can be used to remove a node from the replica set. You need to shut down the server instance before attempting to remove a node. When removing, you can specify which node should be removed using the name of that node.

rs.remove("mongodb-node-04:27017")

Result:

This result indicates that the node was successfully removed from the replica set.

MongoDB Role-Based Access Control (RBAC) Explained

Shanika Wickramasinghe — Thu, 28 Jan 2021 14:29:22 +0000

MongoDB access control enables database administrators to secure MongoDB instances by enforcing user authentication. MongoDB supports multiple authentication methods and grants access through role-based authorization.

Roles are the foundation blocks in MongoDB, providing user isolation for a great degree of security and manageability.

In this article, we’ll take a look at the most common roles in MongoDB. Then, we’ll explore role management by illustrating many common role-related functions.

(This article is part of our MongoDB Guide. Use the right-hand menu to navigate.)

How MongoDB RBAC works

A user can be assigned one or more roles, and the scope of user access to the database system is determined by those assigned roles. Users have no access to the system outside the designated roles.

Importantly, MongoDB access control is not enabled by default; you have to enable it through the security.authorization setting.

When that setting is enabled, users need to authorize themselves before interacting with the database. User privileges can include a specific resource (database, collection, cluster) and actions permitted on that resource.

A role grants permission to perform particular actions on a specific resource. A single user account can consist of multiple roles. Roles can be assigned:

At the time of user creation
When updating the roles of existing users

There are two types of Roles in MongoDB:

Built-In Roles. MongoDB provides built-in roles to offer a set of privileges that are commonly needed in a database system.
User-Defined Roles. If built-in roles do not provide all the expected privileges, database administrators can define custom roles using the createRole method. Those roles are called User-Defined roles.

Next, let’s look at the variety of roles in more details.

Built-in Roles

In this section, we’ll go through the most common built-in roles of MongoDB.

Database user roles

Database user roles are normal user roles that are useful in regular database interactions.

Role	Description
read	Read all non-system collections and the system.js collection
readWrite	Both Read and Write functionality on non-system collections and the system.js collection

Database administration roles

These are roles that are used to carry out administrative operations on databases.

Role	Description
dbAdmin	Perform administrative tasks such as indexing and gathering statistics, but cannot manage users or roles
userAdmin	Provides the ability to create and modify roles and users of a specific database
dbOwner	This is the owner of the database who can perform any action. It is equal to combining all the roles mentioned above: readWrite, dbAdmin, and userAdmin roles

Cluster admin roles

These roles enable users to interact and administrate MongoDB clusters.

Role	Description
clusterManager	Enables management and monitoring functionality on the cluster. Provides access to config and local databases used in sharding and replication
clusterMonitor	Provide read-only access to MongoDB monitoring tools such as Cloud Manager or Ops Manager monitoring agent
hostManager	Provides the ability to monitor and manage individual servers
clusterAdmin	This role includes the highest number of cluster administrative privileges allowing a user to do virtually anything. This functionality is equal to the combination of clusterManager, clusterMonitor, hostManager roles, and dropDatabase action.

Backup & restoration roles

These are the roles that are required for backup and restoring data in a MongoDB instance. They can only be assigned with the admin database.

Role	Description
backup	Provides the necessary privileges to backup data. This role is required for MongoDB Cloud Manager and Ops Manager, backup agents, and the monogdump utility.
restore	Provides the privileges to carry out restoration functions

All database roles

These are database roles that provide privileges to interact with all databases, excluding local and config databases.

Role	Description
readAnyDatabase	Read any database
readWriteAnyDatabase	Provides read and write privileges to all databases
userAdminAnyDatabase	Create and Modify users and roles across all databases
dbAdminAnyDatabase	Perform database administrative functions on all databases

Superuser roles

MongoDB can provide either direct or indirect system-wide superuser access. The following roles grant superuser privileges scoped to a specified database or databases.

dbOwner
userAdmin
userAdminAnyDatabase

The true superuser role is the root role, which provides systemwide privileges for all functions and resources.

For a detailed explanation of all the user roles and privileges, please refer to the official MongoDB documentation.

User-Defined Roles

In situations where built-in roles are unable to provide the necessary privileges covering the scope of the access requirements or to restrict the scope and actions of a user, you can define custom User-Defined roles.

MongoDB Role Management provides the necessary methods to create and manage user-defined roles. The most commonly used methods for user-defined role creation are shown in the following table. (A complete list of methods can be found here.)

Method	Description
db.createRole()	Create a role and its privileges
db.updateRole()	Update the user-defined role
db.dropRole()	Delete a user-defined role
db.grantPrivilegesToRole()	Assigns new privileges to a role
db.revokePrivilegesFromRole()	Removes privileges from a role

MongoDB role management

In this section, Let’s discuss how to create and modify roles within a MongoDB instance.

Roles are defined using the following syntax:

roles: [
{
role: "", db: ""
}
]

Assigning user roles at user creation

First, create a user with read and write access to a specific database (supermarket) using the createUser method with roles parameter.

db.createUser(
{
user: "harry",
pwd: "test123",
roles: [
{
role: "readWrite",
db: "supermarket"
}
]
}
)

Result:

Then, create an administrative user to the supermarket database with roles granting read access to all other databases and the backup and restore functions.

db.createUser(
{
user: "harryadmin",
pwd: "admin12345",
roles: [
{
role: "dbOwner", db: "supermarket"
},
{
role: "readAnyDatabase", db: "admin"
},
{
role: "backup", db: "admin"
},
{
role: "restore", db: "admin"
}
]
}
)

Result:

As shown in the above code-block, you can create users with multiple roles providing access to a wide range of functionality within the database.

Retrieving role information

Using the getRole method, users can obtain information about a specific role. The below code shows how to get the details of the readWriteAnyDatabase role from the admin database.

db.getRole("readWriteAnyDatabase")

Result:

To obtain the privileges associated with that role, use the showPrivileges option by setting its value as ‘true’. This can also be used in the getUser method.

db.getRole("readWriteAnyDatabase", { showPrivileges: true})

Result:

Identifying assigned user roles

The getUser method enables you to identify the roles assigned to a specific user by using the following syntax:

db.getUser("")

In the below example, you can retrieve the user details of the users, “harry” and “harryadmin” with their roles using the getUser method.

db.getUser("harry")
db.getUser("harryadmin")

Result:

Granting & revoking user roles

Using the grantRolesToUser and revokeRolesFromUser methods, you can modify the roles assigned to existing users. These methods use the following syntax:

db. (
"",
[
{ role: "", db: "" }
]
)

The below example shows how to grant user roles. You will add a new role to the user “harry” providing the necessary privileges to act as the database admin (dbAdmin) to gather statistics from the “vehicles” table.

db.grantRolesToUser(
"harry",
[
{ role: "dbAdmin", db: "vehicles" }
]
)

Result:

Now you have added a new role to the user, “harry”. So, let’s remove that newly granted role using the revokeRolesFromUser method.

db.revokeRolesFromUser(
"harry",
[
{ role: "dbAdmin", db: "vehicles" }
]
)

Result:

Creating user-defined roles

Using the createRole method, you can create a new role according to your needs. The basic structure of the createRole method is shown below.

Note: Using the roles parameter within the createRole method allows you to inherit the privileges of other roles within the user-defined role. If no other privileges are required, you need to define an empty roles parameter when creating a user-defined role.

db.createRole(
{
role: "",
privileges: [
{
resource: { db: "", collection: ""},
actions: [ "" ]
}
],
roles: [
{ role: "", db: "" }
]
}
)

To associate a user-defined role to all databases or collections, you can specify the resources with empty double quotes, as shown below.

resource: { db: "", collection: ""}

In the next code block, you will create a role that limits a user’s access to a specific database collection (inventory collection of supermarket database). It limits the user actions to find and update commands without inheriting any other privileges.

db.createRole(
{
role: "inventoryeditor",
privileges: [
{
resource: { db: "supermarket", collection: "inventory"},
actions: [ "find", "update" ]
}
],
roles: [ ]
}
)

Result:

We can identify the user-defined roles using the isBuiltin parameter. The inventoryeditor role has false for the that parameter, indicating it as a non-built-in role.

Now you know how to create a standalone user-defined role. Next, you will create a user-defined role that inherits some privileges from another built-in role.

In the below code block, you will create an inventorymanager role with all the CURD privileges and inherit the privileges from the userAdmin role.

db.createRole(
{
role: "inventorymanager",
privileges: [
{
resource: { db: "supermarket", collection: "inventory"},
actions: [ "find", "update", "insert", "remove" ]
}
],
roles: [
{ role: "userAdmin", db: "supermarket" }
]
}
)

Result:

If you check the inventorymanager role using the getRole method, it will display the details of the role, including the inherited permissions.

Assigning user-defined roles to users

You can assign user-defined roles to a new user or update the roles of an existing user in the same way you do it with a built-in role. Let’s create a new user with the inventorymanager role assigned and update an existing user with the inventoryeditor role in the following examples.

Creating a new user:

db.createUser(
{
user: "managerjerry",
pwd: "manager123",
roles: [
{
role: "inventorymanager",
db: "admin"
}
]
}
)

Result:

Granting new roles:

db.grantRolesToUser(
"repairmanager",
[
{ role: "inventoryeditor", db: "admin" }
]
)

Result:

Updating & deleting user-defined roles

You can update the user-defined roles using the updateRole method and delete the roles using the dropRole method.

Let’s update the inventoryeditor role with an additional privilege to create documents within the specified collection, in the below example.

db.updateRole(
"inventoryeditor",
{
privileges: [
{
resource: { db: "supermarket", collection: "inventory"},
actions: [ "find", "update", "insert" ]
}
],
roles: [ ]
}
)

Result:

The most important thing to keep in mind when updating roles is that it will completely replace old values in the privileges and roles arrays. Therefore, you need to provide the complete arrays with the modifications when updating a user-defined role.

The dropRole method has a single functionality to remove a user-defined role. You can remove the inventoryeditor role from the database, as shown below.

db.dropRole("inventoryeditor")

Result:

The above screenshot indicates that the role deletion was successful, and if we try to retrieve the deleted role, it will return a null value. When a role is deleted, it will affect all the users associated with the deleted role, revoking all the privileges granted by that user-defined role.

Earlier, you assigned the inventoryeditor role to the user repairmanager. Now, if you check that user details, you can notice that the “inventoryeditor” role is no longer assigned to the “repairmanager” user.

db.getUser("repairmanager")

Result:

In situations where only a single user-defined role is assigned to a user when the said role is deleted, it will result in an empty roles array, effectively denying all user privileges.

You can check this scenario by deleting the inventorymanager role and getting the managerjerry user details. This will result in an empty roles array for the user managerjerry, demonstrated below by comparing the user details before and after the role deletion.

db.getUser("managerjerry")
db.dropRole("inventorymanager")
db.getUser("managerjerry")

Result:

Thus, it is paramount that you check if the roles are associated with any users and how it will affect the user’s functional scope before deleting any user-defined roles.

As a final remark, the dropRole method can only delete user-defined roles. If you try to delete a built-in role using that method, it will result in an error, as shown below.

That concludes this comprehensive look at MongoDB roles and role management.

Using mongorestore for Restoring MongoDB Backups

Shanika Wickramasinghe — Thu, 21 Jan 2021 13:34:33 +0000

It is essential to have an efficient and reliable data restoration method after backing up data during the backup and restore process. Consider the differences:

A properly configured restoration method means users can successfully restore the data to the previous state.
A poor restoration method makes the whole backup process ineffective, by preventing users from accessing and restoring the backed-up data.

The mongorestore command is the sister command of the mongodump command. You can restore the dumps (backups) created by the mongodump command into a MongoDB instance using the mongorestore command.

In this article, you will learn how to utilize the mongorestore command to restore database backups effectively.

(This article is part of our MongoDB Guide. Use the right-hand menu to navigate.)

What is mongorestore?

mongorestore is a simple utility that is used to restore backups. It can load data from either:

A database dump file created by the mongodump command
The standard input to a mongod or mongos instance

Starting with MongoDB 4.4, the mongorestore utility is not included in the base MongoDB server installation package. Instead, it’s distributed as a separate package within the MongoDB Database Tools package. This allows the utility to have a separate versioning scheme starting with 100.0.0.

The mongorestore utility offers support for MongoDB versions 4.4, 4.2, 4.0, and 3.6. It may also work with earlier versions of MongoDB, but compatibility is not guaranteed.

Additionally, mongorestore supports a multitude of platforms and operating systems ranging from x86 to s390x; you can see the full compatibility list in the official documentation.

Mongorestore behavior

Here is a list of things you need to know about the expected behaviors of the mongorestore utility.

The mongorestore utility enables users to restore data to an existing database or create a new database. When restoring data into an existing database, mongorestore will only use insert commands and does not perform any kind of updates. Because of that, existing documents with a matching value for the _id field of the documents in the backup will not be overwritten by the restoration process.

This will lead to a duplicate key error during the restoration process, as shown here:

As mongodump does not backup indexes, the mongorestore command will recreate indexes recorded by mongodump.
The best practice when backing up and restoring a database is to use the corresponding versions of both mongodump and mongorestore. If a backup is created using a specific version of the mongodump utility, it is advisable to use its corresponding version of the mongorestore utility to perform the restore operation.
Mongorestore will not restore the “system.profile” collection data.
The mongorestore utility is fully compliant with FIPS (Federal Information Processing Standard) connections to perform restore operations.
When restoring data to a MongoDB instance with access control, you need to be aware of the following scenarios:
- If the restoring data set does not include the “system.profile“ collection and you run the mongorestore command without the –oplogReply option, the “restore” role will provide the necessary permissions to carry out the restoration process.
- When restoring backups that include the “system.profile” collection, even though mongorestore would not restore the said collection, it will try to create a fresh “system.profile” collection. In that case, you can use the “dbAdmin” and “dbAdminAnyDatabase” roles to provide the necessary permissions.
- To run the mongorestore command with the –oplogReply option, the user needs to create a new user-defined role with anyAction in anyResource permissions.
You can’t use the mongorestore utility in a backup strategy when backing up and restoring sharded clusters with sharded transactions in progress. This change was introduced in MongoDB 4.2. When dealing with shared clusters, the recommended solutions would be MongoDB Cloud Manager or Ops Manager, as they can maintain the atomicity of the transactions across shards.
The mongorestore command should be executed from the command shell of the system because it is a separate database utility.

Using MongoDB mongorestore

In this section, you will find out the basic usage of the mongorestore utility in a standalone MongoDB instance.

Basic mongorestore syntax

mongorestore

The basic way to restore a database is to use the mongorestore command to specify the backup directory (dump directory) without any options. This option is suitable for databases located in the localhost (127.0.0.1) using the port 27017. The restore process will create new databases and collections as needed and will also log its progress.

mongorestore ./dump/

Result:

In the above example, you can see how to successfully restore the “vehicles” database with all its collections and documents. This will create a new database named vehicles in the MongoDB instance containing all the collections and documents of the backup. You can verify the restoration process by logging into the MongoDB instance.

use vehicles
show collections
db.vehicleinformation.find()

Result:

Restoring data into a remote MongoDB instance

In order to restore data into a remote MongoDB instance, you need to establish the connection. The connection to a database can be specified using either:

The URI connection string
The host option
The host and port option

Connecting using the URI option:

mongorestore [additional options] --uri="mongodb://:" [restore directory/file]

Connecting using the host option:

mongorestore [additional options] --host=":"  [restore directory/file]

Connecting using host and port options:

mongorestore [additional options] --host="" --port=  [restore directory/file]

The example below shows you how to restore a backup of the remote MongoDB instance. The verbose option will provide users with a detailed breakdown of the restoration process.

mongorestore --verbose --host="10.10.10.59" --port=27017 ./dump/

Result:

Restoring a secure MongoDB instance

When connecting to an access-controlled MongoDB instance, you need to provide:

Username
Password
Authentication database options

Additionally, mongorestore supports key-based authentications. It is necessary to ensure that the authenticated user has the required permissions/roles in carrying out the restoration process.

Authentication syntax:

mongorestore [additional options] --authenticationDatabase= -u= -p= [restore directory/file]

The following restoration command shows how to connect to a remote MongoDB server using the username and password for authentication.

mongorestore --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword" ./dump/

Result:

Selecting Databases and Collections

Using the –nsInclude option, users can specify which database or collection needs to be restored. When using the –nsInclude option, you can use the namespace pattern (Ex: “vehicles.*”, “vehicles.vehicleInformation”) to define which database or collection should be included.

To specify multiple namespaces, you can use the –nsInclude command multiple times in a single command. The -nsInclude command also supports wildcards to be added in the defined namespace.

The –db and –collection options are deprecated and will result in the following error.

To exclude a database or a collection, you can use the –nsExclude command.

Selecting a Database/Collection:

mongorestore [additional options] --nsInclude= (${DATABASE}.${COLLECTION}) [restore directory/file]

Excluding a Database/Collection:

mongorestore [additional options] --nsExclude= (${DATABASE}.${COLLECTION}) [restore directory/file]

In the following example, you will see how to restore the complete “persons” database. You can include the whole database by specifying the “persons” namespace with the asterisk as a wild card pattern. This will restore all the data within the database.

mongorestore --nsInclude=persons.* --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword" ./dump/

Result:

Restore data from an Archive File

The mongorestore utility supports restorations from an archive file. The –archive option can be used to select the archive file, and the –nsInclude and –nsExclude options can be used in conjunction with the archive option.

mongorestore [additional options] --archive=

The below example illustrates how to define an archive file when restoring data. The –nsInclude option is used to specify which collection is to be restored to the database from the archive file.

mongorestore -v --nsInclude=vehicles.vehicleinformation --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword" --archive=db.archive

Result:

Restoring data from a Compressed File

The mongodump utility uses the –gzip option to compress the individual JSON and BSON files. These compressed backups can also be used to restore the database. The compressed file can also be filtered using the –nsInclude and –nsExclude options.

mongorestore --gzip [additional options] [restore directory/file]

You can restore a compressed MongoDB backup using the following commands. The compressed backup is stored in the “backupzip” directory.

mongorestore --gzip -v --nsInclude=vehicles.vehicleinformation --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword" ./backupzip/

Result:

The same process can be applied to a compressed archive file. The below example shows how to restore data from a compressed archive file.

mongorestore --gzip -v --nsInclude=vehicles.vehicleinformation --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword" --archive=db.archive

Result:

Restoring data from Standard Input

The mongorestore command enables users to read data from standard inputs and use that data in the restoration process. You can read the data by providing the –archive option without the filename.

mongodump [additional options] --archive | mongorestore [additional options] –archive

The following example shows how to create a backup from a secure MongoDB database using mongodump and pass it as standard input to the mongorestore command to be restored in an insecure remote MongoDB instance.

mongodump --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword" --db=vehicles --archive | mongorestore --host=10.10.10.58 --port=27018 --archive

Result:

You can verify the restoration process by checking the databases with the remote server. This can be done by executing a JavaScript code using –eval tag. Using the “listDatabases” admin command, you will be able to list all the databases within the remote MongoDB instance.

mongo --host=10.10.10.58 --port=27018 --quiet --eval 'printjson(db.adminCommand( { listDatabases: 1 } ))'

Result:

mongorestore & mongodump

This tutorial offers in-depth knowledge about the MongoDB mongorestore utility and how it can be used to restore backups created by mongodump. The mongorestore offers a convenient and efficient way of restoring database backups.

The combination of mongodump with mongorestore provides small scale database administrators with a complete backup strategy.

23 Common MongoDB Operators & How To Use Them

Shanika Wickramasinghe — Tue, 19 Jan 2021 13:48:19 +0000

In this article, we will take a look at the most commonly used query operators. We’ll explain what they do, then share examples so you can see how they work.

(This article is part of our MongoDB Guide. Use the right-hand menu to navigate.)

What are MongoDB operators?

MongoDB offers different types of operators that can be used to interact with the database. Operators are special symbols or keywords that inform a compiler or an interpreter to carry out mathematical or logical operations.

The query operators enhance the functionality of MongoDB by allowing developers to create complex queries to interact with data sets that match their applications.

MongoDB offers the following query operator types:

Comparison
Logical
Element
Evaluation
Geospatial
Array
Bitwise
Comments

MongoDB operators can be used with any supported MongoDB command.

Now, let’s look at commonly used operators. (We won’t touch on them all, there are so many.) We’ll use the following dataset with the find() function to demonstrate each operator’s functionality.

Database: supermarket
Collections: employees, inventory, payments, promo

use supermarket
db.employees.find()
db.inventory.find()
db.payments.find()
db.promo.find()

Dataset:

Comparison Operators

MongoDB comparison operators can be used to compare values in a document. The following table contains the common comparison operators.

Operator	Description
$eq	Matches values that are equal to the given value.
$gt	Matches if values are greater than the given value.
$lt	Matches if values are less than the given value.
$gte	Matches if values are greater or equal to the given value.
$lte	Matches if values are less or equal to the given value.
$in	Matches any of the values in an array.
$ne	Matches values that are not equal to the given value.
$nin	Matches none of the values specified in an array.

$eq Operator

In this example, we retrieve the document with the exact _id value “LS0009100”.

db.inventory.find({"_id": { $eq: "LS0009100"}}).pretty()

Result:

$gt and $lt Operators

In this example, we retrieve the documents where the `quantity` is greater than 5000.

db.inventory.find({"quantity": { $gt: 5000}}).pretty()

Result:

Let’s find the documents with the ‘quantity’ less than 5000.

db.inventory.find({"quantity": { $lt: 5000}}).pretty()

Result:

$gte and $lte Operators

Find documents with ‘quantity’ greater than or equal to 5000.

db.inventory.find({"quantity": { $gte: 12000}}).pretty()

Result:

The following query returns documents where the quantity is less than or equal to 1000.

db.inventory.find({"quantity": { $lte: 1000}}).pretty()

Result:

$in and $nin Operators

The following query returns documents where the price field contains the given values.

db.inventory.find({"price": { $in: [3, 6]}}).pretty()

Result:

If you want to find documents where the price fields do not contain the given values, use the following query.

db.inventory.find({"price": { $nin: [5.23, 3, 6, 3.59, 4.95]}}).pretty()

Result:

$ne Operator

Find documents where the value of the price field is not equal to 5.23 in the inventory collection.

db.inventory.find({"price": { $ne: 5.23}})

Result:

Logical Operators

MongoDB logical operators can be used to filter data based on given conditions. These operators provide a way to combine multiple conditions. Each operator equates the given condition to a true or false value.

Here are the MongoDB logical operators:

Operator	Description
$and	Joins two or more queries with a logical AND and returns the documents that match all the conditions.
$or	Join two or more queries with a logical OR and return the documents that match either query.
$nor	The opposite of the OR operator. The logical NOR operator will join two or more queries and return documents that do not match the given query conditions.
$not	Returns the documents that do not match the given query expression.

$and Operator

Find documents that match both the following conditions

job_role is equal to “Store Associate”
emp_age is between 20 and 30

db.employees.find({ $and: [{"job_role": "Store Associate"}, {"emp_age": {$gte: 20, $lte: 30}}]}).pretty()

Result:

$or and $nor Operators

Find documents that match either of the following conditions.

job_role is equal to “Senior Cashier” or “Store Manager”

db.employees.find({ $or: [{"job_role": "Senior Cashier"}, {"job_role": "Store Manager"}]}).pretty()

Result:

Find documents that do not match either of the following conditions.

job_role is equal to “Senior Cashier” or “Store Manager”

db.employees.find({ $nor: [{"job_role": "Senior Cashier"}, {"job_role": "Store Manager"}]}).pretty()

Result:

$not Operator

Find documents where they do not match the given condition.

emp_age is not greater than or equal to 40

db.employees.find({ "emp_age": { $not: { $gte: 40}}})

Result:

Element Operators

The element query operators are used to identify documents using the fields of the document. The table given below lists the current element operators.

Operator	Description
$exists	Matches documents that have the specified field.
$type	Matches documents according to the specified field type. These field types are specified BSON types and can be defined either by type number or alias.

$exists Operator

Find documents where the job_role field exists and equal to “Cashier”.

db.employees.find({ "emp_age": { $exists: true, $gte: 30}}).pretty()

Result:

Find documents with an address field. (As the current dataset does not contain an address field, the output will be null.)

db.employees.find({ "address": { $exists: true}}).pretty()

Result:

$type Operator

The following query returns documents if the emp_age field is a double type. If we specify a different data type, no documents will be returned even though the field exists as it does not correspond to the correct field type.

db.employees.find({ "emp_age": { $type: "double"}})

Result:

db.employees.find({ "emp_age": { $type: "bool"}})

Result:

Evaluation Operators

The MongoDB evaluation operators can evaluate the overall data structure or individual field in a document. We are only looking at the basic functionality of these operators as each of these operators can be considered an advanced MongoDB functionality. Here is a list of common evaluation operators in MongoDB.

Operator	Description
$jsonSchema	Validate the document according to the given JSON schema.
$mod	Matches documents where a given field’s value is equal to the remainder after being divided by a specified value.
$regex	Select documents that match the given regular expression.
$text	Perform a text search on the indicated field. The search can only be performed if the field is indexed with a text index.
$where	Matches documents that satisfy a JavaScript expression.

$jsonSchema Operator

Find documents that match the following JSON schema in the promo collection.

The $let aggregation is used to bind the variables to a results object for simpler output. In the JSON schema, we have specified the minimum value for the “period” field as 7, which will filter out any document with a lesser value.

let promoschema = {
bsonType: "object",
required: [ "name", "period", "daily_sales" ],
properties: {
"name": {
bsonType: "string",
description: "promotion name"
},
"period": {
bsonType: "double",
description: "promotion period",
minimum: 7,
maximum: 30
},
"daily_sales": {
bsonType: "array"
}
}
}

db.promo.find({ $jsonSchema: promoschema }).pretty()

Result:

$mod Operator

Find documents where the remainder is 1000 when divided by 3000 in the inventory collection.

Note that the document “Milk Non-Fat – 1lt” is included in the output because the quantity is 1000, which cannot be divided by 3000, and the remainder is 1000.

db.inventory.find({"quantity": {$mod: [3000, 1000]}}).pretty()

Result:

$regex Operator

Find documents that contain the word “Packed” in the name field in the inventory collection.

db.inventory.find({"name": {$regex: '.Packed.'}}).pretty()

Result:

$text Operator

Find documents by using a text searching for “Non-Fat” in the name field. If the field is not indexed, you must create a text index before searching.

db.inventory.createIndex({ "name": "text"})

Result:

db.inventory.find({ $text: { $search: "Non-Fat"}}).pretty()

Result:

$where Operator

Find documents from the “payments” collection where the _id field is a string type and equals the given md5 hash defined as a JavaScript function.

db.payments.find({ $where: function() { var value =  isString(this._id) && hex_md5(this._id) == '57fee1331906c3a8f0fa583d37ebbea9'; return value; }}).pretty()

Result:

Array Operators

MongoDB array operators are designed to query documents with arrays. Here are the array operators provided by MongoDB.

Operator	Description
$all	Matches arrays that contain all the specified values in the query condition.
$size	Matches the documents if the array size is equal to the specified size in a query.
$elemMatch	Matches documents that match specified $elemMatch conditions within each array element.

$all Operator

Find documents where the category array field contains “healthy” and “organic” values.

db.inventory.find({ "category": { $all: ["healthy", "organic"]}}).pretty()

Result:

$size Operator

Find documents where the category array field has two elements.

db.inventory.find({ "category": { $size: 2}}).pretty()

Result:

$elemMatch Operator

Find documents where at least a single element in the “daily_sales” array is less than 200 and greater than 100.

db.promo.find({ "daily_sales": { $elemMatch: {$gt: 100, $lt: 200}}}).pretty()

Result:

Comment Operator

The MongoDB comment query operator associates a comment to any expression taking a query predicate. Adding comments to queries enables database administrators to trace and interpret MongoDB logs using the comments easily.

$comment Operator

Find documents where the period is equal to 7 in promo collection while adding a comment to the find operation.

db.promo.find({ "period": { $eq: 7}, $comment: "Find Weeklong Promos"}).pretty()

Result:

Adding comments lets users easily identify commands in MongoDB logs. The above operation will be logged as follows.

db.adminCommand( { getLog:'global'} ).log.forEach(x => {print(x)})

Result:

Conclusion

In this article, we have only scratched the surface of MongoDB operators. We can further extend the overall functionality of the database using projection and aggregations.

How To Use mongodump for MongoDB Backups

Shanika Wickramasinghe — Tue, 12 Jan 2021 14:04:01 +0000

Maintaining backups is vital for every organization. Data backups act as a safety measure where data can be recovered or restored in case of an emergency. Typically, you create database backups by replicating the database, using either:

Built-in tools
Specialized external backup services

MongoDB offers multiple inbuilt backup options depending on the MongoDB deployment method you use. We’ll look briefly at the options, but then we’ll show you how to utilize one particular option—MongoDB mongodump—for the backup process.

(This article is part of our MongoDB Guide. Use the right-hand menu to navigate.)

Built-in backups in MongoDB

Here are the several options you have for backing up your data in MongoDB:

MongoDB Atlas Backups

If MongoDB is hosted in MongoDB Atlas cloud database service, the Altas service provides automated continuous incremental backups.

Additionally, Altas can be used to create cloud provider snapshots, where local database snapshots are created using the underlying cloud providers’ snapshot functionality.

MongoDB Cloud Manager or Ops Manager

Cloud Manager is a hosted backup, monitoring, and automation service for MongoDB. Cloud Manager enables easy backup and restores functionality while providing offsite backups.

Ops Manager provides the same functionality as Cloud Manager but it can be deployed as an on-premise solution.

MongoDB mongodump

Mongodump is a simple MongoDB backup utility that creates high fidelity BSON files from an underlying database. These files can be restored using the mongorestore utility.

Mongodump is an ideal backup solution for small MongoDB instances due to its ease of use and portability.

File system backups

In this method, you merely keep copies of the underlying data files of a MongoDB installation. We can utilize snapshots if the file system supports it.

Another way is to use a tool like rsync where we can directly copy the data files to a backup directory.

What is MongoDB mongodump?

The mongodump is a utility for creating database backups. The utility can be used for creating binary export of the database contents. Mongodump can export data from both mongod and mongos instances, allowing you to create backups from:

A standalone, replica set
A sharded cluster of MongoDB deployments

Before MongoDB 4.4, mongodump was released alongside the MongoDB server and used matched versioning. The new iterations of mongodump are released as a separate utility in MongoDB Database Tools. Mongodump guarantees compatibility with MongoDB 4.4, 4.2, 4.0, and 3.6.

The mongodump utility is supported on most x86_64 platforms and some of ARM64, PPC64LE, and s390x platforms. You can find the full list of platforms that mongodump is compatible with from their documentation.

mongodump actions

The following list breaks down the expected behaviors and limitations of the mongodump utility.

The mongodump utility directs its read operations to the primary member of a replica set, making the default read preference to primary.
The backup operation will exclude the “local” database and only captures the documents excluding the index data. These indexes must be rebuilt after a restoration process.
When it comes to backing up read-only views, mongodump only captures metadata of views. If you want to capture documents within a view, use the “–viewsAsCollections” flag.
To ensure maximum compatibility, use Extended JSON v2.0 (Canonical) for mongodump metadata files. It is recommended to use the corresponding versions of mongodump and mongorestore in backup and restore operations.
The mongodump command will overwrite the existing files within the given backup folder. The default location for backups is the dump/ folder.
When the WiredTiger storage engine is used in a MongoDB instance, the output will be uncompressed data.
Backup operations using mongodump is dependent on the available system memory. If the data set is larger than the system memory, the mongodump utility will push the working set out of memory.
If access control is configured to access the MongoDB database, users must have enough privileges to each database to make backups. MongoDB has a built-in backup role with required privileges to backup any database.
MongoDB allows mongodump to be a part of the backup strategy for standalone or a replica set.
Starting with MongoDB 4.2, mongodump cannot be used as a part of the backup strategy when backing up sharded clusters that have sharded transactions in progress. In these instances, it is recommended to use a solution like MongoDB Cloud Manager or Ops Manager, which maintain the atomicity in transactions across shards.
The mongodump command must be executed from the system command shell as it is a separate utility.
There is no option for incremental backups. All backups will make a full copy of the database.

MongoDB Database Tools

MongoDB Database Tools are a collection of command-line utilities that help with the maintenance and administration of a MongoDB instance. The MongoDB Database tools are compatible in these environments:

Windows
Linux
macOS

In this section, we will take a look at how we can install the Database Tools on a Linux server.

Checking for Database Tools

To check if the database tools are already installed on the system, we can use the following command.

sudo dpkg -l mongodb-database-tools

Result for Database Tools installed:

Result for Database Tools unavailable:

Installing Database Tools

If your system doesn’t have Database Tools, here’s how to install it.

The MongoDB download center provides the latest version of MongoDB Database Tools. Download the latest version according to your platform and package type. In a CLI environment, we can copy the download link and use wget or curl to download the package.

In the example below, we will be using the Database Tools version 100.2.1 for Ubuntu as a deb package and then install using the downloaded file.

curl -o mongodb-database-tools-ubuntu2004-x86_64-100.2.1.deb https://fastdl.mongodb.org/tools/db/mongodb-database-tools-ubuntu2004-x86_64-100.2.1.deb
sudo apt install ./mongodb-database-tools-ubuntu2004-x86_64-100.2.1.deb

Result:

Using MongoDB mongodump

In this section, we will cover the basic usage of mongodump utility in a standalone MongoDB instance.

Basic mongodump Syntax

mongodump

The most basic method to create a backup is to use the mongodump command without any options. This will assume the database is located in localhost (127.0.0.1) and using port 27017 with no authentication requirements. The backup process will create a dump folder in the current directory.

mongodump

Result:

We can navigate to the dump folder to verify the created backups.

Backing up a remote MongoDB instance

We can specify a host and a port using the –uri connection string.

Connect using the uri option:

mongodump --uri="mongodb://:" [additional options]

Connect using the host option:

mongodump --host=":"  [additional options]

Connect using host and port options:

mongodump --host="" --port= [additional options]

The following example demonstrates how to create a backup of the remote MongoDB instance:

mongodump --host="10.10.10.59" --port=27017

Result:

Backing up a secure MongoDB instance

If we want to connect to a MongoDB instance with access-control, we need to provide:

Username
Password
Authentication database options

Authentication Syntax

mongodump --authenticationDatabase= -u= -p= [additional options

Let’s see how we can connect to a remote MongoDB instance using a username and password.

mongodump --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword"

Result:

Selecting databases & collections

Using the –db and –collection options, we can indicate a database and a collection to be backed up. The –db option can be a standalone option, but to select a collection a database must be specified. To excuse a collection from the backup process, we can use the –excludeCollection option.

Selecting a database:

mongodump  --db= [additional options]

Selecting a collection:

mongodump  --db= --collection= [additional options]

Excluding a collection:

mongodump  --db= --excludeCollection= [additional options]

In the following example, we define the “vehicleinformation” collection as the only backup target.

mongodump --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword" --db=vehicles --collection=vehicleinformation

Result:

Changing the backup directory

The –out option can be used to specify the location of the backup folder.

mongodump --out= [additional options]

Let us change the backup directory to the “dbbackup” folder.

mongodump --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword" --out=dbbackup

Result:

Creating an archive file

The mongodump utility allows us to create an archive file. The –archive option can be used to specify the file. If no file is specified the output will be written to standard output (stdout).

The –archive option cannot be used in conjunction with the –out option.

mongodump --archive= [additional options]

The below example demonstrates how we can define an archive file.

mongodump --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword" --archive=db.archive

Result:

Compressing the backup

The backup files can be compressed using the –gzip option. This option will compress the individual JSON and BSON files.

mongodump --gzip [additional options]

Let’s compress the complete MongoDB database.

mongodump --host=10.10.10.59 --port=27017 --authenticationDatabase="admin" -u="barryadmin" -p="testpassword" --gzip

Result:

In this article, we learned about MongoDB mongodump and how mongodump can be used to create backups and manage database backups.

Top MongoDB Commands You Need to Know

Shanika Wickramasinghe — Mon, 04 Jan 2021 08:33:36 +0000

This article discusses the most useful commands for MongoDB database administration.

We’ll first get familiar with the basic concepts of MongoDB. Then, I’ll show you how to carry out a variety of basic administrative functions, with commands for:

Connecting
Viewing databases, collections, roles & users
Managing users
Checking logs
Managing the database
Gathering collection details
Renaming collections
Terminating an instance

Let’s get started!

(This article is part of our MongoDB Guide. Use the right-hand menu to navigate.)

MongoDB overview

MongoDB is a high performance, highly scalable cross-platform NoSQL database. MongoDB relies on concepts like Documents, Collections, and Databases:

Collections and documents are analogous to the traditional table and rows in an RDBMS database.
A single MongoDB instance can contain multiple databases. The database is a physical container for collections with a dedicated file structure in the system.

A Collection contains a group of MongoDB documents. Collections are created within a database and do not enforce a schema like a traditional database. Therefore different documents in the same collection can have different fields. MongoDB Documents are based on key-value pairs.

Let’s take a quick look at how traditional RDBMS terminology relates to MongoDB structure.

Relational Database Management System	MongoDB
Database	Database
Table	Collection
Row	BSON document
Column	BSON field
Index	Index
Primary key	_id field (Primary key) By default, MongoDB auto-generates a 12-byte hexadecimal number for each document.
Group	Aggregation
Join	Embedding and linking

Like any database, MongoDB needs administration. That’s where administrative commands come in—let’s take a look.

Commands for connecting to MongoDB

First, we need to know how to connect to a MongoDB database. You can use the mongo command to connect with a MongoDB database and use parameters like host and port if needed.

mongo Run this command in the localhost shell to connect to the local database on the default port 27017.
mongo / Specify the host and database as parameters to connect to a specific database.
mongo –host –port [options] You can use this format to specify different options while connecting the database. Refer to the mongo man pages or help for details information about all available options.

mongo --host 10.10.10.59 --port 27017 --verbose

mongo –host –port –authenticationDatabase -u -p

If authentication is enabled in the MongoDB installation, we can specify the user details and the authentication database. The authentication database is where the user details reside, which can be any database that is used to create users. If you don’t give the password in the command, cmd will ask for it later.

mongo --port 27017 --authenticationDatabase "admin" -u "barryadmin" -p

MongoDB show command

Let us see how to view objects in a MongoDB database. You can get the existing databases, collections, roles, and users with the show command.

View all databases

show dbs

View collections inside a database

show collections / db.getCollectionNames()

View roles in a database

show roles

View users in a database

show users / db.getUsers()

User management commands

One of the most important administrative tasks is to manage permission for users. MongoDB provides this functionality using users and roles, and it has built-in roles for easy access controls.

You have to enable the “authentication” option in the MongoDB config file to use the access control feature. Add the following lines in mongod.conf file and restart the MongoDB service to reflect the changes.

/etc/mongod.conf

security:
authorization: "enabled"

Creating a user

The createUser command allows us to create users. Let’s create a user for the vehicles database with only read and write permissions.

Syntax:

db.createUser(
{
user: ,
pwd: ,
roles: [
{role: , db: }
]
}
)

Example:

use vehicles
db.createUser(
{
user: "barryvehicles",
pwd: passwordPrompt(),
roles: [
{role: 'readWrite', db: "vehicles"}
]
}
)

Result:

The passwordPromt() function will ask for the password when running the createUser command. The user is created in the “vehicles” database.

So, when we authenticate using this user, we must specify the “vehicles” database as the “authenticationDatabase”. A database user can be created in any database while defining permissions for other databases.

mongo --port 27017 --authenticationDatabase "vehicles" -u "barryvehicles" -p

Result:

Updating user details

We can update the details of the user using the updateUser() command. When updating user roles, we need to specify all the desired roles—because updateUser()will overwrite any existing rules.

In this example, we will update the “barryvehicles” user with a custom field and give read permission to the admin database. The “customData” section allows us to create any custom key pair. This has no effect on user roles; custom fields can be considered more of an informative section where we can add additional details for the user.

Syntax:

db.updateUser(
<"username">,
{
customData : {  },
roles: [
{role: , db: }
]
}
)

Example:

db.updateUser(
"barryvehicles",
{
customData : { usertype: 'dbadmin' },
roles: [
{role: 'readWrite', db: "vehicles"},
{role: 'read', db: "admin"}
]
}
)

Result:

Deleting a user

We can delete a user using the dropUser() command. In the following example, we will delete the user “barryvehicles”.

Syntax:

db.dropUser()

Example:

>db.dropUser("barryvehicles")

Result:

Checking logs

We have two methods for checking logs in MongoDB. We can:

Check the mongod log file
Use the getLog() command

getLog() returns the most recent logged events. This command will read the recent 1024 MongoDB log events in the RAM cache. In earlier versions of MongoDB, logs were returned in plaintext format. However, in MongoDB 4.4, the logs are formatted in Extended JSON v2.0.

Syntax:

db.adminCommand( { getLog:  } )

There are three possible values for the getLog() command. Those are

* returns the list of available values for getLog() command.
global returns all the recent log entries.
startupWarnings returns log entries that may contain errors or warnings since the start of the current process.

Example:

db.adminCommand( { getLog: "*" } )
db.adminCommand( { getLog : "global" } )

Result:

Database management commands

In this section, we will cover basic database management commands. These can help determine the server stats, collection stats, collection size, etc.

Help

help is an essential command in any administrator’s toolbox. The help command will give you a list of help options available in MongoDB.

Normal help:

help

Result:

Here, you can see all the help options available in MongoDB. If you want to get all the help commands needed to work with databases, execute the db.help() command.

db.help()

Result:

Get database details

The stats() command provides statistics of the database. The information provided ranges from the number of collections and objects (documents), database sizes to indexes.

The scaleFactor reflects how data sizes are represented. The default scaleFactor is set to 1, which shows data in bytes. For example, we can change the scaleFactor to 1024 to show the sizes in kilobytes.

db.stats()

Result:

If you want to get the server details, use the db.serverStats() command.

db.serverStats()

Result:

To get a list of connection names, use getCollectionNames() command.

db.getCollectionNames()

Result:

Obtaining and returning collection details

Get collection statistics

The status() function will provide a comprehensive overview of the collection.

db.vehicledetails.stats()

Result:

Get collection latency

Use the latencyStatus() command to obtain the average latency of the read, write operations and the number of read and write operations.

db.vehicledetails.latencyStats()

Result:

Get collection sizes

The following commands are used to find out the sizes of Collections in various ways:

dataSize() shows the size of data within the collection.
storageSize() indicates the total amount of storage allocated to the documents in the collection.
totalSize() indicates the total size of the collection, including documents and the indexes.
totalIndexSize() provides the indexed size of the collection.

db.vehicledetails.dataSize()
db.vehicledetails.storageSize()
db.vehicledetails.totalSize()
db.vehicledetails.totalIndexSize()

Result:

Because the “vehicledetails” is a small collection, the storageSize is equal to totalIndexSize as the indexed data is equal to the raw data in the collection, and further compression is unfeasible.

Renaming collections

We can rename an existing collection with the renameCollection function. (This function is not compatible with sharded collections.)

When renaming a collection, we need to specify the source namespace and the destination namespace correctly. In MongoDB, namespace relates to the unique name in which we can identify database objects.

In the below example, we are renaming the “vehicledetails” collection in the vehicle database. We have defined the namespaces as . to differentiate between the old and new collections.

db.adminCommand( { renameCollection: "vehicles.vehicledetails", to: "vehicles.vehicleinformation" } )

Result:

Terminating the server

If we want to completely terminate the MongoDB instance, we can use the built-in shutdownServer() command. shutdownServer() will clean up all the resources used by databases before terminating the MongoDB process.

The command must be issued against the admin database to be executed. We can achieve this by using the getSiblingDB function to indicate the admin database.

Syntax:

db.shutdownServer({
force: ,
timeoutSecs: 
})

The force option forces a shutdown operation and interrupts any ongoing operations to terminate the MongoDB instance. The timeoutSec option can be used to set the time in seconds before a shutdown occurs. In an authenticated environment, the user must have the shutdown privilege to run this command.

In the following example, we will force a shutdown of the MongoDB instance in 10 seconds. The getSiblingDB function allows us to point the shutdown function to the admin database.

db.getSiblingDB("admin").shutdownServer({ "force": true, "timeoutSecs": 10 })

Result:

That concludes this MongoDB commands tutorial. All the above-mentioned commands can be further explored using the official MongoDB documentation.

MongoDB Atlas: Setting Up & Using Managed MongoDB

Shanika Wickramasinghe — Thu, 24 Dec 2020 14:07:55 +0000

In this article, I’ll show you how to set up a NoSQL cloud database using MongoDB Atlas.

The MongoDB Atlas free tiers provide developers a turnkey solution to prototype and test applications using MongoDB as the backend database. Database as a service (DBaaS) aims to eliminate the tedious configuration process of a database while providing a scalable, highly available, and high-performance database.

Let’s take a look.

(This article is part of our MongoDB Guide. Use the right-hand menu to navigate.)

What is MongoDB Atlas?

MongoDB Atlas is a cloud-based NoSQL database service developed by MongoDB Inc. It was developed to offer a flexible, scalable, and on-demand platform to eliminate the need for costly infrastructure, configurations, and maintenance.

MongoDB Atlas provides all the features of MongoDB—without the need to worry about database administration tasks such as:

Infrastructure provisioning
Database configurations
Patches
Scaling
Backups

Features of MongoDB Atlas

Key features of Atlas include:

Cloud provider agnostic. MongoDB Atlas is a cloud provider agnostic service that allows users to run the database service on a cloud provider of their choice, including, of course, AWS, Azure, and GCP.
Up-to-date features. MongoDB Atlas provides support to the two latest versions of MongoDB service with automatic patching and one-click upgrades.
Scalability & high availability. Atlas can scale out and scale up to meet the database’s needs effortlessly while having a minimum of three data nodes per replica set deployed across availability zones, providing continuous database functionality.
High performance. The MongoDB WiredTiger storage engine, along with compression and fine-grained concurrency control, gives the required performance for any database need.

Additionally, Atlas provides monitoring and alerts, strong security, workload isolation, and disaster recovery functions.

Setting up MongoDB Atlas

MongoDB Atlas provides a free tier that can be used for learning and prototyping databases. This free tier is called M0 Sandbox, and it is limited to 512MB of storage, shared vCPU, and RAM with 100 maximum connections on a single M0 cluster.

The MongoDB Atlas paid services are billed hourly based on your usage.

This section will guide you step-by-step guide through creating a MongoDB database cluster on Atlas using the free tier account.

Creating a MongoDB Atlas account

First, we need to create an account in MongoDB Atlas. There are two methods to create an account:

You can use your pre-existing Google account to log in to the service.
You can use your email to create a new account by providing an email address, name, password, and company name.

Review and accept the Terms of Service and Privacy Policy before clicking Sign Up. In this tutorial, we will use an email to create a MongoDB Atlas account.

MongoDB Atlas website

MongoDB Atlas registration page

Configuring your Atlas account

After creating the account, enter the organization name, project name, and select your preferred language. For this tutorial, we will select Python.

Select cluster type

We will select the Shared Clusters option because we are creating a free tier cluster.

Create a starter cluster

After going through the above steps, you’ll be presented with the Create a Starter Cluster page. Here, you’ll select:

Cloud provider
Region
Cluster tier
MongoDB settings, like version and backup frequency

You can improve the connection latency between the application and the database by selecting a region closest to the location where your application is deployed. We will select AWS as our cloud provider and the N. Virginia (us-east-1) as the location.

Importantly, there is no option to create backups in the free tier. Finally, we name the cluster and click on Create Cluster to deploy the cluster.

Creating a starter cluster

Admin interface

It will take a couple of minutes after clicking on the Create a Cluster button to create a MongoDB with all the options specified. Then we will be redirected to the MongoDB Atlas Admin interface. We have named the cluster as MainDBCluster.

Admin Interface

Configuring your Atlas cluster

We have now successfully created a MongoDB cluster within MongoDB Atlas. Now, we must configure the access and security to the database before we can use the database. This section will cover the basic configurations of the cluster.

Whitelist IP address

The first thing to do is to whitelist the IP address that can be used to access the database. By limiting the database access to specific IP addresses, we can limit the security risk of unwanted connection attempts to the database.

To whitelist an IP address, go to the Network Access section. Then click Add IP Address and enter the details. We have the option to add the current IP address and configure an expiry time.

IP Whitelist page

We have selected a single IP with an expiry of six hours.

Create users

We can create users using the Database Access section in the MongoDB Atlas admin interface.

To add a user, click Add New Database User in Database Access, and enter the details of the user. Let’s create a simple user account with the Password Authentication method and give him Read and Write access to any database within the cluster.

Database access page

Connect to the cluster

Within the Clusters section, click the Connect button in the MainDBCluster to connect to the database. MongoDB Atlas provides three methods to connect to the cluster:

Connect with the Mongo shell
Connect your application
Connect using MongoDB Compass

We will select the MongoDB Compass as the connection method as it would provide us a graphical user interface (GUI) to interact with the database. After selecting the MongoDB Compass option, we have the option to either:

Download the MongoDB Compass client
Use an existing MongoDB Compass installation

Connection methods

MongoDB Compass connection method

Access the database

Using MongoDB Compass, we will connect to the MongoDB Atlas cluster to access the database. We will start the MongoDB Compass application and enter the connection string and click on Connect.

MongoDB Compass Connection Interface

After a successful connection, you’ll be presented with the MongoDB Atlas cluster (MainDBCluster).

Interacting with the database

Let’s see how we can interact with the database using the MongoDB Atlas admin interface. MongoDB Atlas provides a sample dataset that can be added to a cluster for testing purposes. To load the sample dataset, select Load Sample Dataset in the Clusters section in the admin interface.

MongoDB Atlas admin interface

Load sample dataset

After successfully loading the sample dataset, we can interact with the data using either:

The MongoDB Atlas admin interface
MongoDB Compass

Using admin interface

Within the admin interface, we can click the Collections button in the MainDBCluster. Then you’ll be redirected to the Collections section.

MainDBCluster Collection (simple_airbnb database)

Using MongoDB Compass

From the MongoDB Compass interface, we can simply select the necessary databases and collections and interact with documents as needed.

Collection view (simple_mflix database)

Both methods allow you to carry out all the necessary database functions without a command-line interface.

Monitoring the cluster

MongoDB Atlas provides metrics to monitor the cluster performance from the admin interface. Simply click on the Metrics button in the MainDBCluster, and you’ll be redirected to the metrics page:

Atlas cluster metrics

That concludes this tutorial on MongoDB Atlas.

How To Run MongoDB as a Docker Container

Shanika Wickramasinghe — Tue, 15 Dec 2020 07:45:26 +0000

MongoDB is among the most popular NoSQL databases today. And containers offer easy app usage and scalability. In this article, I’ll show you how to:

Configure MongoDB as a container in Docker
Set up the Docker platform with docker-compose
Create a docker-compose file to create the MongoDB container
And more

The last part of this tutorial will look at advanced configurations. These can give you a glimpse of the extensibility of a containerized project. So, we’ll create a self-containing project with a MongoDB instance and Mongo Express web interface on a dedicated network and docker volume to maximize the portability of the project.

Let’s get started.

(This article is part of our MongoDB Guide. Use the right-hand menu to navigate.)

Docker containers & MongoDB

Docker is a tool to create, deploy, and run applications using containers easily. A container is a standard unit of software that can be used to package applications and all the dependencies to a single package. These containers can be run on any server platform regardless of the underlying configuration or hardware structure.

Docker can be used to run MongoDB instances. Setting up MongoDB as a container allows the user to create a portable and extensible NoSQL database. A containerized MongoDB instance behaves exactly like a non-containerized MongoDB instance without having to worry about the underlying configuration.

Interested in Enterprise DevOps? Learn more about DevOps Solutions and Tools with BMC. ›

Installing Docker

In this section, we’ll set up a simple Docker installation to run containers on a Ubuntu-based server. We can get the Docker installation packages from the official Docker repository. Here are the installation steps:

Update existing packages.

sudo apt update && sudo apt upgrade -y

Install prerequisite packages.

sudo apt install apt-transport-https ca-certificates curl software-properties-common

Add the GPG key from the official Docker repository.

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

Add the official docker repository to APT sources.

sudo add-apt-repository 
"deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

Update the Ubuntu package list.

sudo apt update

Verify the docker repository.

apt-cache policy docker-ce

Install the Docker community edition.

sudo apt install docker-ce

Check the status of the installation with the following command. If the service status returns active (running), Docker is successfully installed and active on the system.

sudo systemctl status docker

Installing Docker Compose

We can use the command line interface (CLI) to create and manage Docker containers. However, the CLI can be tedious when dealing with multiple containers and configurations.

Docker Compose allows users to take multiple containers and integrate them into a single application. Docker Compose uses the YAML format to create the compose files that can be easily executed using docker-compose up or down commands that will create or remove all the containers and configurations within a compose file, respectively.

Let’s install Docker Compose on the Ubuntu server.

Install the current stable release of Docker Compose.

sudo curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose

Apply executable permissions for the downloaded binary.

>sudo chmod +x /usr/local/bin/docker-compose

Verify the Docker Compose installation.

docker-compose --version

Setting up a MongoDB container

This section will cover how to set up a MongoDB container using a Docker Compose file.

Before creating the compose file, let’s search for the official MongoDB container image using the search command.

sudo docker search mongodb

The search results show us that an official MongoDB container image called mongo exists in the docker container registry.

By default, the MongoDB container stores the databases within the /data/db directory within the container.

Next, we need to create a directory called “mongodb” to hold the docker-compose file. We will create another directory called “database” inside the “mongodb” directory to map to the database location of the container. This will enable local access to the database. We use the -pv operator to create those parent folders.

>mkdir -pv mongodb/database

The following docker-compose.yml file will be created within the “mongodb” directory to construct the MongoDB container.

docker-compose.yml

version: "3.8"
services:
mongodb:
image : mongo
container_name: mongodb
environment:
- PUID=1000
- PGID=1000
volumes:
- /home/barry/mongodb/database:/data/db
ports:
- 27017:27017
restart: unless-stopped

We used version 3.8 to create the above compose file. The compose file version directly correlates to:

Which options are available within the compose file
The minimum supported Docker engine version

In this case, It’s Docker Engine 19.03.0 or newer.

In the compose file, we have created a service called mongodb using the Docker image mongo. We have named the container “mongodb” and mapped the database folder within the container to the local database folder (/home/barry/mongodb/database). These kinds of mappings are known as bind-mount volumes.

The environment variables are used to define the “user” and “group” of the container. Finally, we mapped the local port 27017 to internal port 27017. Then the restart policy is set to restart unless stopped by the user.

Here’s the file structure of the project:

tree mongodb

Go to the “mongodb” folder and run the docker-compose up command to start the MongoDB container. The -d operator runs the detached container as a background process.

sudo docker-compose up -d

The up command will pull the mongo image from the docker registry and create the container using the given parameters in the docker-compose.yml file.

Let’s verify if the container is running and the local folder is populated with the following commands. The -a operator will display all the containers within the system regardless of their status.

sudo docker ps -a

sudo tree mongodb

Interacting with the MongoDB container

Using the docker exec command, we can access the terminal of the MongoDB container. As the container runs in a detached mode, we will use the Docker interactive terminal to establish the connection.

sudo docker exec -it mongodb bash

In the bash terminal of the container, we call the mongo command to access MongoDB. We will create a database called “food” and a collection called “fruits”, along with three documents.

Switch the database.

use food

Create collection.

db.createCollection("fruits")

Insert documents

db.fruits.insertMany([ {name: "apple", origin: "usa", price: 5}, {name: "orange", origin: "italy", price: 3}, {name: "mango", origin: "malaysia", price: 3} ])

Search for the documents using the find command:

db.fruits.find().pretty()

The MongoDB container will act like any normal MongoDB installation without any concerns about the underlying software and hardware configuration. Using the exit command, we can exit both the MongoDB shell and container shell.

External connections to MongoDB container

While creating the MongoDB container, we mapped the internal MongoDB port to the corresponding port in the server, exposing the MongoDB container to external networks.

The following example demonstrates how we can connect to the container from an external endpoint by simply pointing the mongo command to the appropriate server and port.

mongo 10.10.10.60:27017

The find command will search for the fruits collection and its documents to verify that we are connected to the MongoDB container.

show databases
use food
show collections
db.fruits.find().pretty()

Data resilience

We’ve mapped the database to a local folder. As a result of that, even if the container is removed, the saved data in the local folder can be used to recreate a new MongoDB container.

Let’s test that. We’ll:

Remove the container using the docker-compose down
Delete the associated images.
Recreate a new MongoDB database using the compose file and local database files.

Remove the MongoDB container.

sudo docker-compose down

Remove the local mongo image.

sudo docker rmi mongo

Verify the local database files.

From the output below, we can identify that even though we removed the containers, the data mapped to a local directory did not get removed.

sudo tree mongodb

Recreate a new MongoDB container. Now, we will recreate the container using the original docker-compose.yml file. We execute the following command in the mongodb folder.

sudo docker-compose up -d

Verify the Data in the MongoDB container. Let’s now access the bash shell in the container and check for the “fruits” collections.

sudo docker exec -it mongodb bash

show databases
use food
db.fruits.find().pretty()

The result indicates that the new container was created with the local database information associated with the new container.

Additionally, we can simply move the container by moving the local folder structure to a new server and creating a container using the docker-compose.yml file. Docker volumes can be used instead of locally saving the data to increase the portability of the database.

Take IT Service Management to the next level with BMC Helix ITSM.›

Container log files

Every container creates logs that can be used to monitor and debug itself. We can access the container logs using the docker logs command with the container name to be monitored.

sudo docker logs mongodb

Advanced container usage

In this section, we will create a secure MongoDB container that requires a username and password to access the database.

In earlier examples, we mapped the database data to a local folder. However, this is tedious and requires manual intervention when moving the Docker container. Using Docker volumes, we can create Docker native persistent volumes that can be easily transferred between Docker installations.

Although we can use the CLI to manipulate the MongoDB instance, a GUI would be a more convenient option to do that. Mongo Express is a web-based MongoDB administration interface that also can be run as a containerized application.

The docker-compose file comes in handy as a single YAML file that captures all the requirements.

docker-compose.yml

version: "3.8"
services:
mongodb:
image: mongo
container_name: mongodb
environment:
- MONGO_INITDB_ROOT_USERNAME=root
- MONGO_INITDB_ROOT_PASSWORD=pass12345
volumes:
- mongodb-data:/data/db
networks:
- mongodb_network
ports:
- 27017:27017
healthcheck:
test: echo 'db.runCommand("ping").ok' | mongo 10.10.10.60:27017/test --quiet
interval: 30s
timeout: 10s
retries: 3
restart: unless-stopped
mongo-express:
image: mongo-express
container_name: mongo-express
environment:
- ME_CONFIG_MONGODB_SERVER=mongodb
- ME_CONFIG_MONGODB_ENABLE_ADMIN=true
- ME_CONFIG_MONGODB_ADMINUSERNAME=root
- ME_CONFIG_MONGODB_ADMINPASSWORD=pass12345
- ME_CONFIG_BASICAUTH_USERNAME=admin
- ME_CONFIG_BASICAUTH_PASSWORD=admin123
volumes:
- mongodb-data
depends_on:
- mongodb
networks:
- mongodb_network
ports:
- 8081:8081
healthcheck:
test:  wget --quiet --tries=3 --spider http://admin:admin123@10.10.10.60:8081 || exit 1
interval: 30s
timeout: 10s
retries: 3
restart: unless-stopped
volumes:
mongodb-data:
name: mongodb-data
networks:
mongodb_network:
name: mongodb_network

Now, let’s break down the compose file given above. First, we have created two services:

mongodb
mongo-express

mongodb service

The root username and password of the mongodb container are configured using the following environment variables.

MONGO_INITDB_ROOT_USERNAME
MONGO_INITDB_ROOT_PASSWORD

The data volume is mapped to mongodb-data docker volume, and the network is defined as mongodb_network while opening port 27017.

mongo-express service

The environment variables of the mongo-express container are:

ME_CONFIG_MONGODB_SERVER – MongoDB service (mongodb)
ME_CONFIG_MONGODB_ENABLE_ADMIN – Enable access to all databases as admin
ME_CONFIG_MONGODB_ADMINUSERNAME – Admin username of the MongoDB database
ME_CONFIG_MONGODB_ADMINPASSWORD – Admin password of the MongoDB database
ME_CONFIG_BASICAUTH_USERNAME – Mongo-Express web interface access username
ME_CONFIG_BASICAUTH_PASSWORD – Mongo-Express web interface access password

Additionally, we have configured the mongo-express service to depend on the mongodb service. The network is assigned the same mongodb_network, and the volumes are mapped to mongodb-data volume. Then the port 8081 is exposed to allow access to the web interface.

Both services are monitored using Docker health checks. The mongodb service will ping the MongoDB database, while the mongo-express service will try to access the web page using the given credentials.

Finally, we have defined a volume called mongodb-data and a network called mongodb_network for the project.

Start the Docker compose file.

sudo docker-compose up -d

The above output contains no errors. So, we can assume that all the services are created successfully. As we have added health checks for both services, we can verify it by using the docker ps command.

sudo docker ps -a

The docker ps command prints the health status of the container. This health status is only available if you have defined a health check for the container.

Mongo Express

Now, let’s go to the Mongo Express web interface using the server IP (http://10.10.10.60:8081).

The Mongo Express interface provides a convenient way to interact with the MongoDB database. The Mongo Express interface also provides an overview status of the MongoDB server instance, providing a simple monitoring functionality.

That concludes this tutorial.

How To Use PyMongo

Shanika Wickramasinghe — Wed, 09 Dec 2020 07:51:06 +0000

In this tutorial, I’ll walk you through how to use PyMongo in MongoDB.

(This tutorial is part of our MongoDB Guide. Use the right-hand menu to navigate.)

What is PyMongo?

PyMongo is the official Python driver that connects to and interacts with MongoDB databases. The PyMongo library is being actively developed by the MongoDB team.

Installing PyMongo

PyMongo can be easily installed via pip on any platform.

pip install pymongo

You can verify the PyMongo installation via pip or by trying to import the module to a Python file.

Checking via pip:

pip list --local

Let us create a simple Python file where we import the PyMongo module. The import statement is wrapped in a try-except statement to write a message to the output based on the success or failure of the import statement.

try:
import pymongo
print("Module Import Successful")
except ImportError as error:
print("Module Import Error")
print(error)

How to use PyMongo

In this section, we will cover the basic usage of the PyMongo module and how it can be used to interact with a MongoDB installation. The MongoDB server details used in this article are:

IP: 10.10.10.59
Port: 27017

/etc/mongod.conf

# network interfaces
net:
port: 27017
bindIp: 127.0.0.1, 10.10.10.59

To allow access to the database, you need to add the external IP of the server to the bindIp parameter. You can find it in the network section of the “mongod.conf” file.

Here are the actions that I’ll show you how to do:

Connect to MongoDB install
Obtain collections from database
Find documents in a collection
Find one document
Create a database, collection & document
Insert documents to a collection
Update documents
Delete documents, collections & databases

Connecting to the MongoDB install

There are two ways to define the connection string:

Using IP and Port
Using a MongoURL

from pymongo import MongoClient
client = MongoClient('10.10.10.59', 27017)
# Alternative Connection Method - MongoURL Format
# client = MongoClient('mongodb://10.10.10.59:27017')
print(client.server_info())

Result:

In this example, we have established the connection by creating a new instance of the MongoClient Class with the MongoDB server IP and port. Then we call the server_info() function to obtain the data about the MongoDB server instance.

In the next example, we call the function list_database_names() to get a list of all the databases.

from pymongo import MongoClient
client = MongoClient('10.10.10.59', 27017)
print(client.list_database_names())

Result:

Obtaining collections from a database

Using the list_collection_names method, we can obtain a list of collections from a specified database object. The include_system_collection is set to False to exclude any collections created by the system itself.

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
# Select the Database
database = client.students
# Alternative Declaration Method
# database = client['students']
# Get List of Collections
collection = database.list_collection_names(include_system_collections=False)
print(collection)

Result:

Finding documents in a collection

You can use the find() method to search for any document in a Collection. Let’s do an example to understand it better.

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
# Select the Database
database = client.students
# Get Details of the specified Collection
collection = database['student_grades']
# Print each Document
for studentinfo in collection.find():
print(studentinfo)

Result:

You can use several other MongoDB operators, like projection, push and pull, and sort, within the find() method to filter the output.

In our next example, we use the MongoDB projection operator to filter out the “_id” field and sort the resulting data set in descending order.

from pymongo import MongoClient

# Create Connection
client = MongoClient('10.10.10.59', 27017)
# Select the Database
database = client.students
# Get Details of the specified Collection
collection = database['student_grades']
# Print each Document
for studentinfo in collection.find({},{'_id':0}).sort('name',-1):
print(studentinfo)

Result:

Finding one document from a collection

PyMongo provides the find_one() and find() methods to find a single document from a collection:

The find_one() method will return the first document according to the given conditions.
The find() method will return a Cursor object, which is an iterable object that contains additional helper methods to transform the data.

The find_one() Method

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
# Select the Database
database = client.students
# Get Details of the specified Collection
collection = database['student_grades']
studentinfo = collection.find_one({'name':'Mike'})
print(studentinfo)

Result:

The find() method

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
# Select the Database
database = client.students
# Get Details of the specified Collection
collection = database['student_grades']
studentinfo = collection.find({'name':'Mike'})
print(studentinfo)

Result:

To obtain a readable output, we can use a loop to iterate the object.

from pymongo import MongoClient

# Create Connection
client = MongoClient('10.10.10.59', 27017)
# Select the Database
database = client.students
# Get Details of the specified Collection
collection = database['student_grades']
studentinfo = collection.find({'name':'Mike'})
# print(studentinfo)
# Print each Document
for student in studentinfo:
print(student)

Result:

Creating a database, collection & document

In this section, we’ll look at how to create a database and a collection, and how to insert documents into the new collection. MongoDB will not create a database until there is at least one document in the database.

The following example demonstrates how to create a database called “food” with a “dessert” collection with a single document using the insert_one() method. PyMongo automatically creates a new database if the selected database is not available.

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
print(f"Existing Databases")
print(client.list_database_names())
# Create a New Database - dictionary style
database_foods = client['foods']
# Alternative method for simple strings #
# database = client.foods
# Create a Collection - dictionary style
col_desserts = database_foods['desserts']
# Insert a Single Document
document = {'name': 'chocolate cake', 'price': 20, 'ingredients':['chocolate', 'flour', 'eggs']}
col_desserts.insert_one(document)
print(f"\nVerify the New Database")
print(client.list_database_names())
print(f"\nCollections in the Database foods")
print(database_foods.list_collection_names())
print(f"\nDocuments in the desserts Collection")
dessert = col_desserts.find()
# Print each Document
for des in dessert:
print(des)

Result:

Inserting documents to a collection

PyMongo provides two methods to insert documents as insert_one() and insert_many():

Insert_one() is used to insert a single document
Insert_many() inserts multiple documents.

Let’s see a couple of examples of that with the “desserts” collection.

Inserting a single document

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
database_foods = client['foods']
col_desserts = database_foods['desserts']
print(f"\nAll Documents in the desserts Collection")
dessert = col_desserts.find()
# Print each Document
for des in dessert:
print(des)
# Insert a Single Document
document = {'name': 'ice cream', 'price': 10, 'ingredients':['milk', 'vanilla extract', 'eggs', 'heavy cream', 'sugar']}
col_desserts.insert_one(document)
print(f"\nAfter Insert")
dessert = col_desserts.find()
# Print each Document
for des in dessert:
print(des)

Result:

Inserting multiple documents

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
database_foods = client['foods']
col_desserts = database_foods['desserts']
documents = [
{'name': 'pumpkin pie', 'price': 15, 'ingredients':['pumpkin', 'condensed milk', 'eggs', 'cinnamon']},
{'name': 'chocolate chip cookies', 'price': 3, 'ingredients':['chocolate chips', 'butter', 'white sugar', 'flour', 'vanilla extract']},
{'name': 'banana pudding', 'price': 10, 'ingredients':['banana', 'white sugar', 'milk', 'flour', 'salt']}
]
col_desserts.insert_many(documents)
dessert = col_desserts.find({},{'_id':0})
# Print each Document
for des in dessert:
print(des)

Result:

Updating documents

The methods update_one() and update_many() are used to update a single document or multiple documents. The following examples demonstrate how each of these methods can be used.

Updating a single document

In this example, we update the “price” field of a document specified by the “_id” field. For this, you need to import the ObjectId class from the “bson” module.

In this example, we have updated the price of a chocolate cookie to be 5.

from pymongo import MongoClient
from bson.objectid import ObjectId
# Create Connection
client = MongoClient('10.10.10.59', 27017)
database_foods = client['foods']
col_desserts = database_foods['desserts']
document = {'_id': ObjectId('5fc5ebe4abb99571485ce8f0')}
updated_values = {'$set': {'price': 5}}
# Update the Document
col_desserts.update_one(document, updated_values)
# Find the Updated Document
updated_doc = col_desserts.find_one({'_id': ObjectId('5fc5ebe4abb99571485ce8f0')})
print(updated_doc)

Result:

Updating multiple documents

In this example, we find all the documents whose “name” starts with the string “chocolate” using the $regex operator. Then we increase their price by 5 using the $inc operator.

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
database_foods = client['foods']
col_desserts = database_foods['desserts']
# Find Documents that starts with chocolate
document = {"name": { "$regex": "^chocolate"}}
# Increase the existing value by five
updated_values = {'$inc': {'price': 5}}
# Update the Documents
col_desserts.update_many(document, updated_values)
# Find the Updated Documents
for des in col_desserts.find({},{'_id':0}):
print(des)

Result:

Deleting documents, collections & databases

Let us look at how “delete” works within PyMongo:

The delete_one() method deletes a single document.
The delete_many() method deletes multiple files.
The drop() method deletes collections.
The drop_database() method deletes a database.

Let’s look at an example for each use case.

Deleting a single document

We will delete a document using the “_id” field as the identifier.

from pymongo import MongoClient
from bson.objectid import ObjectId
# Create Connection
client = MongoClient('10.10.10.59', 27017)
database_foods = client['foods']
col_desserts = database_foods['desserts']
document = {'_id': ObjectId('5fc5ebe4abb99571485ce8f0')}
# Delete the Document
col_desserts.delete_one(document)
# Find the Remaining Documents
for des in col_desserts.find():
print(des)

Result:

Deleting multiple documents

We will delete all documents where the price is equal to or less than 10 from the “desserts” collection.

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
database_foods = client['foods']
col_desserts = database_foods['desserts']
# Select only documents where price is less than or equal to 10
document = {'price': { '$lte': 10 }}
# Delete the Document
col_desserts.delete_many(document)
# Find the Remaining Documents
for des in col_desserts.find():
print(des)

Result:

Deleting a collection

A MongoDB collection can be deleted using the drop() method. Here’s how to drop the complete “desserts” collection from our food database.

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
database_foods = client['foods']
# Check currently available collections
print(database_foods.list_collection_names())
# Delete the collection
col_desserts = database_foods['desserts']
col_desserts.drop()
# Print the remaining collections
print(database_foods.list_collection_names())

Result:

The second print statement is empty as we have deleted the only collection from the “foods” database.

Deleting the database

The example below shows you how to delete a complete database with all its collections and documents. We will delete the “students” using the drop_database() method.

from pymongo import MongoClient
# Create Connection
client = MongoClient('10.10.10.59', 27017)
# List all databases
print(client.list_database_names())
# Check for Data within the database
database_students = client['students']
print('')
print(database_students.list_collection_names())
# Delete the complete database
client.drop_database('students')
# Verify the Deletion
print('')
print(client.list_database_names())

Result:

As you can see, the “students” database is completely removed.

That’s it for this tutorial. If you want to learn further about PyMongo, the best source is their documentation.

MongoDB – BMC Software | Blogs

MongoDB Indexes: Creating, Finding & Dropping Top Index Types

What are indexes in MongoDB?

Creating indexes

Finding indexes

Dropping indexes

Common MongoDB index types

Single field index

Compound index

Multikey index

Other MongoDB index types

Geospatial Index

Text index

Hashed index

MongoDB index properties

Sparse index

Partial index

Unique index

Indexes recap

Related reading

MongoDB Replication: A Complete Introduction

What is MongoDB Replication?

How MongoDB replication works

The Heartbeat process

Replica set elections

MongoDB Replica Set vs MongoDB Cluster

Dealing with replication delay

Configuring the Replica Set

Setting up the environment

Starting the MongoDB Instance

Initializing the Replica Set

Validate Data Replication

Adding a new node to the Replica Set

Removing a node from the Replica Set

Related reading

MongoDB Role-Based Access Control (RBAC) Explained

How MongoDB RBAC works

Built-in Roles

Database user roles

Database administration roles

Cluster admin roles

Backup & restoration roles

All database roles

Superuser roles

User-Defined Roles

MongoDB role management

Assigning user roles at user creation

Retrieving role information

Identifying assigned user roles

Granting & revoking user roles

Creating user-defined roles

Assigning user-defined roles to users

Updating & deleting user-defined roles

Related reading

Using mongorestore for Restoring MongoDB Backups

What is mongorestore?

Mongorestore behavior

Using MongoDB mongorestore

Basic mongorestore syntax

Restoring data into a remote MongoDB instance

Restoring a secure MongoDB instance

Selecting Databases and Collections

Restore data from an Archive File

Restoring data from a Compressed File

Restoring data from Standard Input

mongorestore & mongodump

Related reading

23 Common MongoDB Operators & How To Use Them

What are MongoDB operators?

Comparison Operators

$eq Operator

$gt and $lt Operators

$gte and $lte Operators

$in and $nin Operators

$ne Operator

Logical Operators

$and Operator

$or and $nor Operators

$not Operator

Element Operators