Join Regular Classroom : Visit ClassroomTech

MongoDB – Interview Questions | Codewindow.in

1. What is MongoDB?

MongoDB is a document-oriented NoSQL database used for high volume data storage. Instead of using tables and rows as in the traditional relational databases, MongoDB makes use of collections and documents. Documents consist of key-value pairs which are the basic unit of data in MongoDB. Collections contain sets of documents and function which is the equivalent of relational database tables.

Key Features

Document Oriented and NoSQL database.

Supports Aggregation

Uses BSON format

Sharding (Helps in Horizontal Scalability)

Supports Ad Hoc Queries

Schema Less

Capped Collection

Indexing (Any field in MongoDB can be indexed)

MongoDB Replica Set (Provides high availability)

Supports Multiple Storage Engines

Key Components

a. _id: The _id field represents a unique value in the MongoDB document. The _id field is like the document’s primary key. If you create a new document without an _id field, MongoDB will automatically create the field.

b. Collection: This is a grouping of MongoDB documents. A collection is the equivalent of a table which is created in any other RDMS such as Oracle.

c. Cursor: This is a pointer to the result set of a query. Clients can iterate through a cursor to retrieve results.

d. Database: This is a container for collections like in RDMS wherein it is a container for tables. Each database gets its own set of files on the file system. A MongoDB server can store multiple databases.

e. Document: A record in a MongoDB collection is basically called a document. The document, in turn, will consist of field name and values.

f. Field: A name-value pair in a document. A document has zero or more fields. Fields are analogous to columns in relational databases.

2. What are Indexes in MongoDB?

Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.

Indexes are special data structures that store a small portion of the collection’s data set in an easy to traverse form. The index stores the value of a specific field or set of fields, ordered by the value of the field. The ordering of the index entries supports efficient equality matches and range-based query operations. In addition, MongoDB can return sorted results by using the ordering in the index.

Example

// The createIndex() method only creates an index if an index of the same specification does not already exist. The following example ( using Node.js ) creates a single key descending index on the name field:

collection.createIndex( { name : -1 }, function(err, result) {
   console.log(result);
   callback(result);
}

3. What are the types of Indexes available in MongoDB?

MongoDB supports the following types of the index for running a query.

a. Single Field Index

MongoDB supports user-defined indexes like single field index. A single field index is used to create an index on the single field of a document. With single field index, MongoDB can traverse in ascending and descending order. By default, each collection has a single field index automatically created on the _id field, the primary key.

Example

{
  "_id": 1,
  "person": { name: "Alex", surname: "K" },
  "age": 29,
  "city": "New York"
}

We can define, a single field index on the age field.

db.people.createIndex( {age : 1} ) // creates an ascending index

db.people.createIndex( {age : -1} ) // creates a descending index

With this kind of index we can improve all the queries that find documents with a condition and the age field, like the following:

db.people.find( { age : 20 } )

db.people.find( { name : “Alex”, age : 30 } )

db.people.find( { age : { $gt : 25} } )

b. Compound Index

A compound index is an index on multiple fields. Using the same people collection we can create a compound index combining the city and age field.

db.people.createIndex( {city: 1, age: 1, person.surname: 1  } )

In this case, we have created a compound index where the first entry is the value of the city field, the second is the value of the age field, and the third is the person.name. All the fields here are defined in ascending order.

Queries such as the following can benefit from the index:

db.people.find( { city: “Miami”, age: { $gt: 50 } } )

db.people.find( { city: “Boston” } )

db.people.find( { city: “Atlanta”, age: {$lt: 25}, “person.surname”: “Green” } )

c. Multikey Index

This is the index type for arrays. When creating an index on an array, MongoDB will create an index entry for every element.

Example

{
   "_id": 1,
   "person": { name: "John", surname: "Brown" },
   "age": 34,
   "city": "New York",
   "hobbies": [ "music", "gardening", "skiing" ]
 }

The multikey index can be created as:

db.people.createIndex( { hobbies: 1} )

Queries such as these next examples will use the index:

db.people.find( { hobbies: “music” } )

db.people.find( { hobbies: “music”, hobbies: “gardening” } )

d. Geospatial Index

GeoIndexes are a special index type that allows a search based on location, distance from a point and many other different features. To query geospatial data, MongoDB supports two types of indexes – 2d indexes and 2d sphere indexes. 2d indexes use planar geometry when returning results and 2dsphere indexes use spherical geometry to return results.

e. Text Index

It is another type of index that is supported by MongoDB. Text index supports searching for string content in a collection. These index types do not store language-specific stop words (e.g. “the”, “a”, “or”). Text indexes restrict the words in a collection to only store root words.

Example

Let’s insert some sample documents.

var entries = db.people("blogs").entries;
entries.insert( {
  title : "my blog post",
  text : "i am writing a blog. yay",
  site: "home",
  language: "english" });
entries.insert( {
  title : "my 2nd post",
  text : "this is a new blog i am typing. yay",
  site: "work",
  language: "english" });
entries.insert( {
  title : "knives are Fun",
  text : "this is a new blog i am writing. yay",
  site: "home",
  language: "english" });
  var entries = db.people("blogs").entries;
entries.ensureIndex({title: "text", text: "text"}, { weights: {
    title: 10,
    text: 5
  },
  
  // Let's define create the text index.
  
  name: "TextIndex",
  default_language: "english",
  language_override: "language" });

Queries such as these next examples will use the index:

var entries = db.people(“blogs”).entries;

entries.find({$text: {$search: “blog”}, site: “home”})

f. Hashed Index

MongoDB supports hash-based sharding and provides hashed indexes. These indexes are the hashes of the field value. Shards use hashed indexes and create a hash according to the field value to spread the writes across the sharded instances.

4. How many indexes does MongoDB create by default for a new collection?

By default MongoDB creates a unique index on the _id field during the creation of a collection. The _id index prevents clients from inserting two documents with the same value for the _id field.

5. Can you create an index in an array field in MongoDB?

Yes, To index a field that holds an array value, MongoDB creates an index key for each element in the array. Multikey indexes can be constructed over arrays that hold both scalar values (e.g. strings, numbers) and nested documents. MongoDB automatically creates a multikey index if any indexed field is an array.

Syntax

db.collection.createIndex( { <field>: < 1 or -1 > } )

For example, consider an inventory collection that contains the following documents:

{ _id: 10, type: “food”, item: “aaa”, ratings: [ 5, 8, 9 ] }

{ _id: 11, type: “food”, item: “bbb”, ratings: [ 5, 9 ] }

{ _id: 12, type: “food”, item: “ccc”, ratings: [ 9, 5, 8, 4, 7 ] }

The collection has a multikey index on the ratings field:

db.inventory.createIndex( { ratings: 1 } )

The following query looks for documents where the ratings field is the array [ 5, 9 ]:

db.inventory.find( { ratings: [ 5, 9 ] } )

MongoDB can use the multikey index to find documents that have 5 at any position in the ratings array. Then, MongoDB retrieves these documents and filters for documents whose ratings array equals the query array [ 5, 9 ].

C QA

Mostly Asked

DS QA

Mostly Asked

DBMS QA

Mostly Asked

ML QA

Mostly Asked

6. Why does Profiler use in MongoDB?

The database profiler captures data information about read and write operations, cursor operations, and database commands. The database profiler writes data in the system.profile collection, which is a capped collection.

The database profiler collects detailed information about Database Commands executed against a running mongod instance. This includes CRUD operations as well as configuration and administration commands.

Profiler has 3 profiling levels.

Level 0 – Profiler will not log any data

Level 1 – Profiler will log only slow operations above some threshold

Level 2 – Profiler will log all the operations

a. To get current profiling level.

db.getProfilingLevel()  

// Output

0

b. To check current profiling status

db.getProfilingStatus()

// Output

{ “was” : 0, “slowms” : 100 }

c. To set profiling level

db.setProfilingLevel(1, 40)

// Output

{ “was” : 0, “slowms” : 100, “ok” : 1 }

7. How to remove attribute from MongoDB Object?

$unset

The $unset operator deletes a particular field. If the field does not exist, then $unset does nothing. When used with $ to match an array element, $unset replaces the matching element with null rather than removing the matching element from the array. This behavior keeps consistent the array size and element positions.

/*
Example:
delete the properties.service attribute from all records on this collection.
*/

db.collection.update(
    {},
    {
        $unset : {
            "properties.service" : 1
        }
    },
    {
        multi: true
    }
);
To verify they have been deleted you can use:
db.collection.find(
    {
        "properties.service" : {
            $exists : true
         }
    }
).count(true);

8. What is “Namespace” in MongoDB?

MongoDB stores BSON (Binary Interchange and Structure Object Notation) objects in the collection. The concatenation of the collection name and database name is called a namespace.

9. What is Replica Set in MongoDB?

It is a group of mongo processes that maintain same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments. A replica set contains a primary node and multiple secondary nodes.

The primary node receives all write operations. A replica set can have only one primary capable of confirming writes with { w: “majority” } write concern; although in some circumstances, another mongod instance may transiently believe itself to also be primary.

The secondaries replicate the primary’s oplog and apply the operations to their data sets such that the secondaries’ data sets reflect the primary’s data set. If the primary is unavailable, an eligible secondary will hold an election to elect itself the new primary.

10. How can you achieve primary key – foreign key relationships in MongoDB?

The jQuery CSS() method is used to get (return)or set style properties or values for selected elements. It facilitates you to get one or more style properties.

11. When should we embed one document within another in MongoDB?

You should consider embedding documents for:

contains relationships between entities

One-to-many relationships

Performance reasons

12. How is data stored in MongoDB?

In MongoDB, Data is stored in BSON documents (short for Bin­ary JSON). These documents are stored in MongoDB in JSON (JavaScript Object Notation) format. JSON documents support embedded fields, so related data and lists of data can be stored with the document instead of an external table. Documents contain one or more fields, and each field contains a value of a specific data type, including arrays, binary data and sub-documents. Documents that tend to share a similar structure are organized as collections.

JSON is formatted as name/value pairs. In JSON documents, field names and values are separated by a colon, field name and value pairs are separated by commas, and sets of fields are encapsulated in “curly braces” ({}).

Example:

{
  "name": "notebook",
  "qty": 50,
  "rating": [ { "score": 8 }, { "score": 9 } ],
  "size": { "height": 11, "width": 8.5, "unit": "in" },
  "status": "A",
  "tags": [ "college-ruled", "perforated"]
}

13. What are the differences between MongoDB and SQL-SERVER?

The MongoDB store the data in documents with JSON format but SQL store the data in Table format.

The MongoDB provides high performance, high availability, easy scalability etc. rather than SQL Server.

In the MongoDB, we can change the structure simply by adding, removing column from the existing documents.

MongoDB and SQL Server Comparision Table

Base of Comparison

MS SQL Server

MongoDB

Storage Model

RDBMS

Document-Oriented

Joins

Yes

No

Transaction

ACID

Multi-document ACID Transactions with snapshot isolation

Agile practices

No

Yes

Data Schema

Fixed

Dynamic

Scalability

Vertical

Horizontal

Map Reduce

No

Yes

Language

SQL query lang

JSON Query Language

Secondary index

Yes

Yes

Triggers

Yes

Yes

Foreign Keys

Yes

No

Concurrency

Yes

Yes

XML Support

Yes

No

14. How MongoDB supports ACID transactions and locking functionalities?

ACID stands that any update is:

Atomic: it either fully completes or it does not

Consistent: no reader will see a “partially applied” update

Isolated: no reader will see a “dirty” read

Durable: (with the appropriate write concern)

MongoDB, has always supported ACID transactions in a single document and, when leveraging the document model appropriately, many applications don’t need ACID guarantees across multiple documents.

MongoDB is a document based NoSQL database with a flexible schema. Transactions are not operations that should be executed for every write operation since they incur a greater performance cost over a single document writes. With a document based structure and denormalized data model, there will be a minimized need for transactions. Since MongoDB allows document embedding, you don’t necessarily need to use a transaction to meet a write operation.

MongoDB version 4.0 provides multi-document transaction support for replica set deployments only and probably the version 4.2 will extend support for sharded deployments.

Example: Multi-Document ACID Transactions in MongoDB

These are multi-statement operations that need to be executed sequentially without affecting each other. For example below we can create two transactions, one to add a user and another to update a user with a field of age.

$session.startTransaction()

   db.users.insert({_id: 6, name: “John”})

   db.users.updateOne({_id: 3, {$set: {age:26} }})

session.commit_transaction()

Transactions can be applied to operations against multiple documents contained in one or many collection/database. Any changes due to document transaction do not impact performance for workloads not related or do not require them. Until the transaction is committed, uncommitted writes are neither replicated to the secondary nodes nor are they readable outside the transactions.

15. Explain limitations of MongoDB Transactions?

MongoDB transactions can exist only for relatively short time periods. By default, a transaction must span no more than one minute of clock time. This limitation results from the underlying MongoDB implementation. MongoDB uses MVCC, but unlike databases such as Oracle, the “older” versions of data are kept only in memory.

You cannot create or drop a collection inside a transaction.

Transactions cannot make writes to a capped collection

Transactions take plenty of time to execute and somehow they can slow the performance of the database.

Transaction size is limited to 16MB requiring one to split any that tends to exceed this size into smaller transactions.

Subjecting a large number of documents to a transaction may exert excessive pressure on the WiredTiger engine and since it relies on the snapshot capability, there will be a retention of large unflushed operations in memory. This renders some performance cost on the database.

16. Should I normalize my data before storing it in MongoDB?

Data used by multiple documents can either be embedded (denormalized) or referenced (normalized). Normalization, which is increasing the complexity of the schema by splitting tables into multiple smaller ones to reduce the data redundancy( 1NF, 2NF, 3NF).

But Mongo follows the exact opposite way of what we do with SQL. In MongoDB, data normalization is not requried. Indeed we need to de-normalize and fit it into a collection of multiple documents.

Example: Let’s say we have three tables

Table – 1 : ColumnA, ColumnB (primary key)

Table – 2 : ColumnC (Foreign key), ColumnD (primary key)

Table – 3 : ColumnE (foreign key), ColumnF

In this case, mongoDB document structure should be as follows.

{
    ColumnA : ValueA,
    ColumnB : ValueB,
    Subset1 : [{
       ColumnC : ValueC,
       ColumnD : ValueD,
       Subset2 : [{
           ColumnE : ValueE,
           ColumnF : ValueF
       }]
    }]
}

17. Does MongoDB pushes the writes to disk immediately or lazily?

MongoDB pushes the data to disk lazily. It updates the immediately written to the journal but writing the data from journal to disk happens lazily.

18. If you remove a document from database, does MongoDB remove it from disk?

Yes. If you remove a document from database, MongoDB will remove it from disk too.

19. What is Sharding in MongoDB?

Sharding is a method for distributing data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations.

Database systems with large data sets or high throughput applications can challenge the capacity of a single server. For example, high query rates can exhaust the CPU capacity of the server. Working set sizes larger than the system’s RAM stress the I/O capacity of disk drives. There are two methods for addressing system growth: vertical and horizontal scaling.

a. Vertical Scaling

Vertical Scaling involves increasing the capacity of a single server, such as using a more powerful CPU, adding more RAM, or increasing the amount of storage space.

b. Horizontal Scaling

Horizontal Scaling involves dividing the system dataset and load over multiple servers, adding additional servers to increase capacity as required. While the overall speed or capacity of a single machine may not be high, each machine handles a subset of the overall workload, potentially providing better efficiency than a single high-speed high-capacity server.

MongoDB supports horizontal scaling through sharding. A MongoDB sharded cluster consists of the following components:

Shards: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set.

Mongos: The mongos acts as a query router, providing an interface between client applications and the sharded cluster. Starting in MongoDB 4.4, mongos can support hedged reads to minimize latencies.

Config Servers: Config servers store metadata and configuration settings for the cluster.

20. What is Aggregation in MongoDB?

Aggregation in MongoDB is an operation used to process the data that returns the computed results. Aggregation basically groups the data from multiple documents and operates in many ways on those grouped data in order to return one combined result.

Aggregate function groups the records in a collection, and can be used to provide total number(sum), average, minimum, maximum etc out of the group selected. In order to perform the aggregate function in MongoDB, aggregate () is the function to be used.

Syntax

db.collection_name.aggregate(aggregate_operation)

MongoDB provides three ways to perform aggregation:

the aggregation pipeline,

the map-reduce function,

and single purpose aggregation methods and commands.

MongoDB’s aggregation framework is modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated result.

Example

db.orders.aggregate([
   { $match: { status: "A" } },
   { $group: { _id: "$cust_id", total: { $sum: "$amount" } } }
]);

The $match stage filters the documents by the status field and passes to the next stage those documents that have status equal to “A”. The $group stage groups the documents by the cust_id field to calculate the sum of the amount for each unique cust_id.
Expressions used by Aggregate function

Expression Description

$sum Summates the defined values from all the documents in a collection

$avg Calculates the average values from all the documents in a collection

$min Return the minimum of all values of documents in a collection

$max Return the maximum of all values of documents in a collection

$addToSet Inserts values to an array but no duplicates in the resulting document

$push Inserts values to an array in the resulting document

$first Returns the first document from the source document

$last Returns the last document from the source document

Categories
Pages
Recent Posts