Relational Modelling in NoSQL: Embracing the Power of Single Collection Design

Paul Allies
5 min readAug 10, 2023
Photo by Brett Sayles

Traditionally, we’ve usually designed databases by normalising data across various tables. This approach aims to reduce data repetition while ensuring data integrity through connections between these collections. It was first proposed by British computer scientist Edgar F. Codd as part of his relational model.

While the relational model is powerful for representing structured data, it’s important to consider the trade-offs involved in performing table joins. Joining tables in an RDBMS can introduce significant overhead due to processing complexity, which can contribute to slower query performance. Especially in scenarios involving large datasets, the performance implications of joining tables can become even more pronounced. In fact, Edgar Codd touches this in one of his papers.

In this blog post we’ll delve into some of the concepts and ideas presented by Rick Houlihan on Single Collection Design and how they might help to solve this problem. Have a look at some of the presentations here and here.

The RDMS types like MySQL, Oracle, PostgreSQL, and others offer us the freedom to design databases in a way that supports on-the-fly queries. This setup empowers us to pose a wide array of questions to the database through query joins.

ERD

While optimisations like indexing can alleviate some of these challenges, they might not fully address the issue.

Additionally, what may help improve query performance is considering alternatives like denormalization, which involves trading off some normalisation benefits.

NoSQL

This is where NoSQL databases may come to the fore, offering a more adaptable approach to handling vast and interconnected datasets. Let’s explore how the characteristics of NoSQL databases address these shortcomings and provide solutions tailored to the demands of modern data management.

We should leverage the advantages of a schema-less design to address our current challenges. This entails adopting a fresh perspective on data and moving away from the practise of separating data into distinct collections. If we don’t, we’ll encounter similar issues as those faced with RDMS databases. The key challenge lies in eliminating the need to join collections.

Single Collection Design

In a NoSQL database like MongoDB, a single collection design (SCD) refers to the practise of storing different types of related data in a single collection. This is in contrast to a multi-collection design where data is spread across multiple collections. In SCD, documents with varying structures and attributes can coexist within the same collection.

To effectively connect and query this data:

Use Descriptive Attributes: Include attributes in your documents that help differentiate between different types of data. For example, you might have a “type” field that indicates the data’s category or purpose.

Utilise Indexing for relationships: Create appropriate indexes on connected documents.

Utilise Indexing for search: Create appropriate indexes on frequently queried attributes to improve query performance. This helps optimise searches across different document types within the collection.

While this flexible approach allows you to store diverse data types together, it’s important to carefully plan your schema, indexing, and queries to ensure efficient data retrieval and manipulation.

It’s important to note that the decision to use the SCD should be based on a thorough understanding of your application’s data access patterns (how data is queried) and expected query performance.

The SCD approach represents a significant shift in our traditional perspective on relational data. Embrace this new approach and take on a small project to experiment with it. Let’s do that now.

Let’s consider a simple example of an order system using a NoSQL database like MongoDB. In this project, we’ll store the data in a single collection.

Example

Let’s take the following order system ERD and show how it can be modelled using the single collection design

Order System: ERD

Using MongoDB’s index modeller, we can visually represent various document types within a one collection:

Data in a Single Collection

Let’s start first by creating an index on the “type” attribute. The aggregated view of this index looks like:

TYPE-index

This allows for an access pattern of “get all or paged documents of a specific type”

Related Data

Let’s link data by adding another attribute called “connectedTo” . This attribute will hold the values of keys that the document is connected to:

//Connect Order to Customer and Employee and to itself
{
"_id": "order_1",
"type": "order",
"orderDate": "2023-08-10",
"connectedTo": [
"order_1",
"employee_1",
"customer_1"
]
}

//Link a customer to an order
{
"_id": "customer_1",
"type": "customer",
"name": "TechSolutions Inc.",
"connectedTo": [
"order_1"
]
}

//Link an employee to an order
{
"_id": "employee_1",
"type": "employee",
"firstName": "Emily",
"lastName": "Parker",
"connectedTo": [
"order_1",
"employee_1"
]
}

{
"_id": "order_1_detail_1",
"type": "orderDetail",
"unitPrice": 100,
"qty": 2,
"description": "NexusTech SmartWatch",
"connectedTo": [
"order_1"
]
}

{
"_id": "order_1_detail_2",
"type": "orderDetail",
"unitPrice": "2000",
"qty": "1",
"description": "RoboGuard Home Security System",
"connectedTo": [
"order_1"
]
}

{
"_id": "order_1_detail_3",
"type": "orderDetail",
"unitPrice": "40",
"qty": "10",
"description": "FusionDrive Gaming Keyboard",
"connectedTo": [
"order_1"
]
}

Let’s now create another index on the “connectedTo” attribute.

CONNECT-To-index

With these 2 indexes we can service the following access patterns:

Get all documents of a specific type

db.getCollection("DATA").find({"type": "customer"})
db.getCollection("DATA").find({"type": "supplier"})
...

Get a single order and all related data

db.getCollection("data").find({"connectedTo": "order_1"})

List all products supplied by a supplier

db.getCollection("data").find({"connectedTo": "supplier_1"})

List all orders an employee is responsible for

db.getCollection("data").find({"connectedTo": "employee_1"})

List all orders placed by a customer

db.getCollection("data").find({"connectedTo": "customer_1"})

Conclusion

While Relational Database Management Systems (RDBMS) have long been the cornerstone of data management, they come with limitations, particularly evident in their handling of complex data relationships through joins. The traditional approach of joining tables can lead to performance bottlenecks and increased complexity as data scales.

This is where NoSQL databases shine with their schema-less design. By accommodating diverse data structures within a single collection, NoSQL databases eliminate the need for complex joins and rigid schemas.

In the context of single collection design, NoSQL databases encourage a mindset shift, enabling us to break free from the constraints of predefined tables (or collections) and relationships. While the SCD approach might not be suitable for all scenarios, it presents an alternative option for projects where large, diverse, and related data types need to coexist harmoniously.

--

--