How to Get Started with MongoDB Atlas Stream Processing and the HashiCorp Terraform MongoDB Atlas Provider

Rutuja Rajwade, Zuhair Ahmed5 min read • Published May 20, 2024 • Updated May 20, 2024

Terraform Connectors

Rate this tutorial

An event-driven architecture is a highly effective framework for creating applications that are responsive, scalable, and maintainable with loosely coupled components. This approach has become essential in developing applications that can rapidly respond to customer demands and dynamic market conditions.

For example, in healthcare, event-driven systems process data emitted by medical devices (like ECGs and blood glucose monitors) to provide real-time insights and alerts to healthcare providers, enabling timely interventions. Financial institutions utilize event-driven architectures to process and analyze transactions in real time. This allows for immediate fraud detection, risk assessment, and dynamic pricing based on current market conditions.

In this tutorial, we cover how to get started with MongoDB Atlas for near real-time data processing with HashiCorp Terraform. By integrating MongoDB Atlas Stream Processing with the HashiCorp Terraform MongoDB Atlas Provider, developers can efficiently manage real-time data workflows and automate resource provisioning.

Let’s get started!

Prerequisites

Create a MongoDB Atlas account. (More configuration details will be shown below.)
Install HashiCorp Terraform on your terminal or sign up for a free Terraform Cloud account.
Create MongoDB Atlas programmatic API keys and associate them with Terraform (further details below).
Choose your IDE. For this tutorial, we will be using VS Code.

Deploy foundational Atlas Stream Processing resources using the Terraform provider

There are three main components to consider with Atlas Stream Processing:

Stream Processing Instance (SPI) — This acts as the primary environment where your streaming data processes are executed. It's responsible for managing and executing the stream processing tasks, maintaining performance, and ensuring data integrity throughout the streaming lifecycle.
Connection registry — This serves as the conduit between your data sources/sinks and the processing instance. It's tasked with the secure and efficient transport of streaming data, facilitating continuous data ingestion from various sources into the SPI.
Stream processor — This is where the business logic for data transformation and analysis is defined and executed. Using MongoDB's expressive querying and aggregation capabilities, this component processes incoming data streams continuously, enabling immediate data manipulation and enrichment before forwarding it for storage or further processing.

At this time, the Terraform Atlas Provider supports management of the SPI (mongodbatlas_stream_instance) and connections in the Connection Registry (mongodbatlas_stream_connection), with programmatic support for stream processors coming in the future.

First, we need to deploy basic Atlas resources to get started. As part of this tutorial, this includes an Atlas project, M10 dedicated Atlas cluster, database user, and IP access list entry.

In addition, we will be deploying one Stream Processing Instance and two connections in the connection registry. See our Atlas Admin Open API Specification documentation for a full list of supported cloud providers, regions, and all other parameter inputs for these resources:

In this example, our Stream Processing Instance will be hosted on AWS us-east-1 using an SP30 tier. More information on Stream Processing tiers is available in the documentation.
For the connection registry, we will point our data source to type Sample (solar data that we provide as a sample source to use when getting started) which is great for testing purposes. (Type Kafka is also supported.)
For a connection to the built-in sample_stream_solar, provide a configuration file with the following syntax:

1 {
2   "name": "sample_stream_solar",
3   "type": "Sample"
4 }

Our data sink will point to the M10 dedicated Atlas cluster we mentioned earlier.
In the sample Terraform script provided below, replace the “org_id” field, and enter your own organization’s ID and your unique username and password in the “username” and “password” fields.
Ensure that the IP address added to the script matches the one in Terraform. If you are unsure of the IP address that you are running Terraform on (and you are performing this step from that machine), simply click “Use Current IP Address and Save”. Another option is to open up your IP access list to all, but this comes with significant potential risks. To do this, you can add the following two CIDRs: 0.0.0.0/1 and 128.0.0.0/1. These entries will open your IP access list to, at most, 4,294,967,296 (or 2^32) IPv4 addresses and should be used with caution.

See the below Terraform script as part of our main.tf file.

1 terraform {
2   required_providers {
3     mongodbatlas = {
4       source = "mongodb/mongodbatlas"
5     }
6   }
7   required_version = ">= 0.13"
8 }
9 
10 resource "mongodbatlas_project" "testProject" {
11   name   = "testProject16"
12   org_id = "[YOUR ORG_ID]"
13 }
14 
15 resource "mongodbatlas_project_ip_access_list" "testIPAccessList" {
16   project_id = mongodbatlas_project.testProject.id
17   ip_address = "[YOUR IP ADDRESS]"
18 }
19 
20 resource "mongodbatlas_database_user" "testUser" {
21   project_id         = mongodbatlas_project.testProject.id
22   username           = "[YOUR USERNAME]"
23   password           = "[YOUR PASSWORD]"
24   auth_database_name = "admin"
25   roles {
26     role_name     = "readWrite"
27     database_name = "dbforApp"
28   }
29 }
30 
31 resource "mongodbatlas_advanced_cluster" "testCluster" {
32   project_id   = mongodbatlas_project.testProject.id
33   name         = "testCluster"
34   cluster_type = "REPLICASET"
35   replication_specs {
36     region_configs {
37       electable_specs {
38         instance_size = "M10"
39         node_count    = 3
40       }
41       provider_name = "AWS"
42       priority      = 7
43       region_name   = "US_EAST_1"
44     }
45   }
46 }
47 
48 resource "mongodbatlas_stream_instance" "myFirstStreamInstance" {
49   project_id    = mongodbatlas_project.testProject.id
50   instance_name = "myFirstStreamInstance"
51   data_process_region = {
52     region         = "VIRGINIA_USA"
53     cloud_provider = "AWS"
54   }
55   stream_config = {
56     tier = "SP30"
57   }
58 }
59 
60 resource "mongodbatlas_stream_connection" "dataSource" {
61     project_id = mongodbatlas_project.testProject.id
62     instance_name = mongodbatlas_stream_instance.myFirstStreamInstance.instance_name
63     connection_name = "sample_stream_solar"
64     type = "Sample"
65 }
66 
67 resource "mongodbatlas_stream_connection" "dataSink" {
68     project_id = mongodbatlas_project.testProject.id
69     instance_name = mongodbatlas_stream_instance.myFirstStreamInstance.instance_name
70     connection_name = "dataSink"
71     type = "Cluster"
72     cluster_name = mongodbatlas_advanced_cluster.testCluster.name
73     db_role_to_execute = {
74       role = "atlasAdmin"
75       type = "BUILT_IN"
76     }
77 }

Note: Before deploying, be sure to store your MongoDB Atlas programmatic API keys created as part of prerequisites as Environment Variables. To deploy, you can use the below commands from the terminal:

1 terraform init 
2 terraform plan
3 terraform apply

If your deployment was successful, you should be greeted with “Apply complete!”

To confirm, you should be able to see your newly created SPI in the Atlas UI. (To navigate from the home screen, click on “Stream Processing” on the left-hand menu.)

By going into the Connection Registry, we can also verify that our connections were successfully created.

Get connection string and create stream processor

Next, you’ll need to retrieve the Atlas Stream Processing connection string and create your first stream processor.

While still in the Atlas UI, click “Connect” and select the tool to access your data through MongoDB Shell or MongoDB for VS Code.

This will generate your connection string which you can use to connect to your Stream Processing Instance.

At this point, your first stream processors can be created to define how your data will be processed in your SPI. As mentioned earlier, there are currently no Terraform resources as of yet to provide this configuration. To learn more about how stream processors can be configured, refer to Manage Stream Processors.

For more detailed information, also see our Atlas Stream Processing example in our Terraform repo.

All done

Congratulations! You have everything that you need now to run your first Atlas Stream Processing-based workflow.

With the above steps, teams can utilize Atlas Stream Processing capabilities within the Terraform MongoDB Atlas Provider. Whether you are developing complex event processing systems, real-time analytics, or interactive user engagement features, Atlas Stream Processing integrated with Terraform provides a robust infrastructure to support scalable and resilient applications.

The HashiCorp Terraform Atlas Provider is open-sourced under the Mozilla Public License v2.0 and we welcome community contributions. To learn more, see our contributing guidelines.

The fastest way to get started is to create a MongoDB Atlas account from the AWS Marketplace. To learn more about the Terraform Provider, check out our documentation, tutorials, solution brief, or get started today.

Go build with MongoDB Atlas and the HashiCorp Terraform Atlas Provider today!

Rate this tutorial

Tutorial

Tuning the MongoDB Connector for Apache Kafka

Sep 17, 2024 | 10 min read

Tutorial

Go to MongoDB Using Kafka Connectors - Ultimate Agent Guide

Sep 17, 2024 | 7 min read

Tutorial

Mastering MongoDB Ops Manager on Kubernetes

Jan 13, 2023 | 7 min read

Article

Streaming Data With Apache Spark and MongoDB

Aug 28, 2024 | 7 min read

Prerequisites
Deploy foundational Atlas Stream Processing resources using the Terraform provider
Get connection string and create stream processor
All done

Connectors

How to Get Started with MongoDB Atlas Stream Processing and the HashiCorp Terraform MongoDB Atlas Provider

Prerequisites

Deploy foundational Atlas Stream Processing resources using the Terraform provider

Get connection string and create stream processor

All done

Related

Tuning the MongoDB Connector for Apache Kafka

Go to MongoDB Using Kafka Connectors - Ultimate Agent Guide

Mastering MongoDB Ops Manager on Kubernetes

Streaming Data With Apache Spark and MongoDB

Table of Contents

1	terraform {
2	required_providers {
3	mongodbatlas = {
4	source = "mongodb/mongodbatlas"
5	}
6	}
7	required_version = ">= 0.13"
8	}
9
10	resource "mongodbatlas_project" "testProject" {
11	name = "testProject16"
12	org_id = "[YOUR ORG_ID]"
13	}
14
15	resource "mongodbatlas_project_ip_access_list" "testIPAccessList" {
16	project_id = mongodbatlas_project.testProject.id
17	ip_address = "[YOUR IP ADDRESS]"
18	}
19
20	resource "mongodbatlas_database_user" "testUser" {
21	project_id = mongodbatlas_project.testProject.id
22	username = "[YOUR USERNAME]"
23	password = "[YOUR PASSWORD]"
24	auth_database_name = "admin"
25	roles {
26	role_name = "readWrite"
27	database_name = "dbforApp"
28	}
29	}
30
31	resource "mongodbatlas_advanced_cluster" "testCluster" {
32	project_id = mongodbatlas_project.testProject.id
33	name = "testCluster"
34	cluster_type = "REPLICASET"
35	replication_specs {
36	region_configs {
37	electable_specs {
38	instance_size = "M10"
39	node_count = 3
40	}
41	provider_name = "AWS"
42	priority = 7
43	region_name = "US_EAST_1"
44	}
45	}
46	}
47
48	resource "mongodbatlas_stream_instance" "myFirstStreamInstance" {
49	project_id = mongodbatlas_project.testProject.id
50	instance_name = "myFirstStreamInstance"
51	data_process_region = {
52	region = "VIRGINIA_USA"
53	cloud_provider = "AWS"
54	}
55	stream_config = {
56	tier = "SP30"
57	}
58	}
59
60	resource "mongodbatlas_stream_connection" "dataSource" {
61	project_id = mongodbatlas_project.testProject.id
62	instance_name = mongodbatlas_stream_instance.myFirstStreamInstance.instance_name
63	connection_name = "sample_stream_solar"
64	type = "Sample"
65	}
66
67	resource "mongodbatlas_stream_connection" "dataSink" {
68	project_id = mongodbatlas_project.testProject.id
69	instance_name = mongodbatlas_stream_instance.myFirstStreamInstance.instance_name
70	connection_name = "dataSink"
71	type = "Cluster"
72	cluster_name = mongodbatlas_advanced_cluster.testCluster.name
73	db_role_to_execute = {
74	role = "atlasAdmin"
75	type = "BUILT_IN"
76	}
77	}