Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Managing a Smart Product Catalog Metadata Architecture With MongoDB Atlas and Google Cloud

Pedro Bereilh, Diego Canales, Angie Guemes Estrada9 min read • Published Dec 11, 2024 • Updated Dec 16, 2024
JavaScriptAtlas
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Digital transformation in retail has accelerated through customers’ adoption of e-commerce platforms as their selected purchase channels, with transactions expected to reach a value of $8.1 trillion by 2026. To take advantage of the market potential, retailers need to build a modern product catalog that can handle multiple types of information and modernize e-commerce portals, creating the next online shopping generation.
This tutorial demonstrates how to leverage MongoDB Atlas and Google Cloud services to implement a metadata storage architecture, which will be referred to as Product Catalog Metadata Manager. This architecture stores the product catalog’s metadata in MongoDB and links to the images in Google Cloud Storage (GCS) buckets, allowing retailers to store data securely and provide easy access for modifications.
In a nutshell, the process works as follows:
  1. Images stored in GCS buckets are made securely accessible to the application through a signed URLl. This metadata is stored inside of MongoDB's collection.
  2. A scheduled Atlas trigger calls an API deployed on a Google Cloud virtual machine (VM), which runs a microservice that signs the URL of each image. This perfectly demonstrates how seamlessly MongoDB and GC can be integrated. For simplicity, this demo uses a single process to create the signed URLs. For production deployment, this would have resiliency measures in place.
  3. Then, each product document in the MongoDB collection is updated with the new signed URLs of the images stored in the GCS bucket. For simplicity, this is updated every 12 hours. In practice, a retailer would design the update strategy based on the size and freshness requirements of their catalog—e.g., smaller rather than bulk updates to prevent database spikes, or utilizing caching techniques.
  4. Finally, the documents are available to be consumed by the client's application. In this case, the product’s catalog is being queried by a NextJS application that will display the images in an e-commerce demo site.
This demo showcases key concepts for integrating MongoDB with Google Cloud. While providing a strong architecture foundation, additional integrations and enhancements might be necessary for production environments.

Considerations for storing product catalog data with MongoDB and Google Cloud

Product catalogs are crucial for retailers’ online success, as they represent the connection point vis-à-vis clients. Moreover, catalogs require product information to be displayed in multiple channels—for example, on mobile, a website, an app—since today, customers expect to be able to make transactions through several routes. In this scenario, companies often find themselves struggling with the complexity of integrating a modern catalog system due to a myriad of obstacles, including rigid and expensive data infrastructure, uncontrolled technology sprawl, the proliferation of siloed data, the compliance of data protection regulations, and duplication of information in multiple sites.
The Product Catalog Metadata Manager allows retailers to address the prior challenges by enhancing their catalog with the following capabilities:
  • Consolidating a single view with the adoption of the document model and Atlas as the operational layer for the product data.
  • Diminishing information duplication as images are stored only once in Cloud Storage buckets and linked through URLs to the MongoDB database.
  • Increasing security standards with protected images against unauthorized users and image piracy. While this does not prevent users or scraping bots from downloading the image while the URLs are active, it still adds an extra layer of security by expiring the URL if only the metadata is retrieved. Retailers might consider watermarking the images or placing them behind an authentication gateway for increasing this security standard.
  • Slashing expenses through the adoption of cost-effective Cloud Storage buckets for image storage and URL string transfers to the database instead of heavy image files.
  • Automating data workflow through Atlas triggers.
Additionally, with the product metadata in MongoDB Atlas, enterprises can easily deliver enhanced user experience on their catalog. For example, with Atlas Search and Vector Search, customers can effortlessly find relevant products. Moreover, through vector embedding models, such as large language models, clients can receive customized product recommendations based on their preferences or purchase patterns.

Building the architecture of the Product Catalog Metadata Manager

Let’s explore the technical aspects through a simulated e-commerce platform named Leafy Pop-up Store. Figure 1 shows the final output on how the images will be displayed in the UI. In addition to the Product Catalog Metadata Manager, this demo integrates other features, such as dynamic pricing powered by Google Cloud Vertex AI and MongoDB Atlas.
This is an e-commerce portal showing a list of MongoDB merchandise products Figure 1.– Leafy Pop-up Store, an e-commerce demo.
For the Product Catalog Metadata Manager implementation, retailers can refer to the diagram in Figure 2, divided into seven parts.
A seven-step diagram that illustrates the flow of data between MongoDB Atlas and Google Cloud services for managing a product catalog Figure 2.- Architecture diagram of the Product Catalog Metadata Manager. High-level overview
Points 1 and 2 of Figure 2 explain the data flow process, while points 3 through 7 provide insights about individual elements of the architecture.

1. Periodically update signed URLs

The first half of the journey starts with the use of MongoDB Atlas automated built-in services and image retrieval from GCS buckets. In this architecture, images must be handled according to business rules and they must be protected from unauthorized access.
Following this logic, an Atlas trigger is scheduled to run every 12 hours. This trigger is responsible for sending a GET request to a microservice hosted on a VM within Google Cloud Compute Engine over its external IP. Upon receiving a GET request at the /signURLs endpoint, the microservice retrieves all images stored in the Cloud Storage bucket and generates a new signed URL for each file. All of these signed URLs are temporarily stored in an array. These signed URLs are time-limited addresses that provide public access to specific Cloud Storage resources, such as images.
During this process, the microservice securely accesses the private Cloud Storage bucket through authentication, which ensures that only a service account with the necessary authorization privileges can examine this bucket, preventing unauthorized users from retrieving the product catalog’s files.

2. Update MongoDB product catalog

The second half of the journey refers to catalog management. It consists of CRUD operations performed on the product data—in this context, updating the database documents, followed by the consumption of the product catalog by the client’s application.
At this point, the microservice holds an array with the new values for each image URL. The microservice updates the product catalog collection by looping through each value and modifying the MongoDB documents inside the products collection with a new signed URL. Specifically, it adjusts a parameter named “product.image.url” which is sent to the client. Now, the client application, deployed by the Next.js project, requests the list of products from the MongoDB database and displays them on the Leafly Pop-up Store.
The architecture of the Product Catalog Metadata Manager is completed. The returned URLs are always public and updated every 12 hours. Next, let’s proceed to its main components.

3. Configure the Atlas trigger

First, the architecture uses a trigger named “bucketSignerTrigger,” which is created inside Atlas Data Services. It has a basic Schedule Type, set to run once every 12 hours. See the configuration details in Figure 3 below.
A creation form with “Trigger Type” set to “scheduled,”  “scheduled Type” set to “basic,” and “Report once by” set to “Hour every 12 hours” Figure 3.- Scheduled trigger configuration inside MongoDB Atlas
The trigger requires an event type. Choose the option “Function,” and inside the code block, start the microservice by making a GET request to the API.
1// Making a GET request to the microservice
2exports = async function(payload, response) {
3 const url = "http://127.0.0.1:3000/signURLs"; // This is an example URL
4 try {
5 const httpResult = await context.http.get({ url: url });
6 if (httpResult.status == "200 OK") {
7 return httpResult.body.text();
8 } else {
9 return { error: "Failed to fetch data", statusCode: httpResult.status };
10 }
11 } catch (error) {
12 return { error: error.message };
13 }
14};
Notice that the URL consists of a VM external IP, a listening port, and an endpoint.
The composition of the URL pointing at the three elements: “VM Instance External IP,” “Listening port for server,” and “Endpoint” Figure 4.- Microservice’s URL composition

4. Host the microservice in Google Cloud Compute Engine

The next element of the Product Catalog Metadata Manager is the VM called “retail-store.” This instance is always running and it is responsible for hosting the microservice and making it reachable to MongoDB Atlas trigger through its external IP address. This microservice updates the products’ metadata in the following way.
First, it connects MongoDB and GC. For MongoDB, the connection is established through the Node.js Driver. As for GC, a few steps are involved: A service account is created inside the project. Then, roles are assigned. Finally, a JSON key is generated and downloaded.
Follow the step by step tutorial if you wish to replicate.
Following, it gets the resources from the private bucket by assigning the downloaded JSON key to the serviceAccountKey variable and passing it to the storage object. Then, it accesses all the files within the bucket and stores them inside the array [files], as elements of type File.
1// Getting bucket resources
2const serviceAccountKey = await loadJson(jsonFilePath);
3const storage = new Storage({
4 credentials: serviceAccountKey,
5 });
6
7const bucketName = process.env.GC_STORAGE_BUCKET;
8const folderName = process.env.GC_BUCKET_FOLDER;
9
10const bucket = storage.bucket(process.env.GC_STORAGE_BUCKET);
11const [files] = await bucket.getFiles({ prefix: folderName });
Next, it handles the file’s metadata with the File.getSignedUrl() method from the Cloud Storage Node.js client library to generate a signed URL. For other solutions, additional operations such as downloading, deleting, and updating can be used.
1// Updating URL expiration time
2const [signedUrl] = await file.getSignedUrl({
3 action: "read",
4 expires: Date.now() + 14 * 60 * 60 * 1000, // 14 hours
5 });
Finally, it updates the MongoDB database by assigning the signedUrl value to the image.url field.
1// Updating MongoDB collection
2const updateResult = await collection.updateOne(
3 { id: id },
4 { $set: { "image.url": signedUrl } }
5 );

5. Store the product catalog images in Cloud Storage

Inside Cloud Storage, the architecture contains a private bucket called “retail-product-images” (Figure 5) that stores all the products' images. Unlike public buckets, where objects can be accessed by anyone with the URL, private buckets require authentication and authorization to access the stored files.
Google Cloud Storage with one bucket inside named retail-product-images Figure 5. Cloud Storage private bucket
In retail, having access control over the catalog images helps prevent image piracy. Moreover, each image file generates metadata that is used to handle the object. Figure 6 displays the bucket object’s metadata.
The metadata that comprises the bucket file, such as: type, size, creation date, and more Figure 6.- Metadata from a bucket’s object
Out of this metadata, there are two key fields to highlight. First, the public URL, which contains the link to allow anyone access to the object—in this case, it’s not applicable. Then, the public access defines the file permissions—in this case, files are not public.

6. Inspect the product catalog schema in MongoDB

The product catalog is stored as a collection inside MongoDB, saving each article in a document. For example, Figure 7 shows the document model for women's black footwear.
The document model of a product on the left side, and on the right side, that same product displayed inside the e-commerce portal Figure 7.- Product catalog document model
Within the document, two parameters are important for the microservice. The id field represents the numeric identifier for the product’s image. This number matches the image file name inside the Cloud Storage bucket after removing its extension (e.g., .png, .jpg, .jpeg). Likewise, the image field contains the product’s image URL (string). This domain holds the public address URL that is generated by the microservice.

7. Connect the client to MongoDB

Finally, on the client's side, the Leafy Pop-up Store is built using the NextJS framework. All products are retrieved by calling a POST request to the MongoDB collection and later setting the state variable to the array.
1// Retrieving the products catalog from the API.
2const response = await axios.post("/api/getProducts", filters);
3const transformedProducts = response.data.products.map((product) => ({
4id: product.id,
5photo: product.image.url,
6 name: product.name,
7 brand: product.brand,
8 price: `${product.price.amount.toFixed(2)}`,
9 pred_price: `${product.pred_price.toFixed(2)}`,
10 items: product.items,
11}));
12
13setProducts(transformedProducts);
1// getProducts API
2import { NextResponse } from "next/server";
3import { connectToDatabase } from "../../_db/connect";
4
5export async function POST(request) {
6 const filters = await request.json();
7 const db = await connectToDatabase();
8 const collection = db.collection("products");
9
10 let queryBrand = {};
11 let queryCategory = {};
12
13 if (filters.selectedBrands && filters.selectedBrands.length > 0) {
14 queryBrand = { brand: { $in: filters.selectedBrands } };
15 }
16 if (filters.selectedCategories && filters.selectedCategories.length > 0) {
17 queryCategory = { masterCategory: { $in: filters.selectedCategories } };
18
19 }
20
21 const products = await collection
22 .find(
23 { $and: [queryBrand, queryCategory] },
24 {
25 projection: {
26 name: 1,
27 price: 1,
28 brand: 1,
29 image: 1,
30 id: 1,
31 _id: 0,
32 pred_price: 1,
33 items: 1,
34 },
35 }
36 )
37 .toArray();
38
39 return NextResponse.json({ products }, { status: 200 });
40}
Then, the client’s app loops through the products to display each one on the Leafy Pop-up Store.
1// Displaying products in the UI
2{products.map((product, index) => (
3 <div key={index}>
4 <ProductCard
5 id={product.id}
6 photo={product.photo}
7 name={product.name}
8 brand={product.brand}
9 price={product.price}
10 items={product.items}
11 />
12 </div>
13))}
The client corresponds to a Next.js application. However, it can be any other application that connects to a MongoDB Atlas database with different frameworks, servers, or programming languages.
Refer to the official documentation to learn more about the connection methods.

Conclusion

In this architecture, MongoDB Atlas and Google Cloud were seamlessly integrated to create a modern product catalog architecture for retailers. Embracing automated processes and having a strategy to store, access, and modify product data is a game-changer for retailers. It leaves time for developers to focus on creating new, differentiative features, bringing a competitive advantage to their businesses. To learn more about MongoDB’s solutions for retail, get started today with our developer data platform.
Feel free to look at the microservice inside the GitHub repository to explore the code or run the service by yourself.
Top Comments in Forums
There are no comments on this article yet.
Start the Conversation

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Efficient Sync Solutions: Cluster-to-Cluster Sync and Live Migration to Atlas


May 10, 2024 | 5 min read
Tutorial

Influence Search Result Ranking with Function Scores in Atlas Search


Feb 03, 2023 | 5 min read
Tutorial

How to Deploy MongoDB Atlas With the Atlas Kubernetes Operator


Jul 30, 2024 | 6 min read
Tutorial

How to Deploy MongoDB Atlas with Terraform on AWS


Jan 23, 2024 | 12 min read
Table of Contents