MongoDB & Node.js: Aggregation & Data Analysis (Part 2 of 4)
Rate this video
00:00:00Introduction to Aggregation Framework
The video begins with an introduction to MongoDB's aggregation framework, explaining its purpose and capabilities in data analysis.00:01:33CRUD Operations and Resources
The presenter reviews the previously covered CRUD operations and directs viewers to a corresponding blog series for those who prefer reading.00:01:26Aggregation Framework Capabilities
The capabilities of the aggregation framework are discussed, including the ability to filter documents, join collections, group documents, and perform calculations like averages and sorting.00:03:09Practical Example: Airbnb Data in Sydney
A practical example is introduced, where the presenter aims to find the cheapest one-bedroom Airbnb listings in Sydney using the aggregation framework.00:07:43Building the Aggregation Pipeline
The presenter demonstrates how to build an aggregation pipeline using MongoDB Atlas, including stages like `$match`, `$group`, `$sort`, and `$limit`.00:10:26Executing the Pipeline in Node.js
The final section shows how to execute the aggregation pipeline within a Node.js script, including exporting the pipeline from MongoDB Atlas and running the script to get the desired output.00:13:21Conclusion and Further Learning
The video concludes with a summary of the aggregation framework's benefits and encourages viewers to take a free course on MongoDB University for a deeper understanding. Links to additional resources and the MongoDB community are provided.The main theme of the video is to teach viewers how to use MongoDB's aggregation framework to analyze and process data efficiently, with a focus on constructing and executing aggregation pipelines in a Node.js environment.
🔑 Key Points
- Introduction to MongoDB's aggregation framework.
- Overview of creating and using aggregation pipelines.
- Practical example: Finding the cheapest Airbnb listings in Sydney.
- Execution of an aggregation pipeline in a Node.js script.
- Benefits of using MongoDB's aggregation framework for data analysis.
🔗 Related Links
- https://mdb.link/free-iz37fDe1XoM
- https://mdb.link/community-iz37fDe1XoM
- https://youtu.be/fbYExfeFsI0
- https://developer.mongodb.com/quickstart/node-aggregation-framework
- https://github.com/mongodb-developer/nodejs-quickstart/blob/master/aggregation.js
- https://github.com/mongodb-developer/nodejs-quickstart/blob/master/template.js
- https://university.mongodb.com/courses/M121/about
- https://twitter.com/lauren_schaefer
- https://tiktok.com/@lauren_schaefer
- https://www.linkedin.com/in/laurenjan...
- https://developer.mongodb.com/community/forums/t/hey-friends-im-lauren/168/
- https://bit.ly/3bpg1Z1
- https://bit.ly/2LjtNBZ
- https://bit.ly/3fH87gR
- https://bit.ly/3fEaIsd
- https://bit.ly/2SY9w90
- https://bit.ly/3bn9bDv
- https://bit.ly/2I8VCi5
- https://bit.ly/3fHoqdJ
Full Video Transcript
if you're just joining me in this quick start with mongodb and node.js video series welcome i'm glad to have you here when you want to analyze data stored in mongodb you can use mongodb's powerful aggregation framework to do so today i'll give you a high-level overview of the aggregation framework and show you how to use it so far we've covered how to connect to mongodb and perform each of the cred that's create read update and delete operations prefer reading along instead no worries check out the corresponding blog series that i wrote that covers the exact same content that you'll see me show here today the link is right there for you in the description and with that let's dive into the aggregation framework the aggregation framework allows you to analyze data in real time using the framework you can create an aggregation pipeline that consists of one or more stages each stage transforms the documents and passes the output to the next stage you can create multi-stage pipelines in order to do things like filter the documents join them with documents from another collection group documents together calculate an average and sort the results the aggregation framework has a variety of stages available for you to use today we'll discuss the basics of how to use match group sort and limit note that the aggregation framework has many other powerful stages including count geoneer graph lookup project unwind and others i'm hoping to visit the beautiful city of sydney australia soon sydney is a huge city with many suburbs and i'm not sure where to start looking for a cheap rental i want to know which sydney suburbs have on average the cheapest one-bedroom airbnb listings i could write a query to pull all the one-bedroom listings in the city area and then write a script to group the listings by suburb and then calculate the average price per suburb or i could write a single command using the aggregation pipeline let's use the aggregation pipeline there are a variety of ways you can create aggregation pipelines you can write them manually in a code editor or create them visually inside of mongodb atlas or mongodb compass in general i don't recommend writing pipelines manually as it's much easier to understand what your pipeline is doing and spot errors when you use a visual editor since i've been using mongodb atlas for this video series i'm just going to continue working there i'm viewing the aggregation pipeline builder in atlas the aggregation pipeline builder provides you with a visual representation of your aggregation pipeline let's begin by narrowing down the documents in the pipeline to one bedroom listings in the sydney australia market where the room type is entire home slash apartment i can do so by using the dollar match stage now i can input a query in the code box the query syntax for dollar match is the same as the find one syntax that we used in the last video for bedrooms i'll choose one for address dot country i'll say australia for address address.market i'll say sydney i'm going to be using the address.suburb field later in the pipeline so i'm just going to filter out documents where address.suburb does not exist or is represented by an empty string so i'll say address dot suburb dollar exists 1 meaning the field exists and it's not equal to an empty string and last up i'm going to set the room type to entire home slash apartment as you can see the aggregation pipeline builder automatically updates the output on the right side of the row to show a sample of 20 documents that will be included in the results after the dollar match stage is executed now that i've narrowed down the documents to one bedroom listing in the sydney australia market i'm ready to group them by suburb so i can do so by using the dollar group stage atlas provides some sample code which is helpful as i always forget the syntax for underscore id i'll input dollar address dot suburb this means that the documents will be grouped by the suburb field when i group the documents i want to calculate the average price for each group i'll set field n to average price the accumulator i want to use is dollar avg which means average for expression i'll input dollar price which means i'll be calculating an average on the price field here we can see a sample of documents after the group stage note that the documents have been transformed from those in the previous stage right they look a little different instead of having a document for each listing we now have a document for each suburb the suburb documents have only two fields underscore id which is the name of the suburb and average price now that we have the average prices for suburbs in the sydney australia market we are ready to sort them to discover which are the least expensive so we can do so by using the dollar sort stage the field i want to sort on is average price i want to sort in ascending order so i'll input one if i wanted to sort in descending order i could input negative one okay now the documents are sorted from least to most expensive i don't want to work with all of the suburb documents in my application instead i want to limit the results to the 10 least expensive suburbs so i can do so by using the dollar limit stage i'm going to input 10. now i have the 10 least expensive suburbs in sydney i've done so with just four stages of an aggregation pipeline now that i've built an aggregation pipeline let me show you how to execute it from inside of a node.js script i'm going to work from a starter template this template is available in my github repo if you'd like to follow along a link is available in the description below this template code is based on code i wrote in the first video of this series so if you have any questions about the code how what the template is doing head back to that first video i'm going to start by creating a function whose job is to print the 10 cheapest suburbs for a given market so let's make this an asynchronous function named print cheapest suburbs it's going to need several parameters first up it's going to need a connected [ __ ] client then we need to know the country and market to search and finally we need to know the maximum number of results to print all right the first thing i want to do in this function is create a pipeline so i'll say const pipeline equals and we'll just make an empty array for now i already created a pipeline in atlas so i don't need to hand code the pipeline here i'm going to head back to atlas i'll click the export pipeline code to language button let me select node as the language and then i'm going to copy this pipeline code i'll jump back to my code editor and i'm going to paste the pipeline code here now this pipeline code would work fine as written however it's hard-coded to search for 10 results in the sydney australia market i'm going to update this pipeline to be a bit more generic so i'll replace australia with the country variable and sydney with market then i'll replace the number 10 with max number to print all right this pipeline looks good now i need to actually execute this pipeline i can execute a pipeline in node.js by calling aggregate on a collection i want to aggregate on the listings and reviews collection so i'll say client dot db sample airbnb dot collection listings and reviews dot aggregate let's pass the pipeline to aggregate aggregate will return an aggregation cursor so let's assign the results to a constant named ag cursor an aggregation cursor allows traversal over the aggregation pipeline results we can use aggregation cursors for each function to iterate over those results so i'll say await ag cursor dot 4 each i'm going to create an arrow function and i'm going to name the parameter airbnb listing so for each listing i'm just going to print the average price so i'll say console.log airbnb listing dot underscore id colon airbnb listing dot average price okay my function is complete let's call it i want to print the 10 cheapest suburbs in the sydney australia market so i'll say await print cheapest suburbs i'll pass the [ __ ] client for country i'll say australia for market i'll say sydney and for max number to print i'll say 10. all right let's try it out so i'm going to save my file and let's run it excellent here we can see the 10 cheapest suburbs in the sydney market sorted from least to most expensive now i know what suburbs to begin searching as i prepare for my trip to sydney australia not too difficult right the nice thing here is that i didn't have to write multiple queries to make this happen and i didn't have to process the data in my script all of the calculations happened in mongodb the aggregation framework is an incredibly powerful way to analyze your data learning to create pipelines may seem a little intimidating at first but it is worth the investment the aggregation framework can get results to your end users faster and save you from a lot of coding today i only scratch the surface of the aggregation framework i highly recommend mongodb university's free course specifically on the aggregation framework it's called m121 the mongodb aggregation framework and did i mention it's free the course has a more thorough explanation of how the aggregation framework works and provides detail on how to use the various pipeline stages if you want to try out the code you saw me write today check out my blog series that covers the exact same content i also have a github repo so you can see the code at a glance links to both of those are available in the description below now that you have the basics of the aggregation pipeline you are ready to move on to the next tutorial all about change streams and triggers in that video you'll learn how to automatically react to changes in your database so be sure to click subscribe so you do not miss that video if you have any questions about what you saw in today's video or really any question about mongodb i encourage you to ask them in the mongodb community my teammates and i are there every day answering questions chatting with members of our community and i would love to see you there hope to see you soon you