Set Up a Federated Database Instance for Your Dataset - Preview
On this page
This page guides you through the steps for creating a federated database instance for you Data Lake dataset.
Prerequisites
Before you begin, you must have the following:
An Atlas Data Lake dataset in the same project where you intend to create the federated database instance.
Project Owner
role for the project where you want to create the federated database instance.
Procedure
Log in to MongoDB Atlas.
Go to Atlas Data Federation in the Atlas UI.
If it's not already displayed, select the organization that contains your project from the Organizations menu in the navigation bar.
If it's not already displayed, select your project from the Projects menu in the navigation bar.
In the sidebar, click Data Federation under the Services heading.
Create virtual databases, collections, and views and map them to your Data Lake dataset.
Follow the steps in the tab below for your preferred Editor view in the UI.
(Optional) Click the for the:
Federated Database Instance to specify a name for your federated database instance. Defaults to
FederatedDatabaseInstance[n]
.Database to edit the database name. Defaults to
Database[n]
.Corresponds to
databases.[n].name
JSON configuration setting.Collection to edit the collection name. Defaults to
Collection[n]
.Corresponds to
databases.[n].collections.name
JSON configuration setting.View to edit the view name.
You can click:
Add Database to add databases and collections.
associated with the database to add collections to the database.
associated with the collection to add views on the collection. To create a view, you must specify:
The name of the view.
The pipeline to apply to the view.
Note
The view definition pipeline can't include the
$out
or the$merge
stage. If the view definition includes nested pipeline stages such as$lookup
or$facet
, this restriction applies to those nested pipelines as well.To learn more about views, see:
associated with the database, collection, or view to remove it.
Note
The sample queries that you can run later in this tutorial use the names
Database0
for the virtual database name andCollection0
for the virtual collection name. If you modify the names here, make sure to modify the names in the sample queries also before you run them.Drag and drop the Data Lake Dataset to map with the collection.
Example
If you are creating a Federated Database Instance for the Atlas Data Lake dataset that you created for the sample data using the examples in Create an Atlas Data Lake Pipeline - Preview:
Under Datasets, select Ingestion Pipeline from the dropdown if it isn't already selected.
Under Data Lake Dataset section, drag the dataset named
sample_mflix.movies
and drop it under the collection.
Corresponds to
databases.[n].collections.[n].dataSources
JSON configuration setting.
Define your dataset as a data store in your Federated Database Instance storage configuration.
Edit the JSON configuration settings shown in the UI for
stores
. Yourstores
cofiguration setting should resemble the following:{ "stores": [ { "name": "<store-name>", "provider": "<cloud-storage-provider-name>", "region": "<cloud-storage-provider-region>" } ] } To learn more about these settings, see Storage Configuration For Atlas Data Lake Datasets.
Example
If you are creating a Federated Database Instance for the Atlas Data Lake pipeline that you created for the sample data using the examples in Create an Atlas Data Lake Pipeline - Preview, replace the
stores
in the JSON configuration settings shown in the UI with the following:{ "stores": [ { "name": "dls-store-us-east-1", "provider": "dls:aws", "region": "US_EAST_1" } ] } Define virtual databases, collections, and views for your dataset in the Atlas Data Federation storage configuration.
{ "databases": [ { "name": "<database-name>", "collections": [ { "name": "<collection-name>", "dataSources": [ { "storeName": "<store-name>", "datasetName": "<snapshot-name>" } ] } ], "views": [] } ] } To learn more about these settings, see Storage Configuration For Atlas Data Lake Datasets.
Example
If you are creating a Federated Database Instance for the Atlas Data Lake dataset that you created for the sample data using the examples in Create an Atlas Data Lake Pipeline - Preview, replace the
databases
in the JSON configuration settings shown in the UI with the following:{ "databases": [ { "name": "Database0", "collections": [ { "name": "Collection0", "dataSources": [ { "storeName": "dls-store-us-east-1", "datasetName": "v1$atlas$snapshot$dlsTest$sample_mflix$movies$$.<snapshot-id>" } ] } ], "views": [] } ] }
Next Steps
Now that you've created a Federated Database Instance for your Data Lake dataset, proceed to Connect to Your Federated Database Instance - Preview.