Docs Menu
Docs Home
/
MongoDB Connector for BI
/

Cached Sampling

On this page

  • Overview
  • User Permissions for Cached Sampling

New in version 2.3:

By default, mongosqld samples each collection on the connected MongoDB instance and generates a relational representation of the schema which it then caches in memory.

Note

If you have authentication enabled, ensure that your MongoDB user has the correct permissions. See User Permissions below.

By default, mongosqld does not automatically resample data after generating the schema. Specify the --sampleRefreshIntervalSecs option to direct mongosqld to automatically resample the data and regenerate the schema on a fixed schedule.

If the schema which mongosqld creates does not meet your BI workload needs, you can manually generate a schema file file and edit it as necessary.

See Sampling Mode Reference Chart for more information on sampling modes.

If your MongoDB instance uses authentication and you wish to use cached sampling, your BI Connector instance must also use authentication. The admin user that connects to MongoDB via the mongosqld program must have permission to read from all the namespaces from which you want to sample data.

If you wish to sample all namespaces, the admin user requires the following privileges:

  • listDatabases on the cluster

  • listCollections on each database

  • find on each database

Alternatively, create a user with the built-in readAnyDatabase role:

use admin
db.createUser(
{
user: "<username>",
pwd: "<password>",
roles: [
{ "role": "readAnyDatabase", "db": "admin" }
]
}
)

Note

Be aware of all privileges included with the readAnyDatabase role before granting it to a user.

To sample all namespaces, start mongosqld without the --sampleNamespaces option.

mongosqld --auth --mongo-username <username> --mongo-password <password>

If you wish to sample specific namespaces, the admin user requires the following privileges:

  • listCollections for each database where all collections are sampled

  • find on each collection or each database where all collections are sampled

Alternatively, create a user with the built-in readAnyDatabase role. For an example of creating a user with this role, see the Sample All Namespaces section.

Note

Be aware of all privileges included with the readAnyDatabase role before granting it to a user.

The following example creates a custom role in the mongo shell with the minimum required privileges to sample every collection in the test database:

1
use admin
db.createRole(
{
role: "samplingReader",
privileges: [
{
resource: {
db: "test",
collection: ""
},
actions: [ "find", "listCollections" ]
}
],
roles: []
}
)
2
db.createUser(
{
user: "<username>",
pwd: "<password>",
roles: [ "samplingReader" ]
}
)

Note

The user in the example above does not have the listDatabases privilege, so you must specify a database to sample data from with the --sampleNamespaces option when running mongosqld.

3

Run mongosqld with authentication enabled and use the --sampleNamespaces option to sample data from all collections in the test database:

mongosqld --auth --mongo-username <username> --mongo-password <password> \
--sampleNamespaces 'test.*'

Tip

See also:

Back

Map Relational Schemas to MongoDB