Cached Sampling
On this page
Overview
New in version 2.3:
By default, mongosqld
samples each collection on the
connected MongoDB instance and generates a relational representation
of the schema which it then caches in memory.
Note
If you have authentication
enabled,
ensure that your MongoDB user has the correct permissions. See
User Permissions below.
By default, mongosqld
does not automatically resample
data after generating the schema. Specify the
--sampleRefreshIntervalSecs
option to direct
mongosqld
to automatically resample the data and
regenerate the schema on a fixed schedule.
If the schema which mongosqld
creates does not meet your
BI workload needs, you can manually generate a schema file file and edit it as necessary.
See Sampling Mode Reference Chart for more information on sampling modes.
User Permissions for Cached Sampling
If your MongoDB instance uses authentication and you wish to use cached sampling, your
BI Connector instance must also use authentication. The admin user that
connects to MongoDB via the mongosqld
program must
have permission to read from all the namespaces
from which you want to sample data.
Sample All Namespaces
If you wish to sample all namespaces, the admin user requires the following privileges:
listDatabases
on the clusterlistCollections
on each databasefind
on each database
Alternatively, create a user with the built-in readAnyDatabase role:
use admin db.createUser( { user: "<username>", pwd: "<password>", roles: [ { "role": "readAnyDatabase", "db": "admin" } ] } )
Note
Be aware of all privileges included with the readAnyDatabase role before granting it to a user.
To sample all namespaces, start mongosqld
without the
--sampleNamespaces
option.
mongosqld --auth --mongo-username <username> --mongo-password <password>
Sample Specific Namespaces
If you wish to sample specific namespaces, the admin user requires the following privileges:
listCollections
for each database where all collections are sampledfind
on each collection or each database where all collections are sampled
Alternatively, create a user with the built-in readAnyDatabase role. For an example of creating a user with this role, see the Sample All Namespaces section.
Note
Be aware of all privileges included with the readAnyDatabase role before granting it to a user.
The following example creates a
custom role in the
mongo shell with the
minimum required privileges to sample every collection in the test
database:
Create a new user and assign the newly created role to them
db.createUser( { user: "<username>", pwd: "<password>", roles: [ "samplingReader" ] } )
Note
The user in the example above does not have the
listDatabases
privilege, so you must specify a
database to sample data from with the
--sampleNamespaces
option when running mongosqld
.
Start mongosqld
with authentication enabled
Run mongosqld
with authentication enabled and use
the --sampleNamespaces
option to sample data from all collections in the test
database:
mongosqld --auth --mongo-username <username> --mongo-password <password> \ --sampleNamespaces 'test.*'