Docs Menu
Docs Home
/
Spark Connector

Configuring Spark

On this page

  • Overview
  • Specify Configuration

You can configure read and write operations in both batch and streaming mode. To learn more about the available configuration options, see the following pages:

  • Batch Read Configuration Options

  • Batch Write Configuration Options

  • Streaming Read Configuration Options

  • Streaming Write Configuration Options

You can specify configuration options with SparkConf using any of the following approaches:

  • The SparkConf constructor in your application. To learn more, see the Java SparkConf documentation.

The MongoDB Spark Connector will use the settings in SparkConf as defaults.

In the Spark API, the DataFrameReader, DataFrameWriter, DataStreamReader, and DataStreamWriter classes each contain an option() method. You can use this method to specify options for the underlying read or write operation.

Note

Options specified in this way override any corresponding settings in SparkConf.

Options maps support short-form syntax. You may omit the prefix when specifying an option key string.

Example

The following syntaxes are equivalent to one another:

  • dfw.option("spark.mongodb.write.collection", "myCollection").save()

  • dfw.option("spark.mongodb.collection", "myCollection").save()

  • dfw.option("collection", "myCollection").save()

To learn more about the option() method, see the following Spark documentation pages:

The Spark Connector reads some configuration settings before SparkConf is available. You must specify these settings by using a JVM system property.

For more information on Java system properties, see the Java documentation.

Tip

Configuration Exceptions

If the Spark Connector throws a ConfigException, confirm that your SparkConf or options map uses correct syntax and contains only valid configuration options.

Back

Getting Started with the Spark Connector