Docs Home → View & Analyze Data → Spark Connector
Configuring Spark
On this page
Overview
You can configure read and write operations in both batch and streaming mode. To learn more about the available configuration options, see the following pages:
Specify Configuration
Using SparkConf
You can specify configuration options with SparkConf
using any of
the following approaches:
The
--conf
flag at runtime. To learn more, see Dynamically Loading Spark Properties in the Spark documentation.The
$SPARK_HOME/conf/spark-default.conf
file.
The MongoDB Spark Connector will use the settings in SparkConf
as
defaults.
Using an Options Map
In the Spark API, the DataFrameReader
, DataFrameWriter
, DataStreamReader
,
and DataStreamWriter
classes each contain an option()
method. You can use
this method to specify options for the underlying read or write operation.
Note
Options specified in this way override any corresponding settings in SparkConf
.
Short-Form Syntax
Options maps support short-form syntax. You may omit the prefix when specifying an option key string.
Example
The following syntaxes are equivalent to one another:
dfw.option("spark.mongodb.write.collection", "myCollection").save()
dfw.option("spark.mongodb.collection", "myCollection").save()
dfw.option("collection", "myCollection").save()
To learn more about the option()
method, see the following Spark
documentation pages:
Using a System Property
The Spark Connector reads some configuration settings before SparkConf
is
available. You must specify these settings by using a JVM system property.
For more information on Java system properties, see the Java documentation.
Tip
Configuration Exceptions
If the Spark Connector throws a ConfigException
, confirm that your SparkConf
or options map uses correct syntax and contains only valid configuration options.