MongoDB Connector for Spark
The MongoDB Connector for Spark provides integration between MongoDB and Apache Spark.
Note
Version 10.x of the MongoDB Spark Connector is an all-new connector based on the latest Spark API. Install and migrate to version 10.x to take advantage of new capabilities, such as tighter integration with Spark Structured Streaming.
Version 10.x uses the new namespace
com.mongodb.spark.sql.connector.MongoTableProvider
.
This allows you to use old versions of the connector
(versions 3.x and earlier) in parallel with version 10.x.
To learn more about the new connector and its advantages, see the MongoDB announcement blog post.
With the connector, you have access to all Spark libraries for use with
MongoDB datasets: Dataset
for analysis with SQL (benefiting from
automatic schema inference), streaming, machine learning, and graph
APIs. You can also use the connector with the Spark Shell.
The MongoDB Spark Connector is compatible with the following versions of Apache Spark and MongoDB:
MongoDB Connector for Spark | Spark Version | MongoDB Version |
---|---|---|
10.4.0 | 3.1 through 3.5 | 4.0 or later |