Does Spark connector support "INSERT IGNORE"?

Hi.
I’m looking for “INSERT IGNORE” feature in mongodb-spark-connector.
There’s a unique key with multiple column in mongdb, and I wrote some daily batch running on spark. The batch should be retriable, I mean idempotent, so when writing to db, I want to ignore some duplicate error cases.
I’ve seen SaveMode.Overwrite implementation, but it just drop the collection. This is not I’m looking for.

Is there a way to insert and ignore the duplication error?

I recently went through this same scenario. To solve this we can use mode(“append”) and add two options (operationType and upsertDocument).
If the item exists we replace it and do not duplicate it.
Example this soluction:

df_test.write
.format(“mongodb”)
.mode(“append”)
.option(“connection.uri”, “”)
.option(“database”, “”)
.option(“collection”, “”)
.option(“ignoreNullValues”, True)
.option(“operationType”,“replace”)
.option("upsertDocument ", True)
.save()

follow the link for learn more about this configurations type:

1 Like