/ /

Escrever

Página inicial do Docs

/ /

Modo de lote

Escrever

Página inicial do Docs

Opções de configuração de gravação em lote

Visão geral

Você pode configurar as seguintes propriedades ao gravar dados no MongoDB no modo de lote.

Observação

Se você usa o SparkConf para definir as configurações de gravação do conector, insira spark.mongodb.write. como prefixo em cada propriedade.

Nome da propriedade

Descrição

connection.uri

Required.
The connection string configuration key.

Default: mongodb://localhost:27017/

database

Required.

The database name configuration.

collection

Required.

The collection name configuration.

comment

The comment to append to the write operation. Comments appear in the
output of the Database Profiler.

Default: None

mongoClientFactory

MongoClientFactory configuration key.
You can specify a custom implementation that must implement the
com.mongodb.spark.sql.connector.connection.MongoClientFactory
interface.

Default: com.mongodb.spark.sql.connector.connection.DefaultMongoClientFactory

convertJson

Specifies if the connector parses string values and converts extended JSON
into BSON.

This setting accepts the following values:

any: o conector converte todos os valores JSON em BSON.
- "{a: 1}" torna-se {a: 1}.
- "[1, 2, 3]" torna-se [1, 2, 3].
- "true" torna-se true.
- "01234" torna-se 1234.
- "{a:b:c}" não se altera.
objectOrArrayOnly: o conector converte apenas objetos e matrizes JSON em BSON.
- "{a: 1}" torna-se {a: 1}.
- "[1, 2, 3]" torna-se [1, 2, 3].
- "true" não se altera.
- "01234" não se altera.
- "{a:b:c}" não se altera.
false: o conector converte todos os valores em strings.

Default: false

idFieldList

Specifies a field or list of fields by which to split the collection data. To specify more than one field, separate them using a comma as shown in the following example:

"fieldName1,fieldName2"

Default: _id

ignoreNullValues

When true, the connector ignores any null values when writing,
including null values in arrays and nested documents.

Default: false

maxBatchSize

Specifies the maximum number of operations to batch in bulk
operations.

Default: 512

operationType

Specifies the type of write operation to perform. You can set this to one of the following values:

insert: insere os dados.
replace: substitui um documento existente que corresponda ao valor idFieldList inserindo os novos dados. Se não houver correspondência, o valor de upsertDocument indica se um novo documento é inserido pelo conector.
update: atualiza um documento existente que corresponda ao valor idFieldList inserindo os novos dados. Se não houver correspondência, o valor de upsertDocument indica se um novo documento é inserido pelo conector.

Default: replace

ordered

Specifies whether to perform ordered bulk operations.

Default: true

upsertDocument

When true, replace and update operations insert the data
if no match exists.

For time series collections, you must set upsertDocument to
false.

Default: true

writeConcern.w

Specifies w, a write-concern option requesting acknowledgment that
the write operation has propagated to a specified number of MongoDB
nodes.

For a list of allowed values for this option, see WriteConcern
w Option in the MongoDB Server
manual.

Default: Acknowledged

writeConcern.journal

Specifies j, a write-concern option requesting acknowledgment that
the data has been written to the on-disk journal for the criteria
specified in the w option. You can specify either true or
false.

For more information on j values, see WriteConcern j
Option in the MongoDB Server
manual.

writeConcern.wTimeoutMS

Specifies wTimeoutMS, a write-concern option to return an error
when a write operation exceeds the specified number of milliseconds. If you
use this optional setting, you must specify a nonnegative integer.

For more information on wTimeoutMS values, see
WriteConcern wtimeout in
the MongoDB Server manual.

Especificando propriedades em `connection.uri`

Se você usar o SparkConf para especificar qualquer uma das configurações anteriores, poderá incluí-las na configuração connection.uri ou listá-las individualmente.

O exemplo de código a seguir mostra como especificar o banco de dados, a coleção e a configuração convertJson como parte da configuração connection.uri:

spark.mongodb.write.connection.uri=mongodb://127.0.0.1/myDB.myCollection?convertJson=any

Para manter o connection.uri curto e facilitar a leitura das configurações, você pode especificá-las individualmente:

spark.mongodb.write.connection.uri=mongodb://127.0.0.1/
spark.mongodb.write.database=myDB
spark.mongodb.write.collection=myCollection
spark.mongodb.write.convertJson=any

Importante

Se você especificar uma configuração em connection.uri e em sua própria linha, a configuração connection.uri terá precedência. Por exemplo, na configuração abaixo, o banco de dados de conexão é foobar:

spark.mongodb.write.connection.uri=mongodb://127.0.0.1/foobar
spark.mongodb.write.database=bar

Voltar

Escrever

Modo de transmissão

Visão geral

Observação

Especificando propriedades em connection.uri

Importante

Especificando propriedades em `connection.uri`