Docs Menu

Monitoring a Self-Managed MongoDB Deployment

Monitoring is a critical component of all database administration. A firm grasp of MongoDB's reporting will allow you to assess the state of your database and maintain your deployment without crisis. Additionally, a sense of MongoDB's normal operational parameters will allow you to diagnose problems before they escalate to failures.

This document presents an overview of the available monitoring utilities and the reporting statistics available in MongoDB. It also introduces diagnostic strategies and suggestions for monitoring replica sets and sharded clusters.

MongoDB provides various methods for collecting data about the state of a running MongoDB instance:

  • MongoDB distributes a set of utilities that provides real-time reporting of database activities.

  • MongoDB provides various database commands that return statistics regarding the current database state with greater fidelity.

  • MongoDB Atlas is a cloud-hosted database-as-a-service for running, monitoring, and maintaining MongoDB deployments.

  • MongoDB Cloud Manager is a hosted service that monitors running MongoDB deployments to collect data and provide visualization and alerts based on that data.

  • MongoDB Ops Manager is an on-premises solution available in MongoDB Enterprise Advanced that monitors running MongoDB deployments to collect data and provide visualization and alerts based on that data.

Each strategy can help answer different questions and is useful in different contexts. These methods are complementary.

This section provides an overview of the reporting methods distributed with MongoDB. It also offers examples of the kinds of questions that each method is best suited to help you address.

The MongoDB distribution includes a number of utilities that quickly return statistics about instances' performance and activity. Typically, these are most useful for diagnosing issues and assessing normal operation.

mongostat captures and returns the counts of database operations by type (e.g. insert, query, update, delete, etc.). These counts report on the load distribution on the server.

다음을 사용하세요. mongostat to understand the distribution of operation types and to inform capacity planning. See the mongostat reference page for details.

mongotop tracks and reports the current read and write activity of a MongoDB instance, and reports these statistics on a per collection basis.

다음을 사용하세요. mongotop to check if your database activity and use match your expectations. See the mongotop reference page for details.

MongoDB includes a number of commands that report on the state of the database.

These data may provide a finer level of granularity than the utilities discussed above. Consider using their output in scripts and programs to develop custom alerts, or to modify the behavior of your application in response to the activity of your instance. The db.currentOp() method is another useful tool for identifying the database instance's in-progress operations.

The serverStatus command, or db.serverStatus() from the shell, returns a general overview of the status of the database, detailing disk usage, memory use, connection, journaling, and index access. The command returns quickly and does not impact MongoDB performance.

serverStatus outputs an account of the state of a MongoDB instance. This command is rarely run directly. In most cases, the data is more meaningful when aggregated, as one would see with monitoring tools including MongoDB Cloud Manager and Ops Manager. Nevertheless, all administrators should be familiar with the data provided by serverStatus.

The dbStats command, or db.stats() from the shell, returns a document that addresses storage use and data volumes. The dbStats reflect the amount of storage used, the quantity of data contained in the database, and object, collection, and index counters.

Use this data to monitor the state and storage capacity of a specific database. This output also allows you to compare use between databases and to determine the average 문서 size in a database.

The collStats or db.collection.stats() from the shell that provides statistics that resemble dbStats on the collection level, including a count of the objects in the collection, the size of the collection, the amount of disk space used by the collection, and information about its indexes.

The replSetGetStatus command (rs.status() from the shell) returns an overview of your replica set's status. The replSetGetStatus document details the state and configuration of the replica set and statistics about its members.

Use this data to ensure that replication is properly configured, and to check the connections between the current host and the other members of the replica set.

These are monitoring tools provided as a hosted service, usually through a paid subscription.

이름
참고 사항

MongoDB Cloud Manager is a cloud-based suite of services for managing MongoDB deployments. MongoDB Cloud Manager provides monitoring, backup, and automation functionality. For an on-premises solution, see also Ops Manager, available in MongoDB Enterprise Advanced.

VividCortex provides deep insights into MongoDB production database workload and query performance -- in one-second resolution. Track latency, throughput, errors, and more to ensure scalability and exceptional performance of your application on MongoDB.

Dashboard for MongoDB, MongoDB specific alerts, replication failover timeline and iPhone, iPad and Android mobile apps.

IBM has an Application Performance Management SaaS offering that includes monitor for MongoDB and other applications and middleware.

New Relic offers full support for application performance management. In addition, New Relic Plugins and Insights enable you to view monitoring metrics from Cloud Manager in New Relic.

Infrastructure monitoring to visualize the performance of your MongoDB deployments.

Monitoring, Anomaly Detection and Alerting SPM monitors all key MongoDB metrics together with infrastructure incl. Docker and other application metrics, e.g. Node.js, Java, NGINX, Apache, HAProxy or Elasticsearch. SPM provides correlation of metrics and logs.

Pandora FMS provides the PandoraFMS-mongodb-monitoring plugin to monitor MongoDB.

During normal operation, mongod and mongos instances report a live account of all server activity and operations to either standard output or a log file. The following runtime settings control these options.

  • quiet. Limits the amount of information written to the log or output.

  • verbosity. Increases the amount of information written to the log or output. You can also modify the logging verbosity during runtime with the logLevel parameter or the db.setLogLevel() method in the shell.

  • path. Enables logging to a file, rather than the standard output. You must specify the full path to the log file when adjusting this setting.

  • logAppend. Adds information to a log file instead of overwriting the file.

참고

You can specify these configuration operations as the command line arguments to mongod or mongos.

예를 들면 다음과 같습니다.

mongod -v --logpath /var/log/mongodb/server1.log --logappend

Starts a mongod instance in verbose mode, appending data to the log file at /var/log/mongodb/server1.log/.

The following database commands also affect logging:

MongoDB Enterprise에서만 사용할 수 있습니다.

mongod 또는 mongosredactClientLogData와 함께 실행되는 경우, 주어진 로그 이벤트를 로깅하기 전에 모든 메시지를 삭제하여 이벤트와 관련된 메타데이터, 소스 파일 또는 줄 번호만 남깁니다. redactClientLogData는 진단 세부 정보를 희생하여 잠재적으로 민감한 정보가 시스템 로그에 입력되는 것을 방지합니다.

For example, the following operation inserts a document into a mongod running without log redaction. The mongod has the log verbosity level set to 1:

db.clients.insertOne( { "name" : "Joe", "PII" : "Sensitive Information" } )

이 작업은 다음 로그 이벤트를 생성합니다.

{
"t": { "$date": "2024-07-19T15:36:55.024-07:00" },
"s": "I",
"c": "COMMAND",
...
"attr": {
"type": "command",
...
"appName": "mongosh 2.2.10",
"command": {
"insert": "clients",
"documents": [
{
"name": "Joe",
"PII": "Sensitive Information",
"_id": { "$oid": "669aea8792c7fd822d3e1d8c" }
}
],
"ordered": true,
...
}
...
}
}

mongodredactClientLogData와 함께 실행되어 동일한 삽입 작업을 수행하면 다음과 같은 로그 이벤트가 생성됩니다.

{
"t": { "$date": "2024-07-19T15:36:55.024-07:00" },
"s": "I",
"c": "COMMAND",
...
"attr": {
"type": "command",
...
"appName": "mongosh 2.2.10",
"command": {
"insert": "###",
"documents": [
{
"name": "###",
"PII": "###",
"_id": "###"
}
],
"ordered": "###",
...
}
...
}
}

redactClientLogData를 저장 시 암호화TLS/SSL(전송 암호화)과 함께 사용하면 규제 요건을 준수하는 데 도움이 됩니다.

As you develop and operate applications with MongoDB, you may want to analyze the performance of the database as the application. MongoDB 성능 discusses some of the operational factors that can influence performance.

Beyond the basic monitoring requirements for any MongoDB instance, for replica sets, administrators must monitor replication lag. "Replication lag" refers to the amount of time that it takes to copy (i.e. replicate) a write operation on the 기본 to a 보조. Some small delay period may be acceptable, but significant problems emerge as replication lag grows, including:

  • Growing cache pressure on the primary.

  • Operations that occurred during the period of lag are not replicated to one or more secondaries. If you're using replication to ensure data persistence, exceptionally long delays may impact the integrity of your data set.

  • If the replication lag exceeds the length of the operation log (oplog) then MongoDB will have to perform an initial sync on the secondary, copying all data from the 기본 and rebuilding all indexes. [1] This is uncommon under normal circumstances, but if you configure the oplog to be smaller than the default, the issue can arise.

    참고

    The size of the oplog is only configurable during the first run using the --oplogSize argument to the mongod command, or preferably, the oplogSizeMB setting in the MongoDB configuration file. If you do not specify this on the command line before running with the --replSet option, mongod will create a default sized oplog.

    By default, the oplog is 5 percent of total available disk space on 64-bit systems. For more information about changing the oplog size, see the 자체 관리 복제본 세트 멤버의 Oplog 크기 변경.

관리자는 majority committed 지연을 구성 가능한 최댓값 flowControlTargetLagSeconds 이하로 유지하는 것을 목표로 프라이머리가 쓰기를 적용하는 속도를 제한할 수 있습니다.

기본적으로 흐름 제어는 enabled 입니다.

참고

흐름 제어가 작동하려면 복제본 세트/샤딩된 클러스터에 4.2featureCompatibilityVersion(fCV)majority enabled 읽기 우려 사항이 있어야 합니다. 즉, fCV가 4.2 가 아니거나 읽기 문제 과반수가 비활성화된 경우 활성화된 흐름 제어는 효과가 없습니다.

다음도 참조하세요. 복제 지연 확인

Replication issues are most often the result of network connectivity issues between members, or the result of a 기본 that does not have the resources to support application and replication traffic. To check the status of a replica, use the replSetGetStatus or the following helper in the shell:

rs.status()

The replSetGetStatus reference provides a more in-depth overview view of this output. In general, watch the value of optimeDate, and pay particular attention to the time difference between the 기본 and the 보조 members.

[1] 2}가 삭제되는 것을 방지하기 위해 oplog가 구성된 크기 제한을 초과하여 커질 수 majority commit point 있습니다.

이제 복제본 세트의 세컨더리 멤버가 느린 작업 임곗값보다 오래 걸리는 oplog 항목을 기록합니다. 이러한 느린 oplog 메시지의 특성은 다음과 같습니다.

  • diagnostic log에 세컨더리 멤버에 대해 기록합니다.

  • applied op: <oplog entry> took <num>ms 텍스트와 함께 REPL 구성 요소 아래에 기록됩니다.

  • 로그 수준(시스템 또는 구성 요소 수준)에 의존하지 않습니다.

  • 프로파일링 수준에 의존하지 않습니다.

  • slowOpSampleRate의 영향을 받습니다.

프로파일러는 느린 oplog 항목을 캡처하지 않습니다.

In most cases, the components of sharded clusters benefit from the same monitoring and analysis as all other MongoDB instances. In addition, clusters require further monitoring to ensure that data is effectively distributed among nodes and that sharding operations are functioning appropriately.

다음도 참조하세요.

See the 샤딩 documentation for more information.

The config database maintains a map identifying which documents are on which shards. The cluster updates this map as chunks move between shards. When a configuration server becomes inaccessible, certain sharding operations become unavailable, such as moving chunks and starting mongos instances. However, clusters remain accessible from already-running mongos instances.

Because inaccessible configuration servers can seriously impact the availability of a sharded cluster, you should monitor your configuration servers to ensure that the cluster remains well balanced and that mongos instances can restart.

MongoDB Cloud Manager and Ops Manager monitor config servers and can create notifications if a config server becomes inaccessible. See the MongoDB Cloud Manager 문서 and Ops Manager documentation for more information.

The most effective 샤딩된 클러스터 deployments evenly balance chunks among the shards. To facilitate this, MongoDB has a background 밸런서 process that distributes data to ensure that chunks are always optimally distributed among the shards.

Issue the db.printShardingStatus() or sh.status() command to the mongos from within mongosh. This returns an overview of the entire cluster including the database name, and a list of the chunks.

To check the lock status of the database, connect to a mongos instance using mongosh. Issue the following command sequence to switch to the config database and display all outstanding locks on the shard database:

use config
db.locks.find()

The balancing process takes a special "balancer" lock that prevents other balancing activity from transpiring. In the config database, use the following command to view the "balancer" lock.

db.locks.find( { _id : "balancer" } )

The primary of the CSRS config server holds the "balancer" lock, using a process ID named "ConfigServer". This lock is never released. To determine if the balancer is running, see 밸런서가 실행 중인지 확인.

참고

The Storage Node Watchdog is available in both the Community and MongoDB Enterprise editions.

The Storage Node Watchdog monitors the following MongoDB directories to detect filesystem unresponsiveness:

참고

MongoDB 6.1부터는 저널링이 항상 활성화됩니다. 결과적으로 MongoDB는 storage.journal.enabled 옵션과 해당 --journal--nojournal 명령줄 옵션을 제거합니다.

By default, the Storage Node Watchdog is disabled. You can only enable the Storage Node Watchdog on a mongod at startup time by setting the watchdogPeriodSeconds parameter to an integer greater than or equal to 60. However, once enabled, you can pause the Storage Node Watchdog and restart during runtime. See watchdogPeriodSeconds parameter for details.

If any of the filesystems containing the monitored directories become unresponsive, the Storage Node Watchdog terminates the mongod and exits with a status code of 61. If the mongod is the 기본 of a replica set, the termination initiates a 장애 조치, allowing another member to become primary.

Once a mongod has terminated, it may not be possible to cleanly restart it on the same machine.

참고

Symlinks

If any of its monitored directories is a symlink to other volumes, the Storage Node Watchdog does not monitor the symlink target.

For example, if the mongod uses storage.directoryPerDB: true (or --directoryperdb) and symlinks a database directory to another volume, the Storage Node Watchdog does not follow the symlink to monitor the target.

The maximum time the Storage Node Watchdog can take to detect an unresponsive filesystem and terminate is nearly twice the value of watchdogPeriodSeconds.