Query Cache

On this page

Usage
Interactions With Fibers
Query Matching
Limits
Cache Invalidation
Manual Cache Invalidation
Transactions
Aggregations
System Collections
Query Cache Middleware
Rack Middleware
Active Job Middleware

The MongoDB Ruby driver provides a built-in query cache. When enabled, the query cache saves the results of previously-executed find and aggregation queries. When those same queries are performed again, the driver returns the cached results to prevent unnecessary roundtrips to the database.

Usage

The query cache is disabled by default. It can be enabled on the global scope as well as within the context of a specific block. The driver also provides a Rack middleware to enable the query cache automatically for each web request.

To enable the query cache globally:

Mongo::QueryCache.enabled = true

Similarly, to disable it globally:

Mongo::QueryCache.enabled = false

To enable the query cache within the context of a block:

Mongo::QueryCache.cache do
  Mongo::Client.new([ '127.0.0.1:27017' ], database: 'music') do |client|
    client['artists'].find(name: 'Flying Lotus').first
    #=> Queries the database and caches the result
    client['artists'].find(name: 'Flying Lotus').first
    #=> Returns the previously cached result
  end
end

And to disable the query cache in the context of a block:

Mongo::QueryCache.uncached do
  Mongo::Client.new([ '127.0.0.1:27017' ], database: 'music') do |client|
    client['artists'].find(name: 'Flying Lotus').first
    #=> Sends the query to the database; does NOT cache the result
    client['artists'].find(name: 'Flying Lotus').first
    #=> Queries the database again
  end
end

You may check whether the query cache is enabled at any time by calling Mongo::QueryCache.enabled?, which will return true or false.

Interactions With Fibers

The Query cache enablement flag is stored in fiber-local storage (using Thread.current. This, in principle, permits query cache state to be per fiber, although this is not currently tested.

There are methods in the Ruby standard library, like Enumerable#next, that utilize fibers in their implementation. These methods would not see the query cache enablement flag when it is set by the applications, and subsequently would not use the query cache. For example, the following code does not utilize the query cache despite requesting it:

Mongo::QueryCache.enabled = true
client['artists'].find({}, limit: 1).to_enum.next
# Issues the query again.
client['artists'].find({}, limit: 1).to_enum.next

Rewriting this code to use first instead of next would make it use the query cache:

Mongo::QueryCache.enabled = true
client['artists'].find({}, limit: 1).first
# Utilizes the cached result from the first query.
client['artists'].find({}, limit: 1).first

Query Matching

A query is eligible to use cached results if it matches the original query that produced the cached results. Two queries are considered matching if they are identical in the following values:

Namespace (the database and collection on which the query was performed)
Selector (for aggregations, the aggregation pipeline stages)
Skip
Sort
Projection
Collation
Read Concern
Read Preference

For example, if you perform one query, and then perform a mostly identical query with a different sort order, those queries will not be considered matching, and the second query will not use the cached results of the first.

Limits

When performing a query with a limit, the query cache will reuse an existing cached query with a larger limit if one exists. For example:

Mongo::QueryCache.cache do
  Mongo::Client.new([ '127.0.0.1:27017' ], database: 'music') do |client|
    client['artists'].find(genre: 'Rock', limit: 10)
    #=> Queries the database and caches the result
    client['artists'].find(genre: 'Rock', limit: 5)
    #=> Returns the first 5 results from the cached query
    client['artists'].find(genre: 'Rock', limit: 20)
    #=> Queries the database again and replaces the previously cached query results
  end
end

Cache Invalidation

The query cache is cleared in part or in full on every write operation. Most write operations will clear the results of any queries were performed on the same collection that is being written to. Some operations will clear the entire query cache.

The following operations will clear cached query results on the same database and collection (including during bulk writes):

insert_one
update_one
replace_one
update_many
delete_one
delete_many
find_one_and_delete
find_one_and_update
find_one_and_replace

The following operations will clear the entire query cache:

aggregation with $merge or $out pipeline stages
commit_transaction
abort_transaction

Manual Cache Invalidation

You may clear the query cache at any time with the following method:

Mongo::QueryCache.clear

This will remove all cached query results.

Transactions

Queries are cached within the context of a transaction, but the entire cache will be cleared when the transaction is committed or aborted.

Mongo::QueryCache.cache do
  Mongo::Client.new([ '127.0.0.1:27017' ], database: 'music') do |client|
    session = client.start_session
    session.with_transaction do
      client['artists'].insert_one({ name: 'Fleet Foxes' }, session: session)
      client['artists'].find({}, session: session).first
      #=> { name: 'Fleet Foxes' }
      #=> Queries the database and caches the result
      client['artists'].find({}, session: session).first
      #=> { name: 'Fleet Foxes' }
      #=> Returns the previously cached result
      session.abort_transaction
    end
    client['artists'].find.first
    #=> nil
    # The query cache was cleared on abort_transaction
  end
end

Note

Transactions are often performed with a "snapshot" read concern level. Keep in mind that a query with a "snapshot" read concern cannot return cached results from a query without the "snapshot" read concern, so it is possible that a transaction may not use previously cached queries.

To understand when a query will use a cached result, see the Query Matching section.

Aggregations

The query cache also caches the results of aggregation pipelines. For example:

Mongo::QueryCache.cache do
  Mongo::Client.new([ '127.0.0.1:27017' ], database: 'music') do |client|
    client['artists'].aggregate([ { '$match' => { name: 'Fleet Foxes' } } ]).first
    #=> Queries the database and caches the result
    client['artists'].aggregate([ { '$match' => { name: 'Fleet Foxes' } } ]).first
    #=> Returns the previously cached result
  end
end

Note

Aggregation results are cleared from the cache during every write operation, with no exceptions.

System Collections

MongoDB stores system information in collections that use the database.system.* namespace pattern. These are called system collections.

Data in system collections can change due to activity not triggered by the application (such as internal server processes) and as a result of a variety of database commands issued by the application. Because of the difficulty of determining when the cached results for system collections should be expired, queries on system collections bypass the query cache.

You may read more about system collections in the MongoDB documentation.

Note

Even when the query cache is enabled, query results from system collections will not be cached.

Query Cache Middleware

Rack Middleware

The driver provides a Rack middleware which enables the query cache for the duration of each web request. Below is an example of how to enable the query cache middleware in a Ruby on Rails application:

# config/application.rb
# Add Mongo::QueryCache::Middleware at the bottom of the middleware stack
# or before other middleware that queries MongoDB.
config.middleware.use Mongo::QueryCache::Middleware

Please refer to the Rails on Rack guide for more information about using Rack middleware in Rails applications.

Active Job Middleware

The driver provides an Active Job middleware which enables the query cache for each job. Below is an example of how to enable the query cache Active Job middleware in a Ruby on Rails application:

# config/application.rb
ActiveSupport.on_load(:active_job) do
  include Mongo::QueryCache::Middleware::ActiveJob
end

Back

Geospatial Search

GridFS