$topN (aggregation accumulator)
On this page
Definition
Syntax
{ $topN: { n: <expression>, sortBy: { <field1>: <sort order>, <field2>: <sort order> ... }, output: <expression> } }
n
limits the number of results per group and has to be a positive integral expression that is either a constant or depends on the_id
value for$group
.sortBy specifies the order of results, with syntax similar to
$sort
.output
represents the output for each element in the group and can be any expression.
Behavior
Null and Missing Values
$topN
does not filter out null values.$topN
converts missing values to null which are preserved in the output.
db.aggregate( [ { $documents: [ { playerId: "PlayerA", gameId: "G1", score: 1 }, { playerId: "PlayerB", gameId: "G1", score: 2 }, { playerId: "PlayerC", gameId: "G1", score: 3 }, { playerId: "PlayerD", gameId: "G1"}, { playerId: "PlayerE", gameId: "G1", score: null } ] }, { $group: { _id: "$gameId", playerId: { $topN: { output: [ "$playerId", "$score" ], sortBy: { "score": 1 }, n: 3 } } } } ] )
In this example:
$documents
creates the literal documents that contain player scores.$group
groups the documents bygameId
. This example has only onegameId
,G1
.PlayerD
has a missing score andPlayerE
has a nullscore
. These values are both considered as null.The
playerId
andscore
fields are specified asoutput : ["$playerId"," $score"]
and returned as array values.Because of the
sortBy: { "score" : 1 }
, the null values are sorted to the front of the returnedplayerId
array.
[ { _id: 'G1', playerId: [ [ 'PlayerD', null ], [ 'PlayerE', null ], [ 'PlayerA', 1 ] ] } ]
BSON Data Type Sort Ordering
When sorting different types, the order of BSON data types is used to determine ordering. As an example, consider a collection whose values consist of strings and numbers.
In an ascending sort, string values are sorted after numeric values.
In a descending sort, string values are sorted before numeric values.
db.aggregate( [ { $documents: [ { playerId: "PlayerA", gameId: "G1", score: 1 }, { playerId: "PlayerB", gameId: "G1", score: "2" }, { playerId: "PlayerC", gameId: "G1", score: "" } ] }, { $group: { _id: "$gameId", playerId: { $topN: { output: ["$playerId","$score"], sortBy: {"score": -1}, n: 3 } } } } ] )
In this example:
PlayerA
has an integer score.PlayerB
has a string"2"
score.PlayerC
has an empty string score.
Because the sort is in descending { "score" : -1 }
, the string
literal values are sorted before PlayerA
's numeric score:
[ { _id: "G1", playerId: [ [ "PlayerB", "2" ], [ "PlayerC", "" ], [ "PlayerA", 1 ] ] } ]
Restrictions
Window Function and Aggregation Expression Support
$topN
is not supported as a
aggregation expression.
$topN
is supported as a
window operator
.
Memory Limit Considerations
Groups within the $topN
aggregation pipeline are subject to the
100 MB limit pipeline limit. If this
limit is exceeded for an individual group, the aggregation fails
with an error.
Examples
Consider a gamescores
collection with the following documents:
db.gamescores.insertMany([ { playerId: "PlayerA", gameId: "G1", score: 31 }, { playerId: "PlayerB", gameId: "G1", score: 33 }, { playerId: "PlayerC", gameId: "G1", score: 99 }, { playerId: "PlayerD", gameId: "G1", score: 1 }, { playerId: "PlayerA", gameId: "G2", score: 10 }, { playerId: "PlayerB", gameId: "G2", score: 14 }, { playerId: "PlayerC", gameId: "G2", score: 66 }, { playerId: "PlayerD", gameId: "G2", score: 80 } ])
Find the Three Highest Scores
You can use the $topN
accumulator to find the highest scoring
players in a single game.
db.gamescores.aggregate( [ { $match : { gameId : "G1" } }, { $group: { _id: "$gameId", playerId: { $topN: { output: ["$playerId", "$score"], sortBy: { "score": -1 }, n:3 } } } } ] )
The example pipeline:
Uses
$match
to filter the results on a singlegameId
. In this case,G1
.Uses
$group
to group the results bygameId
. In this case,G1
.Uses sort by
{ "score": -1 }
to sort the results in descending order.Specifies the fields that are output from
$topN
withoutput : ["$playerId"," $score"]
.Uses
$topN
to return the top three documents with the highestscore
for theG1
game withn : 3
.
The operation returns the following results:
[ { _id: 'G1', playerId: [ [ 'PlayerC', 99 ], [ 'PlayerB', 33 ], [ 'PlayerA', 31 ] ] } ]
The SQL equivalent to this query is:
SELECT T3.GAMEID,T3.PLAYERID,T3.SCORE FROM GAMESCORES AS GS JOIN (SELECT TOP 3 GAMEID,PLAYERID,SCORE FROM GAMESCORES WHERE GAMEID = 'G1' ORDER BY SCORE DESC) AS T3 ON GS.GAMEID = T3.GAMEID GROUP BY T3.GAMEID,T3.PLAYERID,T3.SCORE ORDER BY T3.SCORE DESC
Finding the Three Highest Score Documents Across Multiple Games
You can use the $topN
accumulator to find the highest scoring
players in each game.
db.gamescores.aggregate( [ { $group: { _id: "$gameId", playerId: { $topN: { output: [ "$playerId","$score" ], sortBy: { "score": -1 }, n: 3 } } } } ] )
The example pipeline:
Uses
$group
to group the results bygameId
.Specifies the fields that are output from
$topN
withoutput : ["$playerId", "$score"]
.Uses sort by
{ "score": -1 }
to sort the results in descending order.Uses
$topN
to return the top three documents with the highestscore
for each game withn: 3
.
The operation returns the following results:
[ { _id: 'G1', playerId: [ [ 'PlayerC', 99 ], [ 'PlayerB', 33 ], [ 'PlayerA', 31 ] ] }, { _id: 'G2', playerId: [ [ 'PlayerD', 80 ], [ 'PlayerC', 66 ], [ 'PlayerB', 14 ] ] } ]
The SQL equivalent to this query is:
SELECT PLAYERID,GAMEID,SCORE FROM( SELECT ROW_NUMBER() OVER (PARTITION BY GAMEID ORDER BY SCORE DESC) AS GAMERANK, GAMEID,PLAYERID,SCORE FROM GAMESCORES ) AS T WHERE GAMERANK <= 3 ORDER BY GAMEID
Computing n
Based on the Group Key for $group
You can also assign the value of n
dynamically. In this example,
the $cond
expression is used on the gameId
field.
db.gamescores.aggregate([ { $group: { _id: {"gameId": "$gameId"}, gamescores: { $topN: { output: "$score", n: { $cond: { if: {$eq: ["$gameId","G2"] }, then: 1, else: 3 } }, sortBy: { "score": -1 } } } } } ] )
The example pipeline:
Uses
$group
to group the results bygameId
.Specifies the fields that are output from
$topN
withoutput : "$score"
.If the
gameId
isG2
thenn
is 1, otherwisen
is 3.Uses sort by
{ "score": -1 }
to sort the results in descending order.
The operation returns the following results:
[ { _id: { gameId: 'G1' }, gamescores: [ 99, 33, 31 ] }, { _id: { gameId: 'G2' }, gamescores: [ 80 ] } ]