Docs Menu
Docs Home
/
MongoDB Atlas
/ /

Atlas SQL Language Reference

On this page

  • Compatability and Limitations
  • Type System
  • Data Types
  • Type Conversions
  • Schemas
  • Clauses
  • SELECT
  • FROM
  • WHERE
  • GROUP BY
  • HAVING
  • ORDER BY
  • LIMIT and OFFSET
  • Set operations
  • Expressions
  • Identifiers
  • Aliases
  • Keywords
  • Literals
  • Parenthesized Expressions
  • Operators
  • Subquery Expressions
  • Document and Field-Access Expressions
  • Null and Missing
  • Comments

This page outlines the syntax and semantics of Atlas SQL, a SQL dialect designed to provide first-class support for working with MongoDB's flexible document model. This page lists and describes supported clauses, operators, expressions, and functions.

  • Atlas SQL is based on the SQL-92 standard. However Atlas SQL is not fully SQL-92 compatible due to the following limitations:

    • The UNION function is not supported. However, UNION ALL is supported.

    • The date data type is not supported. Use timestamp instead.

    • SELECT DISTINCT is not supported.

    • Interval and date interval arithmetic are not supported.

  • Atlas SQL does not support Atlas Vector Search and Atlas Search.

Atlas SQL data types are the set of BSON types. All of these types can be queried in Atlas SQL. They are:

  • String (STRING)

  • Document (DOCUMENT)

  • Array (ARRAY)

  • BinData (BINDATA)

  • ObjectId (OBJECTID)

  • Boolean (BOOL)

  • Date (TIMESTAMP)

  • Null (NULL)

  • Regex (REGEX)

  • 32-bit integer (INT)

  • Double (DOUBLE)

  • Long (LONG)

  • Timestamp (BSON_TIMESTAMP)

  • Decimal (DECIMAL)

  • MinKey (MINKEY)

  • MaxKey (MAXKEY)

  • DBPointer (DBPOINTER)

  • Symbol (SYMBOL)

  • Javascript with scope (JAVASCRIPTWITHSCOPE)

  • JavaScript (JAVASCRIPT)

Each type in Atlas SQL has a name (in parentheses above), which is a keyword that can be used to reference the type when necessary (e.g. in an expression like CAST).

Explicit type conversions are expressed via the CAST function or the :: operator. Numeric types are all mutually comparable; Atlas SQL allows operations between the various numeric types without casting the operands to be the same numeric type.

Atlas SQL converts MongoDB's flexible document values into types using a schema. An Atlas SQL schema is a collection of facts about an expression or collection that are known to be true at compile time.

For example, an Atlas SQL schema might dictate that an expression is either a boolean or a document with subfields, or that an expression is either an array with a length of one or a positive integer.

If a static type constraint is not satisfied, the query will fail to compile.

Atlas SQL queries support a basic set of SQL clauses. The available clauses are:

SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
OFFSET
LIMIT

SELECT begins every Atlas SQL query. Atlas SQL allows SELECT VALUE and SELECT VALUES to be used interchangeably.

Atlas SQL requires nested SELECT statements to have an alias.

SELECT foo FROM (SELECT bar FROM baz) as subSelect

Atlas SQL supports the CAST() funtion, which allows you to dynamically convert values in your query to a given data type.

SELECT * FROM table WHERE period_start_utc >= CAST('2023-01-01T00:00:00.000Z' AS TIMESTAMP)

FROM is the first clause evaluated in every Atlas SQL query.

FROM can pull data from various sources including collections (SELECT * FROM foo), arrays (SELECT * FROM [{'a': 1}]), joins (SELECT * FROM a JOIN b), derived tables (SELECT * FROM (SELECT a FROM foo) d), and FLATTEN and UNWIND.

The WHERE clause is a filter on incoming data. Its expression must statically have type BOOL or NULL and may evaluate to MISSING.

GROUP BY provides a means for grouping and aggregating data.

If using GROUP BY on a field that has been flattened using FLATTEN or UNWIND, an alias is required.

SELECT customerAge, COUNT(*) FROM Sales GROUP BY customer.age AS customerAge

Atlas SQL supports the following aggregation functions.

Name
Description
Notes

ADD_TO_ARRAY

Pushes the argument to the end of an array. The total output of this function will be an array.

The argument to ADD_TO_ARRAY can have any type.

ADD_TO_SET

Pushes the argument to the end of an array removing duplicates. The total output of this function will be an array with all duplicate items removed. Duplicates are determined using the = operator.

The argument to ADD_TO_SET can have any type.

AVG

Returns the average of all the arguments.

The argument must be statically typed to a numeric type.

COUNT

Counts the number of elements. COUNT(*) counts all values unconditionally. COUNT(<expression>) counts all values for which the expression does not result in NULL or MISSING.

The argument to COUNT can have any type.

FIRST

Returns the first element in the group.

Deterministic only when the input has deterministic order, otherwise undefined.

LAST

Returns the first element in the group. Deterministic only when the input has deterministic order, otherwise undefined.

The argument to LAST can have any type.

MAX

Returns the maximum element as ordered by the Atlas SQL > operator.

The argument must be statically typed to be comparable via the > operator.

MERGE_DOCUMENTS

Returns a document formed by successively merging documents, with the previous element used as the left hand side. In the case of duplicate keys, the value of the key in the new element is kept. As with FIRST and LAST, the output is only deterministic when the input has deterministic ordering.

The argument must be statically typed as DOCUMENT, and thus MERGE_DOCUMENTS(DISTINCT x) is not supported.

MIN

Returns the minimum element as ordered by the Atlas SQL < operator.

The argument must be statically typed to be comparable via the < operator.

STDDEV_POP

Returns the standard deviation of all elements over the entire group population.

The argument must be statically typed to a numeric type. See stdDevPop.

STDDEV_SAMP

Returns the standard deviation of a sample of all elements in the group. See stdDevPop.

The argument must be statically typed to a numeric type.

SUM

Returns the sum of all the arguments.

The argument must be statically typed to a numeric type.

The HAVING clause operates in the same way as a WHERE clause, but after the GROUP BY clause. Like the WHERE clause, the HAVING clause takes an expression that must statically have type BOOL or NULL and may evaluate to MISSING. It can reference aliases defined in the GROUP BY and can contain expressions with aggregation functions. Only aliases defined in the GROUP BY are available to the HAVING clause.

The ORDER BY clause provides a way to order a result set by one or more sort keys. Each sort key can be either a column reference, or an integer literal referring to a SELECT expression by its position in the select expression list. Sort keys that are column references can be compound identifiers. These compound identifiers can be qualified with datasource names or refer to document subfields.

Atlas SQL sorts MISSING before NULL, and NULL before all other values. The ORDER BY clause requires that all possible values in a sort key expression can be statically verified to be comparable via the > (greater than) and < (less than) operators.

The LIMIT and OFFSET clauses allow users to retrieve only some of the rows returned by a query. If a LIMIT number is provided, no more than that number of rows will be returned. If an OFFSET number is provided, that number of rows is skipped before returning rows.

Both LIMIT and OFFSET numbers must be positive integers. Using LIMIT or OFFSET without ORDER BY does not guarantee the same result.

When LIMIT and OFFSET are both set, the OFFSET rows will be skipped before returning the rest of the results, which should contain no more than the LIMIT number rows.

LIMIT i, j is a shorter form of LIMIT i OFFSET j.

LIMIT and OFFSET can be used in subqueries.

The UNION ALL set operator returns a single result set for two SELECT queries. The UNION ALL operator does not remove duplicate rows from the result set. The result set returned by the UNION ALL operator does not have a defined order.

Atlas SQL does not support distinct UNION, INTERSECT, or EXCEPT set operations.

Identifiers in Atlas SQL refer to databases, tables, and columns. Atlas SQL identifiers support all UTF-8 characters except the null character \x00.

In Atlas SQL, some identifiers are restricted to avoid conflicts with characters that have other semantic meaning; for an identifier to include such a character, it must be delimited, that is, surrounded by double quotes or backticks. For example, an identifier must be delimited if it begins with a digit or if it conflicts with a reserved keyword (e.g. "10cent"). Identifiers are case-sensitive, whether delimited or not.

Identifiers are used for all aliases in Atlas SQL. In most cases, Atlas SQL returns an error if an alias is used more than once in the same clause. The exception to this is that aliases can be repeated on both sides of a UNION ALL. This also applies to automatically generated aliases.

Atlas SQL keywords (such as SELECT, FROM, JOIN, etc.) cannot be used as undelimited identifiers.

Atlas SQL supports literals for booleans, null, numbers, and strings. Strings are enclosed in single quotes. To include a single quote character in a string literal, double it ('o''clock').

Literal integers are typed as INT when within the 32-bit signed integer range and LONG otherwise. Literal floating point numbers or scientific notation numbers have type DOUBLE.

Atlas SQL does not support literals for every type. For example OBJECTID, BSON_DATE, and DECIMAL have no literal syntax. For such types, pseudo-literal values can be obtained by using the CAST operator to get from a string or numeric representation of those types to their respective type. Some types may also have "constructor" functions which alias the relevant CAST invocations.

A parenthesized expression is an expression grouped by parentheses. Any time infix operators are present, the need for parentheses (or a similar mechanism) to distinguish order of operations may be necessary. Atlas SQL has several infix operators, such as + and ::. For example, the value of 1 + 2 * 3 is 7, while the value of (1 + 2) * 3 is 9.

Atlas SQL supports the following basic operators:

  • +

  • -

  • *

  • /

  • ||

  • <

  • <=

  • !=

  • ==

  • >

  • >=

  • BETWEEN

  • AND

  • OR

  • NOT

A subquery is a SQL query within a query. You can use a subquery anywhere an expression can be used.

Atlas SQL supports scalar subquery and table subquery. A scalar subquery returns a result set with zero or one row and one column. It can be used in most places a literal or single column value is valid. A table subquery returns zero or more rows and one or more columns.

Documents can be represented with a syntax similar to JSON objects. Keys must be strings and values can have any of the supported types. To access document fields, Atlas SQL supports two options: "dot" notation and "bracket" notation.

Dot notation is similar to field access in MongoDB aggregation. For example, if a document doc contains a field f, then the expression doc.f is used to access the value of that field. Bracket notation uses square brackets ([ and ]) around a field name to access the field with that name. For example, consider the same document described before: doc["f"] is used to access the value of that field.

BSON distinguishes between NULL and MISSING. In the case of NULL there is a field with the literal value NULL, whereas in the case of MISSING, the field is gone.

Comments are sequences of characters within queries that do not impact query execution. Atlas SQL supports both standard SQL comments and C-style block comments.

Standard SQL comments begin with double dashes and end with a new line:

\-- This is a standard SQL comment

Block comments begin with \* and end at the matching occurrence of */.

\* This is a
multiline comment
*/

Back

Errors