Function score result in inaccurate score

Hello,

I am running a compound search query with a function score on embedded documents in my collection. However, the score returned does not seem to accurately match the value in the mapped field

data structure:

[
  {
    "_id": 1,
    "items": [
      { 
        "remarks": "test",  
        "ordering": 900000201000  
      }
    ]
  }
]

query:

[
  {
    $search: {
      index: "default",
      "embeddedDocument": {
        "path": "items", 
        "operator": {
          "compound": {
            "must": [
              {
                "text": {
                  "path": "items.remarks",
                  "query": "test"
                }
              }
            ],
            "score": {
              "function": {
                "path": {
                  "value": "items.ordering"
                }
              } 
            }
          }
        }
      },
      "scoreDetails": true
    }
  },
  {
   "$project": {
     "key": 1,
     "items": 1,
     "score": { "$meta": "searchScore" },
     "scoreDetails": { "$meta": "searchScoreDetails"}
   }
  }
]

Index mappings:

{
  "mappings": {
    "dynamic": false,
    "fields": {
      "items": {
        "dynamic": true,  
        "fields": {
          "remarks": {
            "type": "string"  
          },
          "ordering": {
            "type": "number"
          }
        },
        "type": "embeddedDocuments"
      }
    }
  }
}

output:

[
  {
    "_id": 1,
    "items": [
      {
        "remarks": "test",
        "ordering": 900000201000
      }
    ],
    "score": 900000186368,
    "scoreDetails": {
      "value": 900000186368,
      "description": "Score based on child docs, best match:",
      "details": [
        {
          "value": 900000186368,
          "description": "FunctionScoreQuery($embedded:5/items/$type:string/items.remarks:test, scored by items.ordering) [BM25Similarity], result of:",
          "details": [
            {
              "value": 900000186368,
              "description": "items.ordering",
              "details": []
            }
          ]
        }
      ]
    }
  }
]

Could you please help explain why the function score is not accurately reflecting the mapped field value? Any suggestions on how to properly score based on the embedded field?

Thank you in advance for any guidance or troubleshooting suggestions

It looks like you’re encountering an issue with the returned score in your MongoDB $search query not matching the exact value from the items.ordering field.

Here’s a potential explanation for why this is happening:

Problem Explanation:

The discrepancy between the items.ordering field value and the returned score could be due to how the function score is applied within the compound query and how the scoring function interacts with MongoDB’s BM25Similarity model. The score returned from the function might be influenced by normalization factors or optimizations within the scoring algorithm rather than directly reflecting the raw value from the ordering field.

In your case, it seems that the BM25Similarity scoring algorithm is being applied, and the result is a computed score based on various factors, such as document frequency, term frequency, and length normalization. This score is not guaranteed to be identical to the raw ordering field but rather an adjusted version based on how the scoring model interprets the relevance of the document.

Troubleshooting & Suggestions:

  1. Using a Custom Scoring Formula: If you want the score to directly reflect the items.ordering field without any adjustment, consider using a more straightforward scoring mechanism that bypasses BM25 or similarity-based adjustments.One approach is to explicitly define a linear function or another mathematical transformation that directly maps the ordering field to the score. You can achieve this by using a constant function in the score part of the compound query.Example:

json

Copy code

{
  $search: {
    index: "default",
    "embeddedDocument": {
      "path": "items",
      "operator": {
        "compound": {
          "must": [
            {
              "text": {
                "path": "items.remarks",
                "query": "test"
              }
            }
          ],
          "score": {
            "function": {
              "constant": {
                "value": 1
              },
              "boost": {
                "value": { "path": "items.ordering" }
              }
            }
          }
        }
      }
    },
    "scoreDetails": true
  }
}

This way, you can directly control how the score is calculated by setting a constant function and applying a boost with the field value. This could give you more control over the result.
2. Check the Scoring Model: Since MongoDB uses BM25Similarity for text-based queries, which introduces normalization and scoring weights, you could also try different scoring models if MongoDB provides alternatives (like constant, log, or linear).
3. Score Normalization: MongoDB Atlas Search or any Lucene-based full-text search engines often normalize scores to fit into a specific range (like 0-1 or 0-1000). It’s worth investigating if any normalization or scaling is applied to the ordering field during the scoring process. You could check MongoDB’s documentation or further tune the query for your use case.
4. Review the Field Type: Ensure the ordering field is correctly indexed and typed as a number in the search index mapping. Sometimes, type mismatches between the index mapping and document field can result in unexpected scoring behavior.
5. Use a Single Field for Scoring: If your query aims to score based only on items.ordering, simplifying the search query to prioritize this field (e.g., removing the text-based search if not necessary) can also help to reflect the exact value.

Conclusion:

The difference in the score and the ordering field likely stems from the BM25 similarity scoring mechanism, which adjusts raw field values. To ensure the score reflects the ordering field accurately, consider using a simpler scoring function or boosting method in the query

it doesn’t seem to work as MongoDB documentation for modifying scores in Atlas Search shows that the boost and constant options cannot be used together.

Regarding to the " Use a Single Field for Scoring", may i know that is it possible to sort on a field from a matched child document of an embeddedDocument? like sorting directly on item.ordering.

Thanks a lot