AnnouncementIntroducing MongoDB 8.0, the fastest MongoDB ever! Read more >
NEWSLearn why MongoDB was named a leader in the 2024 Gartner® Magic Quadrant™ Read the blog >
AnnouncementIntroducing Search Demo Builder, the newest addition to the Atlas Search Playground Learn more >

Oxy Saves $4 Million with Native MongoDB Solution That Extracts 1.5 Million Documents

Photo of a man working with documents.

INDUSTRY

Energy and Environmental

PRODUCT

MongoDB Atlas

USE CASE

Content Management,
Modernization,
Single View

CUSTOMER SINCE

2018
During MongoDB .local Austin 2024, Occidental Petroleum’s Alexander Lach and Andrew Pruet discuss how their team saved $4 million and 12 months of their team’s time by building a solution on MongoDB Atlas.
THEIR CHALLENGE

Alleviating manual document review using MongoDB Atlas for Oxy

Since its founding in 1920, Occidental Petroleum (Oxy) has accumulated more than 12 million land-lease agreements, which govern the use of oil, gas, and minerals. “There are hundreds of people who use these documents every day,” said Alexander Lach, Artificial Intelligence (AI) Development Manager at Oxy. “It takes a lot of effort to find the information you need.”

Oxy had planned to hire 30 contractors to manually review a batch of 1.5 million documents, classify them, and extract pertinent information. But the company’s engineers thought they could significantly cut the project allocation of $4 million and 18 months by using a mix of cloud-based solutions that integrate with MongoDB Atlas.

OUR SOLUTION

Building a multi-cloud solution to accommodate thousands of requests per second

With a data layer built on MongoDB, Oxy developed an automated, event-driven approach that combines serverless computing from Amazon Web Services (AWS) and off-the-shelf large language models (LLMs). “We went into this with zero experience using MongoDB,” said Lach. “What was amazing for us was that we were able to get something up and running in a matter of days.”
“We went into this with zero experience using MongoDB. What was amazing for us was that we were able to get something up and running in a matter of days.”

Alexander Lach, Artificial Intelligence Development Manager at Oxy

Oxy’s automated document review and classification system operates fully in the cloud, transforming data management through the use of MongoDB Atlas. Documents are scanned into Oxy’s enterprise content management system as PDFs and moved automatically to Amazon Simple Storage Service (Amazon S3). An event-driven architecture built around AWS Lambda, Amazon’s serverless computing function, concurrently processes files and sends them to Microsoft Azure AI Document Intelligence. Using advanced machine learning techniques for optical character recognition, the solution extracts text so that the LLM can properly classify the documents in one of 140 categories. Additionally, the solution extracts key metadata, such as dates or complex legal terms.
Automated document review and classification system diagram.
At first, Oxy intended to use MongoDB Atlas solely as a data layer to link physical documents to their correlated digital files. But certain uncommon scenarios — such as a gigabyte-sized PDF — were clogging the pipeline. Instead of refactoring the architecture, the engineers simply changed the code in the data model. They shifted to stateful documents so that the system could pick up where it left off in case of failure. “We went from making a few requests to MongoDB to making thousands of requests a second to MongoDB, and it never really caused an issue,” said Andrew Pruet, Production Engineer Adviser at Oxy.
OUTCOME

Saving more than $4 million and 12 months through automated processes in the cloud

The automated system that the company built using MongoDB Atlas saved Oxy $4 million and 12 months. The company plans to apply the architecture to other areas of the business, anticipating saving a further $3 million.

Engineers are experimenting with additional features of MongoDB. Using MongoDB Atlas Vector Search, they ran a proof of concept. This resulted in a retrieval time of less than 200 milliseconds as they searched more than 100 GB of text across millions of documents.

“I was just blown away,” said Pruet. “We’re excited for what MongoDB has to offer in the future.”

“I was just blown away. We’re excited for what MongoDB has to offer in the future.”

Andrew Pruet, Production Operations Engineer Adviser at Oxy

What will your story be?

MongoDB will help you find the best solution.