Retrieval algorithms like that used by the Orbiting Carbon Observatory (OCO)-2 mission generate massive quantities of data of varying quality and reliability. A computationally efficient, simple method of labeling problematic data points or predicting soundings that will fail is required for basic operation, given that only 6% of the retrieved data may be operationally processed. This method automatically obtains a filter designed to reduce scatter based on a small number of input features.
Most machine-learning filter construction algorithms attempt to predict error in the CO2 value. By using a surrogate goal of Mean Monthly STDEV, the goal is to reduce the retrieved CO2 scatter rather than solving the harder problem of reducing CO2 error. This lends itself to improved interpretability and performance.
This software reduces the scatter of retrieved CO2 values globally based on a minimum number of input features. It can be used as a prefilter to reduce the number of soundings requested, or as a post-filter to label data quality. The use of the MMS (Mean Monthly Standard deviation) provides a much cleaner, clearer filter than the standard ABS(CO2-truth) metrics previously employed by competitor methods.
The software’s main strength lies in a clearer (i.e., fewer features required) filter that more efficiently reduces scatter in retrieved CO2 rather than focusing on the more complex (and easily removed) bias issues.
This work was done by Lukas Mandrake of Caltech for NASA’s Jet Propulsion Laboratory.
This software is available for commercial licensing. Please contact Dan Broderick at
This Brief includes a Technical Support Package (TSP).

Scatter-Reducing Sounding Filtration Using a Genetic Algorithm and Mean Monthly Standard Deviation
(reference NPO-48255) is currently available for download from the TSP library.
Don't have an account?
Overview
The document is a Technical Support Package associated with NASA Tech Brief NPO-48255, focusing on scatter-reducing sounding filtration using a genetic algorithm and mean monthly standard deviation. It is produced by the Jet Propulsion Laboratory (JPL) at the California Institute of Technology and aims to disseminate results from aerospace-related developments that have broader technological, scientific, or commercial applications.
The primary objective of the document is to provide insights into the methodologies and technologies developed for improving data retrieval quality in atmospheric measurements. One of the key components discussed is the Retrieval Quality Estimation (RQE), which is designed to perform comparably to expert systems, such as those created by Chris O’dell. The RQE allows for adjustable transparency, enabling users to select varying amounts of data for analysis. It identifies critical parameters that correlate with the quality of data retrieval and facilitates the sorting of soundings based on their likely utility. Importantly, the RQE is designed to be unbiased, not favoring specific geographic regions or timespans, and incorporates both Total Carbon Column Observing Network (TCCON) and Satellite-based High-resolution Atmospheric (SHA) data as truth metrics.
The document notes that the RQE has been completed for certain data types, specifically Land H-gain and Glint, but not for M-gain due to insufficient data. This highlights the ongoing challenges in data collection and analysis within the field.
Additionally, the document emphasizes the importance of collaboration and support from NASA's Innovative Partnerships Office, which provides further assistance and resources for those interested in the research and technology discussed. The contact information for the office is provided, encouraging engagement with the broader scientific community.
Overall, this Technical Support Package serves as a valuable resource for understanding the advancements in data retrieval techniques and their implications for atmospheric science and related fields. It underscores NASA's commitment to sharing knowledge and fostering innovation in technology that can have wide-ranging applications beyond aerospace.

