In many fields of spectroscopy, classification of the measured multivariate response is not always a straightforward process. Accurate classification is limited by many factors that are highly dependent upon the spectroscopy method being utilized. In all cases, however, the classification process is complicated by the complexity of the multivariate response, the quality and size of the database being utilized, and the ever present, clouding contaminants of noise and undesired background. A new classification scheme addresses these complications with a very fast, simple, and easily implemented algorithm that allows one to easily and quite accurately identify individual features of spectroscopic mixtures. The method is applied to Raman spectroscopy with the use of the RRUFF database for classification of mineralogical mixtures acquired using a time-gated Raman spectrometer.
By leveraging widely utilized constrained regression techniques and internally modifying the algorithm implementations, a multiple-stage classification approach was developed to identify a linear mixture of Raman spectra (or other multivariate spectroscopic responses) with the aid of a large database that contains many groups of correlated predictors. The algorithm is very fast and easily implemented. The power of the method is based upon a first stage of constrained regression, called the elastic net, which retains the advantages of both least absolute shrinkage and selection operator (LASSO) and ridge regression constrained regression methods. The power and simplicity in implementation of the algorithm, via a modified coordinate descent algorithm, makes it more computationally attractive for mineral classification than other successfully applied classification methods. The modifications to the algorithm force non-negative coefficients and allow for a significant improvement in algorithm convergence speed and performance simply by optimizing the weights in a permuted order. A second stage of regression is employed via a non-negative reduced gradient descent algorithm with weight thresholding on the retained groups of predictors to form a more parsimonious and mathematically accurate unbiased model.
Unlike the methods of LASSO and independent component analysis, the method is capable of retaining groups of correlated predictors while simultaneously preserving sparsity, which is essential when rock samples contain multiple mineral phases that share similar Raman features. Furthermore, because the algorithm is easily implemented, has very few tuning parameters, does not require extensive parameter training, and does not require data dimensionality reduction prior to classification, it is very attractive for realtime classification in portable Raman spectrometer systems. Not only is this method useful for classification of Raman spectra, but it is also widely applicable to any other spectroscopic classification scheme accompanied by a large database.
This work was done by Corey J. Cochrane of Caltech for NASA’s Jet Propulsion Laboratory.