Fast Query-Optimized Kernel-Machine Classification

NASA's Jet Propulsion Laboratory, Pasadena, California

A recently developed algorithm performs kernel-machine classification via incremental approximate nearest support vectors. The algorithm implements support-vector machines (SVMs) at speeds 10 to 100 times those attainable by use of conventional SVM algorithms. The algorithm offers potential benefits for classification of images, recognition of speech, recognition of handwriting, and diverse other applications in which there are requirements to discern patterns in large sets of data.

SVMs constitute a subset of kernel machines (KMs), which have become popular as models for machine learning and, more specifically, for automated classification of input data on the basis of labeled training data. While similar in many ways to k-nearest-neighbors (k-NN) models and artificial neural networks (ANNs), SVMs tend to be more accurate. Using representations that scale only linearly in the numbers of training examples, while exploring nonlinear (kernelized) feature spaces that are exponentially larger than the original input dimensionality, KMs elegantly and practically overcome the classic "curse of dimensionality." However, the price that one must pay for the power of KMs is that query-time complexity scales linearly with the number of training examples, making KMs often orders of magnitude more computationally expensive than are ANNs, decision trees, and other popular machine learning alternatives.

The present algorithm treats an SVM classifier as a special form of a k-NN. The algorithm is based partly on an empirical observation that one can often achieve the same classification as that of an exact KM by using only small fraction of the nearest support vectors (SVs) of a query.

The exact KM output is a weighted sum over the kernel values between the query and the SVs. In this algorithm, the KM output is approximated with a k-NN classifier, the output of which is a weighted sum only over the kernel values involving k selected SVs. Before query time, there are gathered statistics about how misleading the output of the k-NN model can be, relative to the outputs of the exact KM for a representative set of examples, for each possible k from 1 to the total number of SVs. From these statistics, there are derived upper and lower thresholds for each step k. These thresholds identify output levels for which the particular variant of the k-NN model already leans so strongly positively or negatively that a reversal in sign is unlikely, given the weaker SV neighbors still remaining.

At query time, the partial output of each query is incrementally updated, stopping as soon as it exceeds the predetermined statistical thresholds of the current step. For an easy query, stopping can occur as early as step k = 1. For more difficult queries, stopping might not occur until nearly all SVs are touched. A key empirical observation is that this approach can tolerate very approximate nearest-neighbor orderings. In experiments, SVs and queries were projected to a subspace comprising the top few principal- component dimensions and neighbor orderings were computed in that subspace. This approach ensured that the overhead of the nearest-neighbor computations was insignificant, relative to that of the exact KM computation.

This work was done by Dominic Mazzoni and Dennis DeCoste of Caltech for NASA's jet Propulsion Laboratory. For further information, access the Technical Support Package (TSP) free on-line at www.techbriefs.com/tsp under the Information Sciences category.

The software used in this innovation is available for commercial licensing. Please contact Don Hart of the California Institute of Technology at (818) 393-3425. Refer to NPO-40441.

This Brief includes a Technical Support Package (TSP).

Fast-Query-Optimized Kernal-Machine Classification

(reference NPO-40441) is currently available for download from the TSP library.

Don't have an account?

Overview

The document is a Technical Support Package for the Fast Query-Optimized Kernel-Machine Classification, identified as NPO-40441, published by NASA Tech Briefs. It serves as a comprehensive resource detailing advancements in aerospace-related technologies that have potential applications beyond their original context. The package is part of NASA's Commercial Technology Program, which aims to promote the dissemination of innovative technologies developed through NASA's research efforts.

The document emphasizes the importance of making these technological advancements accessible to a wider audience, including industries and researchers who can benefit from them. It highlights the collaborative nature of NASA's work, encouraging partnerships and the sharing of knowledge to foster further innovation.

Included in the package are various technical details and methodologies related to the kernel-machine classification technique, which is designed to optimize query performance in data processing tasks. This technique is particularly relevant in fields that require efficient data analysis and machine learning applications, such as aerospace engineering, robotics, and artificial intelligence.

The document also provides contact information for the NASA Scientific and Technical Information (STI) Program Office, which serves as a resource for additional information and support. The STI Program Office can be reached through their website or directly via phone and email, ensuring that users have access to further assistance regarding the technologies discussed in the package.

Moreover, the document includes a disclaimer stating that the United States Government and its representatives do not assume liability for the use of the information contained within. It clarifies that any mention of trade names or manufacturers is for identification purposes only and does not imply endorsement by NASA.

In summary, this Technical Support Package is a valuable resource for understanding the advancements in kernel-machine classification technology, its applications, and the broader implications for various industries. It reflects NASA's commitment to sharing knowledge and fostering innovation in aerospace and related fields, while also providing avenues for further inquiry and support.

October 1, 2004 | Information Technology

Fast Query-Optimized Kernel-Machine Classification

Computation is accelerated by an order of magnitude, without loss of accuracy.

This Brief includes a Technical Support Package (TSP).

Fast-Query-Optimized Kernal-Machine Classification

Overview

Top Stories

Blog: Lighting

Blog: Energy

INSIDER: Energy

Quiz: Energy

Blog: Physical Sciences

Blog: Materials

Webcasts

Upcoming Webinars: Test & Measurement

Upcoming Webinars: Aerospace

Upcoming Webinars: Test & Measurement

Upcoming Webinars: Power

Upcoming Webinars: Automotive

Upcoming Webinars: Test & Measurement

October 1, 2004
| Information Technology