A compact symbolic formulation enables mapping of an arbitrarily complex decision tree of a certain type into a highly computationally efficient multidimensional software object. The type of decision trees to which this formulation applies is that known in the art as the Boolean class of balanced decision trees. Parallel lateral slices of an object created by means of this formulation can be executed in constant time — considerably less time than would otherwise be required.
Decision trees of various forms are incorporated into almost all large software systems. A decision tree is a way of hierarchically solving a problem, proceeding through a set of true/false responses to a conclusion. By definition, a decision tree has a treelike structure, wherein each internal node denotes a test on an attribute, each branch from an internal node represents an outcome of a test, and leaf nodes represent classes or class distributions that, in turn represent possible conclusions. The drawback of decision trees is that execution of them can be computationally expensive (and, hence, time-consuming) because each non-leaf node must be examined to determine whether to progress deeper into a tree structure or to examine an alternative. The present formulation was conceived as an efficient means of representing a decision tree and executing it in as little time as possible.
The formulation involves the use of a set of symbolic algorithms to transform a decision tree into a multi-dimensional object, the rank of which equals the number of lateral non-leaf nodes. The tree can then be executed in constant time by means of an order-one table lookup. The sequence of operations performed by the algorithms is summarized as follows:
- Determination of whether the tree under consideration can be encoded by means of this formulation.
- Extraction of decision variables.
- Symbolic optimization of the decision tree to minimize its form.
- Expansion and transformation of all nested conjunctive-disjunctive paths to a flattened conjunctive form composed only of equality checks when possible.
If each reduced conjunctive form contains only equality checks and all of these forms use the same variables, then the decision tree can be reduced to an order-one operation through a table lookup. The speedup to order one is accomplished by distributing each decision variable over a surface of a multidimensional object by mapping the equality constant to an index.
The disadvantage of this formulation is that it requires mapping of each equality constant to a small range in order to keep the multidimensional object small. However, if each constant is reduced to a small range through preprocessing, then the result will be optimal in both time and space.
This work was done by Mark James of Caltech for NASA's Jet Propulsion Laboratory. For further information, access the Technical Support Package (TSP) free on-line at www.techbriefs.com/tsp under the Information Sciences category.
The software used in this innovation is available for commercial licensing. Please contact Karina Edmonds of the California Institute of Technology at (626) 395-2322. Refer to NPO-42004.