next up previous
Next: ESOP Minimization for Machine Up: APPLICATION OF ESOP Previous: Summary of DFC Measurements

Towards Improved Approaches to Logic Minimizers for Machine Learning

Although the AFD approach of FLASH gives very robust results, it is slow, which essentially restricts its practicality to 20 or even less, variables. It is, therefore, important, to be able to compare the FLASH decompositional logic approach to other logic approaches that use the same DFC measure, but introduce some restricted bias resulting from the assumed network's structure. Such approaches are then faster and can be used for larger variable functions. In this respect, the well-known circuit minimizer Espresso and the standard machine-learning program C4.5 were tested together with EXORCISM. These programs have the following structure/gate-type biases: Espresso assumes a two-level AND-OR network, EXORCISM assumes a two-level AND-EXOR network, C4.5 assumes an ordered tree. (The input variables can be multiple-valued).

The questions arise:

Other important question that must be faced with while developing improved minimizers for machine learning applications is the following:

This question is very important practically. Improving the performance of the FLASH system orders of magnitude without sacrificing much of its robustness (DFC) is required for making it useful for such important military applications as High Resolution Radar, for example.

The data (switching functions) used in learning and algorithm design applications by the PT group are arbitrary switching functions. Thus, the standard and generally applicable minimization procedures of ``logic synthesis' can be applied. An extremely important observation is that these functions have quite different properties than the data taken from industrial companies on which the programs are tested in the ``logic synthesis'' community (MCNC benchmarks). In theory, the algorithm should work well on any type of data. However, since all practical network minimization problems are NP-hard, all practical algorithms, by necessity, are heuristic in nature. Thus, they are very dependent on the type of data. Taking into account the data characteristics (such as closeness to unate Boolean functions) was, in principle, the main reason of the commercial success of two-level logic minimizers in circuit-design applications.

What is it that distinguishes the machine learning data from the circuit design data? Our preliminary answer is the following:

  1. ML problems have an extremely high percent of don't cares (Don't cares are combination of argument value for which the function value is not specified.) The missing data can be represented as don't cares.
  2. Arguments (variables) in ML problems are naturally multiple-valued, or continuous.
  3. ML problems involve missing values in inputs (missing fields) and conflicting data for discrete as well as continuous fields.

next up previous
Next: ESOP Minimization for Machine Up: APPLICATION OF ESOP Previous: Summary of DFC Measurements

Marek Perkowski
Tue Nov 11 17:11:29 PST 1997