next up previous
Next: Towards Improved Approaches to Up: APPLICATION OF ESOP Previous: The Basic Research Ideas

Summary of DFC Measurements and Applications

A number of experiments have been conducted to assess the generality of DFC across different problem domains (see [18, 8, 19]. The DFC of over 800 non-randomly generated functions was measured, including many classes of functions (numeric, symbolic, chaotic, string-based, graph-based, images and files). Roughly 98 percent of the non-randomly generated functions had low DFC (versus less than 1 percent for random functions). The 2 percent that did not decompose were the more complex of the non-randomly generated functions rather than some class of low complexity that AFD could not deal with. It is important to note that when AFD says the DFC is low, which it did some 800 times, it also provides an algorithm (or a description of the pattern or features found). AFD found the classical algorithms for a number of functions. Each of these algorithms are represented in a combinational form that includes intermediate states. These intermediate states are features in the sense of concentrating information.

There is also high correlation between DFC and a person's judgment of the strength of a pattern in an image, the degree of compression by a data compression program, and the Lyapunov exponents of logistic functions. This shows DFC's ability to reflect patterns within each domain, despite their different nature.

In learning experiments, AFD has been compared to back-propagation trained Neural Networks (NN), AIM, C4.5, SMOG [15] and standard logic minimization tools. These comparisons used a broad range of function types (from the 800 mentioned above). For each of the test functions, AFD performed near the best of any method, while the other methods generalized well on some functions but not on others.

In the context of ML, when one talks about ``noise," it is assumed that he is referring to the situation where you have some training data classified correctly as a ``1" or a ``0" but then ``noise" flips that bit to the incorrect entry. Another situation is that the value of some minterm is or becomes unknown - a ``don't care". This would be referred to as ``unknown" in the ML community. ML techniques can be used to restore noisy images [4]. Of course, we do not know which pixels (or feature values) are ``noisy" - so we treat those that are most suspicious as don't cares.

The more robust generalization performance of AFD, including dealing with noise and unknown data, is a reflection of its robust measure of complexity, DFC. It is also an indication that the useful inductive bias of these various standard methods results from their parsimonious tendencies rather than their particular representation schemes (be they thresholded weighted sums, polynomials, decision trees or Boolean operators).


next up previous
Next: Towards Improved Approaches to Up: APPLICATION OF ESOP Previous: The Basic Research Ideas

Marek Perkowski
Tue Nov 11 17:11:29 PST 1997