DATA MINING AND PATTERN RECOGNITION.



Link to the main class syllabus.

CAPSTONE PROJECT 1999: EASY INTERFACE TO A MEDICAL DATA MINING SYSTEM


People:
Students:
  1. Tim Brandis - timothy.brandis@orcad.com
  2. Michael Levy - levym@ee.pdx.edu
  3. Thang Ta - tat@ee.pdx.edu
  4. Tu Dinh - tux.dinh@intel.com

Advisors:
  1. Karen Steingart
  2. Alan Mishchenko - alanmi@ee.pdx.edu
  3. Marek Perkowski - mperkows@ee.pdx.edu


Meetings:

Wednesday 10:00-11:00. If this meeting time will be not good for you, we will change it, but let us meet this week before Thanksgiving.


NEW
  • MINUTES OF THE MEEING NUMBER 1.
  • MINUTES OF THE MEEING NUMBER 2.

    Supervisors:


    1. Dr. Karen Steingart, Medical Doctor. Dr. Steingart will help you to formulate good interfacing ideas and will answer medical questions.
    2. Alan Mishchenko, Visiting Assistant Professor of Electrical Engineering Dr. Mishchenko will help you with programming and integrating software from previous students. He is our Visual C++ expert.
    3. Marek A. Perkowski, Professor of Electrical Engineering. Dr. Perkowski will teach you the principles of the software and help you with and integrating software from previous students.

    MAIN READING


    You have to read at least the first of the following papers. The more you read, the better. Read them in the order given below.
    1. M. Perkowski, M. Marek-Sadowska, L. Jozwiak, T. Luba, S. Grygiel, M. Nowicka, R. Malvi, Z. Wang, and J. S. Zhang, "Decomposition of Multiple-Valued Relations," Proc. ISMVL'97, Halifax, Nova Scotia, Canada, May 1997, pp. 13 - 18.
      POSTSCTIPT OF THIS PAPER

    2. C. Files, R. Drechsler, and M. Perkowski, "Functional Decomposition of MVL Functions using Multi-Valued Decision Diagrams," Proc. ISMVL'97, Halifax, Nova Scotia, Canada, May 1997, pp. 27 - 32.
      Slides of presentation of Craig Files at ISMVL'97:
      Postscript of the paper.

    3. M. A. Perkowski, L. Jozwiak, and D. Foote, "Architecture of a Programmable FPGA Coprocessor for Constructive Induction Approach to Machine Learning and other Discrete Optimization Problems", in Reiner W. Hartenstein and Victor K. Prasanna (ed) ``Reconfigurable Architectures. High Performance by Configware,'' IT Press Verlag, Bruchsal, Germany, 1997, pp. 33 - 40.
      postscript

    4. M. A. Perkowski, "A New Representation of Strongly Unspecified Switching Functions and Its Application to Multi-Level AND/OR/EXOR Synthesis," Proc. of the Second Workshop on Applications of Reed-Muller Expansion in Circuit Design, Chiba City, Japan, 27-29 August 1995, pp. 143-151.
      postscript of the paper
      HTML of the paper
      SLIDE 1. MVCDB representation of functions.
      SLIDE 2. MVCDB representation of functions.
      SLIDE 3. MVCDB representation of functions.
    5. S. Grygiel, M. Perkowski, M. Marek-Sadowska, T. Luba, and L. Jozwiak, "Cube Diagram Bundles, A New Representation of Strongly Unspecified Multiple-Valued Functions and Relations," Proc. ISMVL'97, Halifax, Nova Scotia, Canada, May 1997, pp. 287 - 292.
      POSTSCTIPT OF THIS PAPER


    6. M.A. Perkowski, S. Grygiel, and the Functional Decomposition Group, Department of Electrical Engineering, ``A Survey of Literature on Function Decomposition,'' Version IV, PSU Electrical Engineering Department Report, November 20, 1995.

      postscript



    7. M.A. Perkowski, T. Luba, S. Grygiel, P. Burkey, M. Burns, N. Iliev, M. Kolsteren, R. Lisanke, R. Malvi, Z. Wang, H. Wu, F. Yang, S. Zhou, and J.S. Zhang. Unified Approach to Functional Decompositions of Switching Functions. PSU Electrical Engineering Department Report, December 29, 1995. postscript.



    8. M. Perkowski, M. Burns, T.Luba, S. Grygiel, C. Stanley, R. Price, Z. Wang, J. Lu, P. Burkey, D. Manoharan, and S. Mohammad. ``Development of Search Strategies for MULTIS''. PSU Electrical Engineering Department Report, December 29, 1995. postscript.


    9. M. Perkowski, M. Burns, R. Almeria, and N. Iliev. ``Approaches to the Input-Output Encoding Problem in Boolean Decomposition,'' PSU Electrical Engineering Department Report, January 9, 1996.
      postscript.
    10. M. Perkowski, L. Jozwiak, and S. Mohamed, ``New Approach to Learning Noisy Boolean Functions,'' Proc. ICCIMA'98 Conference, February 1998, Australia, published by World Scientific, pp. 693 - 706. Australia, published by World Scientific.
      Postscript


    11. C. Files, M. Perkowski, ``Multi-Valued Functional Decomposition as a Machine Learning Method,'' Proc. ISMVL'98, May 1998. PAGES???


    12. M. A. Perkowski, T. Ross, D. Gadd, J.A. Goldman, and N. Song, "Application of ESOP Minimization in Machine Learning and Knowledge Discovery," Proc. of the Second Workshop on Applications of Reed-Muller Expansion in Circuit Design, Chiba City, Japan, 27-29 August 1995, pp. 102-109.

      call html
      call postscript
    13. C. Files, M. Perkowski, ``An Error Reducing Approach to Machine Learning Using Multi-Valued Functional Decomposition,'' Proc. ISMVL'98, May 1998. PAGES???


    14. S. Grygiel, and M. Perkowski, ``New Compact Representation of Multiple-Valued Functions, Relations, and Non-deterministic State Machines,'' Proc. ICCD 1998, October 1998.
      Postscript of slides.

      Postscript of paper.
    15. B. Steinbach, M. Perkowski, Ch. Lang, ``Bi-Decomposition in Multi-Valued Logic for Data Mining,'' Proc. ISMVL'99, May, 1999.
      Postscript.
    16. M. Perkowski, R. Malvi, S. Grygiel, M. Burns, and A. Mishchenko, ``Graph Coloring Algorithms for Fast Evaluation of Curtis Decompositions,'' Proc. DAC'99, June 21-23 1999. (DAC 99). New Orleans, LA, USA, June 21-25, 1999. PowerPoint presentation University booth poster
    17. M. A. Perkowski, A. N. Chebotarev, A. A. Mishchenko. Evolvable Hardware or Learning Hardware? Induction of State Machines from Temporal Logic Constraints. The First NASA/DOD Workshop on Evolvable Hardware (NASA/DOD-EH 99). Jet Propulsion Laboratory, Pasadena, California, USA,  July 19-21, 1999.
    18. C. Files, and M. Perkowski, ``Multi-Valued Decision Diagrams,'' submitted to ICCAD'99 conference


    19. C. Files, and M. Perkowski, ``Multiple-Valued Decision Diagrams,'' submitted to IEEE Trans. on CAD, March 1999

    QUESTIONS FOR SELF-EVALUATION


    You have to try to answer the following questions.
    1. What is Data Mining.
    2. What is Machine Learning.
    3. What are the well-known applications of Machine Learning and Data Mining.
    4. What are the well-known applications of Machine Learning and Data Mining.
    5. What is the Occam's Razor Principle, how it is used in practical methods.
    6. What is a Boolean Function? What is a multi-output Boolean Relation? What is an incompletely specified Boolean Function? What is strongly unspecified Boolean Function?
    7. What is a Multi-Valued Function? What is a multi-output Multi-Valued Relation? Why are they important for this project?
    8. What is Ashenhurst Decomposition?
    9. What is Curtis Decomposition?
    10. How to decompose multi-valued functions?
    11. How to decompose multi-valued relations?
    12. How and why to use graph coloring in decomposition?
    13. Describe the principles of C4.5 program for Data Mining.
    14. Compare C4.5 and our approaches.
    15. Other approaches based on Constructive Induction.
    16. Variable Partitioning and Variable Ordering problems.
    17. Input Data formats that we will use.
    18. Output Data formats that we will use.
    19. Internal Data formats that we will use.
    20. The role of unknowns and don't cares.
    21. The role of noise.
    22. The role of discretization of continuous variables.

    IT IS RECOMMENDED FOR THE STUDENTS TO FAMILIARIZE THEMSELVES WITH THE FOLLOWING ADDITIONAL LITERATURE:

    1. U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, ``Advances in Knowledge Discovery and Data Mining''.
    2. W. Ziarko, ``Rough Sets, Fuzzy Sets and Knowledge Discovery,'' Springer Verlag, 1994.
    3. Zdzislaw Pawlak, ``Rough Sets,'' Kluwer Academic Publishers, 1991.
    4. Marek Perkowski, ``Data Mining and Pattern Recognition using Inductive Methods''.
      Textbook in preparation.
      Will be available from professor, on the chapter-by-chapter basis.