Tuesday, March 30, 2010

Paper Summary - Data Mining: An Overview from a Database Perspective

M. Chen and J. Han and P. Yu, "Data Mining: An Overview from a Database Perspective", IEE Transactions on Knowledge and Data Engineering, 8(6): 866-883, 1996

This is a seminal paper about mining information from large databases. It is a survey of data mining techniques from a database researcher perspective.

The paper discusses key feature and challenges:


  • Different types of data
  • Efficiency and Scalability of algorithms
  • Accuracy and usefulness of results
  • How results are conveyed
  • Multiple Abstraction Levels
  • Mining different sources
  • Privacy and security



They go on to classify different types of data mining schemes. They can be classified according to the data they are examining, according to the kind of knowledge they are mining, and according to the technique they implore. This paper focuses on the knowledge they are mining.


  • Association rules
  • Data Generalization and Summarization
  • Classification huge amount of data
  • Data Clustering
  • Pattern based similarity
  • Path traversal patterns


It describes each one of this items in great detail. The paper is a great paper to get a good foundation on this topic. It is quite long but detailed. I don't seem mention of confidence approach.

No comments: