Wednesday, March 3, 2010

Paper Summary - Semantics for the Semantic Web: The Implicit, the Formal and the Powerful

A. Sheth, C. Ramakrishnan, and C. Thomas. Semantics for the semantic web: The implicit, the
formal, and the powerful. Semantic Web and Information Systems, 1(1), Idea Group, 2005.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.9929&rep=rep1&type=pdf

This paper discusses both the limitation of relying upon just description logics and the need to utilize different semantics to handle the complexity of exploiting semantic web data. In particular it organizes these semantics into three categories:
  • Implicit - from Patterns in the data, examples include co-occurrence and links
  • Formal - formal language which presents syntactical rules, Description Logic falls under this category
  • Powerful - statistical analysis that uses patterns in the data

What is interesting about this paper are the following statements:

"Even though it is desirable to have a consistent knowledge base, it becomes impractical as the size of the knowledge base increases or as knowledge from many sources is added. It is rare that human experts in most scientific domains have a full and complete agreement. In these cases it becomes more desirable that the system can deal with inconsistencies."

"Sometimes it is useful to look at a knowledge base as a map. This map can be partitioned according to different criteria, e.g. the source of the facts or their domain. While on such a map the knowledge is usually locally consistent, it is almost impossible and practically infeasible to maintain a global consistency. Experience in developing the Cyc ontology demonstrated this challenge. Hence, a system must be able to identify sources of inconsistency and deal with contradicting statements in such a way that it can still produce derivations that are reliable."

They then go on to discuss current approaches to deal with this inconsistency.

  • Probabilistic reasoning
  • Possibilistic reasoning
  • Fuzzy reasoning

It highlights drawbacks with these methods and proposes the need for a standardization in this area.

The paper then discusses correlating semantic capabilities with types of semantics in relation to the bootstrapping and utilization phases.

The last part of this paper discusses information integration, information retrieval and extraction, data mining and analytical applications.

Some of the interesting papers it references:

R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In P. Buneman and S. Jajodia, editors, Proceedings of the 1993.

D. Barbará, H. Garcia-Molina and D. Porter. The Management of Probabilistic Data. IEEE Transactions on Knowledge and Data Engineering, Volume 4 , Issue 5 (October 1992), Pages: 487 - 502

Jochen Heinsohn: Probabilistic Description Logics. UAI 1994: 311-318.
Int’l Journal on Semantic Web & Information Systems, 1(1), 1-18, Jan-March 2005

Hui Han, C. Lee Giles, Hongyuan Zha, Cheng Li, and Kostas Tsioutsiouliklis. "Two Supervised Learning Approaches for Name Disambiguation in Author Citations" , in Proceedings of ACM/IEEE Joint Conference on Digital Libraries (JCDL 2004), pages 296-305, 2004.

Birger Hjorland Information retrieval, text composition, and semantics. Knowledge Organization 25(1/2):16-31, 1998

Vipul Kashyap, Amit Sheth: Semantic Heterogeneity in Global Information Systems: The Role of Metadata, Context and Ontologies, Cooperative Information Systems 1996

M. Kuramochi and G. Karypis. Finding frequent patterns in a large sparse graph. In SIAM International Conference on Data Mining (SDM-04), 2004.

Alexander Maedche, Steffen Staab: Ontology Learning for the Semantic Web. IEEE Intelligent Systems 16(2): 72-79 (2001)

B. Omelayenko. Learning of Ontologies for the Web: the Analysis of Existent approaches. In Proceedings of the International Workshop on Web Dynamics, 2001.

Erhard Rahm, Philip A. Bernstein. A Survey of Approaches to Automatic Schema Matching. In VLDB Journal 10: 4, 2001
Int’l Journal on Semantic Web & Information Systems, 1(1), 1-18, Jan-March 2005

Amit P. Sheth, Sanjeev Thacker, Shuchi Patel: Complex relationships and knowledge discovery support in the InfoQuilt system. VLDB J. 12(1): 2-27 (2003)

J. Townley, The Streaming Search Engine That Reads Your Mind, August 10, 2000. http://smw.internet.com/gen/reviews/searchassociation/

William A. Woods: “Meaning and Links: A Semantic Odyssey”. Principles of Knowledge Representation and Reasoning: Proceedings of the Ninth International Conference (KR2004), June 2-5, 2004. pp. 740-742

Lotfi A. Zadeh. Toward a perception-based theory of probabilistic reasoning with imprecise probabilities. In Journal of Statistical Planning and Inference 105 (2002) 233-264.

No comments: