The convergence of domains and intelligent systems: 2010

Friday, July 2, 2010

Linux on Windows

If you aren't aware of wubi yet. It is actually a great way to run Linux on Windows Ubuntu flavor. The installation is easy and you can control how much disk space you want to use. I am slowly rebuilding my laptops to run Linux but in the meantime this is a nice way to keep Windows and run Linux too.

Get it

Saturday, May 29, 2010

Hadoop + SPARQL

I began working with cloud computing about 2 years ago because of my interest in using this environment for semantic web applications. It seems to be of interest to others too.

https://opencirrus.org/content/sparql-query-over-hadoop-very-large-rdf-dataset
http://portal.acm.org/citation.cfm?id=1779599.1779605
http://cs264.org/projects/web/Rohloff_Kurt/rohloff/index.html

Sunday, April 25, 2010

Paper Summary - A Framework for Combining Ontology and Schema Matchers with Dempster-Shafer

"Paper Summary - A Framework for Combining Ontology and Schema Matchers with Dempster-Shafer", P. Besana

This is a short paper about using Dempster-Shafer for ontology mapping. They have a tool based on their work: PyontoMap. This paper is really relevant so the summary will be deferred.

Paper Summary - BeliefOWL: An Evidential Representation in OWL Ontology

Amira Essaid and Boutheina Ben Yaghlane, BeliefOWL: An Evidential Representation in OWL Ontology, pages 77-80, International Semantic Web Conference, International Workshop on Uncertainty Reasoning for the Semantic Web, Washington DC, USA, 2009.

This paper is very short, only 4 pages. It starts with a discussion of how uncertainty is represented currently in ontologies using either probabilistic or fuzzy approaches. It then proposed the Dempster-Shafer approach as another option. They discuss the work by :
Ben Yaghlane,B.: Uncertainty representation and reasoning in directed evidential
networks, PhD thesis, Institut Sup´erieur de Gestion de Tunis Tunisia, 2002.

which is work related to representing uncertainty using a DAG. I haven't read this paper yet but it certainly seems like a good read.

This paper then goes on with a presentation of BeliefOWL, their uncertainty extension to OWL.

They define two classes to represent prior evidence:

- enumerates different masses and has object property which specifies the relation between itself and .

- expreses prior evidence and has property

They define two classes which represent conditional evidence:

- has an object property of

- conditional evidence with property

They construct an evidential network by translating the OWL ontology into a DAG. They then assign masses to nodes in the DAG. Details of this work are mentioned but briefly.

Overall I don't know if this helps my work in anyway except to see some uses of DS with ontologies.

Paper Summary - Uncertainty in Ontologies: Dempster-Shafer Theory for Data Fusion Applications

"Uncertainty in Ontologies: Dempster-Shafer Theory for Data Fusion Applications", A.
Bellenger1 and S. Gatepaille, Defence and Security Information Processing, Control and Cognition department, France

This paper is relevant to my work because they use DS to represent uncertainty in ontologies. They do this by creating an upper ontology that contains the DS measures calculated i.e mass, belief, plausibility, etc.

The paper starts with a background in data fusion and gives some examples of how uncertainty is captured in ontologies and why it is important to represent uncertainty. They define uncertainty as "incomplete knowledge, including incompleteness, vagueness, ambiguity, and others". In addition to the natural occurrence of uncertainty in data, it is also a product of fusing data which may be acquired from different sources.

This is an interesting statement: " If the user/application is not able to decide in favor of a single alternative (due to insufficient trust in the respective information sources), the aggregated statement resulting from the fusion of multiple
statements is typically uncertain. The result needs to reflect and weight the different information inputs appropriately, which typically leads to uncertainty."

This is the common in military applications and in general knowledge bases because one attempts to acquire supporting data for entities in the knowledge base from various sources that can be unreliable.

They briefly discuss the shortcomings in current traditional methods to handle uncertainty in ontologies. They state that since ontologies are designed to contain concepts and relations only that describe asserted facts about the world, that they are not designed to handle uncertainty. The facts asserted are assumed to be 'true'. Therefore even information that is not certain to be 'true' are stored and lead to errors or inaccurate information. There is not a standard way to handle uncertainty currently (can read more on this).

They discuss how probability is used as a way to represent uncertainty in ontologies. They discuss some existing work in this area including (BayesOWL). The problem with this approach in particular is that it does not account for OWL properties, instances of the ontologies or the data types. There are also extensions to DL (Pronto is one of them), however performance is a problem. There are also Fuzzy approaches that exist.

They then discuss using DS. DS is presented as a generalized probability theory, however books related to this topic are not exactly is agreement with this representation. Masses are calculated and the sum of these masses make up the beliefs. It is also noted that is supports combining evidence from different sources which makes it especially useful for fusing data from different sources. They note work that actually use DS to handle the inconsistencies produced by mapping ontologies. However it also highlights a relevant paper that translates an OWL taxonomy into a directed evidential network.

The rest of the paper discusses their approach and how they use DS for modeling and reasoning. The point they make about uncertainty and why probabilistic methods can't represent it accurately is p-methods do not represent the absence of information very well. One needs to specify prior and conditional probabilities. They argue this leads to a minimax error due to is nature of symmetry prior probability assignment (.5) when information is not available. With DS missing information is not applied unless obtained indirectly. It allows one to specify a degree of ignorance (some define this as an upper and lower bound). They find this property to be appealing.

Probabilistic approaches use singletons only where DS allows one to use composites in addition to singletons. This is powerful. With probability theory there is a relationship between an event and its negation, DS does not imply a relationship between an event and its negation, it only models beliefs associated with a class.

They mention an additional point that I find makes this approach appealing. DS provides a way to combine evidence from different sources. This makes it especially useful for fusion.

They state, "the evidence theory is much more flexible than the probability theory". This is a strong statement and I'm not sure if it is completely true based on other papers that show how both Bayesian and DS can produce similar results.

Ok the paper ends with their approach. They discuss their proposed model which is an upper ontology representing the uncertainty. A DS_Concept which is a subclass of OWL:Thing has a DS_Mass, DS_Belief, DS_Plausibility, and a DS_Source. The Uncertain_Concept represents a concept that is part of the set. There is an object property is_either which has a range of owl:Thing so that all instances can be used.

This paper isn't cited by anyone else but I think there are good ideas proposed here and I am using this paper in my 601 work.

Saturday, April 24, 2010

Papers To Read

http://www.dsto.defence.gov.au/publications/2563/DSTO-TR-1436.pdf
http://www.isif.org/fusion/proceedings/fusion03CD/special/s31.pdf
http://uima.apache.org/downloads/releaseDocs/2.3.0-incubating/docs/pdf/tutorials_and_users_guides.pdf
http://www.autonlab.org/tutorials/bayesnet09.pdf
http://spiedl.aip.org/getabs/servlet/GetabsServlet?prog=normal&id=PSISDG004051000001000255000001&idtype=cvips&gifs=yes&ref=no
http://www2.research.att.com/~lunadong/publication/fusion_vldbTutorial.pdf
http://www.aaai.org/aitopics/pmwiki/pmwiki.php/AITopics/Uncertainty
http://www.cs.cmu.edu/afs/cs/academic/class/15381-s07/www/slides/032207probAndUncertainty.pdf
cox's theorem
http://www.britannica.com/bps/additionalcontent/18/35136768/Data-Fusion-for-Traffic-Incident-Detection-Using-DS-Evidence-Theory-with-Probabilistic-SVMs
http://data.semanticweb.org/workshop/ursw/2009/paper/main/5/html
http://portal.acm.org/citation.cfm?id=1698790.1698821
http://volgenau.gmu.edu/~klaskey/papers/LaskeyCostaJanssen_POFusion.pdf
http://www.slideshare.net/rommelnc/ursw-2009-probabilistic-ontology-and-knowledge-fusion-for-procurement-fraud-detection-in-brazil
http://www.eurecom.fr/~troncy/Publications/Troncy_Straccia-eswc06.pdf
http://www.glennshafer.com/assets/downloads/articles/article48.pdf
http://www.fusion2008.org/tutorials/tutorial05.pdf
http://www.google.com/url?sa=t&source=web&ct=res&cd=5&ved=0CCcQFjAE&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.62.9835%26rep%3Drep1%26type%3Dpdf&rct=j&q=dempster+shafer+tutorial&ei=UprNS4m9K4P88Ab5v-GVAQ&usg=AFQjCNFUUt_xSeT2QOkrqYvsLySWOllqCw&sig2=RLqpJoODSs1kgLFK869Ikw
http://www.cs.cf.ac.uk/Dave/AI2/node87.html
http://www.autonlab.org/tutorials/bayesnet09.pdf
http://sinbad2.ujaen.es/sinbad2/files/publicaciones/186.pdf
http://www.ensieta.fr/belief2010/papers/p133.pdf
http://www.gimac.uma.es/ipmu08/proceedings/papers/057-MerigoCasanovas.pdf
http://www.sas.upenn.edu/~baron/journal/jdm7803.pdf
http://classifier4j.sourceforge.net/usage.html
http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-527/paper1.pdf

Research Paper Summary - A General Data Fusion Architecture

H Carvalho, W Heinzelman, A Murphy, and C Coelho. A
general data fusion architecture. In Int. Conf. on Info. Fusion,
pages 1465–1472, 2003.

This is a short paper that describes an architecture for data fusion. What they are proposing is a taxonomy that defines 3 types of fusion: data oriented, variable oriented, and a mixture of the two. They are making a clear distinction between data as a measurement of the environment and variable as determined by feature extraction.

They describe examples of sensor data and state that the data needs to be pre-processed before fused. The pre-processing can involve conversions of a signal or filtering or handling noise. After pre-processing the data can be fused and they are proposing a 3-level data fusion framework. They begin by classifying the data as defined by the taxonomy. Basically when the fusion occurs defines what type of fusion we are dealing with (data, variable or mixture).

They go into a few examples of using this architecture. In general, the paper is not detailed enough to understand if the approach is viable. It is high level and short. It does provide additional information about the formalities of data fusion which is useful.

Paper Summary - A New Technique for Combining Multiple Classifiers using The Dempster-Shafer Theory of Evidence

Al-Ani, A. & Deriche, M. (2002) A new technique for combining multiple classifiers using the dempster shafer theory of evidence. In Journal of Artificial Intelligence Research, 17, (pp. 333—361)

This paper describes a new technique based on Dempster-Shafer to combine classifiers. The basic premise is that different types of features may be used depending upon the application. With different features, the same classifier may not always be best. So based on the features, a different classifier may outperform others. By combining classifiers they propose this is an efficient way to achieve the best classification results.

There are two problems defined by others, how to determine which classifiers to use and how to combine the classifier results to get the best results. They are addressing the second question in this paper.

They categorize the output of classification algorithms into 3 levels:

the abstract - outputs a unique label

the rank - ranks all labels with label at top as first choice

the measurement levels - attributes to each class a value reflecting degree of confidence that input belongs to class

They state the measurement level contains the 'highest amount of information' and they use this level for their work.

Two combination scenarios mentioned:

all use the same representation of input pattern

each uses its own representation

Relating to item 2 they found from another study that using a joint probability distribution using the sum rule gave the best results. They also quote a study that used weighted sums, and another that used a cost function to minimize MSE in conjunction with A NN. In this same study that used NN, a number of NNs were used to produce linear combination. Combining the results from the NN, they used Dempster-Shafer theory. They give a few other approaches and then the rest of the paper discusses their approach.

They combine classifier results using a number of different feature sets. Each feature set is used to train a classifier. For some input x, each classifier will produce a vector that conveys the degree of confidence that the classifier has for each class given the input.

They then discuss DS. DS is said to represent uncertainties better than probabilistic techniques such as Bayesian. For classifier combination, they stress this is important since there usually exists "a certain level of uncertainty associated with the performance of each classifier". Other classifier combination methods that use DS theory do not accurately estimate the evidence of classifiers, they state and believe that their approach which uses gradient descent learning minimizes the MSE between the combined output and target output of the training set.

They then go into detail about the math behind DS and about their approach.

Note DS Belief and Plausibility formulas from wikipedia:

Belief:

Plausibility:

Note DS Rule of combination:

where:

I need to return to this to describe their method. It is detailed and involves a lot of math.

Why am I reviewing this document. Well, this is a little off topic but I thought any exposure to methods that use DS will help me understand it better.

Paper Summary - An Introduction to Multisensor Data Fusion

D. L. Hall and J. Llinas, editors. Handbook of Multisensor
Data Fusion. CRC Press, 2001.

This paper is based on the book and gives a general background in multisensor data fusion. It gives basic definitions, a bit of history and highlights types of applications that use multisensor data fusion techniques. This is most prevalent in military applications but commercial applications are also making use of fusing data from multiple sources. There are advantages in using a 'multi-sensor' approach, improved accuracy and estimates are better and in general there is a statistical advantage.

It goes on to provide some basic definitions and discusses examples of sensors (more related to military domain). What is interesting is the following:

"The most fundamental characterization of data fusion involves a hierarchical transformation between observed energy or parameters (provided by multiple sources as input) and a decision or inference (produced by fusion estimation and/or inference processes) regarding the location, characteristics, and identity of an entity, and an interpretation of the observed entity in the context of a
surrounding environment and relationships to other entities....
The transformation between observed energy or parameters and a decision or
inference proceeds from an observed signal to progressively more abstract concepts."

They go on to discuss methods which are to make identity estimations including Dempster-Shafer and Bayesian.

"Observational data may be combined, or fused, at a variety of levels from the raw data (or observation) level to a state vector level, or at the decision level."

It then talks in details about examples in military and non-military applications and then about the Joint Data Fusion Process Model which was established in 1986.

The rest of the paper goes into detail about the architecture.

Why is this important to my work?

There are aspects about true multi-sensor data fusion that can be adapted and used in fusing semantic web data. There are very similar issues involved. We get data about entities from different sources. This data can be complementary, certain sources can offer facts that other sources are not aware of and fusing this information together presents a more comprehensive picture of an entity. This is applicable which smushing FOAF instances (part of earlier work) and this is applicable which simply merging fact retrieved from different sources. One example in particular is new sources. Different facts can be exposed from different news sources. When you bring these facts together you get a more complete story.

This brings us to another issue with data fusion and that is conflict resolution. When we are combining sources sometimes the information can be in conflict with each other. This is an interesting problem.

This paper is a great way to get a good background in multi-sensor data fusion and one can use definitions, techniques, architectures and apply them to fusing Semantic Web data.

Tuesday, April 20, 2010

Peers

"Travel only with thy equals or thy betters; if there are none, travel alone." --The Dhammapada

Sunday, April 18, 2010

Commitment

"Until one is committed, there is hesitancy, the chance to draw back, always ineffectiveness. Concerning all acts of initiative and creation, there is one elementary truth, the ignorance of which kills countless ideas and splendid plans: that the moment one definitely commits oneself, then providence moves too. All sorts of things occur to help one that would never otherwise have occurred. A whole stream of events issues from the decision, raising in one’s favor all manner of unforeseen incidents and meetings and material assistance, which no man could have dreamed would have come his way. Whatever you do, or dream you can, begin it. Boldness has genius, power and magic in it. Begin it now." --Johann Wolfgang von Goethe (1749- 1832)

Saturday, April 17, 2010

Paper Summary - An Introduction to Bayesian and Dempster-Shafer Data Fusion

D. Koks and S. Challa, DSTO Systems Sciences Laboratory, November 2005
http://www.dsto.defence.gov.au/publications/2563/DSTO-TR-1436.pdf

This paper is about data fusion and using techniques such as Bayesian and DS. It starts out with an introduction about data fusion and how it is defined in multiple domains. It then highlights work by others that implemented different methods to perform data fusion. It then gives nice detailed examples of using Bayesian and Dempster-Shafer to perform data fusion. It ends with a comparison summary of these two techniques.

The paper is very good. It shows all of the equations step by step and gives clear examples. It very clearly shows shortcoming of both methods.

Notes on XMPP Prototype

I've been recently investigating XMPP for Cloud Computing. After reading the specifications and learning basically what it is all about, I've begun working on a prototype that would accomplish similar tasks that I accomplish with mesh4x . Basically the goal is offline/online synchronization. So far, I've played around with Jabber, tigase, and a couple of other servers. I need basically server and client support. So far, I like OpenFire and Smack. Right now I have code written in Java that basically creates a chat between two users. I created two accounts at jabber.iitsp.com and wrote some basic code. Now, my two clients are communicating by chatting with one another.

My next steps:

Use OpenFire locally as the server
Write code that will do what I do with mesh4x
Hook it into Google Maps

Basic code:

XMPPConnection connection = new XMPPConnection("jabber.iitsp.com");
connection.connect();
connection.login(userName, password);
Chat chat = connection.getChatManager().createChat(person, new MessageListener() {

public void processMessage(Chat chat, Message message) {
System.out.println("Received message: " + message.getBody());
}
});

chat.sendMessage("Hi there!");

I do a bit more in my code to keep the thread running and to recognize who the user is and respond with a customized message but this is the code that actually sends and receives the message. This is taken from the Smack documentation as a basic test. If you want to just test it one way, create an account at Trillian using XMPP as the protocol with one of your user accounts: user1@jabber.iitsp.com. Then set up your Java code as user2 who wants to send to user1. Your trillian client will pop up your message.

Monday, April 5, 2010

Paper Summary - Ontology matching: A machine learning approach

A. Doan and J. Madhaven and P. Domingos and A. Halevy, "Ontology matching: A machine learning approach", Handbook on Ontologies in Information Systems,2004,397-416, Springer-Verlag

This paper is about finding mappings between ontologies and discusses the system GLUE. Using learning techniques it semi-automatic generates mappings. This work attempts to address the issue of matching concept nodes.

It begins by discussing the meaning of similarity and a discussion around using the joint probability distribution of concepts.

It then discusses the complexities of computing the JPD for two different concepts.

It then discusses using machine learning to use instances of one concept to learn a classifier for that concept and the same for the second concept.

Rather than a single algorithm is uses multiple algorithms and then combines their predictions.

This paper is long. Their techniques are novel and interesting. This is a good paper to use for machine learning techniques for instance matching.

Saturday, April 3, 2010

Paper Summary - Probabilistic relational models

D. Koller. Probabilistic relational models. In S. Dzeroski and P. Flach, editors, Proceedings of Ninth International Workschop on Inductive Logic Programming (ILP-
1999). Springer, 1999.

This paper highlights deficiencies that exist with Bayesian networks; mainly that BNs cannot represent complex domains because they cannot represent models of domains that they are not aware of in advance. They present probabilistic relational models as a language for describing probabilistic models. Entities, their properties and relations are represented with the language.

The key points in this paper are:

BNs are very useful and have been successful as a way to perform probabilistic reasoning
BNs are inadequate for representing large complex domains

BNs lack the concept of an object and therefore there is no concept of similarity among objects across contexts

Probabilistic relational models extend BNs by adding concepts of individuals, properties and relations

Objects are the 'basic entities' in a PRM and partitioned into disjoint classes with a set of associated attributes. In addition to objects, relations also make up the vocabulary. It states that the goal of the PRM is to define a probability distribution over a set of instances for a schema.

The key distinctions here with PRMs and Bayesian networks are PRMs define the dependency model at the class level and they use the relational structure of the model. They are more expressive then Bayesian networks.

Regarding inference, they show how this expressiveness helps rather than further complicates the processing.

The paper is dense but interesting.

Tuesday, March 30, 2010

Paper Summary - Data Mining: An Overview from a Database Perspective

M. Chen and J. Han and P. Yu, "Data Mining: An Overview from a Database Perspective", IEE Transactions on Knowledge and Data Engineering, 8(6): 866-883, 1996

This is a seminal paper about mining information from large databases. It is a survey of data mining techniques from a database researcher perspective.

The paper discusses key feature and challenges:

Different types of data
Efficiency and Scalability of algorithms
Accuracy and usefulness of results
How results are conveyed
Multiple Abstraction Levels
Mining different sources
Privacy and security

They go on to classify different types of data mining schemes. They can be classified according to the data they are examining, according to the kind of knowledge they are mining, and according to the technique they implore. This paper focuses on the knowledge they are mining.

Association rules
Data Generalization and Summarization
Classification huge amount of data
Data Clustering
Pattern based similarity
Path traversal patterns

It describes each one of this items in great detail. The paper is a great paper to get a good foundation on this topic. It is quite long but detailed. I don't seem mention of confidence approach.

Paper Summary - Link-based text classification

Q. Lu and L. Getoor, "Link-based text classification",IJCAI workshop on text-mining and link-analysis, 2003

This paper examines machine learning when objects are linked and using the links as additional information for the classifier. They present a framework which models link distributions using a logistic regression model for both content and links. They found that using links actually improved the accuracy of the classifier.

They use an iterative classification algorithm since the attributes can be correlated. There is a joint distribution between links and content attributes.

The main points in the paper are:

* The statistical framework models link distributions
* They show through results how this improved accuracy of the classifier
* They show an evaluation of the iterative categorization algorithm

Paper Summary - Unsupervised named-entity extraction from the web: An experimental study

"Unsupervised named-entity extraction from the web: An experimental study", O. Etzioni and M. Cafarella and D. Downey and S. Kok and A. Popescu and T. Shaked and S. Soderland and D. Weld and A. Yates, Artificial Intelligence,91-134,2005

This paper describes a system that uses an unsupervised approach at entity extraction. It describe the architecture of the system and defines general principles for extraction. They present 3 ways improve recall and extraction without compromising precision:
Pattern Learning - domain-specific
Subclass Extraction - identifies sub-classes
List Extraction - locates lists of class instances

The paper starts with a motivation for using an unsupervised approach. It provides background in information extraction and some of the complexities involved. KNOWITALL uses extraction patterns and pointwise mutal information statistics calculated from the web using hit counts measuring degree of correlation between pairs of words. The PMIs are used as features for a classifier. It consumes information from search engines.

It uses a bootstrapping method which is a set of predicates. Labels are given for each class and symbolic names are given for each class. The Bootstrapper "uses the labels to instantiate extraction rules". The Extractor formulates queries to send to search engine and the Assessor evaluates the extractions.

The paper is quite extensive and describes the system well. Overall this paper is a good read and provides a number of tidbits for new ideas.

Thursday, March 18, 2010

Paper Summary - Performing Object Consolidation on the Semantic Web Data Graph

Hogan, A.; Harth, A.; and Decker, S. 2007. Performing object
consolidation on the semantic web data graph. In In Proceedings
of I3: Identity, Identifiers, Identification. Workshop at 16th
International World Wide Web Conference (WWW2007).

This paper describes identities and the integration of data. They present a method for merging instances across multiple data sources (**large scale**). They describe how they determine two instances represent the same entity using inverse functional properties. Their dataset includes over 72 million instances (wow).

Key points:

There isn't much agreement on use of common URIs to identify entities (optional in RDF) so URI represents multiple instances at times
There is a lack of formal specification for determining equivalences among entities
Existing methods that perform object consolidation rely upon probabilistic methods
The use of inverse functional properties in the Semantic Web world is widely used

Paper Summary - On Searching and Displaying RDF Data from the Web

A., H., and H., G. 2005. On searching and displaying RDF data
from the web. In Proceedings of Demos and Posters of the 2nd
European Semantic Web Conference.

This document describes an application built to show RDF in a user interface. It describes how it gathered the data and also a smushing technique it used. The goal was to determine if it was feasible to integrate data without an underlying ontology. They describe their challenges with the data in this paper.

The key points of this paper:

To determine if it is possible to integrate data without using an underlying ontology to do so
They describe their data retrieval process
They describe how they smush data
They describe their application and how they make this data available for a user interface

The paper is short and missing a lot of details. Though I referenced this in the "A Machine Learning Approach to Linking FOAF Instances" document, it really was just to give an example of a smushing process.

Wednesday, March 3, 2010

Paper Summary - Semantics for the Semantic Web: The Implicit, the Formal and the Powerful

A. Sheth, C. Ramakrishnan, and C. Thomas. Semantics for the semantic web: The implicit, the
formal, and the powerful. Semantic Web and Information Systems, 1(1), Idea Group, 2005.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.9929&rep=rep1&type=pdf

This paper discusses both the limitation of relying upon just description logics and the need to utilize different semantics to handle the complexity of exploiting semantic web data. In particular it organizes these semantics into three categories:

Implicit - from Patterns in the data, examples include co-occurrence and links
Formal - formal language which presents syntactical rules, Description Logic falls under this category
Powerful - statistical analysis that uses patterns in the data

What is interesting about this paper are the following statements:

"Even though it is desirable to have a consistent knowledge base, it becomes impractical as the size of the knowledge base increases or as knowledge from many sources is added. It is rare that human experts in most scientific domains have a full and complete agreement. In these cases it becomes more desirable that the system can deal with inconsistencies."

"Sometimes it is useful to look at a knowledge base as a map. This map can be partitioned according to different criteria, e.g. the source of the facts or their domain. While on such a map the knowledge is usually locally consistent, it is almost impossible and practically infeasible to maintain a global consistency. Experience in developing the Cyc ontology demonstrated this challenge. Hence, a system must be able to identify sources of inconsistency and deal with contradicting statements in such a way that it can still produce derivations that are reliable."

They then go on to discuss current approaches to deal with this inconsistency.

Probabilistic reasoning
Possibilistic reasoning
Fuzzy reasoning

It highlights drawbacks with these methods and proposes the need for a standardization in this area.

The paper then discusses correlating semantic capabilities with types of semantics in relation to the bootstrapping and utilization phases.

The last part of this paper discusses information integration, information retrieval and extraction, data mining and analytical applications.

Some of the interesting papers it references:

R. Agrawal, T. Imielinski, and A. N. Swami. Mining association rules between sets of items in large databases. In P. Buneman and S. Jajodia, editors, Proceedings of the 1993.

D. Barbará, H. Garcia-Molina and D. Porter. The Management of Probabilistic Data. IEEE Transactions on Knowledge and Data Engineering, Volume 4 , Issue 5 (October 1992), Pages: 487 - 502

Jochen Heinsohn: Probabilistic Description Logics. UAI 1994: 311-318.
Int’l Journal on Semantic Web & Information Systems, 1(1), 1-18, Jan-March 2005

Hui Han, C. Lee Giles, Hongyuan Zha, Cheng Li, and Kostas Tsioutsiouliklis. "Two Supervised Learning Approaches for Name Disambiguation in Author Citations" , in Proceedings of ACM/IEEE Joint Conference on Digital Libraries (JCDL 2004), pages 296-305, 2004.

Birger Hjorland Information retrieval, text composition, and semantics. Knowledge Organization 25(1/2):16-31, 1998

Vipul Kashyap, Amit Sheth: Semantic Heterogeneity in Global Information Systems: The Role of Metadata, Context and Ontologies, Cooperative Information Systems 1996

M. Kuramochi and G. Karypis. Finding frequent patterns in a large sparse graph. In SIAM International Conference on Data Mining (SDM-04), 2004.

Alexander Maedche, Steffen Staab: Ontology Learning for the Semantic Web. IEEE Intelligent Systems 16(2): 72-79 (2001)

B. Omelayenko. Learning of Ontologies for the Web: the Analysis of Existent approaches. In Proceedings of the International Workshop on Web Dynamics, 2001.

Erhard Rahm, Philip A. Bernstein. A Survey of Approaches to Automatic Schema Matching. In VLDB Journal 10: 4, 2001
Int’l Journal on Semantic Web & Information Systems, 1(1), 1-18, Jan-March 2005

Amit P. Sheth, Sanjeev Thacker, Shuchi Patel: Complex relationships and knowledge discovery support in the InfoQuilt system. VLDB J. 12(1): 2-27 (2003)

J. Townley, The Streaming Search Engine That Reads Your Mind, August 10, 2000. http://smw.internet.com/gen/reviews/searchassociation/

William A. Woods: “Meaning and Links: A Semantic Odyssey”. Principles of Knowledge Representation and Reasoning: Proceedings of the Ninth International Conference (KR2004), June 2-5, 2004. pp. 740-742

Lotfi A. Zadeh. Toward a perception-based theory of probabilistic reasoning with imprecise probabilities. In Journal of Statistical Planning and Inference 105 (2002) 233-264.