Last update: 2010-07-16



An object on the Semantic Web is likely to be denoted with several URIs by different parties. Object coreference is a process to identify "equivalent" URIs of objects for achieving a better Data Web. In this paper, we propose a new approach to bootstrapping object coreference on the Semantic Web. For a given URI, our approach firstly establishes a kernel of semantically equivalent URIs by the same-as, (inverse) functional properties and (max-)cardinalities, and then extends the kernel with respect to the textual descriptions (e.g. labels, local names) of URIs. In addition, a trustworthiness measurement is employed to rank the coreferent URIs in the kernel, and a similarity-based way for ranking the URIs in the extension of the kernel. We implement the proposed approach on a large-scale dataset that involves 76 million URIs from the Falcons search engine. Evaluation on precision, relative recall and response time demonstrates the feasibility of our approach. Furthermore, we apply the approach to investigate the popularity of the URI alias phenomenon on the Semantic Web. Preliminary results show that the URI alias phenomenon is prevalent, but we can only detect a small part so far.

