BipRank: ranking and summarizing RDF vocabulary descriptions
|Authors:||Gong Cheng , Feng Ji , Shengmei Luo , Weiyi Ge , Yuzhong Qu|
|inPublication:||In Proc. of the 1st Joint International Semantic Technology Conference (JIST)|
When searching for RDF vocabularies, users often feel hindered by the lengthy description of a retrieved vocabulary from judging its relevance. A natural strategy for dealing with this issue is to generate a summary of the vocabulary description that compactly carries its main theme and reveals its relevance to the user's information need. In this paper, we present a new solution to this problem of vocabulary summarization, which has been defined as ranking and selecting RDF sentences in our previous work. Firstly, we propose a novel bipartite graph representation of vocabulary description, on which we carry out a stochastic analysis of a random surfer's behavior, from which we derive a new centrality measure for RDF sentences called BipRank. Further, we improve it by investigating the patterns of RDF sentences and employing their statistical features. Then, we combine BipRank with query relevance and cohesion metrics into an aggregate objective function to be optimized for the selection of RDF sentences. Our experiments on real-world vocabularies demonstrate the superiority of our approach to the baseline, and also validate its scalability in practice.