Relaxation of keyword pattern graphs on RDF data

Ananya Dass, Cem Aksoy, Aggeliki Dimitriou, Dimitrios Theodoratos

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

One of the facets of the data explosion in recent years is the growing of the repositories of RDF Data on the Web. Keyword search is a popular technique for querying repositories of RDF graph data. Recently, a number of approaches leverage a structural summary of the graph data to address the typical keyword search related problems of: (a) identifying relevant results among a multitude of candidates, and (b) performance scalability. These approaches compute queries (pattern graphs) corresponding to alternative interpretations of the keyword query and the user selects one that matches her intention to be evaluated against the data. Though promising, these approaches suffer from a drawback: because summaries are approximate representations of the data, they might return empty answers or miss results which are relevant to the user intent. In this paper, we present a novel approach which combines the use of the structural summary and the user feedback with a relaxation technique for pattern graphs. We leverage pattern graph homomorphisms to define relaxed pattern graphs that are able to extract more results potentially of interest to the user. We introduce an operation on pattern graphs and we prove that it is complete, that is, it can produce all relaxed pattern graphs. To guarantee that the result pattern graphs are as close to the initial pattern graph as possible, we devise different metrics to measure the degree of relaxation of a pattern graph. We design an algorithm that computes relaxed pattern graphs with non-empty answers in relaxation order. To improve the successive computation of relaxed pattern graphs, we suggest subquery caching and multiquery optimization techniques adapted to the context of this computation. Finally, we run experiments on different real datasets which demonstrate the effectiveness of our ranking of relaxed pattern graphs, and the efficiency of our system and optimization techniques in computing relaxed pattern graphs and their answers.

Original languageEnglish (US)
Pages (from-to)363-398
Number of pages36
JournalJournal of Web Engineering
Volume16
Issue number5-6
StatePublished - Sep 1 2017

Fingerprint

Explosions
Scalability
Feedback
Experiments

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems
  • Computer Networks and Communications

Cite this

Dass, Ananya ; Aksoy, Cem ; Dimitriou, Aggeliki ; Theodoratos, Dimitrios. / Relaxation of keyword pattern graphs on RDF data. In: Journal of Web Engineering. 2017 ; Vol. 16, No. 5-6. pp. 363-398.
@article{9d2060a2fb9b4f169038d04f1017ebc2,
title = "Relaxation of keyword pattern graphs on RDF data",
abstract = "One of the facets of the data explosion in recent years is the growing of the repositories of RDF Data on the Web. Keyword search is a popular technique for querying repositories of RDF graph data. Recently, a number of approaches leverage a structural summary of the graph data to address the typical keyword search related problems of: (a) identifying relevant results among a multitude of candidates, and (b) performance scalability. These approaches compute queries (pattern graphs) corresponding to alternative interpretations of the keyword query and the user selects one that matches her intention to be evaluated against the data. Though promising, these approaches suffer from a drawback: because summaries are approximate representations of the data, they might return empty answers or miss results which are relevant to the user intent. In this paper, we present a novel approach which combines the use of the structural summary and the user feedback with a relaxation technique for pattern graphs. We leverage pattern graph homomorphisms to define relaxed pattern graphs that are able to extract more results potentially of interest to the user. We introduce an operation on pattern graphs and we prove that it is complete, that is, it can produce all relaxed pattern graphs. To guarantee that the result pattern graphs are as close to the initial pattern graph as possible, we devise different metrics to measure the degree of relaxation of a pattern graph. We design an algorithm that computes relaxed pattern graphs with non-empty answers in relaxation order. To improve the successive computation of relaxed pattern graphs, we suggest subquery caching and multiquery optimization techniques adapted to the context of this computation. Finally, we run experiments on different real datasets which demonstrate the effectiveness of our ranking of relaxed pattern graphs, and the efficiency of our system and optimization techniques in computing relaxed pattern graphs and their answers.",
author = "Ananya Dass and Cem Aksoy and Aggeliki Dimitriou and Dimitrios Theodoratos",
year = "2017",
month = "9",
day = "1",
language = "English (US)",
volume = "16",
pages = "363--398",
journal = "Journal of Web Engineering",
issn = "1540-9589",
publisher = "Rinton Press Inc.",
number = "5-6",

}

Dass, A, Aksoy, C, Dimitriou, A & Theodoratos, D 2017, 'Relaxation of keyword pattern graphs on RDF data', Journal of Web Engineering, vol. 16, no. 5-6, pp. 363-398.

Relaxation of keyword pattern graphs on RDF data. / Dass, Ananya; Aksoy, Cem; Dimitriou, Aggeliki; Theodoratos, Dimitrios.

In: Journal of Web Engineering, Vol. 16, No. 5-6, 01.09.2017, p. 363-398.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Relaxation of keyword pattern graphs on RDF data

AU - Dass, Ananya

AU - Aksoy, Cem

AU - Dimitriou, Aggeliki

AU - Theodoratos, Dimitrios

PY - 2017/9/1

Y1 - 2017/9/1

N2 - One of the facets of the data explosion in recent years is the growing of the repositories of RDF Data on the Web. Keyword search is a popular technique for querying repositories of RDF graph data. Recently, a number of approaches leverage a structural summary of the graph data to address the typical keyword search related problems of: (a) identifying relevant results among a multitude of candidates, and (b) performance scalability. These approaches compute queries (pattern graphs) corresponding to alternative interpretations of the keyword query and the user selects one that matches her intention to be evaluated against the data. Though promising, these approaches suffer from a drawback: because summaries are approximate representations of the data, they might return empty answers or miss results which are relevant to the user intent. In this paper, we present a novel approach which combines the use of the structural summary and the user feedback with a relaxation technique for pattern graphs. We leverage pattern graph homomorphisms to define relaxed pattern graphs that are able to extract more results potentially of interest to the user. We introduce an operation on pattern graphs and we prove that it is complete, that is, it can produce all relaxed pattern graphs. To guarantee that the result pattern graphs are as close to the initial pattern graph as possible, we devise different metrics to measure the degree of relaxation of a pattern graph. We design an algorithm that computes relaxed pattern graphs with non-empty answers in relaxation order. To improve the successive computation of relaxed pattern graphs, we suggest subquery caching and multiquery optimization techniques adapted to the context of this computation. Finally, we run experiments on different real datasets which demonstrate the effectiveness of our ranking of relaxed pattern graphs, and the efficiency of our system and optimization techniques in computing relaxed pattern graphs and their answers.

AB - One of the facets of the data explosion in recent years is the growing of the repositories of RDF Data on the Web. Keyword search is a popular technique for querying repositories of RDF graph data. Recently, a number of approaches leverage a structural summary of the graph data to address the typical keyword search related problems of: (a) identifying relevant results among a multitude of candidates, and (b) performance scalability. These approaches compute queries (pattern graphs) corresponding to alternative interpretations of the keyword query and the user selects one that matches her intention to be evaluated against the data. Though promising, these approaches suffer from a drawback: because summaries are approximate representations of the data, they might return empty answers or miss results which are relevant to the user intent. In this paper, we present a novel approach which combines the use of the structural summary and the user feedback with a relaxation technique for pattern graphs. We leverage pattern graph homomorphisms to define relaxed pattern graphs that are able to extract more results potentially of interest to the user. We introduce an operation on pattern graphs and we prove that it is complete, that is, it can produce all relaxed pattern graphs. To guarantee that the result pattern graphs are as close to the initial pattern graph as possible, we devise different metrics to measure the degree of relaxation of a pattern graph. We design an algorithm that computes relaxed pattern graphs with non-empty answers in relaxation order. To improve the successive computation of relaxed pattern graphs, we suggest subquery caching and multiquery optimization techniques adapted to the context of this computation. Finally, we run experiments on different real datasets which demonstrate the effectiveness of our ranking of relaxed pattern graphs, and the efficiency of our system and optimization techniques in computing relaxed pattern graphs and their answers.

UR - http://www.scopus.com/inward/record.url?scp=85021726902&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021726902&partnerID=8YFLogxK

M3 - Article

VL - 16

SP - 363

EP - 398

JO - Journal of Web Engineering

JF - Journal of Web Engineering

SN - 1540-9589

IS - 5-6

ER -