Assigning semantics to partial tree-pattern queries

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

The wide adoption of XML has increased the interest on data models that are based on tree-structured data. Querying capabilities are provided through tree-pattern queries (TPQs). The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. Assigning semantics to the queries of these languages so that they return meaningful answers is a challenging issue. In this paper, we introduce a query language which allows the specification of partial tree-pattern queries (PTPQs). The structure in a PTPQ can be flexibly specified fully, partially or not at all. We define index graphs which summarize the structural information of data trees. Using index graphs, we show that PTPQs can be evaluated through the generation of an equivalent set of "complete" TPQs. We suggest an original approach that exploits the set of complete TPQs of a PTPQ to assign meaningful semantics to the PTPQ language. In contrast to previous approaches that operate locally on the data to compute meaningful answers (usually by computing lowest common ancestors), our approach operates globally on index graphs to detect meaningful complete TPQs. We implemented and experimentally evaluated our approach on DBLP-based data sets with irregularities. Its comparison to previous ones shows that it succeeds in finding all the meaningful answers when the others fail (perfect recall). Further, it outperforms approaches with similar recall in excluding meaningless answers (better precision). Finally, it is superior to and scales better than the only previous approach that allows for structural constraints in the queries. Our approach generates TPQs and therefore, it can be easily implemented on top of an XQuery engine.

Original languageEnglish (US)
Pages (from-to)242-265
Number of pages24
JournalData and Knowledge Engineering
Volume64
Issue number1
DOIs
StatePublished - Jan 1 2008

Fingerprint

Query languages
Semantics
Specifications
XML
Data structures
Engines
Query

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Cite this

@article{7c123f36d0414507b7fd92f44ae76393,
title = "Assigning semantics to partial tree-pattern queries",
abstract = "The wide adoption of XML has increased the interest on data models that are based on tree-structured data. Querying capabilities are provided through tree-pattern queries (TPQs). The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. Assigning semantics to the queries of these languages so that they return meaningful answers is a challenging issue. In this paper, we introduce a query language which allows the specification of partial tree-pattern queries (PTPQs). The structure in a PTPQ can be flexibly specified fully, partially or not at all. We define index graphs which summarize the structural information of data trees. Using index graphs, we show that PTPQs can be evaluated through the generation of an equivalent set of {"}complete{"} TPQs. We suggest an original approach that exploits the set of complete TPQs of a PTPQ to assign meaningful semantics to the PTPQ language. In contrast to previous approaches that operate locally on the data to compute meaningful answers (usually by computing lowest common ancestors), our approach operates globally on index graphs to detect meaningful complete TPQs. We implemented and experimentally evaluated our approach on DBLP-based data sets with irregularities. Its comparison to previous ones shows that it succeeds in finding all the meaningful answers when the others fail (perfect recall). Further, it outperforms approaches with similar recall in excluding meaningless answers (better precision). Finally, it is superior to and scales better than the only previous approach that allows for structural constraints in the queries. Our approach generates TPQs and therefore, it can be easily implemented on top of an XQuery engine.",
author = "Dimitrios Theodoratos and Xiaoying Wu",
year = "2008",
month = "1",
day = "1",
doi = "https://doi.org/10.1016/j.datak.2007.07.002",
language = "English (US)",
volume = "64",
pages = "242--265",
journal = "Data and Knowledge Engineering",
issn = "0169-023X",
publisher = "Elsevier",
number = "1",

}

Assigning semantics to partial tree-pattern queries. / Theodoratos, Dimitrios; Wu, Xiaoying.

In: Data and Knowledge Engineering, Vol. 64, No. 1, 01.01.2008, p. 242-265.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Assigning semantics to partial tree-pattern queries

AU - Theodoratos, Dimitrios

AU - Wu, Xiaoying

PY - 2008/1/1

Y1 - 2008/1/1

N2 - The wide adoption of XML has increased the interest on data models that are based on tree-structured data. Querying capabilities are provided through tree-pattern queries (TPQs). The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. Assigning semantics to the queries of these languages so that they return meaningful answers is a challenging issue. In this paper, we introduce a query language which allows the specification of partial tree-pattern queries (PTPQs). The structure in a PTPQ can be flexibly specified fully, partially or not at all. We define index graphs which summarize the structural information of data trees. Using index graphs, we show that PTPQs can be evaluated through the generation of an equivalent set of "complete" TPQs. We suggest an original approach that exploits the set of complete TPQs of a PTPQ to assign meaningful semantics to the PTPQ language. In contrast to previous approaches that operate locally on the data to compute meaningful answers (usually by computing lowest common ancestors), our approach operates globally on index graphs to detect meaningful complete TPQs. We implemented and experimentally evaluated our approach on DBLP-based data sets with irregularities. Its comparison to previous ones shows that it succeeds in finding all the meaningful answers when the others fail (perfect recall). Further, it outperforms approaches with similar recall in excluding meaningless answers (better precision). Finally, it is superior to and scales better than the only previous approach that allows for structural constraints in the queries. Our approach generates TPQs and therefore, it can be easily implemented on top of an XQuery engine.

AB - The wide adoption of XML has increased the interest on data models that are based on tree-structured data. Querying capabilities are provided through tree-pattern queries (TPQs). The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. Assigning semantics to the queries of these languages so that they return meaningful answers is a challenging issue. In this paper, we introduce a query language which allows the specification of partial tree-pattern queries (PTPQs). The structure in a PTPQ can be flexibly specified fully, partially or not at all. We define index graphs which summarize the structural information of data trees. Using index graphs, we show that PTPQs can be evaluated through the generation of an equivalent set of "complete" TPQs. We suggest an original approach that exploits the set of complete TPQs of a PTPQ to assign meaningful semantics to the PTPQ language. In contrast to previous approaches that operate locally on the data to compute meaningful answers (usually by computing lowest common ancestors), our approach operates globally on index graphs to detect meaningful complete TPQs. We implemented and experimentally evaluated our approach on DBLP-based data sets with irregularities. Its comparison to previous ones shows that it succeeds in finding all the meaningful answers when the others fail (perfect recall). Further, it outperforms approaches with similar recall in excluding meaningless answers (better precision). Finally, it is superior to and scales better than the only previous approach that allows for structural constraints in the queries. Our approach generates TPQs and therefore, it can be easily implemented on top of an XQuery engine.

UR - http://www.scopus.com/inward/record.url?scp=36048951735&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=36048951735&partnerID=8YFLogxK

U2 - https://doi.org/10.1016/j.datak.2007.07.002

DO - https://doi.org/10.1016/j.datak.2007.07.002

M3 - Article

VL - 64

SP - 242

EP - 265

JO - Data and Knowledge Engineering

JF - Data and Knowledge Engineering

SN - 0169-023X

IS - 1

ER -