摘要 :
Data analysis-based decision-making is performed daily by domain experts. As data grows, getting access to relevant data becomes a challenge. In an approach known as Ontology-based data access (OBDA), ontologies are advocated as a...
展开
Data analysis-based decision-making is performed daily by domain experts. As data grows, getting access to relevant data becomes a challenge. In an approach known as Ontology-based data access (OBDA), ontologies are advocated as a suitable formal tool to address complex data access. This technique combines a domain ontology with a data source by using a declarative mapping specification to enable data access using a domain vocabulary. We investigate this approach by studying the theoretical background; conducting a literature review on the implementation of OBDA in production systems; implementing OBDA on a relational dataset using an OBDA tool and; providing results and analysis of query answering. We selected Ontop to illustrate how this technique enhances the data usage of the GitHub community. Ontop is an open-source OBDA tool applied in the domain of relational databases. The implementation consists of the GHTorrent dataset and an extended SemanGit ontology. We perform a set of queries to highlight a subset of the features of this data access approach. The results look positive and can assist various use cases related to GitHub data with a semantic approach. OBDA does provide benefits in practice, such as querying in domain vocabulary and making use of reasoning over the axioms in the ontology. However, the practical impediments we observe are in the "manual" development of a domain ontology and the creation of a mapping specification which requires deep knowledge of a domain and the data. Also, implementing OBDA within the practical context of an information system requires careful consideration for a suitable user interface to facilitate the query construction from ontology vocabulary. Finally, we conclude with a summary of the paper and direction for future research.
收起
摘要 :
We propose a method for generating and evaluating faceted queries over ontology-enhanced distributed graph databases. A user, who only vaguely knows the domain ontology, starts with a set of keywords. Then, an initial faceted quer...
展开
We propose a method for generating and evaluating faceted queries over ontology-enhanced distributed graph databases. A user, who only vaguely knows the domain ontology, starts with a set of keywords. Then, an initial faceted query is automatically generated and the user is guided in interactive modification and refinement of successively created faceted queries. We provide the theoretical foundation for this way of faceted query construction and translation into first order monadic positive existential queries.
收起
摘要 :
To deliver business value, most data-driven enterprises and applications require data to be extracted and merged from otherwise siloed data storage platforms. FDC Cache has been designed and developed to enable the fusion and cach...
展开
To deliver business value, most data-driven enterprises and applications require data to be extracted and merged from otherwise siloed data storage platforms. FDC Cache has been designed and developed to enable the fusion and caching of data drawn from multiple small and/or Big Data stores. This capability executes a sequence of queries, wherein the results from one query may be used to constrain subsequent queries. The results of each query are linked with results from previous queries, incrementally building a cache of semantically linked data that can be used to support multiple independent data requests. FDC Cache uses Semantic Web technologies, and knowledge graphs in particular, to describe the relevant data and relationships in a computable model. This enables applications to reason over the graph, for example to dynamically retrieve targeted subsets of data comprised of previously disparate information. We have successfully applied FDC Cache to two distinct industrial use cases: (i) merging data across multiple sources to assemble information about current parts in a gas turbine, and (ii) dynamically aligning siloed data from electric grid transmission and distribution networks to an industry-standard common model, in which the cache creation time has been shown to scale sub-linearly with the number of data elements. FDC Cache has been open-sourced as part of the GE-developed open source Semantics Toolkit.
收起
摘要 :
Description Logics (DLs) play a central role as formalisms for representing ontologies and reasoning about them. This lecture introduces the basics of DLs. We discuss the knowledge modeling capabilities of some of the most promine...
展开
Description Logics (DLs) play a central role as formalisms for representing ontologies and reasoning about them. This lecture introduces the basics of DLs. We discuss the knowledge modeling capabilities of some of the most prominent DLs, including expressive ones, and present some DL reasoning services. Particular attention is devoted to the query answering problem, and to the increasingly popular framework in which data repositories are queried through DL ontologies. We give an overview of the main challenges that arise in this setting, survey some query answering techniques for both lightweight and expressive DLs, and give an overview of the computational complexity landscape.
收起
摘要 :
Description Logics (DLs) play a central role as formalisms for representing ontologies and reasoning about them. This lecture introduces the basics of DLs. We discuss the knowledge modeling capabilities of some of the most promine...
展开
Description Logics (DLs) play a central role as formalisms for representing ontologies and reasoning about them. This lecture introduces the basics of DLs. We discuss the knowledge modeling capabilities of some of the most prominent DLs, including expressive ones, and present some DL reasoning services. Particular attention is devoted to the query answering problem, and to the increasingly popular framework in which data repositories are queried through DL ontologies. We give an overview of the main challenges that arise in this setting, survey some query answering techniques for both lightweight and expressive DLs, and give an overview of the computational complexity landscape.
收起
摘要 :
Ontology-Based Data Access (OBDA) is considered as a promising semantic approach to query various complex datasets for such weak-formalized activity as energy technology forecasting. OBDA uses an ontology to operate with complex e...
展开
Ontology-Based Data Access (OBDA) is considered as a promising semantic approach to query various complex datasets for such weak-formalized activity as energy technology forecasting. OBDA uses an ontology to operate with complex energy technology data abstracting away from the technical schema-level details. Special mapping is required to connect the related data to ontology entities. OBDA approach automatically translates queries posed over the ontology into data-level queries which can be executed by the underlying database management system. The paper is focused on the main principles of OBDA applied to Energy Technology Database within technology forecasting information system.
收起
摘要 :
Ontology-based Data Access (OBDA) is a by now well-established paradigm that relies on conceptually representing a domain of interest to provide access to relational data sources. The conceptual representation is given in terms of...
展开
Ontology-based Data Access (OBDA) is a by now well-established paradigm that relies on conceptually representing a domain of interest to provide access to relational data sources. The conceptual representation is given in terms of a domain schema (also called an ontology), which is linked to the data sources by means of declarative mapping specifications, and queries posed over the conceptual schema are automatically rewritten into queries over the sources. We consider the interesting setting where users would like to access the same data sources through a new conceptual schema, which we call the upper schema. This is particularly relevant when the upper schema is a reference model for the domain, or captures the data format used by data analysis tools. We propose a solution to this problem that is based on using transformation rules to map the upper schema to the domain schema, building upon the knowledge contained therein. We show how this enriched framework can be automatically transformed into a standard OBDA specification, which directly links the original relational data sources to the upper schema. This allows us to access data directly from the data sources while leveraging the domain schema and upper schema as a lens. We have realized the framework in a tool-chain that provides modeling of the conceptual schemas, a concrete annotation-based mechanism to specify transformation rules, and the automated generation of the final OBDA specification.
收起
摘要 :
Ontology-based Data Access (OBDA) is a by now well-established paradigm that relies on conceptually representing a domain of interest to provide access to relational data sources. The conceptual representation is given in terms of...
展开
Ontology-based Data Access (OBDA) is a by now well-established paradigm that relies on conceptually representing a domain of interest to provide access to relational data sources. The conceptual representation is given in terms of a domain schema (also called an ontology), which is linked to the data sources by means of declarative mapping specifications, and queries posed over the conceptual schema are automatically rewritten into queries over the sources. We consider the interesting setting where users would like to access the same data sources through a new conceptual schema, which we call the upper schema. This is particularly relevant when the upper schema is a reference model for the domain, or captures the data format used by data analysis tools. We propose a solution to this problem that is based on using transformation rules to map the upper schema to the domain schema, building upon the knowledge contained therein. We show how this enriched framework can be automatically transformed into a standard OBDA specification, which directly links the original relational data sources to the upper schema. This allows us to access data directly from the data sources while leveraging the domain schema and upper schema as a lens. We have realized the framework in a tool-chain that provides modeling of the conceptual schemas, a concrete annotation-based mechanism to specify transformation rules, and the automated generation of the final OBDA specification.
收起
摘要 :
In this paper we study query answering and rewriting in ontology-based data access. Specifically, we present an algorithm for computing a perfect rewriting of unions of conjunctive queries posed over ontologies expressed in the de...
展开
In this paper we study query answering and rewriting in ontology-based data access. Specifically, we present an algorithm for computing a perfect rewriting of unions of conjunctive queries posed over ontologies expressed in the description logic εLHIO, which covers the OWL 2 QL and OWL 2 EL profiles. The novelty of our algorithm is the use of a set of ABox dependencies, which are compiled into a so-called EBox, to limit the expansion of the rewriting. So far, EBoxes have only been used in query rewriting in the case of DL-Lite, which is less expressive than εLHIO. We have extensively evaluated our new query rewriting technique, and in this paper we discuss the tradeoff between the reduction of the size of the rewriting and the computational cost of our approach.
收起
摘要 :
In this paper we study query answering and rewriting in ontology-based data access. Specifically, we present an algorithm for computing a perfect rewriting of unions of conjunctive queries posed over ontologies expressed in the de...
展开
In this paper we study query answering and rewriting in ontology-based data access. Specifically, we present an algorithm for computing a perfect rewriting of unions of conjunctive queries posed over ontologies expressed in the description logic εLHIO, which covers the OWL 2 QL and OWL 2 EL profiles. The novelty of our algorithm is the use of a set of ABox dependencies, which are compiled into a so-called EBox, to limit the expansion of the rewriting. So far, EBoxes have only been used in query rewriting in the case of DL-Lite, which is less expressive than εLHIO. We have extensively evaluated our new query rewriting technique, and in this paper we discuss the tradeoff between the reduction of the size of the rewriting and the computational cost of our approach.
收起