Our technical approach will exploit recent ground-breaking European research on semantic technologies, in particular related to query rewriting, combining this with techniques for scaling up query evaluation, in particular massive parallelism. These will be integrated in a comprehensive and extensible platform that builds on open standards and protocols such as RDF, OWL, SPARQL the OWL API, and the openRDF project. This will facilitate the integration and reuse of existing software and tools developed in previous initiatives and projects, and allow us to (i) focus our eorts on the development of novel and performance-critical key modules; (ii) provide a ﬁrst prototype implementation at an early stage of the project; and (iii) maximise the reusability of the components developed in Optique.
A key feature of Optique is that the ontology provides a user-oriented conceptual model of the domain against which queries can be posed. This allows the user to formulate “natural” queries using familiar terms and without having to understand the structure of the underlying data sources. Tool support will be provided that exploits the ontology to help users to formulate coherent queries, and allows users with diering levels of expertise to cooperate on the same query and to extend the ontology on the ﬂy as needed for the query being formulated.
Ontology & Mapping Management
The architecture proposed in this project crucially depends on the existence of suitable ontologies and mappings. In this context, the ontology provides a user-oriented conceptual model of the domain that makes it easier for users to formulate queries and understand answers; and at the same time, the ontology acts as a “global schema” onto which the schemas of various data sources can be mapped. Developing suitable ontologies from scratch is likely to be expensive. Optique will address this issue by developing tools and methodologies for semi-automatically “bootstrapping” the system with a suitable initial ontology and for extending the ontology “on the ﬂy” as needed by a given application.
Regarding the ontology/data-source mappings, many of these will, like the ontology, be generated automatically from either database schemata and other available metadata or formal installation models. However, these initial mappings are unlikely to be sucient in all cases, and they will certainly need to evolve along with the ontology. Moreover, new data sources may be added, and this would again require extension and adjustment of the mappings.
Time and Streams
Time plays an important role in many industrial applications, and this is even more true in the context of ontologybased data modeling. Temporal queries and queries w.r.t. static data are usually to be combined.
The central idea of Optique for scalable temporal query answering and stream-based answering of continuous queries using the OBDA approach is to apply the idea of query transformation w.r.t. ontologies also for queries with temporal or window-based semantics. Using an ontology as well as a set of mappings for notions deﬁned in the ontology to the nomenclature used in particular relational database schemas, query formulation is much easier, and costly materialization will not be required.
The Optique query answering infrastructure will rely heavily on query rewriting. However, given the large number of sources, their size, and the complexity of the use cases, we will need to go far beyond traditional approaches. In particular, our platform will put an emphasis not only on the performance of the rewriting process but also on the performance of query evaluation.
Distributed Query Execution
The Optique work on distributed query execution concentrates on the ecient evaluation of queries produced by the query translation component. These queries target the EPDS database and the project databases in the Statoil use case or the collection of Turbine databases in the Siemens use case. We address one-time , continuous/streaming, and temporal query scenarios as they arise in the two use cases.