Aspects of Stream Processing
Stream processing has been and is still a highly relevant research topic in computer science. There are quite a few research paper titles hinting concisely to various important aspects of stream processing, be it the ubiquity of streams due to the temporality of most data (“It’s a streaming world!”, ), or its potential infinity in contrast to data stored in a static database system (“Streams are forever”, ), or the importance of the order in which data are streamed (“Order matters”, ). These aspects are relevant for all levels of stream processing, in particular for classical stream processing on the sensor-data level, e.g. within sensor networks, and on the relational data level, e.g., within stream data management systems.
Recent interest on high-level declarative stream processing w.r.t. a terminological knowledge base, aka as ontology, have lead to additional aspects becoming relevant: The enduser accesses all possibly heterogeneous data sources (static and streaming) via a declarative query language using the vocabulary, aka as signature, of the ontology. Some recent projects, such as BOEMIE and CASAM, demonstrated how such a uniform ontology interface could be used to realize (abductive) interpretation of multimedia streaming data, which combine video streams, audio streams and text streams with annotations . This kind of convenience and flexibility for the end-user leads to challenging aspects for the streaming engine which has to provide the homogeneous ontology view over the data and which has to guarantee that all (and only) answers w.r.t. the ontology are captured.
The stream-temporal sub-module of the Optique platform implements high-level stream processing along the ontology-based data paradigm (OBDA)—providing end-users a homogeneous declarative access to relational static data and relational streaming data. Mappings lift relational data (streams) stored in the distributed relational data stream management system EXAREME to the ontology level so that the details of the original data and streaming sources are hidden from the end-user. Regarding the ubiquity aspect, the query language STARQL underlying the Optique streaming engine [4, 5] offers the same query interface to both historical (= timestamped) data stored in a (temporal) DBMS and streaming data; so it provides the same means for the engineer/the enduser to conduct reactive diagnosis of data coming from a historical database, and to conduct predictive diagnostics by monitoring real-time streams. Regarding the aspect of poten- tial infinity, STARQL uses the widely used sliding window constructor, but adapts it to cope with the semantical constraints of the ontology (such as a constraint saying that, at every time point, a sensor may show at most one value). And regarding the ordering aspect, STARQL offers various sequencing strategies within the sliding window, thereby providing a local grouping and ordering.
Current work on the Optique stream-temporal sub-module concerns optimization aspects, in particular optimizations for multiple query answering as needed for parallel monitoring of up to thousands of sensor value streams.
Prof. Dr. Ralf Möller
Full professor and director of the Institute of Information Systems (IFIS), University of Lübeck. Leader of the Optique work package on temporal and stream processing (WP5).
Dr. Özgur L. Özçep
Research Assistant, Institute of Information Systems (IFIS), University of Lübeck. Working in WP5 on theoretical aspects of temporal and stream processing.