Data stream management systems: a response to large scale scientific data requirements

Sabina Surdu

Abstract


Exact sciences research communities are dealing with data sets that are expected to reach exascale sizes in the years to come. Analysing such data sets becomes a tedious task, when using classical data processing paradigms. An increasing number of domains from exact sciences, like radio astronomy or financing, are handling data streams, which are sequences of values produced over time by data sources. Data streams require new processing approaches and a number of prototypes have already been implemented. As an important application of informatics in exact sciences, we consider the case of Data Stream Management Systems, as a response to increasingly large scale data in a great number of fields. We aim at highlighting the main challenges in data stream processing. Based on identified difficulties, we present a preliminary set of five principles, SCIPE, that we plan to use in further research work. The final purpose would be to design and develop a dedicated Scientific Data Stream Management System.  

Full Text:

PDF