The DBMS - your Big Data Sommelier

Y. Kargın; M. Kersten; S. Manegold; H. Pirk

doi:https://doi.org/10.1109/ICDE.2015.7113361

The DBMS - your Big Data Sommelier

Authors	Y. Kargın M. Kersten S. Manegold H. Pirk
Publication date	2015
Book title	31st IEEE International Conference on Data Engineering: Seoul, Korea, April 13-17, 2015
ISBN	9781479979653
Event	31st IEEE International Conference on Data Engineering
Pages (from-to)	1119-1130
Publisher	[Piscataway, NJ]: IEEE
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	When addressing the problem of "big" data volume, preparation costs are one of the key challenges: the high costs for loading, aggregating and indexing data leads to a long data-to-insight time. In addition to being a nuisance to the end-user, this latency prevents real-time analytics on "big" data. Fortunately, data often comes in semantic chunks such as files that contain data items that share some characteristics such as acquisition time or location. A data management system that exploits this trait can significantly lower the data preparation costs and the associated data-to-insight time by only investing in the preparation of the relevant chunks. In this paper, we develop such a system as an extension of an existing relational DBMS (MonetDB). To this end, we develop a query processing paradigm and data storage model that are partial-loading aware. The result is a system that can make a 1.2 TB dataset (consisting of 4000 chunks) ready for querying in less than 3 minutes on a single server-class machine while maintaining good query processing performance.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1109/ICDE.2015.7113361 (Final published version)
Downloads	KarginICDE2015 (Submitted manuscript)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

The DBMS - your Big Data Sommelier