Lazy ETL in Action: ETL Technology Dates Scientific Data
| Authors |
|
|---|---|
| Publication date | 08-2013 |
| Journal | Proceedings of the VLDB Endowment |
| Event | 39th International Conference on Very Large Data Bases |
| Volume | Issue number | 6 | 12 |
| Pages (from-to) | 1286-1289 |
| Organisations |
|
| Abstract |
Both scientific data and business data have analytical needs. Analysis takes place after a scientific data warehouse is eagerly filled with all data from external data sources (repositories). This is similar to the initial loading stage of Extract, Transform, and Load (ETL) processes that drive business intelligence. ETL can also help scientific data analysis. However, the initial loading is a time and resource consuming operation. It might not be entirely necessary, e.g. if the user is interested in only a subset of the data.
We propose to demonstrate Lazy ETL, a technique to lower costs for initial loading. With it, ETL is integrated into the query processing of the scientific data warehouse. For a query, only the required data items are extracted, transformed, and loaded transparently on-the-fly. The demo is built around concrete implementations of Lazy ETL for seismic data analysis. The seismic data warehouse is ready for query processing, without waiting for long initial loading. The audience fires analytical queries to observe the internal mechanisms and modifications that realize each of the steps; lazy extraction, transformation, and loading. |
| Document type | Article |
| Note | Proceedings title: Proceedings of the 39th International Conference on Very Large Data Bases, Riva del Garda, Trento, Italy Editors: M. Böhlen, C. Koch |
| Language | English |
| Published at | https://doi.org/10.14778/2536274.2536297 |
| Published at | http://www.vldb.org/pvldb/vol6/p1286-kargin.pdf |
| Permalink to this page | |