Unsupervised Anomaly Detection in Data Quality Control

Open Access
Authors
Publication date 2021
Host editors
  • Y. Chen
  • H. Ludwig
  • Y. Tu
  • U. Fayyad
  • X. Zhu
  • X. Hu
  • S. Byna
  • X. Liu
  • J. Zhang
  • S. Pan
  • V. Papalexakis
  • J. Wang
  • A. Cuzzocrea
  • C. Ordonez
Book title 2021 IEEE International Conference on Big Data
Book subtitle proceedings : Dec 15-Dec 18, 2021 : virtual event
ISBN
  • 9781665445993
ISBN (electronic)
  • 9781665439022
Event 2021 IEEE International Conference on Big Data
Pages (from-to) 2327-2336
Number of pages 10
Publisher Piscataway, NJ: IEEE
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Data is one of the most valuable assets of an organization and has a tremendous impact on its long-term success and decision-making processes. Typically, organizational data error and outlier detection processes perform manually and reactively, making them time-consuming and prone to human errors. Additionally, rich data types, unlabeled data, and increased volume have made such data more complex. Accordingly, an automated anomaly detection approach is required to improve data management and quality control processes. This study introduces an unsupervised anomaly detection approach based on models comparison, consensus learning, and a combination of rules of thumb with iterative hyper-parameter tuning to increase data quality. Furthermore, a domain expert is considered a human in the loop to evaluate and check the data quality and to judge the output of the unsupervised model. An experiment has been conducted to assess the proposed approach in the context of a case study. The experiment results confirm that the proposed approach can improve the quality of organizational data and facilitate anomaly detection processes.
Document type Conference contribution
Note In print proceedings: p. 1564-1573.
Language English
Published at https://doi.org/10.1109/BigData52589.2021.9671672
Published at https://zenodo.org/record/5872438
Other links https://www.proceedings.com/62200.html
Downloads
2021.workshop.bigdata.midp21.camera (Accepted author manuscript)
Permalink to this page
Back