A quantitative comparison of automated cleaning techniques for web scraped image data of 'Smart Cities'

Open Access
Authors
Publication date 2022
Book title IPMV 2022
Book subtitle 2022 4th International Conference on Image Processing and Machine Vision : Virtual Conference, March 25-27, 2022
ISBN (electronic)
  • 9781450395823
Series ICPS
Event 4th International Conference on Image Processing and Machine Vision, IPMV 2022
Pages (from-to) 64-71
Number of pages 8
Publisher New York, New York: The Association for Computing Machinery
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract

This paper implements and compares four automated image cleaning techniques through the ResNet-34 Convolutional Neural Network, motivated by the need to reduce manual cleaning efforts of large image datasets. For each of these techniques, the relation with the literature on automated image cleaning is identified. Each of the four techniques uses a specific criterion to identify and remove unwanted images from datasets. The criteria range from identifying images with text, through identifying images with a specific size or tonal distribution, up to identifying images with a specific training loss value. In order to evaluate the four cleaning techniques, ResNet-34 was trained with web scraped images corresponding to 15 object classes of 'Smart Cities', and accuracy results were obtained through testing on the CalTech 256 dataset subset. The results show that manual cleaning outperforms automated cleaning techniques on all four criteria. However, analysis reveals that the individual automated techniques or a combination thereof can initially be deployed on large datasets before manual verification to reduce workload and increase dataset stability.

Document type Conference contribution
Note With supplementary slides.
Language English
Published at https://doi.org/10.1145/3529446.3529457
Other links https://www.scopus.com/pages/publications/85134877620
Downloads
3529446.3529457 (Final published version)
Supplementary materials
Permalink to this page
Back