A quantitative comparison of automated cleaning techniques for web scraped image data of 'Smart Cities'
| Authors |
|
|---|---|
| Publication date | 2022 |
| Book title | IPMV 2022 |
| Book subtitle | 2022 4th International Conference on Image Processing and Machine Vision : Virtual Conference, March 25-27, 2022 |
| ISBN (electronic) |
|
| Series | ICPS |
| Event | 4th International Conference on Image Processing and Machine Vision, IPMV 2022 |
| Pages (from-to) | 64-71 |
| Number of pages | 8 |
| Publisher | New York, New York: The Association for Computing Machinery |
| Organisations |
|
| Abstract |
This paper implements and compares four automated image cleaning techniques through the ResNet-34 Convolutional Neural Network, motivated by the need to reduce manual cleaning efforts of large image datasets. For each of these techniques, the relation with the literature on automated image cleaning is identified. Each of the four techniques uses a specific criterion to identify and remove unwanted images from datasets. The criteria range from identifying images with text, through identifying images with a specific size or tonal distribution, up to identifying images with a specific training loss value. In order to evaluate the four cleaning techniques, ResNet-34 was trained with web scraped images corresponding to 15 object classes of 'Smart Cities', and accuracy results were obtained through testing on the CalTech 256 dataset subset. The results show that manual cleaning outperforms automated cleaning techniques on all four criteria. However, analysis reveals that the individual automated techniques or a combination thereof can initially be deployed on large datasets before manual verification to reduce workload and increase dataset stability. |
| Document type | Conference contribution |
| Note | With supplementary slides. |
| Language | English |
| Published at | https://doi.org/10.1145/3529446.3529457 |
| Other links | https://www.scopus.com/pages/publications/85134877620 |
| Downloads |
3529446.3529457
(Final published version)
|
| Supplementary materials | |
| Permalink to this page | |
