Unsupervised Detection of Anomalous Commits in Software Repositories
| Authors |
|
|---|---|
| Publication date | 2025 |
| Host editors |
|
| Book title | 2025 IEEE International Conference on Software Services Engineering : IEEE SSE 2025 |
| Book subtitle | Helsinki, Finland, 7-12 July 2025 : proceedings |
| ISBN |
|
| ISBN (electronic) |
|
| Event | 2025 IEEE International Conference on Software Services Engineering, SSE 2025 |
| Pages (from-to) | 31-38 |
| Number of pages | 8 |
| Publisher | Los Alamitos, California: IEEE Computer Society |
| Organisations |
|
| Abstract |
Identifying anomalous commits is essential for maintaining software quality and reliability, as these anomalies can indicate potential issues in code, development practices, or repository management. Current anomaly detection methods typically rely on prede-fined rules or supervised learning, which suffer from limitations such as dependence on labeled datasets, rigid rule definitions, and high maintenance overhead in rapidly evolving repositories. This paper introduces a novel unsupervised framework for effectively detecting anomalous commits without requiring labeled data or rigid rules, providing a scalable and adaptable solution to enhance code quality in modern version control systems. To address the high-dimensional and mul-tifaceted nature of commit data, our approach com-bines dimensionality reduction techniques with tar-geted feature engineering, enhancing both precision and adaptability in anomaly detection. We systematically evaluate three state-of-the-art unsupervised techniques-Local Outlier Factor (LOF), Isolation Forest (IF), and Histogram-Based Outlier Score (HBOS)-across five diverse open-source repositories. Our results demonstrate that Isolation Forest achieves the highest detection accuracy, effectively balancing precision and recall while capturing both global and local anomalies. Additionally, expert validation confirms the practical relevance of our approach, providing insights into frequent and high-impact anomalies encountered in real-world repositories. |
| Document type | Conference contribution |
| Language | English |
| Published at | https://doi.org/10.1109/SSE67621.2025.00013 |
| Other links | https://www.proceedings.com/82141.html https://www.scopus.com/pages/publications/105016017456 |
| Downloads |
Unsupervised_Detection_of_Anomalous_Commits_in_Software_Repositories
(Final published version)
|
| Permalink to this page | |
