Identification of Security related Bug Reports via Text Mining using Supervised and Unsupervised Classification

Abstract: This paper is focused on automated classification of software bug reports to security and non-security related, using both supervised and unsupervised approaches. For both approaches, three types of feature vectors are used. For supervised learning, we experiment with multiple learning algorithms and training sets with different sizes. Furthermore, we propose a novel unsupervised approach based on anomaly detection. The evaluated is based on three NASA datasets. The results show that supervis…