Jackson State University
Faculty Sponsor's Department(s):
Detecting Spam Email Using Machine Learning Algorithms
Spam has become a major problem for the internet and its users. It ranges from fraudulent reviews of products and services to individuals creating fake social media profiles to reach unsuspecting victims. All those with email addresses have received their fair share of spam emails. As a result of opening these messages, users have been attacked by viruses, phishing scams, and Trojan Horses, which put the users, computer systems, and internet at risk. In February 2014, the estimated world-wide email spam rate had decreased to 64 percent due to improved spam filters and practical spam prevention practices. We focused on email spam, analyzing the features that classify these messages as spam, and the Naive Bayes, Logistic Regression, and J48 Decision Tree supervised machine learning algorithms used in spam filters to detect them. After being applied to two datasets, using the same set of features, the J48 algorithm returned the highest percentage of correctly classified spam and non-spam messages. Further study will seek to identify unique features of the J48 algorithm that can be incorporated into other algorithms. This information will be extremely useful to improve the efficiency and spam detecting abilities of machine learning algorithms.