Accomplishments
PhishNet: a comprehensive system for detecting phishing websites using machine learning
- Abstract
Phishing remains an ongoing and arduous cyberspace threat, dependent upon counterfeit Uniform Resource Locators (URLs) to legitimize its entrapment of unsuspecting site users. This article presents a multi-layered anti-phishing detection framework that leverages Machine Learning and external threat intelligence to optimize detection strength and accuracy. The system makes use of lexical patterns and domain level attributes where a Random Forest classifier trained on 651,192 Uniform Resource Locators (URLs) is used for initial classification. Secondary validation pipelines, which include WHOIS metadata checks, VirusTotal threat intelligence, and typosquatting detection through string-based similarity algorithms, are then applied to overcome the limitations of any singular Machine Learning method. A composite risk rating algorithm is an aggregation of these indicators to provide for precise, high-confidence identification of probable phishing. Continuous recording offers auditability and guarantees continuous improvement, while automated abuse reporting and response modules ensure fast threat silencing. The system is now better placed to proactively act, detect, and counter dynamic phishing threats in real time through this multi-aspect approach.