Accomplishments
Summarizing Dark Web Services with TF-IDF and LSA
- Abstract
Identifying individual dark web services can be challenging, but several methods exist to overcome this challenge. Automated dark web services labeling can be performed using NLP, sentiment analysis, computer vision, and deep learning techniques. These methods range from keyword-based labeling to deep learning-based labeling. The proposed work uses TF-IDF and LSA (Latent Semantic Analysis) to summarize dark web services. LSA is used to uncover latent relationships between words and concepts in a corpus of Dark Web services. In summarizing dark web services, authors can identify the most important keywords and concepts, which can provide insight into what types of services are available on the dark web. The identified key terms can potentially be the label of individual dark web services. With the help of word cloud visualization, the authors demonstrated the proposed approach. The proposed method can be particularly valuable when dealing with large volumes of unlabeled dark web text data, as it can identify patterns that would be hard to detect otherwise.