Accomplishments
Keyword-Based Information Retrieval Model for the Dark Web
- Abstract
Due to its encryption networks and websites that cannot be found via traditional search engines, searching for specific information on the dark web poses unique challenges. The lack of an indexing mechanism requires an improvised searching mechanism on the dark web. This paper presents retrieving information on the dark web. The proposed method utilizes a keyword-driven ranking system to show relevant dark web content. Within a crawled dataset of dark web websites, a given set of websites is ranked using keyword analysis, cosine similarity, and GloVe embeddings. Initially, keywords are selected to ensure that they pertain to the content contained within the dataset. Choosing appropriate keywords determines how accurately the crawled dataset represents the various topics. A comprehensive keyword analysis is then performed on each website's content, scoring each keyword based on importance. A retrieval model combines these scores to create composite scores, which indicate whether a website is relevant to a search query. The websites are sorted in descending order based on their combined keyword scores. Using user keywords, the presented work identifies the most relevant results from dark web onion services.