Computers facilitate various crimes, including hacking, drug trafficking, and child pornography. Over 75% of criminals store plans on their PCs or laptops. Investigative officers extract evidence from suspects’ machines, but the rise in computer-related crimes necessitates specialized forensic tools. These tools streamline the process, making it more efficient than manual searches. Inspired by the Sûreté du Québec (SQ) forensic process, we introduce a unique subject-based semantic document clustering model. It helps investigators group documents on a suspect’s computer into overlapping clusters, each corresponding to an investigator-defined subject of interest. Our system stands out in the crowded field of forensic tools.
Our ‘Subject Based Forensic Investigation’ project comprises four modules. ‘Registration’ is the initial module, akin to standard user registration. In our system, multiple users, serving as investigators, must register to access all its features.
After the user registration, the investigator should login to our proposed system with the registered user id and the password. After login to our proposed system. We can do our next processes after login to the account.
Subject Vector Expansion
The module’s primary goal is to expand input vectors by suggesting nouns and verbs related to the subject. Using the ‘Subject Vector Expansion’ algorithm, we generate expansion vectors using WordNet for synonyms. These vectors serve as synonym lists for document clustering and are stored in a database. They facilitate data mining on the criminal’s machine, enabling file inspections.
This module uses a ‘Clustering Algorithm’ to analyze suspect files. Keywords in files are compared to synonyms. Matching files are moved to a new folder. This central module enhances our investigation strategy by creating evidence from the clustered folder.