As online activities increased along with the development of the surface web and SNS, the exposure of users' private information on websites also increased. In addition, the increase in image-sharing activity along with text has made it possible to collect private information contained in images. From this, OSINT (Open Source Intelligence) seems to be useful in criminal investigations that require the collection of information related to the suspect. Although it is an area that is currently being used in the United States, it is not suitable for the Korean environment, so we propose a study that reflects it.
In this paper, we propose a workflow to enable a private information collection method considering the Korean situation. Collect as much information as possible based on personal and image workflows. This manual workflow has the advantage of being able to collect information in more diverse areas but has the disadvantage of being difficult to process quickly in analysis. Therefore, images are classified according to the type of private information included in the images based on various artificial intelligence models to automatically perform additional analysis on the collected images. In addition, a method for performing additional automatic analysis on the classified images was proposed. In the case of image type classification, classification is performed according to the type of private information into images with fixed information, images with movement information, images with unique information, images with private information, and images with text. Afterward, analysis was conducted by applying different artificial intelligence models according to the characteristics of each image. In addition, a NoSQL database schema was designed to store and manage collected and analyzed information.
This can be used as an auxiliary tool for profiling in the prosecution's office work, and the collected and analyzed private information can be used as a digital evidence item. In addition, based on the contents of this thesis, it is hoped that research for the protection of private information included in images among public information will be developed and that artificial intelligence will be applied to the entire process in the future to contribute to automatic private information extraction and analysis automation research.