본문 바로가기 주메뉴 바로가기
국회도서관 홈으로 정보검색 소장정보 검색

초록보기

최근 SNS는 개인의 의사소통뿐 아니라 마케팅의 중요한 채널로도 자리매김하고 있다. 그러나 사이버 범죄 역시 정보와 통신 기술의 발달에 따라 진화하여 불법 광고가 SNS에 다량으로 배포되고 있다. 그 결과 개인정보를 빼앗기거나 금전적인 손해가 빈번하게 일어난다. 본 연구에서는 SNS로 전달되는 홍보글인 비정형 데이터를 분석하여 어떤 글이 금융사기(예: 불법 대부업 및 불법 방문판매)와 관련된 글인지를 분석하는 방법론을 제안하였다. 불법 홍보글 학습 데이터를 만드는 과정과, 데이터의 특성을 고려하여 입력 데이터를 구성하는 방안, 그리고 판별 알고리즘의 선택과 추출할 정보 대상의 선정 등이 프레임워크의 주요 구성 요소이다. 본 연구의 방법은 실제로 모 지방자치단체의 금융사기 방지 프로그램의 파일럿 테스트에 활용되었으며, 실제 데이터를 가지고 분석한 결과 금융사기 글을 판정하는 정확도가 사람들에 의하여 판정하는 것이나 키워드 추출법(Term Frequency), MLE 등에 비하여 월등함을 검증하였다.

Recently, SNS has become an important channel for marketing as well as personal communication. However, cybercrime has also evolved with the development of information and communication technology, and illegal advertising is distributed to SNS in large quantity. As a result, personal information is lost and even monetary damages occur more frequently. In this study, we propose a method to analyze which sentences and documents, which have been sent to the SNS, are related to financial fraud.

First of all, as a conceptual framework, we developed a matrix of conceptual characteristics of cybercriminality on SNS and emergency management. We also suggested emergency management process which consists of Pre-Cybercriminality (e.g. risk identification) and Post-Cybercriminality steps. Among those we focused on risk identification in this paper.

The main process consists of data collection, preprocessing and analysis. First, we selected two words ’daechul(loan)’ and ‘sachae(private loan)’ as seed words and collected data with this word from SNS such as twitter. The collected data are given to the two researchers to decide whether they are related to the cybercriminality, particularly financial fraud, or not. Then we selected some of them as keywords if the vocabularies are related to the nominals and symbols. With the selected keywords, we searched and collected data from web materials such as twitter, news, blog, and more than 820,000 articles collected.

The collected articles were refined through preprocessing and made into learning data. The preprocessing process is divided into performing morphological analysis step, removing stop words step, and selecting valid part-of-speech step. In the morphological analysis step, a complex sentence is transformed into some morpheme units to enable mechanical analysis. In the removing stop words step, non-lexical elements such as numbers, punctuation marks, and double spaces are removed from the text.

In the step of selecting valid part-of-speech, only two kinds of nouns and symbols are considered. Since nouns could refer to things, the intent of message is expressed better than the other part-of-speech. Moreover, the more illegal the text is, the more frequently symbols are used.

The selected data is given ‘legal’ or ‘illegal’. To make the selected data as learning data through the preprocessing process, it is necessary to classify whether each data is legitimate or not. The processed data is then converted into Corpus type and Document-Term Matrix. Finally, the two types of ‘legal’ and ‘illegal’ files were mixed and randomly divided into learning data set and test data set. In this study, we set the learning data as 70% and the test data as 30%.

SVM was used as the discrimination algorithm. Since SVM requires gamma and cost values as the main parameters, we set gamma as 0.5 and cost as 10, based on the optimal value function. The cost is set higher than general cases. To show the feasibility of the idea proposed in this paper, we compared the proposed method with MLE (Maximum Likelihood Estimation), Term Frequency, and Collective Intelligence method. Overall accuracy and was used as the metric. As a result, the overall accuracy of the proposed method was 92.41% of illegal loan advertisement and 77.75% of illegal visit sales, which is apparently superior to that of the Term Frequency, MLE, etc. Hence, the result suggests that the proposed method is valid and usable practically.

In this paper, we propose a framework for crisis management caused by abnormalities of unstructured data sources such as SNS. We hope this study will contribute to the academia by identifying what to consider when applying the SVM-like discrimination algorithm to text analysis. Moreover, the study will also contribute to the practitioners in the field of brand management and opinion mining.

권호기사

권호기사 목록 테이블로 기사명, 저자명, 페이지, 원문, 기사목차 순으로 되어있습니다.
기사명 저자명 페이지 원문 목차
뉴스기사를 이용한 소비자의 경기심리지수 생성 = Construction of Consumer Confidence index based on Sentiment analysis using News articles 송민채, 신경식 pp.1-27

지자체 사이버 공간 안전을 위한 금융사기 탐지 텍스트 마이닝 방법 = Financial Fraud Detection using Text Mining Analysis against Municipal Cybercriminality 최석재, 이중원, 권오병 pp.119-138

공간정보기반 클러스터링을 이용한 초고속인터넷 결합유형별 해지의 지역별 특성연구 = A Study on the Regional Characteristics of Broadband Internet Termination by Coupling Type using Spatial Information based Clustering 박장혁, 박상언, 김우주 pp.45-67

웹검색 트래픽 정보를 활용한 유커 인바운드 여행 수요 예측 모형 및 유커마이닝 시스템 개발 = Development of Yóukè Mining System with Yóukè's Travel Demand and Insight Based on Web Search Traffic Information 최유지, 박도형 pp.155-175

온라인 상품평의 내용적 특성이 소비자의 인지된 유용성에 미치는 영향 = Impact of Semantic Characteristics on Perceived Helpfulness of Online Reviews 박윤주, 김경재 pp.29-44

전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안 = Efficient Topic Modeling by Mapping Global and Local Topics 최호창, 김남규 pp.69-94

K-Means Clustering 알고리즘과 헤도닉 모형을 활용한 서울시 연립·다세대 군집분류 방법에 관한 연구 = A Study on the Clustering Method of Row and Multiplex Housing in Seoul Using K-Means Clustering Algorithm and Hedonic Model 권순재, 김성현, 탁온식, 정현희 pp.95-118

RNN(Recurrent Neural Network)을 이용한 기업부도예측모형에서 회계정보의 동적 변화 연구 = Dynamic forecasts of bankruptcy with Recurrent Neural Network model 권혁건, 이동규, 신민수 pp.139-153

참고문헌 (33건) : 자료제공( 네이버학술정보 )

참고문헌 목록에 대한 테이블로 번호, 참고문헌, 국회도서관 소장유무로 구성되어 있습니다.
번호 참고문헌 국회도서관 소장유무
1 Balamurugan, S., R. Rajaram, G. Athiappan and M. Muthupandian, “Data Mining Techniques for Suspicious Email Detection: A Comparative Study,” Proceeding of the IADIS European Conference Data Mining 2007, (2007), 213~217. 미소장
2 Banerjee, A., Barman, D., Faloutsos, M., &Bhuyan, L. N. Cyber-fraud is one typo away. INFOCOM 2008. The 27th Conference on Computer Communications. IEEE (2008) (pp. 1939-1947). IEEE. 미소장
3 Bayer, M., W. Sommer and A. Schacht, “Reading emotional words within sentences: The impact of arousal and valence on event-related potentials,” International Journal of Psychophysiology, Vol.78, No.3(2010), 299~307. 미소장
4 Castell, M. R. F. and L. B. Dacuycuy, “Exploring the use of exchange market pressure and RMU deviation indicator for early warning system (EWS) in the ASEAN+3 region,”DLSU Business & Economics Review, Vol.18, No.2(2009), 1~30. 미소장
5 Comfort, L. K., “Crisis Management in Hindsight:Cognition, Communication, Coordination, and Control,” Public Administration Review, Vol. 67, No.1(2007), 189~197. 미소장
6 Choi, S., Jeon, J., Subrata, B., Kwon, O., “An efficient estimation of place brand image power based on text mining technology,”Journal of Korea Intelligent Information Systems, Vol. 21, No.2(2015), 113~129.(최석재, 전종식, 권오병, “텍스트마이닝 기반의효율적인 장소 브랜드 이미지 강도 측정 방법,” 지능정보연구, Vol.21, No.2(2015), 113~129.) 미소장
7 Choi, S. Song, Y., Kwon, O., “Analyzing contextual polarity of unstructured data for measuring subjective well-being,” Journal of Intelligent Information Systems, Vol.22, No.1(2016), 83~105. (최석재, 송영은, 권오병, “주관적 웰빙 상태 측정을 위한 비정형 데이터의 상황기반 긍부정성 분석 방법,” 지능정보연구, Vol. 22, No.1(2016), 83~105.) 미소장
8 Cui, M., Jin, Y. and Kwon, O., “A method of analyzing sentiment polarity of multilingual social media : A case of korean-chinese languages,” Journal of Intelligent Information Systems, Vol.22, No.3(2016), 91~111. (최미나, 진윤선, 권오병, “다국어 소셜미디어에대한 감성분석 방법 개발,” 지능정보연구, Vol. 22, No.3(2016), 91~111.) 미소장
9 DeAngelo, H. and R. M. Stulz, “Liquid-claim production, risk management, and bank capital structure: Why high leverage is optimal for banks,” Journal of Financial Economics, Vol.116, No.2(2015), 219~236. 미소장
10 Dionne, G., “Risk management: History, definition, and critique,” Risk Management and Insurance Review, Vol.16, No.2(2013), 147~166. 미소장
11 Flores, C., “Management of catastrophic risks considering the existence of early warning systems,” Scandinavian Actuarial Journal, Vol.1(2009), 38~62. 미소장
12 Folino, G., A. Forestiero, G. Papuzzo and G. Spezzano, “A grid portal for solving geoscience problems using distributed knowledge discovery services,” Future Generation Computer Systems, Vol.26, No.1(2010), 87~96. 미소장
13 Grace, M. F., J. T. Leverty, R. D. Phillips and P. Shimpi, “The value of investing in enterprise risk management,” Journal of Risk and Insurance, Vol.82, No.2(2015), 289~316. 미소장
14 Hassan, A. B., F. D. Lass and J. Makinde, “Cybercrime in Nigeria: Causes, Effects and the Way Out,” ARPN Journal of Science and Technology, Vol.2, No.7(2012), 626~631. 미소장
15 Henderson, L. J., “Emergency and disaster:Pervasive risk and public bureaucracy in developing nations,” Public Organization Review, Vol.4, No.2(2004), 103~119. 미소장
16 Holton, C., “Identifying disgruntled employee systems fraud risk through text mining: a simple solution for a multi-billion dollar problem,” Decision Support Systems, Vol.46, No.4(2009), 853~864. 미소장
17 Jans, M., N. Lybaert and K. Vanhoof, “Internal fraud risk reduction: results of a data mining case study,” International Journal of Accounting Information Systems, Vol.11, No.1(2010), 17~41. 미소장
18 Joachims, T., “Text categorization with support vector machines: Learning with many relevant features,” Technical Report LS8-Report, Universitaet Dortmund, 1997. 미소장
19 Kim, J. and Kwon, O., “A method of predicting service time based on voice of customer data,” Journal of the Korea society of IT services, Vol. 15(2016), 197~210. (김정훈, 권오병, “고객의 소리 (VOC) 데이터를활용한 서비스 처리 시간 예측방법,” 한국IT 서비스학회지, Vol.15(2016), 197~210.) 미소장
20 Kumari, A., K. Sharma, and M. Sharma, “Predictive Analysis of Cyber Crime Against Women in India and Laws Prohibiting Them,” International Journal of Innovations & Advancement in Computer Science, Vol.4, No.3(2015), 1-6. 미소장
21 Lin, M., X. Ke and A.B. Whinston, “Vertical differentiation and a comparison of online advertising models,” Journal of Management Information Systems, Vol.29, No.1(2012), 195~236. 미소장
22 Mazurczyk, W., T. Holt, and K. Szczypiorski, “Guest Editors’ Introduction: Special Issue on Cyber Crime,” IEEE Transactions on Dependable and Secure Computing, Vol.13, No.2(2016), 146~147. 미소장
23 McEntire, David A. The status of emergency management theory: Issues, barriers, and recommendations for improved scholarship. University of North Texas. Department of Public Administration. Emergency Administration and Planning, (2004). 미소장
24 Michaelidou, N., N. T. Siamagka and G. Christodoulides, “Usage, barriers and measurement of social media marketing: An exploratory investigation of small and medium B2B brands,” Industrial Marketing Management, Vol.40(2011), 1153~1159. 미소장
25 Nykodym N., R. Taylor and J. Vilela, “Criminal profiling and insider cyber crime,” Digital Investigation, Vol.2(2005), 261~267. 미소장
26 Pérez‐González, F., and H. Yun, “Risk management and firm value: Evidence from weather derivatives,” The Journal of Finance, Vol.68, No.5(2013), 2143~2176. 미소장
27 Petak, W. J., “A Challenge for Public Administration,” Public Administration Review, Vol.45(1985), 3~7. 미소장
28 Sadgrove, K. The complete guide to business risk management. Routledge, 2016. 미소장
29 Sahami, M., S. Dumais and D. Heckerman and E. Horvitz, “A Bayesian Approach to Filtering Junk E-Mail,” In Learning for Text Categorization: Papers from the 1998workshop, Vol.62(1998), 98~105. 미소장
30 Sreenivasulu, V., and R.S. Prasad, “A Methodology for Cyber Crime Identification using Email Corpus based on Gaussian Mixture Model,” International Journal of Computer Applications, Vol.117, No.13(2015), 29~32. 미소장
31 Waugh, W. L., and G. Streib, “Collaboration and leadership for effective emergency management,” Public administration review, Vol.66, No.1(2006), 131~140. 미소장
32 Yates, D, and S. Paquette, “Emergency knowledge management and social media technologies:A case study of the 2010 Haitian earthquake,” International Journal of Information Management, Vol.31(2011), 6~13. 미소장
33 Zhao. L. and Y. Jiang, “A game theoretic optimization model between project risk set and measurement,” International Journal of Information Technology & Decision Making, Vol.8, No.4(2009), 769~786. 미소장