초록

Users create lots of posts on the technical support forums, however developers or manufacturers can usually work on a limited set of issues at a particular instance, therefore they have to decide which issues should be prioritized first. Typically theposts that have the most comments or views are prioritized, however, this might not always represent which issues are adversely affecting the user. Therefore we want a computational methodology to identify which issues are affecting the users.

Previous work on this topic involved lexicon-based sentiment analysis, in which the sentiment polarity of the text is analyzed to check the severity of an issue. However, that approach has its limitations. In this research, we have used aspect-based sentiment analysis in which sentiment regarding each aspect of a product/service is captured. For this purpose Semeval 2015 task5 dataset is used, data is also scraped from Samsung forums to improve out-of-domain ABSA results and stored in a NoSQL database. The text data is preprocessed by doing normalizations, expanding contractions, and noise removal. Named entity recognition is also performed, the keyword-based bootstrapping approach is utilized for capturing words like 'Samsung Galaxy S7'. Using Stanford CoreNLP a dependency parse tree is generated and rule-based ABSA is applied to fetch all the aspect entity pairs. Context resolution vector is used to accurately predict the entity associated with an aspect. Sentiment values are predicted using the Sentiwordnet lexicon.

Using RB-ABSA, we managed to get results comparable to the modern CNN architectures. We also created a micro-service that extracts aspect-entity pairs and sentiment associated with them for out-of-domain data testing. All the negative posts revealed issues/accidents regarding a product and such posts should be prioritized first by the developer/manufacturers. We also suggested some ranking functions that can be used on these negative posts.