Sentiment strength detection in short informal text
|Sentiment strength detection in short informal text|
|Authors:||Mike Thelwall, Kevan Buckley, Georgios Paltoglou, Di Cai, Arvid Kappas|
|Citation:||Journal of the American Society for Information Science and Technology 61 (12): 2544-2558. 2010 December|
|Database(s):||Google Scholar cites|
|Web:||Bing Google Yahoo! — Google PDF|
|Article:||BASE Google Scholar PubMed|
|Restricted:||DTU Digital Library|
Sentiment in short strength detection informal text (the title should have been Sentiment strength detection in short informal text: there is an errata to the text.) reports the development of a sentiment analysis system for estimating the sentiments in short texts (MySpace comments). The authors call the system SentiStrength.
The researchers set 5 human coders to label MySpace comments for sentiment. Positive and negative sentiment was labeled independently on two 5 point scales.
- Word list
- Score adjustment
- "miss" word
- Spelling correction
- Booster words
- Spelling boosting
Furthermore these features were considered:
- Phrase identification
- Semantic disambiguation
The SentiStrength algorithm was compared with "a range of standard machine-learning classification algorithms in Weka (Witten & Frank, 2005) using the frequencies of each word in the sentiment word list as the feature set." (page 2550).
- They found a Pearson correlation coeffients on 0.639-0.664 for the agreement between 3 human coders of sentiment strength on 1,041 MySpace comments.
 Related studies
- A new ANEW: evaluation of a word list for sentiment analysis in microblogs
- Micro-blogging sentiment detection by collaborative online learning
- Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis also considers short text sentiment analysis
- Robust sentiment detection on Twitter from biased and noisy data
- Sentiment in Twitter events is a newer study by the first author, where SentiStrength is used for Twitter sentiment analysis.