A new ANEW: evaluation of a word list for sentiment analysis in microblogs
From Brede Wiki
|Conference paper (help)|
|A new ANEW: evaluation of a word list for sentiment analysis in microblogs|
|Authors:||Finn Årup Nielsen|
|Citation:||Proceedings of the ESWC2011 Workshop on 'Making Sense of Microposts': Big things come in small packages 718 in CEUR Workshop Proceedings : 93-98. 2011 May|
|Editors:||Matthew Rowe, Milan Stankovic, Aba-Sah Dadzie, Mariann Hardey|
|Meeting:||Making Sense of Microposts|
|Database(s):||arXiv (arxiv/1103.2903) Citeulike Google Scholar cites Microsoft Academic Search|
|Web:||DuckDuckGo Bing Google Yahoo! — Google PDF|
|Article:||Google Scholar PubMed|
|Restricted:||DTU Digital Library|
The word list is available from:
It is also part of the "afinn" Python module:
The software used for the paper programmed in Python is available from:
Slides from the Making Sense of Microposts:
Sentiment analysis of microblogs such as Twitter has recently gained a fair amount of attention. One of the simplest sentiment analysis approaches compares the words of a posting against a labeled word list, where each word has been scored for valence, -- a "sentiment lexicon" or "affective word lists". There exist several affective word lists, e.g., ANEW (Affective Norms for English Words) developed before the advent of microblogging and sentiment analysis. I wanted to examine how well ANEW and other word lists performs for the detection of sentiment strength in microblog posts in comparison with a new word list specifically constructed for microblogs. I used manually labeled postings from Twitter scored for sentiment. Using a simple word matching I show that the new word list may perform better than ANEW, though not as good as the more elaborate approach found in SentiStrength.
 Related papers
- Analyzing customer sentiments in microblogs-a topic-model-based approach for Twitter datasets
- Building lexicon for sentiment analysis from massive collection of HTML documents
- Classifying sentiment in microblogs: is brevity an advantage?
- Feature sentiment diversification of user generated reviews: the FREuD approach
- Good friends, bad news - affect and virality in Twitter
- Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis
- Sentiful: generating a reliable lexicon for sentiment analysis
- Sentiment analysis of Twitter data
- Sentiment in short strength detection informal text
- Sentiment-based text segmentation
- SentiSense: an easily scalable concept-based affective lexicon for sentiment analysis
 Use of word list
For further papers see AFINN. That list is more complete
- Aesthetic considerations for automated platformer design (2012)
- Crowd sentiment detection during disasters and crises
- Good friends, bad news - affect and virality in Twitter (2011)
- Networks and language in the 2010 election
- Retweets--but not just retweets: quantifying and predicting influence on Twitter (2012)
- Semi-automated argumentative analysis of online product reviews (2012)
- Summarization of yes/no questions using a feature function model (2011)
- The QWERTY effect: how typing shapes the meanings of words (2012)
- Tracking US Sentiments Over Time In Wikileaks also posted: Tracking US Sentiments Over Time In Wikileaks
- Painting a Novel
 Other mentioning
- Increasing the willingness to collaborate online: an analysis of sentiment-driven interactions in peer content production
- Performance are only given as correlation, - not more standard performance metrics: accuracy, F1, precision, recall.
- There are other and larger word lists than those tested. And these might be better.
- It is not compared with state-of-art which would probably entail some machine learning, better features, e.g., emoticons, negativity detection, ...
- There is no in-depth examination of why the sentiment analyzer fails on specific posts.