From Brede Wiki
|Databases:||Wikipedia with DBpedia|
|Papers:||DOAJ Google Scholar PubMed|
|Ontologies:||MeSH NeuroLex Wikidata Wikipedia|
|Other:||Google Twitter WolframAlpha|
Text sentiment analysis (or usually just sentiment analysis) is a text mining technique to analyze the sentiment of the writer or to the topic written about.
Sentiment analysis may employ machine learning techniques. One often apply method is naïve Bayes classifier where the algorithm is trained on a labeled data set. Within the Python package NLTK is a classic sentiment analysis data set (movie reviews) as well as general machine learning methods for sentiment classification. Some of the earliest papers on this approach are probably
Another approach is to use a word list where each word has been scored for positivity/negativity or sentiment strength. There exists several word lists: ANEW is the oldest and has around 1000 words, AFINN is newer and has around 2.500, while labMT has over 10.000 words scored.
- Affective Text
- "Affective Text: Data Annotated for Emotions and Polarity" Rada Mihalcea 
- Darmstadt Service Review Corpus
- http://www.ukp.tu-darmstadt.de/data/sentiment-analysis/darmstadt-service-review-corpus/ "consumer reviews annotated with opinion related information at the sentence and expression levels."
- Movie reviews
- A classic data set in sentiment analysis by Bo Pang and Lillian Lee. http://www.cs.cornell.edu/People/pabo/movie-review-data/ It is included in the NLTK python package in
- Multi-Domain Sentiment Dataset
- Manually-labeled Twitter posts. The data is described here, but it is unclear if the data is publicly available.
-  In 2013 there was a Twitter sentiment analysis task with several thousand labeled postings and SMS text messages.
- Sentiment140 corpora
- 2 data sets from Twitter. One with 498 labeled tweets . See also the description: .
- Twitter Sentiment Corpus
-  by Niek Sanders "consists of 5513 hand-classified tweets".
- TASS Corpus
-  consists of 70000 tweets in Spanish, annotated with global polarity.
- UMICH SI650 - Sentiment Classification
-  Twitter corpus described as "the training data contains 7086 sentences, already labeled with 1 (positive sentiment) or 0 (negative sentiment). The test data contains 33052 sentences that are unlabeled."
Several researchers have crawled IMDb and downloaded movie reviews text and star rating.
 Affective word lists
Sentiment analysis may use word lists annotated for their arousal and their valence, i.e., whether they are positive or negative. Some word lists are listed and commented on in setion 7.3 of the Pang/Lee monograph. Some of the word lists are:
- Affective Norms for English Words (ANEW)
- An English word list constructed by Bradley and Lang and available from University of Florida . There are 1034 words rated for valence, arousal and dominance. It is "solely for use in academic, not-for-profit research at recognized educational institutions". (It is associated with a program by Greg Siegel, http://www.sci.sdsu.edu/CAL/wordlist/ ). SPANEW, Spanish ANEW. DANEW, Dutch ANEW..
- An English word list with 2477 words (previously 1468 words) constructed by Finn Årup Nielsen for sentiment analysis of Twitter messages (while also used for other texts) and is available with a share-alike license: . Each word is rated by a valence value from -5 to +5. A evaluation of the word list was described in A new ANEW: evaluation of a word list for sentiment analysis in microblogs and the word list was used in Good friends, bad news - affect and virality in Twitter. For a simple example of using the list with Python see .
- Balanced Affective Word List ("original")
- An older version of the Balanced Affective Word List with 277 English words and associated with the program of Greg Siegle, http://www.sci.sdsu.edu/CAL/wordlist/origwordlist.html (The original URL has gone Internet Archive version) The valence coded is 1=positive 2=negative 3=anxious 4=neutral. The words were aggregated from two lists: one list collected by Greg Siegle and Mark Shibley and another list of 240 words by Carolyn H. John from the publication Emotionality ratings and free-association norms of 240 emotional and non-emotional words.
- Berlin Affective Word List (BAWL)
- A word list of 2'200 German words with emotional valence and imageability. A research project took some of these words as part of the basis for an annotated word list of 300 English words.
- Berlin Affective Word List Reloaded (BAWL-R)
- A newer version of BAWL with addition of arousal for words.
- Bilingual Finnish Affective Norms
- 210 British English and Finnish nouns, including taboo words. 
- Compass DeRose Guide to Emotion Words
- English emotional words collected by Steven J. DeRose and categorized but without valence or arousal. http://www.derose.net/steve/resources/emotionwords/ewords.html
- Dictionary of Affect in Language (DAL)
- constructed by Cynthia M. Whissell. A description of it seems to be available as a chapter in the book Emotion: theory, research, and experience (pp. 113-131) with Robert Plutchik and Henry Kellerman as editors and published by Academic Press. One Web services uses DAL:  The list has also been called "Whissell's Dictionary of Affect in Language" (WDAL).
- General Inquirer
- has several dictionaries, e.g., a "positive" list with 1'915 words and one 'negative' list with 2'291 words. http://www.wjh.harvard.edu/~inquirer/homecat.htm
- Hu-Liu opinion lexicon (HL)
- around 6800 words in a negative and a positive list. . Collected over the years starting with the papers Mining and summarizing customer reviews.
- A large word list
- Leipzig Affective Norms for German (LANG)
- "A list of 1,000 German nouns that have been rated for emotional valence, arousal, and concreteness" http://www.springerlink.com/content/m244118283586754/supplementals/ .
- Linguistic Inquiry and Word Count  Commercial ($90) word lists with computer program to extract basic counts / ratios. Contains dictionaries for English, German, Spanish, Dutch, and Italian. Extracts around 60 different word categories, including "positive emotions" and "negative emotions". The program can be purchased; their site also allows you to analyze texts one by one.
- Loughran and McDonald Financial Sentiment Dictionaries
-  Dictionaries with negative, poisitive, uncertainty, litigious and modal words especially for financial texts by Tim Loughran and Bill McDonald. The lists are "Not for commercial use without authorization". Described in When is a liability not a liability? textual analysis, dictionaries, and 10-Ks.
- NRC Emotion Lexicon
- (EmoLex) A large word list constructed by Saif M. Mohammad through Amazon Mechanical Turk.
- NRC Hashtag Sentiment Lexicon
-  large list of words created from 775,310 tweets with a positive or negative hash tag.
- NTU Sentiment Dictionary
- (Listed by Pang and Lee)
- Luis von Ahn's Offensive/Profane Word List
- . "1,300+ English terms that could be found offensive."
- OpinionFinder's Subjectivity Lexicon
- http://mpqa.cs.pitt.edu/lexicons/subj_lexicon/. 8221 words scored for polarity (positive or negative), subjectivity. Distinguishes between POS-tag. It is sometimes referred to at MPQA.
- The Pattern Python package has the sentiment.xml included which 2888 words scored for polarity, subjectivity, intensity and reliability. The words are mostly adjectives. There are no nouns.
- Subjectivity and Sentiment Analysis of Social Media Arabic by Muhammad Abdul-Mageed and Mona T. Diab. Not clear whether it is available. See also Toward building a large-scale Arabic sentiment lexicon.
-  It "consists of 5,496 words and 2,190 synsets labeled with an emotion from a set of 14 emotional categories"
- Assigns 3 sentiment scores for WordNet synset: positivity, negativity, objectivity. The license has been "only for research, non-profit purposes", but now changed to CC-BY-SA. The 3.0 version was described in 2010. http://sentiwordnet.isti.cnr.it/ See also Python interface at https://bitbucket.org/jaganadhg/pysentiwn/wiki/Home
- Taboada and Grieve's Turney adjective list
- (listed in Pang and Lee) available through Yahoo! sentimentAI group.
-  13,915 English words with valence, arousal and dominance collected with Amazon Mechanical Turk. The word list is licensed under CC-BY-NC-ND.
- An English list. Originally "freely available, for research purposes". Now part of WordNet Domains which is distributed under CC-BY. See http://wndomains.fbk.eu/wnaffect.html and http://wndomains.fbk.eu/download.html
For comparison of the different word lists see Enhancing lexicon-based review classification by merging and revising sentiment dictionaries and A new ANEW: evaluation of a word list for sentiment analysis in microblogs.
- AFINN, A affective wordlist. Code exists in several programming languages
- Pattern, Python library.
- sasa-tool, , USC SAIL/AIL sentiment analysis tool.
- Senti by Crowflower , commercial crowd-based service
See also list by Seth Grimes in What are the most powerful open-source sentiment-analysis tools?
 Online services
- http://sentimentalytics.com - a browser plug-in that automatically analyzes social media content (including sentiment)
- http://neuro.imm.dtu.dk/cgi-bin/brede_str_nmf Sentiment-topic mining
- http://www.sentigem.com — does this work?
- ConveyAPI As of 2013 June seemingly Vaporware-ish: "currently offering free a evaluation of the ConveyAPI to select companies." 
- Bitext, demo available at http://svc8.bitext.com/API-demo/
- Workshop on sentiment and subjectivity in text COLING ACL 2006
- Extracting opinions, opinion holders, and topics expressed in online news media text [Extracting opinions, opinion holders, and topics expressed in online news media text]
- First International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion Measurement, 2009.
- 1st Workshop on Opinion Mining and Sentiment Analysis, 2009.
- ICDM11 workshop on opinion mining and sentiment analysis
- Bing Liu
- Finn Årup Nielsen, AFINN
- Mike Thelwall, SentiStrength
- Peter D. Turney, unsupervized sentiment analysis
- Saif M. Mohammad, NRC Emotion Lexicon, SemEval winner.
- A new ANEW: evaluation of a word list for sentiment analysis in microblogs
- Building lexicon for sentiment analysis from massive collection of HTML documents
- Combining social network analysis and sentiment analysis to explore the potential for online radicalisation
- Crowd sentiment detection during disasters and crises
- Determining the sentiment of opinions
- Domain specific affective classification of documents
- Good friends, bad news - affect and virality in Twitter
- Large-scale sentiment analysis for news and blogs
- Leveraging textual sentiment analysis with social network modeling
- Micro-blogging sentiment detection by collaborative online learning
- Mining the peanut gallery: opinion extraction and semantic classification of product reviews
- Negative emotions accelerating users activity in BBC Forum
- Pattern for Python
- Quantitative analysis of bloggers collective behavior powered by emotions
- Robust sentiment detection on Twitter from biased and noisy data
- Sentiment analysis with global topics and local dependency
- Sentiment in short strength detection informal text
- Tweetin' in the rain: exploring societal-scale effects of weather on mood
- Using emoticons to reduce dependency in machine learning techniques for sentiment classification
- Using verbs and adjectives to automatically classify blog sentiment
 See also
 External link
- ↑ Bo Pang, Lillian Lee (2008). "Opinion mining and sentiment analysis". Foundations and Trends in Information Retrieval 2(1-2): 1-135. .
- ↑ Quaozhu Mei, Xu Ling, Matthew Wondra, Hang Su, ChengXiang Zhai(2007). "Topic sentiment mixture: modeling facets and opinions in weblogs".
- ↑ Words with attitude
- ↑ The measurement of meaning
- ↑ I. Ounis, C. MacDonald, I. Soboroff. "Overview of the trec-2008 blog trac". The TREC 2008 Proceedings.
- ↑ Sentence and expression level annotation of opinions in user-generated discourse
- ↑ Recursive deep models for semantic compositionality over a sentiment treebank
- ↑ Domain specific affective classification of documents
- ↑ Margaret M. Bradley, Peter J. Lang. (1999). Affective norms for English words (ANEW). Gainesville, FL. The NIMH Center for the Study of Emotion and Attention, University of Florida.
- ↑ The Spanish adaptation of ANEW (affective norms for English words)
- ↑ The QWERTY effect: how typing shapes the meanings of words
- ↑ Carolyn H. John (1998). "Emotionality ratings and free-association norms of 240 emotional and non-emotional words". Cognition & Emotion 2(1): 49-70. doi: 10.1080/02699938808415229.
- ↑ Melissa L.-H. Võ, Arthur M. Jacobs, Markus Conrad (2006). "Cross-validating the Berlin Affective Word List". Behavior Research Methods 38(4): 606-609.
- ↑ Evaluation of lexical and semantic features for English emotion words
- ↑ Melissa L.-H. Võ, Markus Conrad, Lars Kuchinke, Karolina Urton, Markus J. Hofmann, Arthur M. Jacobs (2009). "The Berlin Affective Word List Reloaded (BAWL-R)". Behavior Research Methods 41: 534-538. doi: 10.3758/BRM.41.2.534.
- ↑ Tiina M. Eilola, Jelena Havelka (2010). "Affective norms for 210 British English and Finnish nouns". Behavior Research Methods 42(1): 134-140. PMID: 20160293.
- ↑ Let me listen to poetry, let me see emotions
- ↑ P. Kanske, S. A. Kotz (2010). "Leipzig Affective Norms for German: A reliability study". Behav Res Methods 42(4): 987-991. PMID: 21139165.
- ↑ Norms of valence, arousal, dominance, and age of acquisition for 4300 Dutch words
- ↑ NRC-Canada: building the state-of-the-art in sentiment analysis of tweets
- ↑ Theresa Wilson, Janyce Wiebe, Paul Hoffmann(2005). "Recognizing contextual polarity in phrase-level sentiment analysis". Proc. of HLT-EMNLP-2005.
- ↑ NRC-Canada: building the state-of-the-art in sentiment analysis of Tweets
- ↑ SentiSense: an easily scalable concept-based affective lexicon for sentiment analysis
- ↑ Andrea Esuli, Fabrizio Sabastiani. "SentiWordNet: a publicly available lexical resource for opinion mining".
- ↑ http://sentiwordnet.isti.cnr.it/
- ↑ Stefano Baccianella, Andrea Esuli, Fabrizio Sebastiani(2010). "SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining". Pages 2200-2204 in In Proceedings of LREC-10, 7th Conference on Language Resources and Evaluation.
- ↑ Gökçay D., Smith MA., "TÜDADEN:Türkçede Duygusal ve Anlamsal Değerlendirmeli Norm Veri Tabanı", Proceedings of Brain-Computer Workshop 4, 2008, Istanbul.
- ↑ Norms of valence, arousal, and dominance for 13,915 English lemmas
- ↑ WordNet-Affect: an affective extension of WordNet
- ↑ A. Valitutti, C. Strapparava, O. Stock (2004). "Developing affective lexical resources". PsychNology Journal 2(1): 61-83. .
- ↑ http://wndomains.fbk.eu/download.html
- Carlo Strapparava, Rada Mihalcea(2008). "Learning to identify emotions in text". Pages 1556-1560 in PSAC '08: Proceedings of the 2008 ACM symposium on Applied computing. doi: http://doi.acm.org/10.1145/1363686.1364052. 
- Understanding sentiment of people from news articles: temporal sentiment analysis of social events