Pattern (software)

From Brede Wiki
Jump to: navigation, search
Software (help)
Pattern
Description: Python package for text mining with web documents
Developer: Tom De Smedt
Language: Python
License: BSD
Link: http://www.clips.ua.ac.be/pages/pattern
Database(s):
Feature(s): Sentiment analysis

Pattern is a Python-based text mining software package with machine learning, etc.

The package is available with pip and the Python Package Index:

http://pypi.python.org/pypi/Pattern/2.4

The software is briefly described in the paper Pattern for Python.

[edit] Example from paper

from pattern.web import Twitter
from pattern.en import Sentence, parse
from pattern.search import search
from pattern.vector import Document, Corpus, KNN
 
corpus = Corpus()
for i in range(1,15):
    for tweet in Twitter().search('#win OR #fail', start=i, count=100):
        p = '#win' in tweet.description.lower() and 'WIN' or 'FAIL'
        s = tweet.description.lower()
        s = Sentence(parse(s))
        s = search('JJ', s) # JJ = adjective
        s = [match[0].string for match in s]
        s = ' '.join(s)
        if len(s) > 0:
            corpus.append(Document(s, type=p))
 
classifier = KNN()
for document in corpus:
    classifier.train(document)
 
print classifier.classify('sweet') # yields 'WIN'
print classifier.classify('stupid') # yields 'FAIL'd

[edit] Papers

  1. Creative Web Services with Pattern
  2. Pattern for Python

[edit] Related software

Personal tools