Rule-Based Sentiment Analysis Beginner

Rule-based sentiment analysis uses predefined lexicons (dictionaries of words with associated sentiment scores) and grammatical rules to determine sentiment. These methods require no training data, making them perfect for quick prototyping and social media analysis.

VADER (Valence Aware Dictionary and sEntiment Reasoner)

VADER is specifically designed for social media text. It handles emojis, slang, capitalization, and punctuation-based emphasis:

Python

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

analyzer = SentimentIntensityAnalyzer()

# Analyze sentiment
texts = [
    "I love this product! It's amazing!",
    "This is the worst experience ever.",
    "The movie was okay, nothing special.",
    "ABSOLUTELY FANTASTIC!!! 🎉🎉🎉",
    "Not bad at all, I'm pleasantly surprised.",
]

for text in texts:
    scores = analyzer.polarity_scores(text)
    print(f"Text: {text}")
    print(f"  Compound: {scores['compound']:.3f}")
    print(f"  Pos: {scores['pos']:.3f}, Neu: {scores['neu']:.3f}, Neg: {scores['neg']:.3f}")
    print()

Understanding VADER Scores

Score	Range	Meaning
compound	-1 to +1	Overall sentiment. >0.05 positive, <-0.05 negative, between = neutral
pos	0 to 1	Proportion of text that is positive
neu	0 to 1	Proportion of text that is neutral
neg	0 to 1	Proportion of text that is negative

Why VADER excels at social media: VADER understands that "GREAT" is more intense than "great," that "!!!" amplifies sentiment, that emojis carry sentiment, and that "but" shifts the focus to the clause that follows it.

TextBlob

TextBlob provides a simple API for common NLP tasks, including sentiment analysis:

Python

from textblob import TextBlob

texts = [
    "I love this product! It's amazing!",
    "This is the worst experience ever.",
    "The weather is nice today.",
]

for text in texts:
    blob = TextBlob(text)
    print(f"Text: {text}")
    print(f"  Polarity: {blob.sentiment.polarity:.3f}")     # -1 to 1
    print(f"  Subjectivity: {blob.sentiment.subjectivity:.3f}")  # 0 to 1
    print()

Analyzing a Dataset with VADER

Python

import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

analyzer = SentimentIntensityAnalyzer()

# Load product reviews
df = pd.read_csv("reviews.csv")

# Apply VADER to each review
df["compound"] = df["text"].apply(
    lambda x: analyzer.polarity_scores(x)["compound"]
)

# Classify sentiment
df["sentiment"] = df["compound"].apply(
    lambda x: "positive" if x >= 0.05
    else ("negative" if x <= -0.05 else "neutral")
)

# Summary statistics
print(df["sentiment"].value_counts())
print(f"Average sentiment: {df['compound'].mean():.3f}")

VADER vs. TextBlob

Feature	VADER	TextBlob
Social media	Excellent (emojis, slang, caps)	Basic
Formal text	Good	Good
Subjectivity	Not available	Yes (0-1 scale)
Speed	Very fast	Fast
Customizable	Can update lexicon	Can train custom classifier

When to use rule-based methods: Rule-based approaches are best when you need quick results without training data, when analyzing social media text (VADER excels here), or when building a prototype before investing in ML approaches. For higher accuracy on domain-specific text, move to ML or deep learning methods.

Try It Yourself

Install VADER and TextBlob, then analyze 20 product reviews from Amazon. Compare the sentiment scores from both tools. Where do they agree and disagree?

Next: ML-Based Methods →

← Introduction ML-Based Methods →