COLING’14: Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts

Arthur Lee
4 min readMay 9, 2020

--

Natural Language Processing paper challenge (1/30)

paper link

What problem do they solve?

Given a sentence, we have to classify the sentiment.

What model do they propose?

CharSCNN: They apply CNN to capture the local information as the features for word level, character level and combined them into sentence level to do prediction.

SCNN: CharSCNN without character-level embedding

word-level embedding:

They applied unsupervised learning method on English Wikipedia corpus to learn word2vec embedding.

char-level embedding:

The goal is to get char-embedding for each word. However there is a challenge. Each word has different size of characters. The naive approach is utilizing MAX-POOLing or AVG-POOLing. Here, they employs MAX-POOLing.

For example, for a word: clearly

if we set fixed window size 4 and then we can get (7–4+1) = 4 vectors z_m.

Like, clea, lear, earl, arly -> 4 windows

For each z_m, we feed it into forward feedback network [W,b].

After then we have 4 z_m_after and get the z_m_after with highest weights.

Here, j-th word is clearly.

M is 4 for this case.

Eventually, we can get r_wch for this word clearly.

sentence-level embedding:

Same with character-level embedding, each sentence has different size of words. They also apply MAX-POOLing.

Each sentence can represents as concatenate of word-level embedding and char-level embedding.

Taking an sentence I like riding a bike as an example.

There are 5 words in the sentence, if we set fixed-size window size as 4, then we will have 2 word_windows (I like riding a, like riding a bike)

Taking z_n for these word_window embedding, so we have 2 z_n in this case.

z_1: the embedding of I like riding a.

z_2: the embedding of like riding a bike.

And then we take the z_k with highest weight.

And feed it into few forward feedback networks.

Network Training

It is multi-classification problem. We can apply Cross-Entropy Loss.

DataSet

ddd

Result

CharSCNN with pre-training word-embedding outperform other models.

CharSCNN can capture the negation automatically.

Other related blogs:

Trust-aware recommender systems

Performance of recommender algorithms on top-n recommendation tasks

Multiverse Recommendation: N-dimensional Tensor Factorization for Context-aware Collaborative Filtering

A Matrix Factorization Technique with Trust Propagation for Recommendation in Social Networks

Hidden Factors and Hidden Topics: Understanding Rating Dimensions with Review Text

Beyond Clicks: Dwell Time for Personalization

RecSys’15: Context-Aware Event Recommendation in Event-based Social Networks

RecSys’11: Utilizing related products for post-purchase recommendation in e-commerce

RecSys11: OrdRec: an ordinal model for predicting personalized item rating distributions

RecSys16: Adaptive, Personalized Diversity for Visual Discovery

RecSys ’16: Local Item-Item Models for Top-N Recommendation

Best paper in RecSys:

https://recsys.acm.org/best-papers/

My Website:

https://light0617.github.io/#/

--

--

Arthur Lee
Arthur Lee

Written by Arthur Lee

An machine learning engineer in Bay Area in the United States

No responses yet