COLING’14: Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts

Arthur Lee

4 min readMay 9, 2020

Natural Language Processing paper challenge (1/30)

paper link

What problem do they solve?

Given a sentence, we have to classify the sentiment.

What model do they propose?

CharSCNN: They apply CNN to capture the local information as the features for word level, character level and combined them into sentence level to do prediction.

SCNN: CharSCNN without character-level embedding

word-level embedding:

They applied unsupervised learning method on English Wikipedia corpus to learn word2vec embedding.

char-level embedding:

The goal is to get char-embedding for each word. However there is a challenge. Each word has different size of characters. The naive approach is utilizing MAX-POOLing or AVG-POOLing. Here, they employs MAX-POOLing.

For example, for a word: clearly

if we set fixed window size 4 and then we can get (7–4+1) = 4 vectors z_m.

Like, clea, lear, earl, arly -> 4 windows

For each z_m, we feed it into forward feedback network [W,b].

After then we have 4 z_m_after and get the z_m_after with highest weights.

Here, j-th word is clearly.

M is 4 for this case.

Eventually, we can get r_wch for this word clearly.

sentence-level embedding:

Same with character-level embedding, each sentence has different size of words. They also apply MAX-POOLing.

Each sentence can represents as concatenate of word-level embedding and char-level embedding.

Taking an sentence I like riding a bike as an example.

There are 5 words in the sentence, if we set fixed-size window size as 4, then we will have 2 word_windows (I like riding a, like riding a bike)

Taking z_n for these word_window embedding, so we have 2 z_n in this case.

z_1: the embedding of I like riding a.

z_2: the embedding of like riding a bike.

And then we take the z_k with highest weight.