COLING’14: Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts
Natural Language Processing paper challenge (1/30)
What problem do they solve?
Given a sentence, we have to classify the sentiment.
What model do they propose?
CharSCNN: They apply CNN to capture the local information as the features for word level, character level and combined them into sentence level to do prediction.
SCNN: CharSCNN without character-level embedding
word-level embedding:
They applied unsupervised learning method on English Wikipedia corpus to learn word2vec embedding.
char-level embedding:
The goal is to get char-embedding for each word. However there is a challenge. Each word has different size of characters. The naive approach is utilizing MAX-POOLing or AVG-POOLing. Here, they employs MAX-POOLing.
For example, for a word: clearly
if we set fixed window size 4 and then we can get (7–4+1) = 4 vectors z_m.
Like, clea, lear, earl, arly -> 4 windows
For each z_m, we feed it into forward feedback network [W,b].
After then we have 4 z_m_after and get the z_m_after with highest weights.
Here, j-th word is clearly.
M is 4 for this case.
Eventually, we can get r_wch for this word clearly.
sentence-level embedding:
Same with character-level embedding, each sentence has different size of words. They also apply MAX-POOLing.
Each sentence can represents as concatenate of word-level embedding and char-level embedding.
Taking an sentence I like riding a bike as an example.
There are 5 words in the sentence, if we set fixed-size window size as 4, then we will have 2 word_windows (I like riding a, like riding a bike)
Taking z_n for these word_window embedding, so we have 2 z_n in this case.
z_1: the embedding of I like riding a.
z_2: the embedding of like riding a bike.
And then we take the z_k with highest weight.
And feed it into few forward feedback networks.
Network Training
It is multi-classification problem. We can apply Cross-Entropy Loss.
DataSet
ddd
Result
CharSCNN with pre-training word-embedding outperform other models.
CharSCNN can capture the negation automatically.
Other related blogs:
Trust-aware recommender systems
Performance of recommender algorithms on top-n recommendation tasks
A Matrix Factorization Technique with Trust Propagation for Recommendation in Social Networks
Hidden Factors and Hidden Topics: Understanding Rating Dimensions with Review Text
Beyond Clicks: Dwell Time for Personalization
RecSys’15: Context-Aware Event Recommendation in Event-based Social Networks
RecSys’11: Utilizing related products for post-purchase recommendation in e-commerce
RecSys11: OrdRec: an ordinal model for predicting personalized item rating distributions
RecSys16: Adaptive, Personalized Diversity for Visual Discovery
RecSys ’16: Local Item-Item Models for Top-N Recommendation
Best paper in RecSys:
https://recsys.acm.org/best-papers/
My Website: