KDD 19': Heterogeneous Graph Neural Network

🤗 Recommendation system paper challenge (28/50)

paper link

Github: https://github.com/chuxuzhang/KDD2019_HetGNN

🤔 What problem do they solve?

They would like to generate Heterogeneous Graph embedding consisting of graph structure information and node content information.

😮 What are the challenges?

Few of them can jointly consider heterogeneous structural (graph) information as well as heterogeneous contents information of each node effectively.

  1. many nodes could not connect to all types of neighbors
  2. A node could carry unstructured content
  3. Different types of neighbors contributes differently to node embedding

😎 Overview of the models: HetGNN

They propose a heterogeneous graph neural network model to resolve this issue.

Specifically, they first introduce a random walk with restart strategy to sample a fixed size of strongly correlated heterogeneous neighbors for each node and group them based upon node types.

Next, we design a neural network architecture with two modules to aggregate feature information of those sampled neighboring nodes. The first module encodes “deep” feature interactions of heterogeneous contents and generates content embedding for each node.

The second module aggregates content (attribute) embeddings of different neighboring groups (types) and further combines them by considering the impacts of different groups to obtain the ultimate node embedding.

Finally, we leverage a graph context loss and a mini-batch gradient descent procedure to train the model in an end-to-end manner.

Sampling Heterogeneous Neighbors (C1)

Most of other GNNs models have some issues:

  1. They cannot capture feature information from different types of neighbors.
  2. Besides that, They are weakened by various neighbor sizes.
  3. They are not suitable for aggregating heterogeneous neighbors which have different content features.

They propose a heterogeneous neighbors sampling strategy based on random walk with restart (RWR).

  1. RWR collects all types of neighbors for each node
  2. the sampled neighbor size of each node is fixed and the most frequently visited neighbors are selected;
  3. neighbors of the same type (having the same content features) are grouped such that type-based aggregation can be deployed.

Encoding Heterogeneous Contents (C2)

Given a node, it has different type neighbors.

Input: 1 neighboring node

output: 1 embedding

For each neighboring node, we would like to get its encoding but different type of node has different contents. How do we aggregate the information together? Bi-LSTM!

We can apply pre-trained model to get embedding of each information and feed into Bi-LSTM (to capture deep interactions) and then Mean Pooling to aggregate it.

Note that the Bi-LSTM operates on an unordered content set.

The advantages:

(1) it has concise structures with relative low complexity (less parameters), making the model implementation and tuning relatively easy

(2) it is capable to fuse the heterogeneous contents information, leading to a strong expression capability;

(3) it is flexible to add extra content features, making the model extension convenient.

Aggregating Heterogeneous Neighbors (C3)

As C1, given 1 node, we have neighboring nodes. Thanks to C2, each neighboring node can encode to 1 embedding.

But how do we aggregate these neighboring nodes together?

Group by Type and for each type run Bi-LSTM!

Same Type Neighbors Aggregation

After grouping neighboring nodes, in each group, we have several nodes in same type. So we can run Same Type Neighbors Aggregation.

We employ Bi-LSTM to aggregate content embeddings of all t-type neighbors and use the average over all hidden states to represent the general aggregated embedding.

We use different Bi-LSTMs to distinguish different node types for neighbors aggregation. Note that the Bi-LSTM operates on an unordered neighbors set, which is inspired by GraphSAGE.

Types Combination

Again, given a node, we have neighboring nodes and we group them into several group by types. So each type, we have 1 embedding.

How to aggregate these embedding? Attention layer

To combine these type-based neighbor embeddings with v’s content embedding, we employ the attention mechanism.

The motivation is that different types of neighbors will make different contributions to the final representation of v.

Objective and Model Training

A graph context loss and a mini-batch gradient descent.

They applied negative sampling (1 by 1) and similar toDeepWalk, they apply random walk to get negative samples.

Finally, we have this whole picture.

🥴 What else in this paper?

In this paper, they also discuss several experiments.

(RQ1) How does HetGNN perform vs. state-of-the-art baselines for various graph mining tasks, such as link prediction (RQ1–1), personalized recommendation (RQ1–2), and node classification & clustering (RQ1–3)?


(RQ2) How does HetGNN perform vs. state-of-the-art baselines for inductive graph mining tasks, such as inductive node classification & clustering?


(RQ3) How do different components, e.д., node heterogeneous contents encoder or heterogeneous neighbors aggregator, affect the model performance?

  • HetGNN has better performance than No-Neigh in most cases, demonstrating that aggregating neighbors information is effective for generating better node embeddings.
  • HetGNN outperforms Content-FC, indicating that the Bi-LSTM based content encoding is better than “shallow” encoding like FC for capturing “deep” content feature interactions.
  • HetGNN achieves better results than Type-FC, showing that selfattention is better than FC for capturing node type impact

(RQ4) How do various hyper-parameters, e.д., embedding dimension or the size of sampled heterogeneous neighbors set, impact the model performance?

🙃 Other related blogs:

KDD 19': Applying Deep Learning To Airbnb Search

KDD 17': Visual Search at eBay

KDD 18': Real-time Personalization using Embeddings for Search Ranking at Airbnb

KDD 18': Notification Volume Control and Optimization System at Pinterest

KDD 19': PinText: A Multitask Text Embedding System in Pinterest

CVPR19' Complete the Look: Scene-based Complementary Product Recommendation

COLING’14: Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts

NAACL’19: Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence

NIPS’2017: Attention Is All You Need (Transformer)

KDD’19: Learning a Unified Embedding for Visual Search at Pinterest

BMVC19' Classification is a Strong Baseline for Deep Metric Learning

KDD’18: Graph Convolutional Neural Networks for Web-Scale Recommender Systems

WWW’17: Visual Discovery at Pinterest

🤩 Conference

ICCV: International Conference on Computer Vision


CVPR: Conference on Computer Vision and Pattern Recognition


KDD 2020


Top Conference Paper Challenge:


My Website:


An machine learning engineer in Bay Area in the United States

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Understanding Machine Learning

Machine Learning, IDP and Chinese Bamboo Trees

Preseizing Techniques for Image Classification using FASTAI

Data Augmentation: How to use Deep Learning when you have Limited Data

Twitter Sentiment Analysis: Buccaneers at Panthers, Week 16, 2021

Machine learning project checklist

What is Deep Learning?

Complete Guide to CNN for MNIST Digits Classification With Tensorflow 2.x.

Journey from Neuron To Perceptron

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Arthur Lee

Arthur Lee

An machine learning engineer in Bay Area in the United States

More from Medium

Deep Neural Network Based Body Measurements Estimation

Maze generation with Variational Autoencoders

Are you ready for Machine-led Machine Learning? MAML: A Modern Approach to Meta-Learning

Bear Classification: From Data Collection to GUI for Model Inference