Mercari Engineering Blog

We're the software engineers behind Mercari. Check out our blog to see the tech that powers our marketplace.

KGRec: An interpretable recommendation system using a knowledge graph

The Japanese version is available here. 日本語の記事はこちらになります。

Hello everyone. I am @joisino. I am doing an internship in Mercari from August 1st. I built a recommendation system using a knowledge graph. I introduce the results in this blog post.


As the number of items in Mercari increases, it becomes more and more difficult for users to find their favorite items. Thus the recommendation system plays a central role in finding items. However, in Mercari, each item is unique, and the same item will not be sold again. In other words, all items are suffered from the cold-start problem. This fact makes implementing an effective recommendation system difficult. In this internship, I built a catalog-based recommendation system using a knowledge graph, named KGRec, based on KGCN [Wang et al., 2019]. This system (1) alleviates the cold-start problem by using the catalog and knowledge graph, (2) is interpretable by figuring out which attributes (e.g., an author of a book and a genre) affect the result, (3) can measure similarities between any attributes (e.g., item-item similarity, author-author similarity), and (4) is scalable because it can recommend items just by running the nearest neighbor search in the 32-dimension Euclidean space, which works fast in practice. Especially, KGRec can perform zero-shot recommendation: KGRec can recommend items that have never been bought by any user thanks to the information of the knowledge graph. The table below contrasts KGRec against other recommendation systems.

QuerySolves cold-startZero-shotInterpretableAttribute SimilarityScalable
KGRec, KGCN [Wang+ 19]
RippleNet [Wang+ 18]
KPRN [Wang+ 19]
Catalog-based MF
Item-based MF

KGRec inherits the strong points of KGCN. KGRec further improves its performance and interpretability by the user-attribute loss and tuning for Mercari dataset. Existing KG-based recommendation systems such as RippleNet and KPRN also have some of the characteristics of KGRec, but not all of them. For example, KPRN is not scalable because it uses LSTMs to compute user-item scores.


In this section, I introduce what a knowledge graph is. A knowledge graph stores complex relations of information in a single graph. There are a number of knowledge graphs available online such as DBPedia and YAGO. However, they do not suit our purpose because most of the items in Mercari do not appear in such knowledge graphs. Therefore, I build an original knowledge graph based on the catalog data in Mercari. The table below shows examples of catalog data.

Title Authors Publisher Target Category
ぼくは王さま 和歌山静子/寺村輝夫 理論社 Infants
あいうえおうさま 和歌山静子/寺村輝夫/杉浦範茂 理論社 Infants 仮名

The figures below show a part of the knowledge graph I constructed for a book recommendation.


This knowledge graph stores which author wrote a book, which company published a book, what a book is about, and so on. Books that have common attributes (e.g., authors, genres) are connected indirectly via attribute nodes (e.g., ぼくは王さま -- written by --> 寺町輝夫 <-- written by -- あいうえおうさま). This information enhances the performance of the recommendation. For example, thanks to the knowledge graph, we can recommend a book that was written by the same author as a book the user bought before.

Related work

There are many KG-based recommendation systems in the literature. In this section, I review the existing work briefly. HeteRec [Yu et al., 2014] is one of the first methods that utilize a knowledge graph for a recommendation. This enumerates paths between a user and an item in the knowledge graph to measure the similarity between the user and item. CKE [Zhang et al., 2016] utilizes embedding of entities of a knowledge graph for a recommendation as well as text and visual information extracted by autoencoders. RippleNet [Wang et al., 2018] learns how to aggregate information in a knowledge graph by the attention mechanism, which is utilized for the recommendation task. Recently, many methods have combined graph neural networks with KG-based recommendation systems. KGCN [Wang et al., 2019] is one of such methods. This aggregates information from neighboring nodes of a knowledge graph based on user-relation scores, which model how much important a user thinks each relation is. For example, thanks to the user-relation scores, KGCN can recommend a book based on its author to users that choose books based on the author, and this can also recommend a book based on its topic (e.g., self-development, family, and physics) to users that think topics are important to choose a book. KGCN suits best for our purpose because it is catalog-based, interpretable, and scalable. Therefore, I built KGRec based on KGCN.


In this section, I explain the method in detail. In KGRec, each node  v in the knowledge graph, including item nodes and attribute nodes, has an embedding vector  \mathbf{h}_v \in \mathbb{R}^{d}. Each user  u and each relation  r (e.g., "is written by" and "is about") also have embeddings  \mathbf{h}_u, \mathbf{h}_r \in \mathbb{R}^{d}, respectively. First, KGRec calculates an embedding of item  i customized for user  u by aggregating attributes from neighboring nodes:

 \displaystyle \hat{\mathbf{h}}_{iu} = \text{tanh}(\sum_{(v, r) \in \mathcal{N}(i)} \alpha_{iur} \mathbf{W} \mathbf{h}_v),

where  \mathcal{N}(i) is the set of incident edges of  i,  v is a neighboring attribute,  r is a relation,  \mathbf{W} \in \mathbb{R}^{d \times d} is a paramter matrix,  \alpha_{iur} is the user-relation score, which I describe next. User-relation scores model how much important a user thinks each relation is. This is modeled by the attention mechanism:

 \displaystyle \alpha_{iur} = \frac{\exp(\mathbf{h}_u^{\top} \mathbf{h}_r)}{\sum_{(v, r') \in \mathcal{N}(i)} \exp(\mathbf{h}_u^{\top} \mathbf{h}_{r'})}.

Finally, KGRec calculates the recommendation scores of items by inner product:

 \text{score}(u, i) = \mathbf{h}_u^{\top} \hat{\mathbf{h}}_{iu},

and recommends top-K items in terms of the recommendation scores. In the training, I used BPR-loss [Randle et al., 2009]:

 \displaystyle \mathcal{L}_{\text{BPR}} = \sum_{(u, p, n) \in \mathcal{T}} \sigma(\mathbf{h}_u^{\top} \hat{\mathbf{h}}_{pu} - \mathbf{h}_u^{\top} \hat{\mathbf{h}}_{nu})

where  \mathcal{T} is training data, which contains triplets of a user, positive item, and negative item, and  \sigma is the sigmoid function. In addition to this loss, I found that user-attribute BPR loss improves performance and stability:

 \displaystyle \mathcal{L}_{\text{UA-BPR}} = \sum_{(u, p_{\text{att}}, n_{\text{att}}) \in \mathcal{T}_{\text{att}}} \sigma(\mathbf{h}_u^{\top} \mathbf{h}_{p_{\text{att}}} - \mathbf{h}_u^{\top} \mathbf{h}_{n_{\text{att}}})

where  \mathcal{T}_{\text{att}} is training data, which contains triplets of a user, attribute of a positive item, and attribute of a negative item. This loss explicitly associates user embeddings with their favorite attributes, which improves interpretability. The loss function for KGRec is:

 \mathcal{L} = \mathcal{L}_{\text{BPR}} + \gamma \mathcal{L}_{\text{UA-BPR}} + \eta \|\theta\|_2^2,

where  \theta denotes all parameters of KGRec (i.e.,  \mathbf{h}_v, \mathbf{h}_u, \mathbf{h}_r, and  \mathbf{W}) and  \gamma and  \eta are hyperparamters.


I aim to answer the following questions through extensive experiments:

  • Q1: Does KGRec perform better than ordinary catalog-based recommendations especially in the cold-start setting?
  • Q2: Is the result of KGRec interpretable?
  • Q3: Can KGRec recommend attributes as well as items?
  • Q4: Is KGRec scalable?

Experimental Setup:

I use Mercari book data through experiments. I set the dimension of embeddings as 32 through experiments. I train models using Adam optimizer [Kingma and Ba, 2015] with learning rate = 0.001.

Experiments for Q1:

To show the effectiveness of KGRec, I measure the recall of KGRec and the catalog-based matrix factorization [Koren et al., 2009] trained by the BPR-loss for Mercari book data. I conduct two experiments. In the first experiment, I use items that are purchased at least 5 times, which is a common setting in the recommendation research. In the second experiment, I use all items, including items that are not bought by any user in the training dataset, which is extremely difficult to predict. This is a more realistic setting because most of the items are purchased less than 5 times.

Since there are as many as 262661 and 774716 items in the subsampled and all item sets, respectively, it is too time-consuming to exactly compute the exact ranking for all users. Therefore, I resort to the sampling scheme. Namely, For each user, I sample 199 items randomly (therefore 200 items with the ground truth item in total) and measure recall@5, recall@10, recall@20, and MRR of the ground truth item in the test data.

The tables below report the result. The result of the first experiment indicates that KGRec outperforms the catalog-based matrix factorization method. More importantly, KGRec further outperforms the matrix factorization in the second experiment. This shows that KGRec effectively alleviates the cold-start problem. Note that the scores in the second experiment tend to be higher than those in the first experiment because the number of irrelevant items increases in the negative sample in the latter experiment.

Subsampled Items
Matrix Factorization
Chance Rate
All Items
Matrix Factorization
Chance Rate

Experiments for Q2:

We can infer why KGRec outputs the result by examining user-relation scores and vector similarities between user embedding  \mathbf{h}_u and attribute embeddings  \mathbf{h}_i. The table below shows examples of recommendations generated by KGRec1. For example, User #1 bought many novels written by Leo Tolstoy and Fyodor Dostoevsky. Therefore, KGRec recommends novels written by them. User #2 bought many books about dressmaking for infants. Therefore, KGRec recommends similar books about dressmaking to this user. User #3 bought many textbooks for high school students. Therefore, KGRec recommends similar textbooks to this user. Furthermore, KGRec succeeds in providing reasons for the recommendation. For example, in the second example, KGRec recommends textbooks because these books target at high school students. It should be noted that these reasons are also provided by KGRec automatically. Note also that KGRec does not use the attribution of users (e.g., a user is a high school student) but implicitly learns them from buying history data.

Buying History of User #1Recommended Items for User #1Reason for the Recommendation
Anna KareninaWar and PeaceThis book is written by Leo Tolstoy
ResurrectionSnow CountryThis book is written by Yasunari Kawabata
Notes from UndergroundCrime and PunishmentThis book is written by Fyodor Dostoevsky
DemonsThe DoubleThis book is written by Fyodor Dostoevsky
Walk in the Light While There Is Light女であることThis book is written by Yasunari Kawabata
One Day in the Life of Ivan DenisovichThe IdiotThis book is written by Fyodor Dostoevsky
The Raw YouthPoor FolkThis book is written by Fyodor Dostoevsky
White NightsThe Death of Ivan Ilyich/The Kreutzer SonataThis book is written by Leo Tolstoy
Ivan the FoolCall If You Need MeThis book is translated by Haruki Murakami
The Overcoat/The NoseWill You Please Be Quiet, Please?This book is translated by Haruki Murakami
Buying History of User #2Recommended Items for User #1Reason for the Recommendation
おうちで給食ごはん小さくてもきちんとした服This book is about dressmaking for children
愛情いっぱい手作りの赤ちゃん服かんたん!かわいい!はじめての赤ちゃん服と小物This book is about dressmaking for children
ハンドメイドベビー服きれいに縫うための基礎の基礎This book is about dressmaking
いちばんよくわかる赤ちゃんと小さな子の服女の子のまいにちの服This book is about dressmaking for children
3歳からのおべんとう女の子のシンプルでかわいい服This book is about dressmaking for children
ぼく、アンパンマン!簡単に作れて、とことん使える日常着This book is about dressmaking for women
ちくちくはじめて赤ちゃんスタイ初めての育児This book is published by ひよこクラブ
ワンポイント刺しゅう図案365女の子のおしゃれ服This book is about dressmaking for children
1色刺繍と小さな雑貨おしゃれが好きな女の子の服This book is about dressmaking for children
あかちゃんのために作るもの子供服ソーイングLESSON BOOKThis book is about dressmaking for children
Buying History of User #3Recommended Items for User #2Reason for the Recommendation
漢文ヤマのヤマ漢文道場This book is published by Z会
得点奪取現代文古文上達This book is written by 仲光雄
パラグラフリーディングのストラテジー 1大学入試全レベル問題集現代文 3This book targets at high school student
大学入試全レベル問題集現代文 1中堅私大古文演習This book targets at high school student
大学入試全レベル問題集現代文 2実力をつける日本史100題This book is published by Z会出版編集部
実力をつける世界史100題パラグラフリーディングのストラテジー 2This book targets at high school student
はじめる日本史 要点&演習古文This book targets at high school student
生物〈生物基礎・生物〉基礎問題精講センター試験現代社会集中講義This book targets at high school student
センター古文満点のコツ「源氏」でわかる古典常識This book targets at high school student
金の漢字最強編 大学受験大学入試全レベル問題集現代文 4This book targets at high school student

Experiments for Q3:

KGRec learns to embed attributes such as authors, publishers, and categories as well as items. Therefore, KGRec can recommend attributes based on other attributes via the nearest neighbor search of  \mathbf{h}_i. In this section, I demonstrate an author recommendation based on other authors. The table below shows examples of author recommendations generated by KGRec. For example, 東野圭吾 (Keigo Higashino) and 伊坂幸太郎 (Kotaro Isaka) are famous novelists, known for their mystery novels. 辻村深月 (Mizuki Tsujimura) and 湊かなえ (Kanae Minato) are also famous novelists known for their mystery novels. 池上彰 (Akira Ikegami) is a journalist and writes many books for business people about economics and politics. 斎藤孝 (Takashi Saito) and 茂木健一郎 (Ken Mogi) also write many books for business people. 荒木飛呂彦 (Hirohiko Araki) is a manga artist known for JoJo's Bizarre Adventure. 諫山創 (Hajime Isayama) and 荒川 弘 (Hiromu Arakawa) are also manga artists known for Attack on Titan and Fullmetal Alchemist, respectively. 有元葉子 (Yoko Arimoto) writes many books about home cooking for homemakers. ワタナベマキ (Maki Watanabe) and 梶晶子 (Akiko Kaji) also write many books about home cooking. 柳沢小実 (Konomi Yanagisawa) and 石村由起子 (Yukiko Ishimura) write books about interiors for homemakers. These results demonstrate that KGRec can make not only for user-based item recommendation but also author-based author recommendation. This result can be naturally extended to other attributes such as user-based author recommendation and item-based item recommendation.


Experiments for Q4:

In this section, I confirm the scalability of KGRec. Since the dimension of embeddings in KGRec is 32, it is expected that KGRec is highly scalable. I conduct item-item recommendation, where similarity is calculated by cosine similarity of  \mathbf{h}_i. I conduct this experiment on an instance of Google Compute Engine with single vCPU without GPU. There are 262661 items in total, and I calculate scores of all items with exact computation one by one. In average, it takes only 0.09 second to calculate the full ranking for each item. Considering the computation is done with only a single CPU, and this is exact computation, KGRec is highly efficient. KGRec can be further accelerated by multiple CPUs or GPUs, or efficient search libraries such as faiss, FLANN, and annoy. I leave this direction for future work.


In this internship, I built KGRec, a catalog-based recommendation system using a knowledge graph. This alleviates the cold-start problem, is interpretable, and scalable. I demonstrated the effectiveness of KGRec through extensive experiments.

There are many existing knowledge graphs such as YAGO, DBpedia, Google’s knowledge graph, and Satori. Utilizing such off-the-shelf knowledge graphs for the recommendation in Mercari is an interesting future direction. Most importantly, deploying KGRec to the Mercari C2C marketplace app to measure the effectiveness of this method in the real-world environment is promising future work.


Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, Dawei Yin. Graph Neural Networks for Social Recommendation. WWW, 2019.

Diederik P. Kingma, Jimmy Ba. Adam: A Method for Stochastic Optimization. ICLR, 2015.

Yehuda Koren, Robert M. Bell, Chris Volinsky. Matrix Factorization Techniques for Recommender Systems. IEEE Computer, 2009.

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, Lars Schmidt-Thieme. BPR: Bayesian Personalized Ranking from Implicit Feedback. UAI, 2009.

Hongwei Wang, Fuzheng Zhang, Jialin Wang, Miao Zhao, Wenjie Li, Xing Xie, Minyi Guo. RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems. CIKM, 2018.

Hongwei Wang, Miao Zhao, Xing Xie, Wenjie Li, Minyi Guo. Knowledge Graph Convolutional Networks for Recommender Systems. WWW, 2019.

Hongwei Wang, Fuzheng Zhang, Mengdi Zhang, Jure Leskovec, Miao Zhao, Wenjie Li, Zhongyuan Wang. Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommender Systems. KDD, 2019.

Xiang Wang, Xiangnan He, Yixin Cao, Meng Liu, Tat-Seng Chua. KGAT: Knowledge Graph Attention Network for Recommendation. KDD, 2019.

Xiang Wang, Dingxian Wang, Canran Xu, Xiangnan He, Yixin Cao, Tat-Seng Chua. Explainable Reasoning over Knowledge Graphs for Recommendation. AAAI, 2019.

Xiao Yu, Xiang Ren, Yizhou Sun, Quanquan Gu, Bradley Sturt, Urvashi Khandelwal, Brandon Norick, Jiawei Han. Personalized entity recommendation: a heterogeneous information network approach. WSDM, 2014.

Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, Wei-Ying Ma. Collaborative knowledge base embedding for recommender systems. KDD, 2016.

  1. In this experiment, I do not show buying histories of real users but virtual users to protect the privacy of Mercari users.