Collaborative Deep Learning for Recommender Systems

[1409.2944] Collaborative Deep Learning for Recommender Systems
by H Wang - ‎2014 - ‎Cited by 113 - ‎Related articles
https://arxiv.org/abs/1409.2944

CF의 단점때문에 보조정보(auxiliary information)을 이용하는 method 많이들씀
1. loosely coupled
  - 보조정보 처리 후 CF에 넣음
2. tightly coupled
  - the rating information can guide the learning of features
  - Collaborative topic regression (CTR) ^[1]
    = Latent Dirichlet allocation (LDA) ^[2](as a topic model) + probabilistic matrix factorization (PMF) ^[3](as a model-based CF).
  - auxiliary info도 sparse하면 잘 안된다.

We first present a Bayesian formulation of a deep learning model called stacked denoising autoencoder (SDAE) ^[4]. With this, we then present our CDL model which tightly couples deep representation learning for the content information and collaborative filtering for the ratings (feedback) matrix, allowing two-way interaction between the two.
- SDAE말고, RBM, CNN, RNN같은거 써도 된다.

contributions
- Unlike previous deep learning models which use simple target like classification ^[5]and reconstruction ^[6], we propose to use CF as a more complex target in a probabilistic framework.
- Besides the algorithm for attaining maximum a posteriori (MAP) estimates, we also derive a sampling-based algorithm for the Bayesian treatment of CDL, which, interestingly, turns out to be a Bayesian generalized version of back-propagation.
- etc.

notation

\(\mathbf{X}_c\)는 \(J\times S\) matrix. J items가 있고, 각 item이 S dimension. c는 clean을 뜻함. SDAE의 input이 된다(noise corrupted matrix는 \(\mathbf{X}_0\))
\(\mathbf{R}\)은 \(I\times J\) matrix. User수가 \(I\).
Note that an L/2-layer SDAE corresponds to an L-layer network.

CDL

Stacked Denoising Autoencoders

AE인데 그냥 stack한것.

Generalized Bayesian SDAE

Note that while generation of the clean input \(\mathbf{X}_c\) from \(\mathbf{X}_L\) is part of the generative process of the Bayesian SDAE, generation of the noise-corrupted input \(\mathbf{X}_0\) from \(\mathbf{X}_c\) is an artificial noise injection process to help the SDAE learn a more robust feature representation.

Collaborative Deep Learning

Maximum A Posteriori Estimates

update rule(논문참조)에 따라 iterative하게 update한다.

단, Collaborative Topic Modeling for Recommending Scientific Articles의 경우 \(v_j\)를 수정하는 \(\theta\)가 미리 계산될 수 있어서, \(\mathbf{U}, \mathbf{V}\)만 update하지만, 이 경우는 neural net학습과 동시에 진행되기 때문에 \(\mathbf{U}, \mathbf{V}, \mathbf{W}, \mathbf{b}\)순서로 iteration한다.

Prediction

EXPERIMENTS

Datasets

Evaluation Scheme

Baselines and Experimental Settings

Quantitative Comparison

Qualitative Comparison

COMPLEXITY ANALYSIS AND IMPLEMENTATION

CONCLUSION AND FUTURE WORK

ACKNOWLEDGMENTS

REFERENCES

[1] D. Agarwal and B.-C. Chen. Regression-based latent factor models. In KDD, pages 19–28, 2009.
[2] P. Baldi and P. J. Sadowski. Understanding dropout. In NIPS, pages 2814–2822, 2013.
[3] Y. Bengio, L. Yao, G. Alain, and P. Vincent. Generalized denoising auto-encoders as generative models. In NIPS, pages 899–907, 2013.
[4] C. M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.
[5] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003.
[6] J. Bobadilla, F. Ortega, A. Hernando, and A. Guti ́errez. Recommender systems survey. Knowledge Based Systems, 46:109–132, 2013.
[7] M. Chen, Z. E. Xu, K. Q. Weinberger, and F. Sha. Marginalized denoising autoencoders for domain adaptation. In ICML, pages 767–774, 2012.
[8] T. Chen, W. Zhang, Q. Lu, K. Chen, Z. Zheng, and Y. Yu. Svdfeature: a toolkit for feature-based collaborative filtering. JMLR, 13:3619–3622, 2012.
[9] K. Georgiev and P. Nakov. A non-iid framework for collaborative filtering with restricted boltzmann machines. In ICML, pages 1148–1156, 2013.
[10] A. Graves, S. Ferna ́ndez, F. J. Gomez, and J. Schmidhuber. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In ICML, pages 369–376, 2006.
[11] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580, 2012.
[12] L.Hu,J.Cao,G.Xu,L.Cao,Z.Gu,andC.Zhu. Personalized recommendation via cross-domain triadic factorization. In WWW, pages 595–606, 2013.
[13] Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In ICDM, pages 263–272, 2008.
[14] M. I. Jordan, Z. Ghahramani, T. Jaakkola, and L. K. Saul. An introduction to variational methods for graphical models. Machine Learning, 37(2):183–233, 1999.
[15] N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. ACL, pages 655–665, 2014.
[16] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, pages 1106–1114, 2012.
[17] K. Lang. Newsweeder: Learning to filter netnews. In ICML, pages 331–339, 1995.
[18] W.-J. Li, D.-Y. Yeung, and Z. Zhang. Generalized latent factor models for social network analysis. In IJCAI, pages 1705–1710, 2011.
[19] D. J. C. MacKay. A practical Bayesian framework for backpropagation networks. Neural Computation, 4(3):448–472, 1992.
[20] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111–3119, 2013.
[21] A. V. D. Oord, S. Dieleman, and B. Schrauwen. Deep content-based music recommendation. In NIPS, pages 2643–2651, 2013.
[22] S. Purushotham, Y. Liu, and C.-C. J. Kuo. Collaborative topic regression with social matrix factorization for recommendation systems. In ICML, pages 759–766, 2012.
[23] S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. BPR: Bayesian personalized ranking from implicit feedback. In UAI, pages 452–461, 2009.
[24] T. N. Sainath, B. Kingsbury, V. Sindhwani, E. Arisoy, and B. Ramabhadran. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In ICASSP, pages 6655–6659, 2013.
[25] R. Salakhutdinov and G. E. Hinton. Deep Boltzmann machines. In AISTATS, pages 448–455, 2009.
[26] R. Salakhutdinov and G. E. Hinton. Semantic hashing. Int. J. Approx. Reasoning, 50(7):969–978, 2009.
[27] R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In NIPS, pages 1257–1264, 2007.
[28] R. Salakhutdinov, A. Mnih, and G. E. Hinton. Restricted boltzmann machines for collaborative filtering. In ICML, pages 791–798, 2007.
[29] S. G. Sevil, O. Kucuktunc, P. Duygulu, and F. Can. Automatic tag expansion using visual similarity for photo sharing websites. Multimedia Tools Appl., 49(1):81–99, 2010.
[30] A. P. Singh and G. J. Gordon. Relational learning via collective matrix factorization. In KDD, pages 650–658, 2008.
[31] R. S. Strichartz. A Guide to Distribution Theory and Fourier Transforms. World Scientific, 2003.
[32] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.
[33] S. Wager, S. Wang, and P. Liang. Dropout training as adaptive regularization. In NIPS, pages 351–359, 2013.
[34] C. Wang and D. M. Blei. Collaborative topic modeling for recommending scientific articles. In KDD, pages 448–456, 2011.
[35] H. Wang, B. Chen, and W.-J. Li. Collaborative topic regression with social regularization for tag recommendation. In IJCAI, pages 2719–2725, 2013.
[36] H. Wang and W. Li. Relational collaborative topic regression for recommender systems. TKDE, 27(5):1343–1355, 2015.
[37] H. Wang, X. Shi, and D. Yeung. Relational stacked denoising autoencoder for tag recommendation. In AAAI, pages 3052–3058, 2015.
[38] N. Wang and D.-Y. Yeung. Learning a deep compact image representation for visual tracking. In NIPS, pages 809–817, 2013.
[39] X. Wang and Y. Wang. Improving content-based and hybrid music recommendation using deep learning. In ACM MM, pages 627–636, 2014.
[40] W. Zhang, H. Sun, X. Liu, and X. Guo. Temporal qos-aware web service recommendation via non-negative tensor factorization. In WWW, pages 585–596, 2014.
[41] K. Zhou and H. Zha. Learning binary codes for collaborative filtering. In KDD, pages 498–506, 2012.

↑ C. Wang and D. M. Blei. Collaborative Topic Modeling for Recommending Scientific Articles. In KDD, pages 448–456, 2011.
↑ D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003.
↑ R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In NIPS, pages 1257–1264, 2007.
↑ P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.
↑ N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. ACL, pages 655–665, 2014.
↑ P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.

blog comments powered by Disqus

[1] C. Wang and D. M. Blei. Collaborative Topic Modeling for Recommending Scientific Articles. In KDD, pages 448–456, 2011.

[2] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003.

[3] R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In NIPS, pages 1257–1264, 2007.

[4] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.

[5] N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. ACL, pages 655–665, 2014.

[6] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.

[1]

[2]

[3]

[4]

[5]

[6]

둘러보기 메뉴