Collaborative Deep Learning for Recommender Systems

ph
이동: 둘러보기, 검색
[1409.2944] Collaborative Deep Learning for Recommender Systems
by H Wang - ‎2014 - ‎Cited by 113 - ‎Related articles
https://arxiv.org/abs/1409.2944

code

저자 홈페이지: http://www.wanghao.in

공식 프로젝트 페이지: http://www.wanghao.in/CDL.htm

official code : https://github.com/js05212/CDL

저자의 다른 논문들(2016):

  • Relational deep learning: A deep latent variable model for link prediction
  • Natural parameter networks: a class of probabilistic neural networks
  • Collaborative recurrent autoencoder: recommend while learning to fill in the blanks

기타 : https://github.com/akash13singh/mxnet-for-cdl

코드 분석 및 실제 실험 : Collaborative Deep Learning for Recommender Systems : Practice

intro

  • CF는 sparse할 때 약하고 cold start problem이 있다.
    • CTR(Collaborative topic regression)이 등장.
  • hierarchical Bayesian model인 CDL(Collaborative Deep Learning)을 제안한다.
    • deep representation learning은 dl로,
    • rating은 CF로.
  • CF의 단점때문에 보조정보(auxiliary information)을 이용하는 method 많이들씀
    1. loosely coupled
      • 보조정보 처리 후 CF에 넣음
    2. tightly coupled
      • the rating information can guide the learning of features
      • Collaborative topic regression (CTR) [1]
        = Latent Dirichlet allocation (LDA) [2](as a topic model) + probabilistic matrix factorization (PMF) [3](as a model-based CF).
      • auxiliary info도 sparse하면 잘 안된다.
  • We first present a Bayesian formulation of a deep learning model called stacked denoising autoencoder (SDAE) [4]. With this, we then present our CDL model which tightly couples deep representation learning for the content information and collaborative filtering for the ratings (feedback) matrix, allowing two-way interaction between the two.
    • SDAE말고, RBM, CNN, RNN같은거 써도 된다.
  • contributions
    • Unlike previous deep learning models which use simple target like classification [5]and reconstruction [6], we propose to use CF as a more complex target in a probabilistic framework.
    • Besides the algorithm for attaining maximum a posteriori (MAP) estimates, we also derive a sampling-based algorithm for the Bayesian treatment of CDL, which, interestingly, turns out to be a Bayesian generalized version of back-propagation.
    • etc.

notation

  • \(\mathbf{X}_c\)는 \(J\times S\) matrix. J items가 있고, 각 item이 S dimension. c는 clean을 뜻함. SDAE의 input이 된다(noise corrupted matrix는 \(\mathbf{X}_0\))
  • \(\mathbf{R}\)은 \(I\times J\) matrix. User수가 \(I\).
  • Note that an L/2-layer SDAE corresponds to an L-layer network.

CDL

Stacked Denoising Autoencoders

AE인데 그냥 stack한것.

Generalized Bayesian SDAE

Note that while generation of the clean input \(\mathbf{X}_c\) from \(\mathbf{X}_L\) is part of the generative process of the Bayesian SDAE, generation of the noise-corrupted input \(\mathbf{X}_0\) from \(\mathbf{X}_c\) is an artificial noise injection process to help the SDAE learn a more robust feature representation.

Collaborative Deep Learning

Maximum A Posteriori Estimates

update rule(논문참조)에 따라 iterative하게 update한다.

단, Collaborative Topic Modeling for Recommending Scientific Articles의 경우 \(v_j\)를 수정하는 \(\theta\)가 미리 계산될 수 있어서, \(\mathbf{U}, \mathbf{V}\)만 update하지만, 이 경우는 neural net학습과 동시에 진행되기 때문에 \(\mathbf{U}, \mathbf{V}, \mathbf{W}, \mathbf{b}\)순서로 iteration한다.

Prediction

EXPERIMENTS

Datasets

Evaluation Scheme

Baselines and Experimental Settings

Quantitative Comparison

Qualitative Comparison

COMPLEXITY ANALYSIS AND IMPLEMENTATION

CONCLUSION AND FUTURE WORK

ACKNOWLEDGMENTS

REFERENCES

[1] D. Agarwal and B.-C. Chen. Regression-based latent factor models. In KDD, pages 19–28, 2009.
[2] P. Baldi and P. J. Sadowski. Understanding dropout. In NIPS, pages 2814–2822, 2013.
[3] Y. Bengio, L. Yao, G. Alain, and P. Vincent. Generalized denoising auto-encoders as generative models. In NIPS, pages 899–907, 2013.
[4] C. M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.
[5] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003.
[6] J. Bobadilla, F. Ortega, A. Hernando, and A. Guti ́errez. Recommender systems survey. Knowledge Based Systems, 46:109–132, 2013.
[7] M. Chen, Z. E. Xu, K. Q. Weinberger, and F. Sha. Marginalized denoising autoencoders for domain adaptation. In ICML, pages 767–774, 2012.
[8] T. Chen, W. Zhang, Q. Lu, K. Chen, Z. Zheng, and Y. Yu. Svdfeature: a toolkit for feature-based collaborative filtering. JMLR, 13:3619–3622, 2012.
[9] K. Georgiev and P. Nakov. A non-iid framework for collaborative filtering with restricted boltzmann machines. In ICML, pages 1148–1156, 2013.
[10] A. Graves, S. Ferna ́ndez, F. J. Gomez, and J. Schmidhuber. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In ICML, pages 369–376, 2006.
[11] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580, 2012.
[12] L.Hu,J.Cao,G.Xu,L.Cao,Z.Gu,andC.Zhu. Personalized recommendation via cross-domain triadic factorization. In WWW, pages 595–606, 2013.
[13] Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In ICDM, pages 263–272, 2008.
[14] M. I. Jordan, Z. Ghahramani, T. Jaakkola, and L. K. Saul. An introduction to variational methods for graphical models. Machine Learning, 37(2):183–233, 1999.
[15] N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. ACL, pages 655–665, 2014.
[16] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, pages 1106–1114, 2012.
[17] K. Lang. Newsweeder: Learning to filter netnews. In ICML, pages 331–339, 1995.
[18] W.-J. Li, D.-Y. Yeung, and Z. Zhang. Generalized latent factor models for social network analysis. In IJCAI, pages 1705–1710, 2011.
[19] D. J. C. MacKay. A practical Bayesian framework for backpropagation networks. Neural Computation, 4(3):448–472, 1992.
[20] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111–3119, 2013.
[21] A. V. D. Oord, S. Dieleman, and B. Schrauwen. Deep content-based music recommendation. In NIPS, pages 2643–2651, 2013.
[22] S. Purushotham, Y. Liu, and C.-C. J. Kuo. Collaborative topic regression with social matrix factorization for recommendation systems. In ICML, pages 759–766, 2012.
[23] S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. BPR: Bayesian personalized ranking from implicit feedback. In UAI, pages 452–461, 2009.
[24] T. N. Sainath, B. Kingsbury, V. Sindhwani, E. Arisoy, and B. Ramabhadran. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In ICASSP, pages 6655–6659, 2013.
[25] R. Salakhutdinov and G. E. Hinton. Deep Boltzmann machines. In AISTATS, pages 448–455, 2009.
[26] R. Salakhutdinov and G. E. Hinton. Semantic hashing. Int. J. Approx. Reasoning, 50(7):969–978, 2009.
[27] R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In NIPS, pages 1257–1264, 2007.
[28] R. Salakhutdinov, A. Mnih, and G. E. Hinton. Restricted boltzmann machines for collaborative filtering. In ICML, pages 791–798, 2007.
[29] S. G. Sevil, O. Kucuktunc, P. Duygulu, and F. Can. Automatic tag expansion using visual similarity for photo sharing websites. Multimedia Tools Appl., 49(1):81–99, 2010.
[30] A. P. Singh and G. J. Gordon. Relational learning via collective matrix factorization. In KDD, pages 650–658, 2008.
[31] R. S. Strichartz. A Guide to Distribution Theory and Fourier Transforms. World Scientific, 2003.
[32] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.
[33] S. Wager, S. Wang, and P. Liang. Dropout training as adaptive regularization. In NIPS, pages 351–359, 2013.
[34] C. Wang and D. M. Blei. Collaborative topic modeling for recommending scientific articles. In KDD, pages 448–456, 2011.
[35] H. Wang, B. Chen, and W.-J. Li. Collaborative topic regression with social regularization for tag recommendation. In IJCAI, pages 2719–2725, 2013.
[36] H. Wang and W. Li. Relational collaborative topic regression for recommender systems. TKDE, 27(5):1343–1355, 2015.
[37] H. Wang, X. Shi, and D. Yeung. Relational stacked denoising autoencoder for tag recommendation. In AAAI, pages 3052–3058, 2015.
[38] N. Wang and D.-Y. Yeung. Learning a deep compact image representation for visual tracking. In NIPS, pages 809–817, 2013.
[39] X. Wang and Y. Wang. Improving content-based and hybrid music recommendation using deep learning. In ACM MM, pages 627–636, 2014.
[40] W. Zhang, H. Sun, X. Liu, and X. Guo. Temporal qos-aware web service recommendation via non-negative tensor factorization. In WWW, pages 585–596, 2014.
[41] K. Zhou and H. Zha. Learning binary codes for collaborative filtering. In KDD, pages 498–506, 2012.




  1. C. Wang and D. M. Blei. Collaborative Topic Modeling for Recommending Scientific Articles. In KDD, pages 448–456, 2011.
  2. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003.
  3. R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In NIPS, pages 1257–1264, 2007.
  4. P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.
  5. N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. ACL, pages 655–665, 2014.
  6. P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.

blog comments powered by Disqus