Collaborative Deep Learning for Recommender Systems
[1409.2944] Collaborative Deep Learning for Recommender Systems by H Wang - 2014 - Cited by 113 - Related articles https://arxiv.org/abs/1409.2944
목차
code
저자 홈페이지: http://www.wanghao.in
공식 프로젝트 페이지: http://www.wanghao.in/CDL.htm
official code : https://github.com/js05212/CDL
저자의 다른 논문들(2016):
- Relational deep learning: A deep latent variable model for link prediction
- Natural parameter networks: a class of probabilistic neural networks
- Collaborative recurrent autoencoder: recommend while learning to fill in the blanks
기타 : https://github.com/akash13singh/mxnet-for-cdl
코드 분석 및 실제 실험 : Collaborative Deep Learning for Recommender Systems : Practice
intro
- CF는 sparse할 때 약하고 cold start problem이 있다.
- CTR(Collaborative topic regression)이 등장.
- hierarchical Bayesian model인 CDL(Collaborative Deep Learning)을 제안한다.
- deep representation learning은 dl로,
- rating은 CF로.
- CF의 단점때문에 보조정보(auxiliary information)을 이용하는 method 많이들씀
- loosely coupled
- 보조정보 처리 후 CF에 넣음
- tightly coupled
- loosely coupled
- We first present a Bayesian formulation of a deep learning model called stacked denoising autoencoder (SDAE) [4]. With this, we then present our CDL model which tightly couples deep representation learning for the content information and collaborative filtering for the ratings (feedback) matrix, allowing two-way interaction between the two.
- SDAE말고, RBM, CNN, RNN같은거 써도 된다.
- contributions
- Unlike previous deep learning models which use simple target like classification [5]and reconstruction [6], we propose to use CF as a more complex target in a probabilistic framework.
- Besides the algorithm for attaining maximum a posteriori (MAP) estimates, we also derive a sampling-based algorithm for the Bayesian treatment of CDL, which, interestingly, turns out to be a Bayesian generalized version of back-propagation.
- etc.
notation
- \(\mathbf{X}_c\)는 \(J\times S\) matrix. J items가 있고, 각 item이 S dimension. c는 clean을 뜻함. SDAE의 input이 된다(noise corrupted matrix는 \(\mathbf{X}_0\))
- \(\mathbf{R}\)은 \(I\times J\) matrix. User수가 \(I\).
- Note that an L/2-layer SDAE corresponds to an L-layer network.
CDL
Stacked Denoising Autoencoders
AE인데 그냥 stack한것.
Generalized Bayesian SDAE
Note that while generation of the clean input \(\mathbf{X}_c\) from \(\mathbf{X}_L\) is part of the generative process of the Bayesian SDAE, generation of the noise-corrupted input \(\mathbf{X}_0\) from \(\mathbf{X}_c\) is an artificial noise injection process to help the SDAE learn a more robust feature representation.
Collaborative Deep Learning
Maximum A Posteriori Estimates
update rule(논문참조)에 따라 iterative하게 update한다.
단, Collaborative Topic Modeling for Recommending Scientific Articles의 경우 \(v_j\)를 수정하는 \(\theta\)가 미리 계산될 수 있어서, \(\mathbf{U}, \mathbf{V}\)만 update하지만, 이 경우는 neural net학습과 동시에 진행되기 때문에 \(\mathbf{U}, \mathbf{V}, \mathbf{W}, \mathbf{b}\)순서로 iteration한다.
Prediction
EXPERIMENTS
Datasets
Evaluation Scheme
Baselines and Experimental Settings
Quantitative Comparison
Qualitative Comparison
COMPLEXITY ANALYSIS AND IMPLEMENTATION
CONCLUSION AND FUTURE WORK
ACKNOWLEDGMENTS
REFERENCES
[1] D. Agarwal and B.-C. Chen. Regression-based latent factor models. In KDD, pages 19–28, 2009.
[13] Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In ICDM, pages 263–272, 2008.
[2] P. Baldi and P. J. Sadowski. Understanding dropout. In NIPS, pages 2814–2822, 2013.
[3] Y. Bengio, L. Yao, G. Alain, and P. Vincent. Generalized denoising auto-encoders as generative models. In NIPS, pages 899–907, 2013.
[4] C. M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.
[5] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003.
[6] J. Bobadilla, F. Ortega, A. Hernando, and A. Guti ́errez. Recommender systems survey. Knowledge Based Systems, 46:109–132, 2013.
[7] M. Chen, Z. E. Xu, K. Q. Weinberger, and F. Sha. Marginalized denoising autoencoders for domain adaptation. In ICML, pages 767–774, 2012.
[8] T. Chen, W. Zhang, Q. Lu, K. Chen, Z. Zheng, and Y. Yu. Svdfeature: a toolkit for feature-based collaborative filtering. JMLR, 13:3619–3622, 2012.
[9] K. Georgiev and P. Nakov. A non-iid framework for collaborative filtering with restricted boltzmann machines. In ICML, pages 1148–1156, 2013.
[10] A. Graves, S. Ferna ́ndez, F. J. Gomez, and J. Schmidhuber. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In ICML, pages 369–376, 2006.
[11] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580, 2012.
[12] L.Hu,J.Cao,G.Xu,L.Cao,Z.Gu,andC.Zhu. Personalized recommendation via cross-domain triadic factorization. In WWW, pages 595–606, 2013.
[14] M. I. Jordan, Z. Ghahramani, T. Jaakkola, and L. K. Saul. An introduction to variational methods for graphical models. Machine Learning, 37(2):183–233, 1999.
[34] C. Wang and D. M. Blei. Collaborative topic modeling for recommending scientific articles. In KDD, pages 448–456, 2011.
[15] N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. ACL, pages 655–665, 2014.
[16] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, pages 1106–1114, 2012.
[17] K. Lang. Newsweeder: Learning to filter netnews. In ICML, pages 331–339, 1995.
[18] W.-J. Li, D.-Y. Yeung, and Z. Zhang. Generalized latent factor models for social network analysis. In IJCAI, pages 1705–1710, 2011.
[19] D. J. C. MacKay. A practical Bayesian framework for backpropagation networks. Neural Computation, 4(3):448–472, 1992.
[20] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111–3119, 2013.
[21] A. V. D. Oord, S. Dieleman, and B. Schrauwen. Deep content-based music recommendation. In NIPS, pages 2643–2651, 2013.
[22] S. Purushotham, Y. Liu, and C.-C. J. Kuo. Collaborative topic regression with social matrix factorization for recommendation systems. In ICML, pages 759–766, 2012.
[23] S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme. BPR: Bayesian personalized ranking from implicit feedback. In UAI, pages 452–461, 2009.
[24] T. N. Sainath, B. Kingsbury, V. Sindhwani, E. Arisoy, and B. Ramabhadran. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In ICASSP, pages 6655–6659, 2013.
[25] R. Salakhutdinov and G. E. Hinton. Deep Boltzmann machines. In AISTATS, pages 448–455, 2009.
[26] R. Salakhutdinov and G. E. Hinton. Semantic hashing. Int. J. Approx. Reasoning, 50(7):969–978, 2009.
[27] R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In NIPS, pages 1257–1264, 2007.
[28] R. Salakhutdinov, A. Mnih, and G. E. Hinton. Restricted boltzmann machines for collaborative filtering. In ICML, pages 791–798, 2007.
[29] S. G. Sevil, O. Kucuktunc, P. Duygulu, and F. Can. Automatic tag expansion using visual similarity for photo sharing websites. Multimedia Tools Appl., 49(1):81–99, 2010.
[30] A. P. Singh and G. J. Gordon. Relational learning via collective matrix factorization. In KDD, pages 650–658, 2008.
[31] R. S. Strichartz. A Guide to Distribution Theory and Fourier Transforms. World Scientific, 2003.
[32] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.
[33] S. Wager, S. Wang, and P. Liang. Dropout training as adaptive regularization. In NIPS, pages 351–359, 2013.
[35] H. Wang, B. Chen, and W.-J. Li. Collaborative topic regression with social regularization for tag recommendation. In IJCAI, pages 2719–2725, 2013.
[36] H. Wang and W. Li. Relational collaborative topic regression for recommender systems. TKDE, 27(5):1343–1355, 2015.
[37] H. Wang, X. Shi, and D. Yeung. Relational stacked denoising autoencoder for tag recommendation. In AAAI, pages 3052–3058, 2015.
[38] N. Wang and D.-Y. Yeung. Learning a deep compact image representation for visual tracking. In NIPS, pages 809–817, 2013.
[39] X. Wang and Y. Wang. Improving content-based and hybrid music recommendation using deep learning. In ACM MM, pages 627–636, 2014.
[40] W. Zhang, H. Sun, X. Liu, and X. Guo. Temporal qos-aware web service recommendation via non-negative tensor factorization. In WWW, pages 585–596, 2014.
[41] K. Zhou and H. Zha. Learning binary codes for collaborative filtering. In KDD, pages 498–506, 2012.
- ↑ C. Wang and D. M. Blei. Collaborative Topic Modeling for Recommending Scientific Articles. In KDD, pages 448–456, 2011.
- ↑ D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. JMLR, 3:993–1022, 2003.
- ↑ R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In NIPS, pages 1257–1264, 2007.
- ↑ P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.
- ↑ N. Kalchbrenner, E. Grefenstette, and P. Blunsom. A convolutional neural network for modelling sentences. ACL, pages 655–665, 2014.
- ↑ P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR, 11:3371–3408, 2010.