"Collaborative Topic Modeling for Recommending Scientific Articles"의 두 판 사이의 차이

2017년 6월 12일 (월) 18:09 판

Collaborative topic modeling for recommending scientific articles
C Wang, DM Blei - Proceedings of the 17th ACM SIGKDD international …, 2011 - dl.acm.org
Cited by 602 Related articles All 18 versions Cite Save

pdf: [1] [2]

source code(?): https://github.com/blei-lab/ctr

abstract

scientific article추천이 목표

Our approach combines the merits of traditional collaborative filtering and probabilistic topic modeling.

intro

관련 paper를 찾는 고전적인 방법은 1)인용 2)keyword search 3)각자가 만든 레퍼런스라이브러리를 공유하게 해줌 등이다.

Collaborative filtering based on latent factor models [17, 18, 13, 1, 22] and content analysis based on probabilistic topic modeling [7, 8, 20, 2].

우리는 둘 다 쓴다.

We combine these approaches in a probabilistic model, where making a recommendation for a particular user is akin to computing a conditional expectation of hidden variables. We will show how the algorithm for computing these expectations naturally balances the influence of the content of the articles and the libraries of the other users. An article that has not been seen by many will be recommended based more on its content; an article that has been widely seen will be recommended based more on the other users.

background

recommendation tasks

I users and J items

our task is to recommend articles that are not in her library but are potentially interesting.

Recommendation by Matrix Factorization

Most successful recommendation methods are latent factor models [17, 18, 13, 1, 22], which provide better recommendation results than the neighborhood methods [11, 13]

Probabilistic Topic Models

The simplest topic model is latent Dirichlet allocation (LDA) [7]. (LDA는 위키봐도 무슨소린지.. 한글로 된 참고자료 [3]와 [4]를 보면 설명이 formal해서 그렇지 그렇게 어려운 내용은 아닌것도 같다. [5]에서는 LDA를 이용한 CF를 다루고 있다. 위키에 보면 처음 발표된 논문은 인용이 10k가 넘는 굉장히 기념비적인 논문이라고 함.)

COLLABORATIVE TOPIC REGRESSION (CTR)

[12]를 읽어야 이해할 수 있는 부분이 많은듯.(특히 parameter 학습과정)

CTR additionally includes a latent variable $ε_j$ that offsets the topic proportions $θ_j$ when modeling the user ratings. As more users rate articles, we have a better idea of what this offset is. This offset variable can explain, for example, that article A is more interesting to machine learning researchers than it is to social network analysis researchers. How much of the prediction relies on content and how much it relies on other users depends on how many users have rated the article.

$$v_j = ε_j + θ_j$$

Note that the expectation of $r_{ij}$ is a linear function of $θ_j$ , $$ \mathbf{E}[r_{ij}|u_i,θ_j,ε_j] = u_i^T (θ_j + ε_j) $$ This is why we call the model collaborative topic regression.

학습에 projection gradient[3]쓴다고 함.

coordinate ascent라는 말도 자주 나오는데 gradient descent랑 어떻게 다른건가.

references

[1] D. Agarwal and B.-C. Chen. Regression-based latent factor models. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 19–28, New York, NY, USA, 2009. ACM.

[2] D. Agarwal and B.-C. Chen. flda: matrix factorization through latent Dirichlet allocation. In Proceedings of the third ACM international conference on Web search and data mining, WSDM ’10, pages 91–100, New York, NY, USA, 2010. ACM.

[3] D. Bertsekas. Nonlinear Programming. Athena Scientific, 1999.

[4] D. Blei and J. Lafferty. A correlated topic model of Science. Annals of Applied Statistics, 1(1):17–35, 2007.

[5] D. Blei and J. Lafferty. Topic models. In A. Srivastava and M. Sahami, editors, Text Mining: Theory and Applications. Taylor and Francis, 2009.

[6] D. Blei and J. McAuliffe. Supervised topic models. In Neural Information Processing Systems, 2007.

[7] D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, January 2003.

[8] J. Chang, J. Boyd-Graber, S. Gerrish, C. Wang, and D. Blei. Reading tea leaves: How humans interpret topic models. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 288–296, 2009.

[9] A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39:1–38, 1977.

[10] S. M. Gerrish and D. M. Blei. Predicting legislative roll calls from text. In Proceedings of the 28th Annual International Conference on Machine Learning, ICML ’11, 2011.

[11] J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’99, pages 230–237, New York, NY, USA, 1999. ACM.

[12] Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pages 263–272, Washington, DC, USA, 2008. IEEE Computer Society.

[13] Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30–37, 2009.

[14] P. Melville, M. R., and R. Nagaraja. Content-boosted collaborative filtering for improved recommendations. In American Association for Artificial Intelligence, pages 187–192, 2002.

[15] R. J. Mooney and L. Roy. Content-based book recommending using learning for text categorization. In Proceedings of the fifth ACM conference on Digital libraries, pages 195–204, New York, NY, USA, 2000. ACM.

[16] R. Pan, Y. Zhou, B. Cao, N. N. Liu, R. Lukose, M. Scholz, and Q. Yang. One-class collaborative filtering. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pages 502–511, Washington, DC, USA, 2008. IEEE Computer Society.

[17] R. Salakhutdinov and A. Mnih. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th International Conference on Machine learning, pages 880–887. ACM, 2008.

[18] R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. Advances in Neural Information Processing Systems, 20:1257–1264, 2008.

[19] H. Shan and A. Banerjee. Generalized probabilistic matrix factorizations for collaborative filtering. In Proceedings of the 2010 IEEE International Conference on Data Mining, pages 1025–1030, Washington, DC, USA, 2010. IEEE Computer Society.

[20] Y. Teh, M. Jordan, M. Beal, and D. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581, 2007.

[21] E. Wang, D. Liu, J. Silva, D. Dunson, and L. Carin. Joint analysis of time-evolving binary matrices and associated documents. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, pages 2370–2378. 2010.

[22] K. Yu, J. Lafferty, S. Zhu, and Y. Gong. Large-scale collaborative prediction using a nonparametric random effects model. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1185–1192, New York, NY, USA, 2009. ACM.

@@ 36번째 줄: / 36번째 줄: @@
 (LDA는 [https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation 위키]봐도 무슨소린지.. 한글로 된 참고자료 [http://www.4four.us/article/2010/11/latent-dirichlet-allocation-simply]와 [http://khanrc.tistory.com/entry/Latent-Dirichlet-Allocation-LDA]를 보면 설명이 formal해서 그렇지 그렇게 어려운 내용은 아닌것도 같다. [http://parkcu.com/blog/latent-dirichlet-allocation/]에서는 LDA를 이용한 CF를 다루고 있다. 위키에 보면 처음 발표된 논문은 인용이 10k가 넘는 굉장히 기념비적인 논문이라고 함.)
-==COLLABORATIVE TOPIC REGRESSION==
+==COLLABORATIVE TOPIC REGRESSION (CTR)==
 [12]를 읽어야 이해할 수 있는 부분이 많은듯.(특히 parameter 학습과정)

둘러보기 메뉴