Collaborative Topic Modeling for Recommending Scientific Articles

Collaborative topic modeling for recommending scientific articles
C Wang, DM Blei - Proceedings of the 17th ACM SIGKDD international …, 2011 - dl.acm.org
Cited by 602 Related articles All 18 versions Cite Save

pdf: [1] [2]

source code(?): https://github.com/blei-lab/ctr

abstract

scientific article추천이 목표

Our approach combines the merits of traditional collaborative filtering and probabilistic topic modeling.

intro

관련 paper를 찾는 고전적인 방법은 1)인용 2)keyword search 3)각자가 만든 레퍼런스라이브러리를 공유하게 해줌 등이다.

Collaborative filtering based on latent factor models [17, 18, 13, 1, 22] and content analysis based on probabilistic topic modeling [7, 8, 20, 2].

우리는 둘 다 쓴다.

We combine these approaches in a probabilistic model, where making a recommendation for a particular user is akin to computing a conditional expectation of hidden variables. We will show how the algorithm for computing these expectations naturally balances the influence of the content of the articles and the libraries of the other users. An article that has not been seen by many will be recommended based more on its content; an article that has been widely seen will be recommended based more on the other users.

background

recommendation tasks

I users and J items

our task is to recommend articles that are not in her library but are potentially interesting.

Recommendation by Matrix Factorization

Most successful recommendation methods are latent factor models [17, 18, 13, 1, 22], which provide better recommendation results than the neighborhood methods [11, 13]

기본 아이디어는 user나 item이 latent factor들로 이루어진 latent vector로 표현될 수 있다고 보는 것이다. $$\hat{r}_{ij} = u^T_i v_j$$가 된다. 이를 아래와 같이 확률모델로 가정해볼 수도 있다. $$u_i \sim \mathcal{N} (0, \lambda_u^{-1} I_K) \\ v_j \sim \mathcal{N} (0, \lambda_v^{-1} I_K) \\ r_{ij} \sim \mathcal N (u_i^T v_j, c^{-1}_{ij} ) \\ \text{confidence parameter} \; c_{ij} = \begin{cases} a, & \text{if}\; r_{ij} = 1, \\ b, & \text{if}\; r_{ij} = 0, \end{cases} \; \text{where} \; a>b>0 $$

Probabilistic Topic Models

The simplest topic model is 'latent Dirichlet allocation’ (LDA) [7].

아래에서 쓰이는 $\theta$를 구한다.

COLLABORATIVE TOPIC REGRESSION (CTR)

[12]를 읽어야 이해할 수 있는 부분이 많은듯.(특히 parameter 학습과정)

위(Probabilistic Topic Models)에서 구한 $\theta$를 아래에 넣음. $$v_j = ε_j + θ_j$$

Note that the expectation of $r_{ij}$ is a linear function of $θ_j$ , $$ \mathbf{E}[r_{ij}|u_i,θ_j,ε_j] = u_i^T (θ_j + ε_j) $$ (따라서 $u_i$를 손보는 것은 아니다.
item's latent vector $v_j$에 user의 선택뿐 아니라 document의 주제까지 반영하겠다는 아이디어.)

학습에 projection gradient[3]쓴다고 함.

Learning the parameters

Prediction

references

[1] D. Agarwal and B.-C. Chen. Regression-based latent factor models. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 19–28, New York, NY, USA, 2009. ACM.

[2] D. Agarwal and B.-C. Chen. flda: matrix factorization through latent Dirichlet allocation. In Proceedings of the third ACM international conference on Web search and data mining, WSDM ’10, pages 91–100, New York, NY, USA, 2010. ACM.

~~[3] D. Bertsekas. Nonlinear Programming. Athena Scientific, 1999.~~

[4] D. Blei and J. Lafferty. A correlated topic model of Science. Annals of Applied Statistics, 1(1):17–35, 2007.

[5] D. Blei and J. Lafferty. Topic models. In A. Srivastava and M. Sahami, editors, Text Mining: Theory and Applications. Taylor and Francis, 2009.

[6] D. Blei and J. McAuliffe. Supervised topic models. In Neural Information Processing Systems, 2007.

[7] D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022, January 2003.

[8] J. Chang, J. Boyd-Graber, S. Gerrish, C. Wang, and D. Blei. Reading tea leaves: How humans interpret topic models. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 288–296, 2009.

~~[9] A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39:1–38, 1977.~~

[10] S. M. Gerrish and D. M. Blei. Predicting legislative roll calls from text. In Proceedings of the 28th Annual International Conference on Machine Learning, ICML ’11, 2011.

[11] J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’99, pages 230–237, New York, NY, USA, 1999. ACM.

[12] Y. Hu, Y. Koren, and C. Volinsky. Collaborative filtering for implicit feedback datasets. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pages 263–272, Washington, DC, USA, 2008. IEEE Computer Society.

[13] Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30–37, 2009.

~~[14] P. Melville, M. R., and R. Nagaraja. Content-boosted collaborative filtering for improved recommendations. In American Association for Artificial Intelligence, pages 187–192, 2002.~~

[15] R. J. Mooney and L. Roy. Content-based book recommending using learning for text categorization. In Proceedings of the fifth ACM conference on Digital libraries, pages 195–204, New York, NY, USA, 2000. ACM.

[16] R. Pan, Y. Zhou, B. Cao, N. N. Liu, R. Lukose, M. Scholz, and Q. Yang. One-class collaborative filtering. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, pages 502–511, Washington, DC, USA, 2008. IEEE Computer Society.

[17] R. Salakhutdinov and A. Mnih. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th International Conference on Machine learning, pages 880–887. ACM, 2008.

[18] R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. Advances in Neural Information Processing Systems, 20:1257–1264, 2008.

[19] H. Shan and A. Banerjee. Generalized probabilistic matrix factorizations for collaborative filtering. In Proceedings of the 2010 IEEE International Conference on Data Mining, pages 1025–1030, Washington, DC, USA, 2010. IEEE Computer Society.

[20] Y. Teh, M. Jordan, M. Beal, and D. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566–1581, 2007.

[21] E. Wang, D. Liu, J. Silva, D. Dunson, and L. Carin. Joint analysis of time-evolving binary matrices and associated documents. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, pages 2370–2378. 2010.

[22] K. Yu, J. Lafferty, S. Zhu, and Y. Gong. Large-scale collaborative prediction using a nonparametric random effects model. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1185–1192, New York, NY, USA, 2009. ACM.

둘러보기 메뉴