TY - GEN
T1 - Collective classification via discriminative matrix factorization on sparsely labeled networks
AU - Zhang, Daokun
AU - Yin, Jie
AU - Zhu, Xingquan
AU - Zhang, Chengqi
N1 - Publisher Copyright:
© 2016 Copyright held by the owner/author(s).
PY - 2016/10/24
Y1 - 2016/10/24
N2 - We address the problem of classifying sparsely labeled networks, where labeled nodes in the network are extremely scarce. Existing algorithms, such as collective classification, have been shown to be effective for jointly deriving labels of related nodes, by exploiting label dependencies among neighboring nodes. However, when the network is sparsely labeled, most nodes have too few or even no connections to labeled nodes. This makes it very difficult to leverage supervised knowledge from labeled nodes to accurately estimate label dependencies, thereby largely degrading classification accuracy. In this paper, we propose a novel discriminative matrix factorization (DMF) based algorithm that effectively learns a latent network representation by exploiting topo-logical paths between labeled and unlabeled nodes, in addition to nodes' content information. The main idea is to use matrix factorization to obtain a compact representation of the network that fully encodes nodes' content information and network structure, and unleash discriminative power inferred from labeled nodes to directly benefit collective classification. We formulate a new matrix factorization objective function that integrates network representation learning with an empirical loss minimization for classifying node labels. An efficient optimization algorithm based on conjugate gradient methods is proposed to solve the new objective function. Experimental results on real-world networks show that DMF yields superior performance gain over the state-of-the-art baselines on sparsely labeled networks.
AB - We address the problem of classifying sparsely labeled networks, where labeled nodes in the network are extremely scarce. Existing algorithms, such as collective classification, have been shown to be effective for jointly deriving labels of related nodes, by exploiting label dependencies among neighboring nodes. However, when the network is sparsely labeled, most nodes have too few or even no connections to labeled nodes. This makes it very difficult to leverage supervised knowledge from labeled nodes to accurately estimate label dependencies, thereby largely degrading classification accuracy. In this paper, we propose a novel discriminative matrix factorization (DMF) based algorithm that effectively learns a latent network representation by exploiting topo-logical paths between labeled and unlabeled nodes, in addition to nodes' content information. The main idea is to use matrix factorization to obtain a compact representation of the network that fully encodes nodes' content information and network structure, and unleash discriminative power inferred from labeled nodes to directly benefit collective classification. We formulate a new matrix factorization objective function that integrates network representation learning with an empirical loss minimization for classifying node labels. An efficient optimization algorithm based on conjugate gradient methods is proposed to solve the new objective function. Experimental results on real-world networks show that DMF yields superior performance gain over the state-of-the-art baselines on sparsely labeled networks.
KW - Collective classification
KW - Matrix factorization
KW - Network representation learning
KW - Sparsely labeled networks
UR - http://www.scopus.com/inward/record.url?scp=84996484081&partnerID=8YFLogxK
U2 - 10.1145/2983323.2983754
DO - 10.1145/2983323.2983754
M3 - Conference contribution
AN - SCOPUS:84996484081
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 1563
EP - 1572
BT - CIKM 2016 - Proceedings of the 2016 ACM Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 25th ACM International Conference on Information and Knowledge Management, CIKM 2016
Y2 - 24 October 2016 through 28 October 2016
ER -