WWW 2007 / Poster Paper Topic: Social Networks Measuring Credibility of Users in an E-learning Environment Wei Wei The Royal Institute of Technology, Sweden Jimmy Lee The Chinese University of Hong Kong Irwin King The Chinese University of Hong Kong weiwe@kth.se ABSTRACT jlee@cse.cuhk.edu.hk king@cse.cuhk.edu.hk 2. CITATION MODELS DEFINITION Basic Article Citation Model. Articles in the LV system form a citation relation set. We use a graph to model the citation relationship. We define a Basic Article Citation Model (BACM) graph as: Definition 1 (BACM Graph). BACM graph is a weighted directed graph G(V , E , W ). V is a set of vertices where each vertex represents an article in the system. E is a set of edges between the vertices: E = {(p, q)|p, q V and the corresponding article p citing the article q}. W is a set of weighted values corresponding to edges belong to E . The weight value of an edge E (p, q) is wpq .1 Learning Villages (LV) is an E-learning platform for p eople's online discussions and frequently citing p ostings of one another. In this pap er, we prop ose a novel method to rank credit authors in the LV system. We first prop ose a kEACM graph to describ e the article citation structure in the LV system. And then we build a weighted graph model k-UCM graph to reveal the implicit relationship b etween authors hidden b ehind the citations among their articles. Furthermore, we design a graph-based ranking algorithm, the Credit Author Ranking (CAR) algorithm, which can b e applied to rank nodes in a graph with negative edges. Finally, we p erform exp erimental evaluations by simulations. The results of evaluations illustrate that the prop osed method works pretty well on ranking the credibility of users in the LV system. The k-Extended Article Citation Model. To take into account indirect impact from nodes, we extend the BACM graph to an k-Extended Article Citation Model (kEACM). Definition 2 (k-distance link). In a BACM graph, if there is a shortest path without considering weight from node x to node y in k hops, we say that there is a k-distance link between the corresponding article x to the corresponding article y . ( Categories and Subject Descriptors J.1 [ADMINISTRATIVE DATA PROCESSING]: Education; J.4 [SOCIAL AND BEHAVIORAL SCIENCES]: Sociology Definition 3 , ) (k-EACM). A k-EACM is a weighted directed graph General Terms Algorithms, Measurement Keywords Education Platform, Author Ranking, HITS Algorithm G V , E W based on BACM graph (V , E , W ). V is a set of veri tices where each vertex represents an article in the system. E s a = set of edges between the vertices: E {(p, q)|p, q V and the corresponding articles p, q share a l-distance link (l k)}. W is a set of weighted values corresponding to edges belong to E. The weight value kwpq of an l-distance link edge (p,q) in graph k-EACM is defined as:2 8 l=1 >wpq > < k wp q = P Q > ( 1 - l -1 ) wedge l > 1 > |{R1 q }| : k p r{Rpq } edgeedges(r) (1) 1. INTRODUCTION Learning Villages (LV) is an E-learning platform in which we use a novel method to evaluate the credibility of users in the LV communities. The citation structure of articles in the community can b e rich source of information to rank authors p erformance on the condition that we can build an effective model to describ e the relationship among articles and authors. Citation analysis has b een widely used for bibliometrics ranking, but the author ranking in the bibliometrics domain is paid relatively little attention. Much research related to citation analysis focuses on publicationbased ranking algorithms [4, 1, 2]. The method prop osed in this pap er focuses on authors ranking in an E-learning system. Copyright is held by the author/owner(s). WWW 2007, May 8­12, 2007, Banff, Alberta, Canada. ACM 978-1-59593-654-7/07/0005. User Citation Model. To describ e the implicit citations among authors, we map k-EACM graph to the corresp onding User Citation Model namely k-UCM graph. Definition 4 (Author Citation Link). Al l the articles in our system form a k-EACM graph, we say the author A has an author citation link to author B if and only if at least one article of author A has an edge to one article of author B in the k-EACM. Definition 5 ( (k-UCM Graph). An k-UCM graph is a weighted , , ) ( derived from k-EACM graph G V , E directed graph G V E W i i where V s a set of al l the users in the system. E s a set of edges = a between the vertices: E {(u, v )|u, v V nd the corresponding wpq = 1,if article p supp orts article q;wpq = -1,if article p is against artic l e q . 2 Rpq represents one shortest route from no de p to no de q; {Rpq } represents a set of all the shortest routes existing b etween p and q; edges(r ) represents a set of all edges in a route r (r Rpq ); weg represents the weight value of an edge eg in ACM graph; |{Rpq }| represents the numb er of elements in the set {Rpq }. 1 , W ) 1279 WWW 2007 / Poster Paper user u, v have an Author Citation Link}. W . i Topic: Social Networks s a set of weighted indicate supp ort probabilities to users of G typ e, B typ e and A typ e. The other two typ e of users have the same supp ort probability parameters. Therefore a 3 × 3 matrix GBA is defined to control the parameters of supp ort probabilities T among users. The evaluation process is to examine the p erpAtk(u) q{C by(p) Atk(v)} centage of G typ e users will b e ranked in top 50 by the 3. CREDIT AUTHOR RANKING ALGORITHM proposed method. We p erform three simulations with different parameters setCredit Author Ranking (CAR) algorithm is designed for ting. In each simulation, 50 G typ e users, 50 B typ e users ranking authors in k-UCM graph. In the CAR algorithm, and 200 A typ e users are defined a d the simulation are run n we associate each user u in the k-UCM graph with a credible GG GB GA attribute weight x and a trouble attribute weight y . for 100 cycles. The GBA matrix BG BB BA used AG AB AA and y for all every users in K-UCM graph The x in the three exp eriments are resp ectively as follows: resp ectively form a vector {x } and a vector {y }. Two op erations denoted by and are defined as: 4 values corresponding to edges in E The weight value awuv of an edge (u, v ) in graph k-UCM is defined as:3 X X awuv = awpq · The op eration up dates the x-weights: x Table 1: GBA matrix for 3 simulations |awuv |×y X v:(u,v)[1] E |awuv |×x + X v:(u,v)[-1] E Simulation 1 0.9 0.1 0.8 0.1 0.7 0.2 0.5 0.5 0.5 Simulation 2 0.9 0.1 0.5 0.1 0.7 0.5 0.9 0.1 0.5 Simulation 3 0.9 0.1 0.9 0.1 0.7 0.1 0.8 0.1 0.8 · The op eration up dates the y-weights: y X -|awvu |×x + v:(v,u)[1] E X -|awvu |×y v:(v,u)[-1] E To compute the reinforcing equilibrium values for credible attribute weights and trouble attribute weights of each users in k-UCM graph, CAR algorithm does the same iterative process as HITS [3] in Algorithm 1. Considering the x representing a user u's credible attribute and y representing a user u's trouble attribute, we introduce z to represent the user u's overall p erformance and define z as: z = x - y . Algorithm 1 Iterative Process in CAR 1: 2: 3: 4: 5: 6: Iterate(V ,t) V : a set of no des in k-UCM graph t: a natural numb er Let w denote the vector (1, 1, 1, ..., 1) Rn . S e t x 0 := w . S e t y 0 := w . for i = 1, 2, ..., t do i Apply op eration to (xi-1 , yi-1 ), obtaining new x-weights x . Apply op eration to (x , yi-1 ), obtaining new y -weights y . Normalize x , obtaining xi . i Normalize y , obtaining yi . i i i We utilize the prop osed method to analyze the data from three simulations. The p ercentages of G typ e users in top 50 are also calculated for each simulation. The results are shown in Table 2. In Table 2, we can find that the prop osed Table 2: Percentage of G typ e users in top 50 Top 10 20 30 40 50 Simulation 1 100% 100% 100% 97.5% 86.0% Simulation 2 100% 100% 96.7% 90.0% 78.0% Simulation 3 100% 95.0% 90.0% 87.5% 82.0% method rank good users from G typ e as top credible users in high p ercentage, which is consistent with our exp ectation. 5. CONCLUSION Our research in this pap er focus on author ranking in the LV system. We prop ose k-EACM graph and k-UCM graph to describ e the citation relations among articles and authors and furthermore design the CAR algorithm which can b e applied to rank nodes in a graph with negative edges. By analyzing the data from three simulations, the exp erimental results demonstrate that the prop osed method works pretty well on ranking the credibility of users in the LV system. 7: end for 8: END 9: return (xt , yt ). 6. REFERENCES 4. EXPERIMENT We conduct exp erimental evaluations of the prop osed method on the data from simulations. Three typ es of users are defined in the simulation including: G typ e (good users), B typ e (bad users) and A typ e (average users). G typ e of users have three parameters GG , GB and GA which resp ectively Atk(u) represents a set of articles in k-EACM of an author u; C by (p) represents a set of no des q and (p, q) is an edge in k-EACM graph. 4 E represents a set of all edges in k-UCM graph; (u, v )[1] E means that edge (u, v ) b elongs to E and no de u has a supp ort citation link to no de v ; (u, v )[-1] E means that edge (u, v ) b elongs to E and no de u has an against citation link to no de v ; awvu represents that weight of edge (u, v ) in E . 3 [1] Y. An, J. Janssen, and E. E. Milios. Characterizing and mining the citation graph of the computer science literature. Know ledge and Information Systems, 6(6):664­678, 2004. [2] E. Garfield. Journal impact factor: a brief review. Canadian Medical Association Journal, 161(8):979­980, Octob er 1999. [3] J. M. Kleinb erg. Authoritative sources in a hyp erlinked environment. Journal of the ACM (JACM), 46(5):604­632, 1999. [4] A. Sidirop oulos and Y. Manolop oulos. A citation-based system to assist prize awarding. SIGMOD Record, 34(4):54­60, 2005. 1280