SIGIR 2007 Proceedings Poster Fine-Grained Named Entity Recognition and Relation Extraction for Question Answering Changki Lee, Yi-Gyu Hwang, Myung-Gil Jang Electronics and Telecommunications Research Institute (ETRI) 161 Gajeong-dong, Yuseong-gu, Daejeon, 305-350, Korea {leeck, yghwang, mgjang}@etri.re.kr Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing; Linguistic processing In this paper, we describe a fine-grained NER and relation extraction between fine-grained named entities and show the effect on QA. General Terms: Algorithms, Experimentation. Keywords: Question answering, fine-grained named recognition, relation extraction. 2. FINE-GRAINED NER USING CRF entity We define 147 fine-grained NE types in consideration of user's asking points for finding answer candidates of a QA system. They have 15 top levels and each top node consists of 2~4 layers. The base set of such types is; person, study field, theory, artifacts, organization, location, civilization, data, time, quantity, event, animal, plant, material, and term. In this case, Conditional Random Fields (CRFs) can not be applied directly, because CRFs which have many classes are too time consuming. To solve this problem, we break down the NER task in two parts; boundary detection and NE classification. Let x = be some observed input data sequences, such as a sequence of words of sentences in a document. Let y= be a set of NE states, each of which is associated with a NE type and a boundary (i.e. B and I). We define the conditional probability of a state sequence y given an input sequence x as follows: P ( y | x ) = P (c, b | x ) = P (b |x ) P (c | b ,x ) P (b |x ) P (c i | bi , x i ) (1) i =1 1. INTRODUCTION In most QA systems, sentences or passages that are regarded as the most relevant to the question are extracted and then answers are retrieved by using NLP techniques (especially named entity recognition and syntactic pattern). Recent advances have brought some systems to within 90% accuracy when classifying named entities into broad categories, such as person, organization, and location. But more finely grained named entity recognition and relation extraction between these entities have additional advantages in QA [1, 2, 3, 7]. In most researches have gone into the coarse categorization of named entities and the syntactic relation pattern extraction, we are not aware of much previous work using machine learning algorithms to perform fine-grained NER and relation extraction between these fine-grained named entities. Fleischman and Hovy describe a method for automatically classifying person instances into eight finer-grained subcategories [1]. They use a supervised learning method. But a training data is highly skewed, because they use a simple bootstrapping method to generate the training data automatically. And they classify only person and location, but we need to classify all kinds of NE types. Mann explores the idea of a fine-grained proper noun ontology and its use in QA [2]. He builds a proper noun ontology from unrestricted text. The disadvantage of this method is that its coverage is small, because he uses simple textual co-occurrence patterns. Agichtein and et al. show that a relatively small number of binary relationships account for the most of the queries in the sample [3]. But they define 12 relationships between coarse entities (9 classes) and do not show the relation extraction method. Ravichandran and Hovy explore the surface text patterns for QA systems [7]. They use a pattern-learning algorithm with bootstrapping method. But they do not cope with various surface expressions of the sentence, because of the word ordering and long distance dependencies. where b = is a set of boundary states, c = is a set of NE class states. We define the conditional probability of a boundary state sequence b given an input sequence x using CRFs as follows: PCRF (b | x) = 1 T exp k f k (bt -1 , bt , x) Z (x) t =1 k (2) where fk(bt-1,bt,x) is an arbitrary feature function over its arguments, and k is a learned weight for each feature function. Higher weights make their corresponding FSM translations more likely [4]. We calculate the conditional probability of a NE class ci given a NE (bi, xi) extracted by boundary detector as follows: PME (ci | bi , x) = 1 exp k f k (ci , bi , x) Z (bi , x) k (3) Copyright is held by the author/owner(s). SIGIR'07, July 23-27, 2007, Amsterdam, The Netherlands. ACM 978-1-59593-597-7/07/0007. The primary advantage of CRFs over HMM is their conditional nature, resulting in the relaxation of independence assumptions required by HMMs. Additionally, CRFs avoid the label bias problem, a weakness exhibited by Maximum Entropy Markov Models (MEMMs). CRFs outperform both MEMMs and HMMs on a number of real-world sequence labeling tasks [4]. 799 SIGIR 2007 Proceedings Poster 3. RELATION EXTRACTION We define 37 binary relationships between fine-grained named entities in consideration of user's asking points. Table 1 shows the 10 most frequent relations observed in our training set. Relation (zij) has_position has_person has_occupation has_email has_business has_building has_county has_technology has_city has_service Subject (yi) PS_NAME LCP_COUNTRY, OG_OTHERS,... obtained 77.7% F1. We also reduced the training time to 27% without loss of performance compared to the baseline model. The experiment for relation extraction was performed on our Korean relation extraction data set which consists of 485 documents tagged by human annotators. We used 455 documents (one-year IT news articles) as a training set and 30 documents (one-month IT news articles) as a test set, respectively. Relation has_position has_person has_occupation has_email has_business Overall Precsion 86.2 88.2 87.0 65.2 55.6 80.8 Recall 73.5 85.7 83.3 60.0 83.3 66.2 F1 79.4 87.0 85.1 62.5 66.7 72.8 Object (yj) CV_POSITION PS_NAME CV_OCCUPATION TMI_EMAIL OGG_BUSINESS AF_BUILDING LCP_COUNTY TR_TECHNOLOGY LCP_CITY TMI_SERVICE Freq. 4039 3208 760 524 351 251 237 210 156 132 PS_NAME PS_NAME LCP_COUNTRY, LCP_CITY,... LCP_COUNTRY, LCP_CITY,... LCP_PROVINCE, LCP_CITY,... LCP_COUNTRY, OGG_BUSINESS,... LCP_COUNTRY, LCP_PROVINCE LCP_COUNTRY, OGG_BUSINESS,... Table 3. Performance of Relation Extraction. Table 3 shows the performance of the relation extraction for the 5 most frequent relations and the overall performance of 37 relations. To show the effect of the fine-grained NER and relation extraction, we performed another experiment for a QA system. We modified ETRI QA Test Set [5], which consists of 402 pairs of question and answer in encyclopedia, as the IT news domain and used 50 factoid questions in the experiment. QA system Passage retrieval + Coarse NER (9 classes) Passage retrieval + Fine-grained NER (147 classes) Passage retrieval + Fine-grained NER + Relation extraction Table 1. Top 10 most frequent relations. Let z= be a set of relation states, each of which is associated with two named entities (i.e. yi and yj). We define the conditional probability of two state sequences z and y given an input sequence x as follows: P ( z , y | x) = P ( y|x) P ( z | y,x) P (y|x) P ( z ij | y i , y j , x) i, j (4) where P(y|x) is the conditional probability of a NE sequence y given an input sequence x as defined by Equation 1. We calculate the conditional probability of a relation state zij given two named entities (yi, yj) as follows: MRAR 0.538 0.660 0.716 Table 4. Performance of Question Answering. Table 4 shows the performance of QA systems. The QA system with fine-grained NER and relation extraction achieved about 48% improvement over QA with coarse NER. PME ( z ij | yi , yi , x) = 1 exp k f k ( z ij , yi , yi , x) Z ( y i , y i , x) k (5) 5. CONCLUSION In this paper, we describe a fine-grained NER and relation extraction using CRFs and ME for QA. Using the proposed approach, we obtained 78.6% F1 for 147 fined-grained NE types and 72.8% F1 for 37 relations. In the QA system, the performance with fined-grained NER and relation extraction archived about 33% improvement over QA with coarse NER. 4. EXPERIMENT The experiments for fine-grained NER were performed on our Korean fine-grained NE data set. The data set consists of 6,000 documents tagged by human annotators. We used 5,500 documents as a training set and 500 documents as a test set, respectively. We trained the model by L-BFGS using our C++ implementation of CRFs and Maximum Entropy (ME). We use a Gaussian prior of 1 (for ME) and 10 (for CRFs). We perform 500 iterations for training. Task Boundary detection 6. REFERENCES [1] M. Fleischman and E. Hovy. Fine grained classification of named entities. COLING, 2002. [2] G. Mann, Fine-Grained Proper Noun Ontologies for Question [3] [4] [5] [6] [7] Answering, SemaNet'02: Building and Using Semantic Networks, 2002 E. Agichtein, S. Cucerzan, and E. Brill. Analysis of Factoid Questions for Effective Relation Extraction, SIGIR, 2005. S. Fei, F. Pereira. Shallow Parsing with Conditional Random Fields, HLT & NAACL, 2003. H. Kim, J. Wang, C. Lee, C. Lee, M. Jang. A LF based Answer Indexing Method for Encyclopedia Question Answering System, AIRS, 2004. K. Han, H. Chung, S. Kim, Y. Song, J. Lee, H. Rim. Korea University Question Answering System at TREC 2004, TREC, 2004 D. Ravichandran and E. Hovy. Learning surface text patterns for a question answering system, ACL, 2002 Method ME CRFs 1 stage:ME(baseline) 1 stage:CRFs 2 stages:ME+ME 2 stages:CRFs+ME Training Time (second/iter.) Pre (%) Rec (%) F1 (%) NER 6.23 24.14 154.97 12248.33 24.59 42.50 87.4 89.5 82.5 82.8 83.2 80.6 79.7 73.5 73.3 74.5 83.9 84.3 77.7 77.8 78.6 Table 2. Performance of fine-grained NER (147 classes). Table 2 shows the performance of boundary detection and finegrained NER using CRFs and ME. In boundary detection, we obtained the performance, 83.9% F1 and 84.3% F1 using ME and CRFs respectively. In NER, our proposed model (2 stages: CRFs+ME) obtained 78.6% F1. The baseline model (1 stage: ME) 800