A Novel Lexicalized HMM-based Learning Framework for Web Opinion Mining

Wei Jin Department of Computer Science, North Dakota State University, USA

WEI.JIN@NDSU.EDU

HUNGHO@BUFFALO.EDU Hung Hay Ho Department of Computer Science & Engineering, State University of New York at Buffalo, USA

Abstract
Merchants selling products on the Web often ask their customers to share their opinions and hands-on experiences on products they have purchased. As e-commerce is becoming more and more popular, the number of customer reviews a product receives grows rapidly. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. In this research, we aim to mine customer reviews of a product and extract highly specific product related entities on which reviewers express their opinions. Opinion expressions and sentences are also identified and opinion orientations for each recognized product entity are classified as positive or negative. Different from previous approaches that have mostly relied on natural language processing techniques or statistic information, we propose a novel machine learning framework using lexicalized HMMs. The approach naturally integrates linguistic features, such as part-ofspeech and surrounding contextual clues of words into automatic learning. The experimental results demonstrate the effectiveness of the proposed approach in web opinion mining and extraction from product reviews.

hundreds or even thousands. Our goal in this research is to design a framework that is capable of extracting, learning and classifying product related entities from product reviews. Specifically, given a particular product, the system first identifies potential product related entities and opinion related entities from the reviews, and then extracts opinion sentences which describe each identified product entity, and finally determines opinion orientations (positive or negative) for each recognized product entity. Different from previous approaches that have mostly relied on natural language processing techniques (Turney, 2002) or statistic information (Hu and Liu, 2004), we propose a novel framework naturally integrates linguistic features (e.g., part-of-speech, phrases' internal formation patterns, and surrounding contextual clues of words/phrases) into automatic learning supported by lexicalized HMMs. The experimental results demonstrate the effectiveness of the proposed approach in web opinion mining and extraction from product reviews. The rest of this paper is organized as follows: section 2 discusses background knowledge and related work. Section3 describes each component of the proposed learning framework. We report in section 4 our experimental results and give our conclusions on this work in section 5.

1. Introduction
With the rapid expansion of e-commerce, more and more products are sold on the web, and more and more people are buying products online. It has become a common practice for online merchants to ask their customers to share their opinions and hands-on experiences on products they have purchased. Unfortunately, reading through all customer reviews is difficult, especially for popular items, the number of reviews can be up to ----------
Appearing in Proceedings of the 26th International Conference on Machine Learning, Montreal, Canada, 2009. Copyright 2009 by the author(s)/owner(s).

2. Related Work
Opinion mining can be divided into two categories, document level and feature level. Document level aims to classify the overall sentiment orientation of a document (Turney and Littman, 2003); feature level is interested in finding product features being commented on and the opinion polarity for each feature. In this paper, we are focused on feature level opinion mining and propose it involves two major tasks, recognition and classification. Recognition is the task of recognizing sentences expressing opinions; classification is the task of classifying elements in an opinion sentence into different categories such as opinion words/phrases and product features. Determining the polarity of opinion words (positive or negative) is also a classification task.

A Novel Lexicalized HMM-based Learning Framework for Web Opinion Mining

There has been work on feature level opinion mining. (Zhuang et al., 2006) classified and summarized movie reviews by extracting high frequency feature keywords and high frequency opinion keywords. Feature-opinion pairs were identified by using a dependency grammar graph. However, it used a fixed list of keywords to recognize high frequency feature words, and thus the system capability is limited. (Popescu and Etzioni, 2005) proposed a relaxation labeling approach to find the semantic orientation of words. However, their approach only extracted feature words with frequency greater than an experimentally set threshold value and ignored low frequency feature words. (Hu and Liu, 2004) proposed a statistical approach capturing high frequency feature words by using association rules. Infrequent feature words are captured by extracting known opinion words' adjacent noun phrases. A summary is generated by using high frequency feature words and ignoring infrequent features. (Ding et al., 2008) further improved Hu's system by manually adding some rules to handle different kinds of sentence structures. However, the capability of recognizing phrase features is limited by the accuracy of recognizing noun-group boundaries. Their approach also lacks an effective way to address infrequent features. In this work, we propose a novel framework naturally integrates linguistic features into automatic learning. The system can identify complex product-specific features (which are possible low frequency phrases in the reviews). The system can also self-learn new vocabularies based on the patterns it has seen from the training data. Therefore, the system is able to predict potential features in the test dataset even without seeing them in the training set. These capabilities were not supported by previous approaches.

Figure 1. The system framework

Table 1. Definitions of entity categories and examples COMPONENTS Physical objects of a camera including the camera itself, e.g., LCD, viewfinder, battery Capabilities provided by a camera, e.g., movie playback, zoom, automatic fillflash, auto focus Properties of components or functions, e.g., color, speed, size, weight, clarity Ideas and thoughts expressed by reviewers on product features / components / functions.

FUNCTIONS

3. The Proposed Techniques
FEATURES

Lexicalized HMMs was previously used in Part-ofSpeech (POS) Tagging and Named Entity Recognition (NER) problem. The task of POS tagging is the process of marking up the words in a text (corpus) as corresponding to a particular part-of-speech, such as noun and verb. The task of NER is identifying and classifying person names, location names, organization names, and etc. To correlate the web opinion mining task with POS Tagging and NER may well be a significant contribution in itself in this work. We have adapted the techniques proposed in (Lee et al., 2000; Fu and Luke, 2005) for Korean part-ofspeech tagging and Chinese named entity tagging respectively to better suit our task. Figure 1 gives the architectural overview of our opinion mining system. Below, we discuss each of the sub-steps in turn. 3.1 Entity Categories and Tag Sets In our work, we have defined four entity categories as shown in table 1 (a digital camera is used as an example).

OPINIONS

Correspondingly, we have further defined the basic tag set to identify each above entity category, which is given in table 2. In general, an entity can be a single word or a phrase. In other words, a word may present itself as an independent entity or a component of an entity. Therefore, a word w in an entity may take one of the following four patterns to present itself: 1. 2. 3. 4. w is an independent entity; w is the beginning component of an entity; w is at the middle of an entity; w is at the end of an entity.

A Novel Lexicalized HMM-based Learning Framework for Web Opinion Mining

We use a pattern tag set to denote the above four patterns, which is given in table 3.
Table 2. Basic tag set and its corresponding entities TAGS <PROD_FEAT> <PROD_COMP> <PROD_FUNCTION> <OPINION_POS_EXP> <OPINION_NEG_EXP> <OPINION_POS_IMP> <OPINION_NEG_IMP> <BG> CORRESPONDING ENTITIES Feature Entities Component Entities Function Entities Explicit Positive Opinion Entities Explicit Negative Opinion Entities Implicit Positive Opinion Entities Implicit Negative Opinion Entities Background Words

Different from traditional Hidden Markov Models, in our work, we integrate linguistic features such as part-ofspeech and lexical patterns into HMMs. An observable state is represented by a pair (wordi, POS(wordi)) where POS(wordi) represents the part-of-speech of wordi. The task is then described as follows: Given a sequence of words W= w1w2w3...wn and corresponding parts-of-speech S = s1s2s3...sn, the task is to find an appropriate sequence ^ of hybrid tags T = t1t 2 t 3 ...t n that maximize the conditional probability P(T|W,S), namely
P(W , S | T ) P(T ) ^ T = arg max P(T | W , S ) = arg max P(W , S ) T T

(1)

Since the probability P(W, S) remains unchanged for all candidate tag sequences, we can disregard it. Thus, we have a general statistical model as follows:
^ T = argmaxP(W, S | T )P(T ) = argmaxP(S | T )P(W | T, S) p(T )  P(si | w1...wi-1 , s1...si-1 , t1...ti-1ti ) ×    = argmax P(wi | w1...wi-1 , s1...si-1si , t1...ti-1ti ) × T i =1    P(ti | w1...wi-1 , s1...si-1 , t1...ti-1 ) 
T T n

Table 3. Pattern tag set and its corresponding patterns PATTERN TAGS <> <BOE> <MOE> <EOE> CORRESPONDING PATTERNS Independent Entities (Single Words) The Beginning Component of an Entity The Middle Component of an Entity The End of an Entity

(2)

Both the basic tag set and pattern tag set are used to represent each word's entity category and pattern (referred to as a hybrid tag representation as proposed in (Fu and Luke, 2005)). Patterns of background words are considered as independent entities. This hybrid-tag labeling method is applied to all the training data and system outputs. The following example illustrates the hybrid tag and basic tag representations of an opinion sentence "I love the ease of transferring the pictures to my computer." Hybrid tags: <BG>I</BG><OPINION_POS_EXP>love</OPINION_P OS_EXP><BG>the</BG><PROD_FEATBOE>ease</PROD_FEAT-BOE> <PROD_FEAT-MOE> of</PROD_FEAT-MOE><PROD_FEATMOE>transferring</PROD_FEAT-MOE> <PROD_FEAT-MOE>the</PROD_FEAT-MOE> <PROD_FEAT-EOE>pictures</PROD_FEAT-EOE> <BG>to</BG><BG>my</BG><BG>computer</BG> Basic tags: <BG>I</BG><OPINION_POS_EXP>love</OPINION_P OS_EXP> <BG> the </BG> <PROD_FEAT>ease of transferring the pictures </PROD_FEAT> <BG> to </BG><BG>my</BG><BG>computer</BG> 3.2 Lexicalized HMMs

Theoretically the general model can provide the system with a powerful capacity of disambiguation. However, in practice this general model is not computable for it involves too many parameters. Two types of approximations are made to simplify the general model. The first approximation is based on the independent hypothesis used in standard HMMs. First-order HMMs is used in view of data sparseness, i.e., P(ti | ti-K...ti-1 )  P(ti | ti-1 ). The second approximation combines the POS information with the lexicalization technique where three main hypotheses are made: 1. The assignment of current tag ti is supposed to depend not only on its previous tag ti-1 but also previous J (1Ji-1) words wi-J...wi-1. The appearance of current word wi is assumed to depend not only on the current tag ti, current POS si, but also the previous K(1Ki-1) words wi-K...wi-1. The appearance of current POS si is supposed to depend both on the current tag ti and previous L(1Li-1) words wi-L...wi-1.

2.

3.

With a view to the issue of data sparseness, we set J=K=L=1. Based on these assumptions, the general model in equation (2) can be rewritten as:

^ T = arg max
T

n


i =1

 P ( s i | w i -1 , t i ) ×     P ( w i | w i -1 , s i , t i ) ×   P (t | w , t )  i i -1 i -1  

(3)

A Novel Lexicalized HMM-based Learning Framework for Web Opinion Mining

Maximum Likelihood Estimation (MLE) is used to estimate the parameters in equation (3). For instance, P(si | wi-1, ti) can be estimated as:
P ( s i | wi -1 , t i ) = C ( wi -1 , t i , s i ) C ( wi -1 , t i , s i ) = C ( wi -1 , t i ) C ( wi -1 , t i , s) S

3.

(4)

The conversion of the results: For evaluation purposes, the hybrid-tag output produced by the tagger is further converted to the basic-tag format (the example is shown in section 3.1) by merging the consecutive known words into entities in terms of their patterns.

Note that the sum of counts of C(wi-1, ti, s) for all s is equivalent to the count of C(wi-1, ti). MLE values for other estimations in equation (3) can be computed similarly. Combing the lexicalization technique with POS information is able to handle richer contextual information for the assignment of tags to known words, including both contextual words and contextual tags under the framework of HMMs. Consequently, the accuracy of the recognizer can be improved without losing its efficiency in training and tagging. If a large training corpus is available, the parameters in equation (3) can be easily estimated. However, MLE will yield zero probabilities for any cases that are not observed in the training data. To solve this problem, we employ the linear interpolation smoothing technique to smooth higher-order models with their relevant lower-order models, or to smooth the lexicalized parameters using the related nonlexicalized probabilities, namely
  P ( wi | wi -1 , s i , t i ) +  P ' ( wi | wi -1 , s i , t i ) =   (1 -  ) P ( w | s , t )   i i i   P ' (t i | wi -1 , t i -1 ) =  P (t i | wi -1 , t i -1 ) + (1 -  ) P (t i | t i -1 ) P ' ( s i | wi -1 , t i ) =  P ( s i | wi -1 , t i ) + (1 -  ) P ( s i | t i )

3.4 Opinion Sentence Extraction This step identifies opinion sentences in the reviews. Opinion sentences in our work are defined as sentences that express an opinion on product related entities. In the pruning step, the following two types of sentences are not considered as effective opinion sentences: 1. 2. Sentences that describe product related entities without expressing reviewers' opinions. Sentences that express opinions on another product model's entities (model numbers, such as DMCLS70S, P5100, A570IS, can be easily identified using the regular expression "[A-Z-]+\d+([A-Za-z]+)?").

3.5 Determining Opinion Orientation This step further classifies opinion orientation given each identified product entity. Due to the complexity of natural language, opinion orientation is not simply equal to opinion entity (word/phrase)'s orientation. For example, "I can tell you right now that the auto mode and the program modes are not that good." The reviewer expressed his negative comment on both "auto mode" and "program modes" even in the presence of the opinion entity (word "good") in the sentence. To determine opinion orientation, we first convert the hybrid tagged sentences to basic tagged sentences, and then for each recognized product entity, we search its matching opinion entity, which is defined as the nearest opinion word/phrase identified by the tagger. The orientation of this matching opinion entity becomes the initial opinion orientation for the corresponding product entity. Next, natural language rules reflecting the sentence context are employed to deal with specific language constructs which may change the opinion orientation, such as the presence of negation words (e.g., not). The details of implementation are shown in Algorithm 1. Line 8 to line 23 checks the presence of any negation words (e.g., not, didn't, don't) within five-word distance in front of an opinion word/phrase and changes opinion orientation accordingly, except 1. 2. A negation word appears in front of a coordinating conjunction (e.g., and, or, but). (line 10 ­ 13) A negation word appears after the appearance of a product entity during the backward search within the five-word window. (line 14 -17)

(5)

Where ,  and  denote the interpolation coefficients (We observed that large values of ,  and  achieved high precision but low recall and low values of ,  and  achieved low precision but high recall. In terms of Fscore, the settings 0.7 for ,  and  achieved the best performance in our experiments). 3.3 Tagging Based on the above models, the tagging algorithm aims at finding the most probable sequence of hybrid tags for a given sequence of known words and corresponding parts­ of-speech. The algorithm is implemented in three major steps as follows: 1. The generation of candidate tags: This step aims to generate candidate hybrid tags for a sequence of known words and corresponding parts-of-speech. As discussed above, the candidate hybrid tags of a known word are a combination of its candidate category tags and its candidate pattern tags. The decoding of the best tag sequence: In this step, the Viterbi algorithm is employed to score all candidate hybrid tags with the proposed model, and then search the best path that has the maximal score.

2.

A Novel Lexicalized HMM-based Learning Framework for Web Opinion Mining

Line 27 to 32 handles the coordinating conjunction "but" and prepositions such as "except" and "apart from". The purpose of this procedure is to resolve the true opinion polarity for product entities when: 1. The opinion expression in the "but" clause is unsolvable, e.g., "It takes great movies with sound, but the sound can not be played back on the camera, only when it is connected to a TV or after you upload them." Reviewers expressed a negative comment on the camera's soundless movie playback. The opinion is expressed on a bunch of product entities except some. For instance, "Everything about the camera is great, except they only went with a 2 battery system."

2.

We used Amazon's digital camera reviews as the evaluation dataset. The reviews for the first 16 unique cameras listed on Amazon.com during November 2007 were crawled. For each review page, each individual review content, model number as well as manufacturer name were extracted from the HTML documents. Sentence segmentation was applied to the data and the information was stored as plain text documents, which we call review documents. POS parsing was applied to each review document. We used the Part-of-Speech tagger designed by the Stanford NLP Group1 and default settings of the tagger were used. 4.1 Training Design After downloading and pre-processing, there were 1728 review documents obtained. We separated the documents into 2 sets. One set (293 documents for 6 cameras) were manually tagged by experts. For each opinion sentence, product entities, opining entities and opinion orientations were manually labeled using the tag sets described in section 3.1. The remaining documents (1435 documents for 10 cameras) were used by the bootstrapping process to self-learn new vocabularies (described next). Additionally, we observed two challenges in this task. One was people use inconsistent terminologies to describe product entities, and the other arose from recognizing rarely mentioned entities (i.e. infrequent entities). For example, only two reviewers mentioned "poor design on battery door" in the training data. Although this is not a frequent entity, it provides valuable information to potential customers. Again, different terms were used to describe it, such as battery cover, battery/SD door, battery/SD cover, battery/SD card cover. To reduce the effort of manual labeling of a large set of training documents and solve the above problems, we propose a novel bootstrapping approach to enable self-directed learning, which can be employed in situations where collecting a large training set could be expensive and difficult to accomplish. 4.2 Bootstrapping Labeling training documents manually is a labor intensive task. Thus, it would be nice if the system can discover new vocabularies automatically by using what it has learned. To achieve this, we have designed a bootstrapping approach which can extract high confidence data through self-learning. The process is shown in Fig. 2 and composed of the following steps: 1. First, the bootstrapping program creates two child processes. The parent process acts as master and the rest acts as workers. Master is responsible for coordinating the bootstrapping process, extracting and distributing high confidence data to each worker.

Algorithm 1 Determining Opinion Orientation RESOLVE_OPINION_ORI(tagged OpinionSentence) 1. FOR each product related entity fi in OpinionSentence 2. corresponding opinion entity oi = fi's matching 3. opinion word/phrase 4. fi's initial opinion orientation = oi's orientation 5. 6. // look backwards and search for negation words 7. done = FALSE 8. FOR (distance = 1; distance <= 5 && !done; 9. distance++) 10. IF ((oi's position - distance) is a coordinating 11. conjunction) 12. done = TRUE 13. END IF 14. IF ((oi's position - distance) is in front of fi's 15 position) 16. done = TRUE 17. END IF 18. IF ((oi's position - distance) is a negation word) 19. done = TRUE 20. fi's opinion orientation = opposite(fi's initial 21 opinion orientation) 22. END IF 23. END FOR 24. 25. // handling the conjunctions such as "but" and 26 prepositions such as "except" 27. IF oi is in front of fi 28. IF "but/except" appears between oi and fi 29. fi's opinion orientation = opposite(fi's initial 30. opinion orientation) 31. END IF 32. END IF 33. END FOR

4. Experiments

----------
1

http://nlp.stanford.edu/software/lex-parser.shtml

A Novel Lexicalized HMM-based Learning Framework for Web Opinion Mining
84

82

80

78

F-value

76

74 72

70

68

Feature entities Component entities Function entities All entities (total)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

66

Iterations

Figure 3. Bootstrapping results for entity extraction

89

78

77

(O in ns n n ee tra tio ) p io e te c x c n

88

76

75

87

74

73

Figure 2. The bootstrapping process

86

Opinion sentence extraction Entity-opinion pair orientation

72

71

85

70 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

2.

We split the training documents into two halves, t1 and t2 by random selection. Each half is used as seeds for each worker's HMM. At the initial stage (0th iteration), each worker trains its own HMM classifier based on its training set, and then each worker's trained HMM is used to tag the documents in the bootstrap document set and produces a new set of tagged review documents. As two workers' training documents are different from each other, the tagging results from step 3 may be inconsistent. Therefore after the tagging step, master inspects each sentence tagged by each HMM classifier and only extracts opinion sentences that are agreed upon by both classifiers. In the experiments, only the identical sentences with identical tags were considered to agree with one another. A hash value is then calculated for each extracted opinion sentence from step 4 and compared with those of the sentences already stored in the database (The database contains newly discovered data from the bootstrap process and is initialized to empty in the first bootstrap cycle). If it is a newly discovered sentence, master stores it into the database. Master then randomly splits the newly discovered data from the database into two halves t1' and t2', and adds t1' and t2' to the training set of two workers respectively. This bootstrap process is repeated until no more new data being discovered. Figure 3 and 4 demonstrate the experimental results obtained from each bootstrap cycle regarding one of the products used in our experiments.

Iterations

3.

Figure 4. Bootstrapping results for opinion extraction and entity-opinion pair orientation

4.3 Evaluation As mentioned above, the review documents for 6 cameras were manually labeled by experts. We chose the largest four data sets (containing 270 documents) and performed a 4-fold cross-validation. The remaining review documents for 2 cameras (containing 23 documents) were used for training only. The bootstrap document set (containing 1435 documents for 10 cameras) was used by the bootstrapping process to extract high confidence data through self-learning (newly discovered high confidence data were then added into the original training set in each iteration). Finally, our best classifier was trained based on the accumulated truth data collected from the original training set and bootstrap data set, and was then applied to our test data and evaluated against the baseline. To evaluate the effectiveness of the proposed framework, we have measured the recall, precision and F-score of extracted entities, opinion sentences and opinion orientations, respectively. The system performance is evaluated by comparing the results tagged by the system with the manually tagged truth data. Only an exact match is considered as a correct recognition in our evaluation. For entity recognition, this means the exact same word/phrase is identified and classified correctly from one of four pre-defined entity categories. Furthermore, each identified entity should occur in the same sentence, same position and same document as compared with the truth data. For opinion sentence extraction, exact match means

4.

5.

6.

(E tity p io p ir o n tio ) n -o in n a rie ta n

F a e -v lu

F a e -v lu

A Novel Lexicalized HMM-based Learning Framework for Web Opinion Mining

the exact same sentence from the same document is identified compared with the truth data. For opinion orientation classification, exact match means the exact same entity and entity category are identified with correct orientation (positive or negative). We have designed and implemented a rule-based baseline system motivated by (Turney 2002) and (Hu and Liu 2004)'s approaches. (Turney 2002) described a document level opinion mining system. It uses a number of rules to identify opinion-bearing words. In our baseline system, the rules shown in Table 4 were used to extract product entities and opinion-bearing words. This was accomplished by searching for any nouns and adjectives matching the rules. Matching nouns (considered as product entities) and matching adjectives (considered as opinion words) were extracted. The corresponding sentences were identified as opinion sentences. In the next step, identified adjectives' semantic orientations were determined. We used twenty five commonly used positive adjectives and twenty five commonly used negative adjectives as seeds. By using the bootstrapping technique proposed in (Hu and Liu 2004), we expanded these two seeds lists by searching synonyms and antonyms for each seed word. Newly discovered words were added into their corresponding seeds lists. This process was repeated until no new word discovered. As semantic orientation of each list of adjective words is known, the orientations of extracted adjectives by the system can be determined by checking the existence of these words in the lists. The detailed evaluation results are presented in Table 5, 6 and 7. As a post analysis, the proposed machine learning approach performs significantly better than the rule-based baseline system in terms of entity extraction, opinion sentence recognition and opinion polarity classification. Through manual inspection, we observe the approach effectively identifies highly specific product related entities and opinion expressions (usually complex phrases, e.g., auto red eye correction) and discovers new vocabularies (e.g., automatic point-and-shoot mode) in the test dataset based on the patterns it has learned.
Table 4. Baseline rules for extracting product entities and opinion-bearing words FIRST WORD 1 2 3 4 JJ RB, RBR or RBS JJ NN or NNS SECOND WORD NN or NNS JJ JJ JJ THIRD WORD Anything NN or NNS NN or NNS Not NN nor NNS (not extracted)

model provides solutions for several problems that have not been addressed by previous approaches. Specifically, the system can self-learn new vocabularies based on the patterns it has learned, which is extremely useful in text and web mining due to the complexity and flexibility of natural language and was not supported by previous rulebased or statistical approaches. Complex product entities and opinion expressions as well as infrequently mentioned entities can be effectively and efficiently identified, which was under-analyzed or ignored by previously proposed methods. A novel bootstrapping approach is employed to handle situations in which collecting a large training set could be expensive and difficult to accomplish. Future directions include the expansion of the datasets from digital product reviews to other product reviews. We are also researching the role of pronoun resolution in improving the mining results.

References
Ding, X, Liu, B and Yu, P. S. (2008). A Holistic Lexiconbased Approach to Opinion Mining. Proceedings of the International Conference on Web Search and Web Data Mining (WSDM'08), 231-239. Fu, G. and Luke, K. K. (2005). Chinese Named Entity Recognition using Lexicalized HMMs. ACM SIGKDD Explorations Newsletter, 7(1), 19-25. Hu, M. and Liu, B. (2004). Mining and Summarizing Customer Reviews. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'04), 168-177. Lee, S. Z., Tsujii, J. and Rim, H. C. (2000). Lexicalized Hidden Markov Models for Part-of-Speech Tagging. Proceedings of the 18th International Conference on Computational Linguistics (COLING'00), 481-487. Popescu, A. and Etzioni, O. (2005). Extracting Product Features and Opinions from Reviews. Proceedings of 2005 Conference on Empirical Methods in Natural Language Processing (EMNLP'05), 339-346. Turney, P. D. (2002). Thumbs up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL'02), 417-424. Turney, P. D. and Littman, M. L. (2003). Measuring praise and criticism: Inference of semantic orientation from association." ACM Transactions on Information Systems, 21(4), 315-346. Zhuang, L., Jing, F. and Zhu, X. (2006). Movie Review Mining and Summarization. Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM'06), 43-50.

5. Conclusions
This paper proposes a novel and robust machine learning approach for web opinion mining and extraction. The

A Novel Lexicalized HMM-based Learning Framework for Web Opinion Mining

Table 5. Experimental results on entity extraction for each category PRODUCTS METHODS FEATURE ENTITY R(%) CAMERA A CAMERA B L-HMM+POS+Bootstrapping L-HMM+POS L-HMM L-HMM+POS+Bootstrapping L-HMM+POS L-HMM L-HMM+POS+Bootstrapping L-HMM+POS L-HMM 85.81 82.01 80.74 89.12 84.35 80.67 76.62 74.03 70.32 P(%) 80.78 77.70 75.86 75.72 72.09 71.51 81.94 80.28 77.33 F(%) 83.22 79.80 78.22 81.88 77.74 75.81 79.19 77.03 73.66 COMPONENT ENTITY R(%) 83.33 73.08 70.36 77.66 74.47 71.66 100.0 97.56 97.56 P(%) 83.33 73.08 70.30 79.35 70.92 70.46 83.67 78.43 74.07 F(%) 83.33 73.08 70.33 78.49 72.65 71.05 91.11 86.96 84.21 FUNCTION ENTITY R(%) 70.31 65.63 60.19 60.87 52.17 47.83 63.64 63.64 63.64 P(%) 75.00 70.00 67.19 82.35 75.00 64.71 87.50 80.50 70.00 F(%) 72.58 67.74 63.50 70.00 61.54 55.00 73.69 71.08 66.67

CAMERA C

Table 6. Experimental results on entity extraction for all categories PRODUCTS METHODS R(%) CAMERA A L-HMM+POS+Bootstrapping L-HMM+POS L-HMM Baseline L-HMM+POS+Bootstrapping L-HMM+POS L-HMM Baseline L-HMM+POS+Bootstrapping L-HMM+POS L-HMM Baseline 83.10 77.21 75.78 20.43 82.58 78.03 74.81 15.53 82.95 80.62 78.23 17.05 ALL ENTITIES (TOTAL) P(%) 80.88 75.43 73.18 29.97 77.30 71.84 70.87 24.26 82.95 79.60 75.54 23.66 F(%) 81.98 76.31 74.46 24.30 79.85 74.81 72.78 18.94 82.95 80.11 76.86 19.82

CAMERA B

CAMERA C

Table 7. Experimental results on opinion sentence identification and opinion orientation classification PRODUCTS METHODS OPINION SENTENCE EXTRACTION (SENTENCE LEVEL) R(%) CAMERA A L-HMM+POS+Bootstrapping L-HMM+POS L-HMM Baseline L-HMM+POS+Bootstrapping L-HMM+POS L-HMM Baseline L-HMM+POS+Bootstrapping L-HMM+POS L-HMM Baseline 90.72 87.63 86.32 51.89 87.95 86.75 85.14 46.39 83.91 80.46 79.76 43.68 P(%) 85.71 83.88 82.11 60.64 82.95 81.82 80.29 57.04 82.95 80.34 78.82 54.29 F(%) 88.15 85.71 84.16 55.93 85.38 84.21 82.64 51.56 83.43 80.40 79.29 48.41 ENTITY-OPINION PAIRS ORIENTATION (FEATURE LEVEL) R(%) 78.98 74.26 73.25 19.65 75.00 69.70 68.45 13.26 77.52 72.87 72.09 17.05 P(%) 76.86 72.55 69.89 28.82 70.21 65.95 65.02 20.71 77.52 72.31 66.91 23.66 F(%) 77.91 73.40 71.53 23.36 72.53 67.77 66.69 16.17 77.52 72.59 69.40 19.82

CAMERA B

CAMERA C