Lexical and Syntactic Priming and Their Impact in Deployed Spoken Dialog Systems Svetlana Stoyanchev and Amanda Stent Department of Computer Science Stony Brook University Stony Brook, NY 11794-4400, USA svetastenchikova@gmail.com, amanda.stent@stonybrook.edu Abstract In this paper, we examine user adaptation to the system's lexical and syntactic choices in the context of the deployed Let's Go! dialog system. We show that in deployed dialog systems with real users, as in laboratory experiments, users adapt to the system's lexical and syntactic choices. We also show that the system's lexical and syntactic choices, and consequent user adaptation, can have an impact on recognition of task-related concepts. This means that system prompt formulation, even in flexible input dialog systems, can be used to guide users into producing utterances conducive to task success. 1 Introduction Numerous studies have shown that people adapt their syntactic and lexical choices in conversation to those of their conversational partners, both human (Brennan, 1996; Pickering et al., 2000; Lockridge and Brennan, 2002; Reitter et al., 2006) and computer (Branigan et al., 2003; Brennan, 1991; Brennan, 1996; Gustafson et al., 1997; Ward and Litman, 2007). User adaptation to the system's lexical and syntactic choices can be particularly useful in flexible input dialog systems. Limited input dialog systems, including most commercial systems, require the user to respond to each system prompt using only the concept and words currently requested by the system. Flexible input dialog systems allow the user to respond to system prompts with concepts and words in addition to or other than the ones currently requested, and may even allow the user to 189 take task initiative. Speech recognition (ASR) accuracy in limited input systems is better than in flexible input systems (Danieli and Gerbino, 1995; Smith and Gordon, 1997). However, task completion rates and times are better in flexible input systems (ChuCarroll and Nickerson, 2000; Smith and Gordon, 1997). With user adaptation, in flexible input dialog systems prompts can be formulated to maximize ASR accuracy and reduce the number of ASR timeouts (Sheeder and Balogh, 2003). Previous research on user adaptation to dialog systems was conducted in laboratory settings. However, the behavior of recruited subjects in a quiet laboratory may differ from that of real users in the noisy world (Ai et al., 2007). Here we present the first study, to the best of our knowledge, that investigates the adaptive behavior of real users of a live dialog system. We analyze dialogs from CMU's Let's Go! dialog system (Raux et al., 2005). We look at the effects of the system's lexical and syntactic choices on: 1) lexical and syntactic choices in user responses; and 2) concept identification rates for user responses. We confirm prior results showing that users adapt to the system's lexical and syntactic choices. We also show that particular choices for system prompts can lead to higher concept identification rates. 2 Experimental Method We conducted our experiment using the Let's Go! telephone-based spoken dialog system that provides information about bus routes in Pittsburgh (Raux et al., 2005). The users are naive callers from the general population seeking information about bus Proceedings of NAACL HLT 2009: Short Papers, pages 189­192, Boulder, Colorado, June 2009. c 2009 Association for Computational Linguistics condition (1) (2) (3) (4) request departure location Where are you leaving from? Where are you leaving from? What is the place of your departure Where do you want to leave from? confirm departure location Leaving from X, is this correct? From X, is this correct? X, is this correct? You want to leave from X, is this correct? request arrival location Where are you going to? Where are you going to? What is the place of your arrival? Where do you want to go to? confirm arrival location Going to X, is this correct To X, is this correct X, is this correct You want to go to X, is this correct Table 1: Experimental conditions Spkr Sys Task type Open Utterance Welcome to the CMU Let's Go bus information system. What can I do for you? 61A schedule Where do you wanna leave from? From downtown Leaving from downtown. Is this correct? Yes Where are you going to? Oakland Going to Waterfront. Is this correct? No, to Oakland Usr Sys Usr Sys Usr Sys Usr Sys Usr Request Departure Location Confirm Departure Location Request Arrival Location Confirm Arrival Location Figure 1: Dialog extract from Let's Go! data schedules. In order to provide the user with route information, Let's Go! elicits a departure location, a destination, a departure time, and optionally a bus route number. Each concept value provided by the user is explicitly confirmed by the system. Figure 1 shows an example dialog with the system. Let's Go! is a flexible input dialog system. The user can respond to a system prompt using a single word or short phrase, e.g. Downtown, or a complete sentence, e.g. I am leaving from downtown1 . We ran four experimental conditions for two months. The conditions varied in the lexical choice and syntax of system prompts for two system request location tasks and two system confirm location tasks (see Table 1). System prompts differed 1 The user response can also contain concepts not requested in the prompt, e.g. specifying departure location and bus number in one response. by presence of a verb (to leave, to go) or a preposition (to, from), and by the syntactic form of the verb. The request location prompt contained both a verb and a preposition in the experimental conditions (1, 3, and 4). The confirm location prompt contained both a verb and a preposition in conditions 1 and 4, only a preposition in condition 2, and neither verb nor preposition in condition 3. In conditions 1 and 4, both request and confirmation prompts differed in the verb form (leaving/leave, going/go). 2184 dialogs were used for this analysis. For each experimental condition, we counted the percentages of verbs, verb forms, prepositions, and locations in the ASR output for user responses to system request location and confirm location prompts. Although the data contains recognition errors, the only difference in system functionality between the conditions is the formulation of the system prompt, so any statistically significant difference in user responses between different conditions can be attributed to the formulation of the prompt. 3 Syntactic Adaptation We analyze whether users are more likely to use action verbs (leave, leaving, go, or going) and prepositions (to, from) in response to system prompts that use a verb or a preposition. This analysis is interesting because ASR partially relies on context words, words related to a particular concept type such as place, time or bus route. For example, the likelihood of correctly recognizing the location Oakland in the utterance "going to Oakland" is different from the likelihood of correctly recognizing the single word utterance "Oakland". Table 2 shows the percentages of user responses 190 Cond. (1) (2) (3) (4) (1) (2) (3) (4) Sys uses Sys uses % with % with verb prep verb prep Responses to request location prompt yes yes 2.3% 5.6% yes yes 1.9% 4.3% no no 0.7% 4.5% yes yes 2.4% 6.0% Responses to confirm location prompt yes yes 15.7% 23.4% no yes 3.9% 16.9% no no 6.4% 12.7% yes yes 10.8% 22.0% Condition/ User's verb (1) Progressive (3) Neutral (4) Simple Condition/ User's verb (1) Progressive (3) Neutral (4) Simple LEAVING (progressive) 74.5% 61.3% 43% GOING (progressive) 84.4% 66.6% 46.5% LEAVE (simple) 25.5% 38.7% 57% GO (simple) 15.6% 33.4% 53.5% total 55 31 42 total 45 21 43 Table 3: Usage of verb forms in user utterances Table 2: Percentages of user utterances containing verbs and prepositions. indicates a statistically significant difference (p<0.01) from the no action verb condition (3). indicates a statistically significant difference from the no action verb in confirmation condition (2). Hence, more data (or human transcription) may be required to see a statistically significant effect. 4 Lexical Adaptation We analyze whether system choice of a particular verb form affects user choice of verb form. For this analysis we only consider user utterances in response to a request location or confirm location prompt that contain a concept and at least one of the verb forms leaving, going, leave, or go3 . Table 3 shows the total counts and percentages of each verb form in the progressive form condition (condition 1), and the neutral condition (condition 3), and the simple form condition (condition 4)4 . We find that the system's choice of verb form has a statistically significant impact on the user's choice (2 test, p<0.01). In the neutral condition, users are more likely to choose the progressive verb form. In the progressive form condition, this preference increases by 13.2% for the verb to leave, and by 17.8% for the verb to go. By contrast, in the simple form condition, this preference decreases by 18.3% for the verb to leave and by 20.1% for the verb to go, making users slightly more likely to choose the simple verb form than the progressive verb form. in each experimental condition that contain a verb and/or a preposition. We observe adaptation to the presence of a verb in user responses to request location prompts. The prompts in conditions 1, 2 and 4 contain a verb, while those in condition 3 do not. The differences between conditions 1 and 3, and between conditions 4 and 3, are statistically significant (p<0.01)2 . The difference between conditions 2 and 3 is not statistically significant, perhaps due to the absence of a verb in a prior confirm location prompt. A similar adaptation to the presence of a verb in the system prompt is seen in user responses to confirm location prompts. The prompts in conditions 1 and 4 contain a verb while those in conditions 2 and 3 do not. The differences between conditions 1 and 2, and between conditions 1 and 3, are statistically significant (p<.01), while the difference between conditions 4 and 2 exhibits a trend. We hypothesize that the lack of the statistically significant differences between conditions 4 and 2, and conditions 4 and 3, is caused by the low relative frequency in our data of dialogs in condition 4. We do not find statistically significant differences in the use of prepositions. However, we observe a trend showing higher likelihood of a preposition in user responses to confirm location in the conditions where the system uses a preposition. Prepositions are short closed-class context words that are more likely to be misrecognized (Goldwater et al., 2008). 2 All analyses in this section are t-tests with Bonferroni adjustment. 5 Effect of Adaptation on Speech Recognition Performance The correct identification and recognition of taskrelated concepts in user utterances is an essential functionality of a dialog system. Table 4 shows Such utterances constitute 3% of all user responses to all request and confirm place prompts in our data. 4 We ignore condition 2 where the verb is used only in the request prompt. 3 191 System prompt (1) (2) (3) (4) Arrival request 72.2% 77.4% 74.5% 82.0% Departure request 63.8% 61.0% 61.5% 66.0% Table 4: Concept identification rates following request location prompts. indicates a statistically significant difference (p<0.01 with Bonferroni adjustment) from condition 4. the percentage of user utterances following a request location prompt that contain an automaticallyrecognized location concept. Condition 4, where the system prompt uses the verb form to leave, achieves the highest concept identification rates. The differences in concept identification rates between conditions 1 and 4, and between conditions 3 and 4, are statistically significant for request arrival location (t-test, p<.01). Other differences are not statistically significant, perhaps due to lack of data. 6 Conclusions and Future Work In this paper, we showed that in deployed dialog systems with real users, as in laboratory experiments, users adapt to the lexical and syntactic choices of the system. We also showed that user adaptation to system prompts can have an impact on recognition of task-related concepts. This means that the formulation of system prompts, even in flexible input dialog systems, can be used to guide users into producing utterances conducive to task success. In future work, we plan to confirm these results using transcribed data. We also plan additional experiments on adaptation in Let's Go!, including an analysis of the time course of adaptation and further analyses of the impact of adaptation on ASR performance. 7 Acknowledgements We would like to thank the Let's Go! researchers at CMU for making Let's Go! available. This research was supported by the NSF under grant no. 0325188. References H. Ai, A. Raux, D. Bohus, M. Eskenazi, and D. Litman. 2007. Comparing spoken dialog corpora col- lected with recruited subjects versus real users. In Proceedings of SIGDial. H. Branigan, M. Pickering, J. Pearson, J. McLean, and C. Nass. 2003. Syntactic alignment between computers and people: the role of belief about mental states. In Proceedings of CogSci. S. Brennan. 1991. Conversation with and through computers. User Modeling and User-Adapted Interaction, 1(1):67­86. S. Brennan. 1996. Lexical entrainment in spontaneous dialog. In Proceedings of ISSD. J. Chu-Carroll and J. Nickerson. 2000. Evaluating automatic dialogue strategy adaptation for a spoken dialogue system. In Proceedings of NAACL. M. Danieli and E. Gerbino. 1995. Metrics for evaluating dialogue strategies in a spoken language system. In Proceedings of the AAAI Spring Symposium on Empirical Methods in Discourse Interpretation and Generation. S. Goldwater, D. Jurafsky, and C. Manning. 2008. Which words are hard to recognize? Lexical, prosodic, and disfluency factors that increase asr error rates. In Proceedings of ACL/HLT. J. Gustafson, A. Larsson, R. Carlson, and K. Hellman. 1997. How do system questions influence lexical choices in user answers? In Proceedings of Eurospeech. C. Lockridge and S. Brennan. 2002. Addressees' needs influence speakers' early syntactic choices. Psychonomics Bulletin and Review. M. Pickering, H. Branigan, A. Cleland, and A. Stewart. 2000. Activation of syntactic priming during language production. Journal of Psycholinguistic Research, 29(2):205­216. A. Raux, B. Langner, A. Black, and M Eskenazi. 2005. Let's Go public! taking a spoken dialog system to the real world. In Proceedings of Eurospeech. E. Reitter, J. Moore, and F. Keller. 2006. Priming of syntactic rules in task-oriented dialogue and spontaneous conversation. In Proceedings of CogSci. T. Sheeder and J. Balogh. 2003. Say it like you mean it: priming for structure in caller responses to a spoken dialog system. International Journal of Speech Technology, 6(2):103­111. R. Smith and S. Gordon. 1997. Effects of variable initiative on linguistic behavior in human-computer spoken natural language dialogue. Computational Linguistics, 23(1):141­168. A. Ward and D. Litman. 2007. Automatically measuring lexical and acoustic/prosodic convergence in tutorial dialog corpora. In Proceedings of the SLaTE Workshop on Speech and Language Technology in Education. 192