A Flexible Approach to Natural Language Generation for Disabled Children
Pradipta Biswas School of Information Technology Indian Institute of Technology, Kharagpur 721302 INDIA pbiswas@sit.iitkgp.ernet.in
a tree structured template organization that resembles Tag Adjoining Grammar (TAG) structure. The template-based approach that has been taken in the system, enables the basic language generation algorithms application independent and language independent. At the final stage of language generation it checks the compatibility of the sentence structure with the current context and validates the result with Chomsky's binding theory. For this reason it is claimed to be as well founded as any plan-based approach. As another practical example of NLG technique, we can consider the IBM MASTOR system (Liu et. al., 2003). It is used as speech-to-speech translator between English and Mandarin Chinese. The NLG part of this system uses trigram language model for selecting appropriate inflectional form for target language generation. When NLG (or NLP) technology is applied in assistive technology, the focus is shifted to increase communication rate rather than increasing the efficiency of input representation. As for example, CHAT (Alm, 1992) software is an attempt to develop a predictive conversation model to achieve higher communication rate during conversation. This software predicts different sentences depending on situation and mood of the user. The user is free to change the situation or mood with a few keystrokes. In "Compansion" project (McCoy, 1997), a novel approach was taken to enhance the communication rate. The system takes telegraphic message as input and automatically produces grammatically correct sentences as output based on NLP techniques. The KOMBE Project (Pasero, 1994) tries to enhance the communication rate in a different way. It predicts a sentence or a set of sentence by taking sequence of words from users. The Sanyog project (Sanyog, 2006)(Banerjee, 2005) initiates a dialog with the users to take different portions (eg. Subject, verb, predicate etc.) of a sentence and automatically constructs a grammatically correct sentence based on NLG techniques.

Abstract
Natural Language Generation (NLG) is a way to automatically realize a correct expression in response to a communicative goal. This technology is mainly explored in the fields of machine translation, report generation, dialog system etc. In this paper we have explored the NLG technique for another novel applicationassisting disabled children to take part in conversation. The limited physical ability and mental maturity of our intended users made the NLG approach different from others. We have taken a flexible approach where main emphasis is given on flexibility and usability of the system. The evaluation results show this technique can increase the communication rate of users during a conversation.

1

Introduction

`Natural Language Generation' also known as `Automated Discourse Generation' or simply `Text Generation', is a branch of computational linguistics, which deals with automatic generation of text in natural human language by the machine. It can be conceptualized as a process leading from a high level communicative goal to a sequence of communicative acts that accomplish this communicative goal (Rambow et. al., 2001). Based on input representation, any NLG technique can be broadly classified into two paradigms viz. Template based Approach and Plan based approach. The template-based approach does not need large linguistic knowledge resource but it cannot provide the expressiveness or flexibility needed for many real domains (Langkilde and Knight, 1998). In (Deemter et. al., 1999), it has been tried to prove with the example of a system (D2S: Direct to Speech) that both of the approaches are equally powerful and theoretically well founded. The D2S system uses

1
Proceedings of the COLING/ACL 2006 Student Research Workshop, pages 1­6, Sydney, July 2006. c 2006 Association for Computational Linguistics


2

The Proposed Approach

2.1

Templates fill up

The present system is intended to be used by children with severe speech and motorimpairment. It will cater those children who can understand different parts of a sentence (like subject, object, verb etc.) but do not have the competence to construct a grammatically correct sentence by properly arranging words. The intended audience offers both advantages and challenges to our NLG technique. The advantage is we can limit the extent of sentence types that have to be generated. But the challenges overwhelm this advantage. The main challenges identified so far can be summarized as follows. Simplicity in interacting with user due to limited mental maturity level of users Flexibility in taking input Generating sentences with minimum number of keystrokes due to the limited physical ability of the users Generating the most appropriate sentence in the first chance since we do not have any scope to provide users a set of sentences and ask them to choose one from the set. In the next few sections the NLG technique adopted in our system will be discussed in details. Due to limited vocabulary and education level of our intended users, our NLG technique will generate only simple active voice sentences. The challenges are also tried to be addressed in developing the NLG technique. Generally an NLG system can be divided into three modules viz. Text Planning, MicroPlanning and Realization. In (Callaway and Lester, 1995), the first two modules are squeezed into a planning module and results only two subtasks in an NLG system. Generally in all the approaches of NLG, the process starts with different parts of a sentence and each of these parts can be designated as a template. After getting values for these templates the templates are arranged in a specified order to form an intermediate representation of a sentence. Finally the intermediate representation undergoes through a process viz. Surface realization to form a grammatically correct and fluent sentence. Thus any NLG technique can be broadly divided into two parts Templates fill up Surface realization Now each of these two steps for our system will be discussed in details.

We defined templates for our system based on thematic roles and Parts of Speech of words. We tagged each sentence of our corpus (the corpus is discussed in section 4.1) and based on this tagged corpus, we have classified the templates in two classes. One class contains the high frequency templates i.e. templates that are contained in most of the sentences. Examples of this class of templates include subject, verb, object etc. The other class contains rest of the templates. Let us consider the first class of templates are designated by set A={a1,a2,a3,a4....} and other class is set B={b1,b2,b3,b4,..............}. Our intention is to offer simplicity and flexibility to user during filling up the templates. So each template is associated with an easy to understand phrase like Subject=> Who Verb=> Action Object=> What Destination=>To Where Source=>From Where...........etc. To achieve the flexibility, we show all the templates in set A to user in the first screen (the screenshot is given in fig. 1, however the screen will not look as clumsy as it is shown because some of the options remain hidden by default and appear only on users' request). The user is free to choose any template from set A to start sentence construction and is also free to choose any sequence during filling up values for set A. The system will be a free order natural language generator i.e. user can give input to the system using any order; the system will not impose any particular order on the user (as imposed by the Sanyog Project). Now if the user is to search for all the templates needed for his/her sentence, then both the number of keystrokes and cognitive load on user will increase. So with each template of set A we defined a sequence of templates taking templates from both set A and set B. Let user chooses template ak. Now after filling up template ak, user will be prompted with a sequence of templates like ak1, ak2, ak3, bk1, bk2, bk3, etc. to fill up. Again the actual sequence that will be prompted to user will depend on the input that is already given by user. So the final sequence shown to the user will be a subset of the predefined sequence. Let us clear the concept with an example. Say a user fills up the template <Destination>. Now s/he will be requested to give values for template like <Source>, <Conveyance>, <Time>, <Subject> etc, excluding those which

2


are already filled up. As the example shows, the user needs not to search for all templates as well as s/he needs not to fill up a template more than once. This strategy gives sentence composition with minimum number of keystrokes in most of the cases. 2.2 Surface Realization

· · · · · · · ·

Action Who Whom With Whom What From Where To Where Vehicle Used ......etc.

It consists of following steps Setting verb form according to the tense given by user Setting Sense Setting Mood Phrase ordering to reflect users intention Each of these steps is described next. The verb form will be modified according to the person and number of the subject and the tense choice given by the user. The sense will decide the type of the sentence i.e. whether it is affirmative, negative, interrogative or optative. For negative sense, appropriate negative word (e.g. No, not, do not) will be inserted before the verb. The relative position of the order of the subject and verb will be altered for optative and interrogative sentences. The mood choice changes the main verb of the sentence to special verbs like need, must etc. It tries to reflect the mood of the user during sentence composition. Finally the templates are grouped to constitute different phrases. These phrases are ordered according to the order of the input given by the user. This step is further elaborated in section 3.2.

After selecting a thematic role, a second form will come as shown in Fig. 2. From the form shown at Fig 2, the user can select as many words as they want. Even if they want they can type a word (e.g. his /her own name). The punctuations and conjunction will automatically be inserted.

Fig. 1: Screenshot of dialog based interface

3

A Case Study

In this section a procedural overview of the present system will be described. The automatic language generation mechanism of the present system uses the following steps Taking Input from Users The user has to give input to the system using the form shown in fig. 1. As shown in the form the user can select any property (like tense, mood or sense) or template at any order. The user can select tense, mood or sentence type by clicking on appropriate option button. The user can give input for the template by answering to the following questions

Fig. 2: Screenshot of word selection interface Template fill-up After giving all the input the user asks the system to generate the sentence by clicking on "generate sentence" Button. The system is incorporated with several template organizations and a default

3


template organization. Examples of some of these template organizations are as follows
· · · · · · · · SUBJECT VERB SUBJECT VERB INANIMATE OBJECT SUBJECT VERB ANIMATE OBJECT SUBJECT VERB WITH COAGENT SUBJECT VERB INANIMATE OBJECT WITH COAGENT SUBJECT VERB INANIMATE OBJECT WITH INSTRUMENT SUBJECT VERB SOURCE DESTINATION BY CONVEYANCE SUBJECT VERB SOURCE DESTINATION WITH COAGENT

To Where => school With Whom => father Main Action => go Tense => Present Continuous Step 2: Template Organization Selection Based on user input the following template organization will be selected
SUBJECT VERB DESTINATION WITH COAGENT

The system select one such template organization based on user input and generates the intermediate sentence representation. Verb modification according to tense The intermediate sentence is a simple present tense sentence. According to the user chosen tense, the verb of the intermediate sentence get modified at this step. If no verb is specified, appropriate auxiliary verb will be inserted. Changing Sentence Type Up to now the sentence remain as an affirmative sentence. According to the user chosen sense the sentence gets modified in this step. E.g. For question, the verb comes in front, for negative sentence not, do not, did not or does not is inserted appropriately. Inserting Modal Verbs Finally the users chosen modal verbs like must, can or need are inserted into the sentence. For some modal verbs (like can or need) the system also changes the form of the verb (like can or could). 3.1 Example of Sentence Generation using Our Approach

Step 3: Verb Modification according to tense Since the selected tense is present continuous and subject is first person singular number, so `go' will be changed to `am going'. Step 4: In this case there is no change of the sentence due to step 4. Step 5: There is no change of the sentence due to step 5. So the final output will be "I am going to school with father". It is same as the user intended sentence. Example 2 Let the user wants to tell, "You must eat it" Step 1: The user inputs will be Who => You Main Action => eat What => it Mood => must Tense => Present Simple Step 2: Template Organization Selection Based on user input the following template organization will be selected
SUBJECT VERB INANIMATE OBJECT

Step 3: Verb Modification according to tense Since the tense is present simple so there will be no change in verb. Step 4: In this case there is no change of the sentence due to step 4. Step 5: The modal verb will be inserted before the verb So the final output will be "You must eat it" Example 3 Let the user wants to tell, "How are you" Step 1: The user inputs will be Who => You Sense => Question Wh-word => How Tense => Present Simple Step 2: Template Organization Selection There is no appropriate template for this input. Hence the default template organization will be chosen. Step 3: Verb Modification according to tense

Let us consider some example of language generation using our system. Example 1 Let the user wants to tell, "I am going to school with father" Step 1: The user inputs will be Who => I

4


Since no action is specified, the auxiliary verb will be selected as the main verb. Here the subject is second person and tense is present simple, so the verb selected is `are'. Step 4: Since the selected sentence type is `Question', so the verb will come in front of the sentence. Again, since a Wh-word has been selected, it will come in front of the verb. A question mark will automatically be appended at the end of the sentence. Step 5: There is no change of the sentence due to step 5. So the final output will be "How are you?" 3.2 Phase ordering to reflect users' intention

Now before sentence realization the phrases are ordered according to their rank. Each of these phrase orders produces a separate sentence. As for example let the communication goal is `I go to school from home with my father'. If the input sequence is (my father -> I -> go -> school -> home), the generated sentence will be `With my father I go from home to School'. Again if the input sequence is (school -> home -> I -> go -> my father), then the generated sentence will be `From home to school I go with my father.' Thus for the same communicative goal, the system produces different sentences based on order of input given by user.

An important part of any NLG system is pragmatics that can be defined as the reference to the interlocutors and context in communication (Hovy, 1990). In (Hovy, 1990), a system viz. PAULINE has been described that is capable of generating different texts for the same communicative goals based on pragmatics. In PAULINE, the pragmatics has been represented by rhetorical goals. The rhetorical goals defined several situations that dictate all the phases like topic collection, topic organization and realization. Inspired from the example of PAULINE the present system has also tried to reflect users' intention during sentence realization. Here the problem is the limited amount of input for making any judicious judgment. The input to the system is only a sequence of words with correspondence to a series of questions. A common finding is that we uttered the most important concept in a sentence earlier than other parts of the sentence. So we have tried to get the users' intention from the order of input given by user based on the belief that the user will fill up the slots in order of their importance according to his/her mood at that time. We have associated a counter with each template. The counter value is taken from a global clock that is updated with each word selection by the user. Each sentence is divided into several phrases before realization. Now each phrase constitute of several templates. For example let S be a sentence. Now S can be divided into phrases like P1, P2, P3..... Again each phrase Pi can be divided into several templates like T1, T2, T3....Based on the counter value of each template, we have calculated the rank of each phrase as the minimum counter value of its constituent templates i.e. Rank(Pi)=Minimum(Counter(Tj)) for all j in Pi

4

Evaluation

The main goal of our system is to develop a communication aid for disabled children. So the performance metrics concentrated on measuring the communication rate that has little importance from NLG point of view. To evaluate our system from NLG point of view we emphasize on the expressiveness and ease of use of the system. The expressiveness is measured by the percentage of sentences that was intended by user and also successfully generated by our system. The ease of use is measured by the average number of inputs needed to generate each sentence. 4.1 Measuring Expressiveness

To know the type of sentences used by our intended users during conversation, first we analyzed the communication boards used by disabled children. Then we took part in some actual conversations with some spastic children in a Cerebral Palsy institute. Finally we interviewed their teachers and communication partners. Based on our research, we developed a list of around 1000 sentences that covers all types of sentences used during conversation. This list is used as a corpus in both development and evaluation stage of our system. During development the corpus is used to get the necessary templates and for classification of templates (refer sec. 2.1). After development, we tested the scope of our system by generating some sentences that were exactly not in our corpus, but occurred in some sample conversations of the intended users. In 96% cases, the system is successful to generate the intended sentence. After analyzing the rest 4% of sentence, we have identified following problems at the current implementation stage.

5


The system cannot handle gerunds as object to preposition. (e.g. He ruins his eyes by reading small letters). The system is yet not capable to generate correct sentence with an introductory `It'. (e.g. It is summer). In these situations the sentence is correctly generated when `It' is given as an agent, which is not intended. 4.2 Measuring ease of use

NLG Performance
Number of Inputs 20 10 0 0 10 20 30

Numbe r of Words

To calculate the performance of the system, we measured the number of inputs given by user for generating sentence. The input consists of words, tense choice, mood option and sense choice given by user. Next we plot the number of inputs w.r.t. the number of words for each sentence. Fig. 3 shows the plot. It can be observed from the plot that as the number of words increases (i.e. for longer sentences), the ratio of number of inputs to number of words decreases. So effort from users' side will not vary remarkably with sentence length. The overall communication rate is found to be 5.52 words/min (27.44 characters/min) that is better than (Stephanidis, 2003). Additionally it is also observed that the communication rate is increasing with longer conversations.

Fig. 3: Line graph for performance measurement of the system

References
Alm N., Arnott J. L., Newell A. F. 1992, Prediction and Conversational Momentum in an Augmentative Communication System, Communications of the ACM, vol. 55, No. 5, May 1992 Banerjee A. 2005, A Natural Language Generation Framework for an Interlingua-based Machine Translation System, MS Thesis, IIT Kharagpur Callaway Charles B., Lester James C. 1995, Robust Natural Language Generation from Large-Scale Knowledge Bases, Proceedings of the Fourth Bar-Ilan Symposium on Foundations Hovy E. H. 1990, Pragmatics and Natural Language Generation, Artificial Inteligence 43(1990): 153-197 Liu Fu-Hua,Liang Gao Gu,Yuqing, Picheny Michael 2003, Use of Statistical N_Gram Models in Natural Language Generation for Machine translation, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Vol 1 : 636 639 Langkilde Irene, Knight Kevin 1998, Generation that Exploits Corpus-Based Statistical Knowledge, Annual meeting-association for computational linguistics: 704710 Deemter Kees van et. al. 1999, Plan-Based vs. templatebased NLG: a false opposition?, Becker and Busemann (1999) McCoy K. 1997 , "Simple NLP Techniques for Expanding Telegraphic Sentences" Natural Language Processing for Communication Aids,1997 Rambow Owen, Bangalore Srinivas, Walker Marilyn 2001,Natural Language Generation in Dialog System, Proceedings of the first international conference on Human language technology research HLT '01 Pasero Robert, Nathalie Richardet and Paul Sabatier; "Guided Sentences Composition for Disabled People"; Proceedings of the fourth conference on Applied natural language processing October 1994 Project SANYOG Available http://www.mla.iitkgp.ernet.in/projects/sanyog.htm at:

5

Conclusion

The present paper discusses a flexible approach for natural language generation for disabled children. A user can start a sentence generation from any part of a sentence. The inherent sentence plan will guide him to realize a grammatically correct sentence with minimum number of keystrokes. The present system respects the pragmatics of a conversation by reordering different parts of a sentence following users' intention. The system is evaluated both from expressiveness and performance point of views. Initial evaluation results show this approach can increase the communication rate of intended users during conversation. Acknowledgement The author is grateful to Media Lab Asia Laboratory of IIT Kharagpur and Indian Institute of Cerebral Palsy, Kolkata for exchanging ideas and providing resources for the present work.

Stephanidis, C. et. al., "Designing Human Computer Interfaces for Quadriplegic People", ACM Transactions on Computer-Human Interaction, pp 87-118, Vol. 10, No. 2, June 2003

6