Local constraints on sentence markers and focus in Somali Katherine Hargreaves School of Informatics University of Manchester Manchester M60 1QD, UK kat@evilution.co.uk Allan Ramsay School of Informatics University of Manchester Manchester M60 1QD, UK Allan.Ramsay@manchester.ac.uk Abstract We present a computationally tractable account of the interactions between sentence markers and focus marking in Somali. Somali, as a Cushitic language, has a basic pattern wherein a small `core' clause is preceded, and in some cases followed by, a set of `topics', which provide sceneseting information against which the core is interpreted. Some topics appear to carry a `focus marker', indicating that they are particularly salient. We will outline a computationally tractable grammar for Somali in which focus marking emerges naturally from a consideration of the use of a range of sentence markers. spelling rules ­ Fig. 1 shows the format of a typical rule, which we compile into an FST to be used during the process of lexical lookup. [q/x/c/h,,v0] ==> [+, k, v0] Figure 1: Insert `k' and a morpheme boundary between `q/x/c/h' and a following vowel The rule in Fig. 1 would, for instance, say that the surface form `saca' might correspond to the underlying form `sac+ka', with a morpheme boundary and a `k' inserted after the `c'. These rules, of which we currently employ about 30, can be efficiently implemented using the standard machinery of cascaded FSTs (Koskiennemi, 1985) interwoven with the general lookup process. 2.1 Noun morphology 1 Introduction This paper presents a computationally tractable account of a number of phenomena in Somali. Somali displays a number of properties which distinguish it from most languages for which computational treatments are available, and which are potentially problematic. We therefore start with a brief introduction to the major properties of the language, together with a description of how we cover the key phenomena within a general purpose NLP framework. In general, a noun consists of a root and a single affix, which provides a combination of gender and number marking. The main complication is that there are several declension classes, with specific singular and plural suffixes for groups of classes (e.g. the plural ending for declensions 1 and 3 is `o') (Saeed, 1999; Lecarme, 2002). Some plural forms involve reduplication of some part of the word ending, e.g. declension 4 nouns form their plural by adding `aC' where `C' is the final consonant of the root, but this can easily be handled by using spelling rules. 2.2 Verb morphology 2 Morphology Somali has a fairly standard set of inflectional affixes for nouns and verbs, as outlined below. In addition, there are a substantial set of `spelling rules' which insert and delete graphemes at the boundaries between roots and suffixes (and clitics). There is not that much to be said about the 337 Verb morphology is slightly more complex. Again, a typical verb consists of a root plus a number of affixes. These include derivational affixes (Somali includes a passivising form which can only be applied to verbs which have a `causative' argument, and a causative affix which adds such Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 337­344, Sydney, July 2006. c 2006 Association for Computational Linguistics an argument) and a set of inflectional affixes which mark aspect, tense and agreement (Andrzejewski, 1968). The forms of the tense and agreement markers vary depending on whether the clause containing the verb is the main clause or is a subordinate clause (either a relative clause or a sentential complement), marked by ±main and on whether it is in a context where the subject is required to be a zero item, marked by ±fullForm. Note that the situation here is fairly complicated: -fullForm versions are required in situations where the subject is forced by local syntactic constraints to be a zero. There are also situations where the subject is omitted for discourse reasons, and here the +fullForm version is used ((Lecarme, 1995) uses the terms `restrictive' and `extensive' for -fullForm and +fullForm respectively). 2.3 Cliticisation `kau' to `ku' and `kiiu' to `kii'. The definite articles, however, cliticise onto the preceding word, with consequential spelling changes. It is again important to ensure that the spelling changes are applied at the right time to ensure that we can recognise `barahu' as `bare' plus `k+a+u', with appropriate changes to the `e' at the end of the root `bare' and the `k' at the start of the determiner `ku'. 3 Syntax 3.1 Framework The syntactic description is couched in a framework which provides a skeletal version of the HPSG schemas, supplemented by a variant on the well-known distinction between internal and external syntax. 3.1.1 Lexical heads and their arguments There are a number of Somali morphemes which can appear either bound to an adjacent word (usually the preceding word) or as free-standing lexical items. The sentence marker `waa' and the pronoun `uu', for instance, combine to produce the form `wuu' when they are adjacent to one another. In several cases, there are quite dramatic morphophonemic alterations at the boundary, so that it is extremely important to ensure that the processes of applying spelling rules and inspecting the lexicon are appropriately interwoven. The definite articles, in particular, require considerable care. There are a number of forms of the definite article, as in Fig. 2: masculine ka ku kii tan tanu taas taasu feminine ta tu tii kan kanu kaas kaasu We assume that lexical items specify a (possibly empty) list of required arguments, together with a description of whether these arguments are normally expected to appear to the left or right. The direction in which the arguments are expected is language dependent, as shown in Fig. 3. Note that the description of where the arguments are to be found specifies the order of combination, very much like categorial descriptions. The description of an English transitive verb, for instance, is like the categorial description (S\NP)/NP, which corresponds to an SVO surface order. English transitive verb(SOV) {syn(nonfoot(head(cat(xbar(+v, -n)))), --- --- --- ---- subcat(args(["NP"(obj), "NP"(subj)])))} Persian transitive verb (SOV) {syn(nonfoot(head(cat(xbar(+v, -n)))), --- --- --- ---- subcat(args(["NP"(obj), "NP"(subj)])))} Arabic transitive verb (VSO) {syn(nonfoot(head(cat(xbar(+v, -n)))), ---- --- --- --- subcat(args(["NP"(subj), "NP"(obj)])))} the-acc the-nom `remote' (nom or acc) this-acc this-nom that-acc that-nom Figure 3: Subcat frames 3.1.2 Adjuncts and modifiers Figure 2: Definite articles We deal with this by assuming that determiners have the form gender-root-case, where the gender markers are `k-' (masculine) and `t-' (feminine), and the case markers are `-' (accusative) and `u' (nominative), with spelling rules that collapse 338 Items such as adjectival phrases, PPs and relative clauses which add information about some target item combine via a principle captured in Fig. 4 - R = {syntax(target= T , result=R)}, T - R = T, {syntax(target= T , result=R)} Figure 4: Modifiers and targets Then if we said that an English adjec- - tive was of type {syntax(target="NN", result="NN") the first rule in Fig. 4 would allow it to combine with an NN to its right to form an NN, and likewise saying that a - - PP was of type {syntax(target="VP", result="VP") would allow it to combine with a VP to its left to form a VP. 3.1.3 Non-canonical order gerunds, and in (Sadler, 1996)'s description of the possibility of allowing a single c-structure to map to multiple f-structures in LFG. We write `equivalence rules' of the kind given in Fig. 5 to deal with such phenomena: {syn(head(cat(xbar(-v, +n))), +specified)} <==>{syn(head(cat(xbar(+v, -n)), vform(participle,present)), subcat(args([{struct(B)}])))} The patterns and principles outlined in §3.1.1 and §3.1.2 specify the unmarked orders for the relevant phenomena. Other orders are often permitted, sometimes for discourse reasons (particularly in free word order languages such as Arabic and Persian) and sometimes for structural reasons (e.g. the left shifting of the WH-pronoun in `I distrust the man whoi she wants to marry i .'). We take the view that rather than introducing explicit rules to allow for various non-canonical orders, we will simply allow all possible orders subject to the application of penalties. This approach has affinities with optimality theory (Grimshaw, 1997), save that our penalties are treated cumulatively rather than being applied once and for all to competing local analyses. The algorithm we use for parsing withing this framework is very similar to the algorithm described by (Foth et al., 2005), though we use the scores associated with partial analyses to guide the search for a complete analysis, whereas Foth et al. use them to choose a complete but flawed analysis to be reconstructed. We have described the application of this algorithm to a variety of languages (including Greek, Spanish, German, Persian and Arabic) elsewhere (Ramsay and Schaler, 1997; Ramsay ¨ and Mansour, 2003; Ramsay et al., 2005): space precludes a detailed discussion here. 3.1.4 Internal:external syntax Figure 5: External and internal views of English verbal gerund The rule in Fig. 5 says that if you have a present participle VP (something of type +v, -n which has vform participle, present and which needs one more argument) then you can use whereever you need an NP (type -v, +n with a specifier +specified). 3.2 Somali syntax As noted earlier, the framework outlined in §3.1 has been used to provide accounts of a number of languages. In the current section we will sketch some of the major properties of Somali syntax and show how they can be captured within this framework. 3.2.1 The `core & topic' structure Every Somali sentence has a `core', or `verbal complex' (Svolacchia et al., 1995), consisting of the verb and a number of pronominal elements. The structure of the core can be fairly easily described by the rule in Fig. 6: CORE ==> SUBJ,(OBJ1),(ADP*),(OBJ2),VERB Figure 6: The structure of the core The situation is not, in fact, quite as simple as suggested by Fig. 6. The major complications are outlined below: 1. the third person object pronouns are never actually written, so that in many cases what you see has the form SUBJ, VERB, as in (1a), rather than the full form given in (1b) (we will write `(him)' to denote zero pronouns): (1) a. b. uu he uu he sugay (him) waited for i sugay me waited for In certain circumstances a phrase that looks as though it belongs to category A is used in circumstances where you would normally expect an item belonging to category B. The phrase `eating the owl' in `He concluded the banquet by eating the owl.', for instance, has the internal structure of a VP, but is being used as the complement of the preposition `by' where you would normally expect an NP. This notion has been around for too long for its origin to be easily traced, but has been used more recently in (Malouf, 1996)'s addition of `lexical rules' to HPSG for treating English nominal 339 2. The second complication arises with ditransitive verbs. The distinction between OBJ1 and OBJ2 in Fig. 6 simply corresponds to the surface order of the two pronouns, and has very little connection with their semantic roles (Saeed, 1999). Thus each of the sentences in (2a) could mean `He gave me to you', and neither of the sentences in (2b) is grammatical. uu i kaa siiyey (2) a. i. he me1 you2 gave uu ku kay siiyey ii. he you1 me2 gave uu kay ku siiyey b. i. he me2 you1 gave uu kaa i siiyey ii. he you2 me1 gave 3. The next problem is that subject pronouns are also sometimes omitted. There are two cases: (i) in certain circumstances, the subject pronoun must be omitted, and when this happens the verb takes a form which indicates that this has happened. (ii) in situations where the subject is normally present, and hence where the verb has its standard form, the subject may nonetheless be omitted (usually for discourse reasons) (Gebert, 1986). 4. There are a small number of preposition-like items, referred to as adpositions in Fig. 6, which can occur between the two objects, and which cliticise onto the preceding pronoun if there is one. The major complication here is that just like prepositions, these require an NP as a complement: but unlike prepositions, they can combine either with the preceding pronoun or the following one, or with a zero pronoun. Thus a core like (3) has two analyses, as shown in Fig. 7: (3) uu uu he sug++++aa uu agent i ka uu object mod agent 0 have space to discuss these here, and their presence or absence does not affect the discussion in §3.3 and §3.4. To capture these phenomena within the framework outlined in §3.1, we assign Somali transitive verbs a subcat frame like the one in Fig. 8 (the patterns for intransitive and ditransitive verbs differ from this in the obvious ways). {syn(nonfoot(head(cat(xbar(+v, -n)))), - - - - - -- ------- subcat(args(["NP"(obj, +clitic), - - - - - - - - - - - - - -- "NP"(subj, +clitic)])), foot(...))} Figure 8: Somali transitive verb Fig. 8 says that the core of a Somali sentence is a clause of the form S-O-V, where S and O are both clitic pronouns. The canonical position of S and O is as given. They can appear further to the left than that to allow for clitic modifiers: exactly where they can go is specified by requiring the clitic modifiers to appear adjacent to the verb (subject to further local constraints on their positions relative to one another), and requiring S and O to fall inside the scope of the `sentence markers'. 3.3 Sentence markers A core by itself cannot be uttered as a free standing sentence. At the very least, it has to include a `sentence marker'. The simplest of these is the word `waa'. (4), for instance, is a well-formed sentence, with the structure shown in Fig. 9. wuu sugaa. (4) waa uu sugaa. s-marker he (it) waits-for sug++ay++aa waa comp(waa) uu agent 0 object ika i ka me at (it) sugaa sugaa waits-for sug++++aa 0 ka object mod i Figure 9: Analysis for (4) (0.01 secs) Note that the pronoun `uu' cliticises onto the end of the sentence marker `waa', producing the written form `wuu', as discussed above. In general, however, the situation is not quite as simple as in (4). Most sentences contain NPs other than the pronouns in the core. The first such examples involve introducing `topics' in front of the sentence marker. Topics are normally definite NPs or PPs which set the scene for the interpretation of the core. A typical example is given in (5): 340 Figure 7: Analyses for (3) (0.010 secs) The second analysis in Fig. 7, `he waits for it at me', doesn't make much sense, but it is nonetheless perfectly grammatical. 5. Finally, there are a number of other minor elements that can occur in the core. We do not (5) ninka nim ka man the wuu waa uu S-marker he (him) 0 sugaa. sugaa. waits-for baabuur+ predication k+a det 0 topic waa comp(waa) somaliTopic wax+ k+a det plements, in much the same way as the English complementiser `that' is used to mark the start of a sentential clause in `I know that she like strawberry icecream.' (Lecarme, 1984). There is, however, an alternative form for main clauses, where one of the topics is marked as being particularly interesting by the sentence markers `baa' or `ayaa' (`baa' and `ayaa' seem to be virtually equivalent, with the choice between them being driven by stylistic/phonological considerations): (6) baraha baa ninka sugaa. `baa'/`ayaa' and `waa' are in complementary distribution: every main clause has to have a sentence marker, which is nearly always one of these two, and they never occur in the same sentence. The key difference is that `baa' marks the item to its left as being particularly significant. Ordinary topics introduce an item into the context, to be picked up by one of the core pronouns, without marking any of them as being more prominent than the others. The item to the left of `baa' is indeed available as an anchor for a core pronoun, but it is also marked as being more important than the other topics. We deal with this by assuming that `baa' subcategorises for an NP to its left, and then forms a sentence marker looking to modify a sentence to its right. The resulting parse tree for (6) is given in Fig. 12, with the interpretation that arises from this tree in Fig. 13. sug++++aa 0 object 0 agent somaliTopic nim+ k+a det somaliTopic focus bare+ k+a det baa comp(baa) Figure 10: Sentence with topic The analysis in Fig. 10 was obtained by exploiting an equivalence rule which says that an item which has the internal properties of a -clitic NP can be used as a `topic', which we take to be a sentence modifier. Topics set the scene for the interpretation of the core by providing potential referents for the pronominal elements in the core. There are no very strong syntactic links between the topics and the clitic pronouns ­ if a topic is +nom then it will provide the referent for the subject, but in some (focused) contexts subject referents are not explicitly marked as +nom. The situation is rather like saying `You know that man we were talking about, and you know the girl we were talking about. Well, she's waiting for him.'. topical(ref (B (N I M (B )))) &claim(C : {aspect(now, simple, C )} (C, ag ent, ref (Df emale(D))) &(C, obj ect, ref (F thing (F ))) &S U G(C )) Figure 11: Interpretation of (5) The logical form given in Fig. 11, which was constructed using using standard compositional techniques (Dowty et al., 1981), says the speaker is marking some known man r ef (B (N I M (B ))) as being topical, and is then making a claim about the existence of a waiting event S U G(C ) involving some known female as its agent and some other known entity as its object. Note that we include discourse related information ­ that the speaker is first marking something as being topical and then making a claim ­ in the logical form. This seems like a sensible thing to do, since this information is encoded by lexical and syntactic choices in the same way as the propositional content itself, and hence it makes sense to extract it compositionally at the same time and in the same way as we do the propositional content. Somali provides a number of such sentence markers. `in' is used for marking sentential com341 Figure 12: Parse tree for (6) topical(ref (C (N I M (C )))) &f ocus(ref (D(B ARE (D)))) &claim(B : {aspect(now, simple, B )} (B , obj ect, ref (E thing (E ))) &(B , ag ent, ref (Gspeaker(G))) &S U G(B )) Figure 13: Interpretation for (6) Treating `baa' as an item which looks first to its left for an NP and then acts as a sentence modifier gives us a fairly simple analysis of (6), ensuring that when we have `baa' we do indeed have a focused item, and also accounting for its complementary distribution with `waa'. The fact that the combination of `baa' and the focussed NP can be either preceded or followed by other topics means that we have to put very careful constraints on where it can appear. This is made more complex by the fact that the subject of the core sentence can cliticise onto `baa', despite the fact that there may be a subsequent topic, as in (7). (7) 0 object baraha buu ninku sugaa. sug++++aa uu agent baa comp(baa) somaliTopic bare+ k+a det k+a det i caseMarker somaliTopic nim+ Figure 14: Parse tree for (7) To ensure that we get the right analyses, we have to put the following constraints on `baa' and `waa': 1. if the subject of the core is realised as an explicit short pronoun, it cliticises onto the sentence marker 2. the sentence marker attaches to the sentence before any topics (note that this is a constraint on the order of combination, not on the leftright surface order: the tree in Fig. 14 shows that `baraha baa' was attached to the tree before `ninka', despite the fact that `ninka' is nearer to the core than `baraha baa'. Between them, these two ensure that we get unique analyses for sentences involving a sentence marker and a number of topics, despite the wide range of potential surface orders. 3.4 Relative clauses & `waxa'-clefts This is a bit awkward for any parsing algorithm which depends propagating the WH-marker up the parse tree until a complete clause has been analysed, and then using it to decide whether that clause is a relative clause or not. We do not want to introduce two versions of each pronoun, one with a WH-marker and the other without, and then produce alternative analyses for each. Doing this would produce very large numbers of alternative analyses, since each core item is can be viewed either way, so that a simple clause involving a transitive clause would produce three analyses (one with the subject WH-marked, one with the object WHmarked, and one with neither). We therefore leave the WH-marking on the clitic pronouns open until we have an analysis of the clause containing them. If we need to consider using this clause in a context where a relative clause is required, we inspect the clitic pronouns and decide which ones, if any are suitable for use as the pivot (i.e. the WH-pronoun which links to the modified analysis). Relative clauses do not require a sentence marker. We thus get analyses of relative clauses as shown in Fig. 15 for (8). (8) ninka nim ka man the wadaya wadaya is-driving wuu waa uu s-marker he shaqeeyayaa shaqeeyaa is-working The man who is driving it: he's working shaqee++ay++aa uu agent waa comp(waa) somaliTopic nim+ k+a det headless whmod wad++ay++a 0 object 0 agent Figure 15: Parse tree for (8) Note the reduced form of `wadaya' in (8). The key here is that the subject of `wadaya' is the `pivot' of the relative clause (the item linking the clause to the modified nominal). When the subject plays this role it is forced to be a zero item, and it is this that makes the verb take the -fullForm versions of the agreement and tense markers. Apart from the fact that you can't tell whether a clitic pronoun is acting as a WH-marker or not until you see the context, and the requirement for 342 We noted above that in general Somali clauses contain a sentence marker ­ generally one of `waa', `baa' and `ayaa' for main clauses, or one of `in' for subordinate clauses. There are two linked exceptions to this rule: relative clauses, and `waxa'-clefts. Somali does not possess distinct WH-pronouns (Saeed, 1999). Instead, the clitic pronouns (including the zero third-person pronoun) can act as WH-markers. reduced form verbs with zero subjects, Somali relative clauses are not all that different from relative clauses in other languages. They are, however, related to a phenomenon which is rather less common. We start by considering nominal sentences. Somali allows for scarenominal sentences consisting of just a pair of NPs. This is a fairly common phenomenon, where the overall semantic effect is as though there were an invisible copula linking them (see Arabic, malay, English `small clauses', . . . ). We deal with this by assuming that any accusative NP could be the predication in a zero sentence. The only complication is that in ordinary Somali sentences the only items which follow the sentence marker are clitic pronouns and modifiers. For nominal sentences, the predicative NP, and nothing else, follows the sentence marker. For uniformity we assume that there is in fact a zero subject, with the +nom NP that appears before the sentence marker acting as a topic. waxu waa baabuurka. (9) wax ka I waa baabuur ka thing the +NOM s-marker truck the Any normal NP can appear as the topic of such a sentence. In particular, the noun `wax', which means `thing', can appear in this position: 0 baabuur+ predication k+a det 0 topic waa comp(waa) somaliTopic wax k+a det i caseMarker marker `I'. The choice of +fullForm this time arises because the subject pronoun is not WHmarked, which means that it is not forced to be zero: remember that -fullForm is used if the local constraints require the subject to be zero, not just if it happens to be omitted for discourse or stylistic reasons. Then in the analysis in Fig. 17 `wax ka aan doonayo I' is a +nom NP functioning as the topic of a nominal sentence. 0 lacag+ predication 0 topic waa somaliTopic comp(waa) wax k+a det headless whmod doon++ay++o 0 patient aan agent I caseMarker Figure 17: (10) the thing I want: it's some money So far so simple. `waxa', however, also takes part in a rather more complex construction. In general, the items that occur as topics in Somali are definite NPs (Saeed, 1984). In all the examples above, we have used definite NPs in the topic positions, because that it is what normally happens. If you want to introduce something into the conversation it is more usual to use a `waxa-cleft', or `heralding sentence' (Andrzejewski, 1975). The typical surface form of such a construction is shown in (11): waxaan waxa aan waxa I doonayaa doonayaa want lacag. lacag money Figure 16: `the thing: it's the truck' The analysis in Fig. 16 corresponds to an interpretation something like `The thing we were talking about, well it's the truck'. Note the analysis of `waxu' here as the noun `wax' followed by the definite article `ka' and the nominative case marker `I'. There is no reason why the topic in such a sentence should not contain a relative clause. In (10), for instance, the topic is `waxaan doonayo I' ­ `the thing which I want'. (10) waxaan wax ka aan thing the I doonayaa doonayo I want +NOM waa waa s-marker lacag. lacag money (11) The key things to note about (11) are as follow: · There is no sentence marker. Or at any rate, the standard sentence markers `waa' and `baa' are missing. · The subject pronoun `aan' has cliticised onto the word `waxa' to form `waxaan'. · The verb `doonayaa' is +fullForm · The noun `lacag' follows the verb. This is unusual, since generally NPs are used as topics preceding the core and, generally, the sentence marker. 343 Note that `doonayaa' here is being read as the {+fullForm,-main} version of the verb `doonayo' followed by a cliticised nominative These facts are very suggestive: (i) the lack of any other item acting as sentence marker suggests that `waxa' is playing this role. (ii) the fact that `uu' has cliticised onto this item supports this claim, since subject pronouns typically cliticise onto sentence markers rather than onto topic NPs. We therefore suggest that `waxa' here is functioning as sentence marker. Like `baa', it focuses attention on some particular NP, but in this case the NP follows the core. doon++ay++aa 0 patient aan agent waxa comp(waxa) lacag+ focus Figure 18: Parse tree for (11) Thus `waxa', as a sentence marker, is just like `baa' except that `baa' expects its focused NP to follow it immediately, with the core following that, whereas the order is reversed for `waxa' (Andrzejewski, 1975). It seems extremely likely that `waxa'-clefts are historically related to sentences like (10). The subtle differences in the surface forms (presence or absence of `waa' and form of the verb), however, lead to radically different analyses. How simple nominal sentences with topics including `waxa' and a relative clause turned into `waxa'-clefts is beyond the scope of this paper. The key observation here is that `waxa'-clefts can be given a straightforward analysis by assuming that `waxa' can function as a sentence-marker that focuses attention on a topical NP that `follows' the core of the sentence. References B W Andrzejewski. 1968. Inflectional characteristics of the so-called weak verbs in Somali. African Language Studies, 9:1­51. B W Andrzejewski. 1975. The role of indicator particles in Somali. Afroasiatic Linguistics, 1(6):123­ 191. D R Dowty, R E Wall, and S Peters. 1981. Introduction to Montague Semantics. D. Reidel, Dordrecht. Killian Foth, Wolfgang Menzel, and Ingo Schroder. ¨ 2005. Robust parsing with weighted constraints. Natural Language Engineering, 11(1):1­25. L Gebert. 1986. Focus and word order in Somali. Afrikanistische Arbeitspapiere, 5:43­69. J Grimshaw. 1997. Projection, heads, and optimality. Linguistic Inquiry, 28:373­422. K Koskiennemi. 1985. A general two-level computational model for word-form recognition and production. In COLING-84, pages 178­181. J Lecarme. 1984. On Somali complement constructions. In T Labahn, editor, Proceedings of the Second International Congress of Somali Studies, 1: Linguistics and Literature, pages 37­54, Hamburg. Helmut Buske. J Lecarme. 1995. L'accord restrictif en Somali. Langues Orientals Anciennes Philologie et Linguistique, 5-6:133­152. J Lecarme. 2002. Gender polarity: Theoretical aspects of Somali nominal morphology. In P Boucher, editor, Many Morphologies, pages 109­ 141, Somerville. Cascadilla Press. Robert Malouf. 1996. A constructional approach to english verbal gerunds. In Proceedings of the Twenty-second Annual Meeting of the Berkeley Linguistics Society, Marseille. A M Ramsay and H Mansour. 2003. Arabic morphosyntax for text-to-speech. In Recent advance in natural language processing, Sofia. A M Ramsay and R Schaler. 1997. Case and word or¨ der in English and German. In R Mitkov and N Nicolo, editors, Recent Advances in Natural Language Processing. John Benjamin. A M Ramsay, Najmeh Ahmed, and Vahid Mirzaiean. 2005. Persian word-order is free but not (quite) discontinuous. In 5th International Conference on Recent Advances in Natural Language Processing (RANLP-05), pages 412­418, Borovets, Bulgaria. Louisa Sadler. 1996. New developments in LFG. In Keith Brown and Jim Miller, editors, Concise Encyclopedia of Syntactic Theories. Elsevier Science, Oxford. J I Saeed. 1984. The Syntax of Focus and Topic in Somali. Helmut Buske Verlag, Hamburg. J I Saeed. 1999. Somali. John Benjamins Publishing Co, Amsterdam. M Svolacchia, L Mereu, and A. Puglielli. 1995. Aspects of discourse configurationality in Somali. In K E Kiss, editor, Discourse Configurational Languages, pages 65­98, New York. Oxford University Press. 4 Conclusions We have outlined a computational treatment of Somali that runs right through from morphology and morphographemics to logical forms. The construction of logical forms is a fairly routine activity, given that we have carried out this work within a framework that has already been used for a number of other languages, and hence the machinery for deriving logical forms from semantically annotated parse trees is already available. The most notable point about Somali semantics within this framework is the inclusion of the basic illocutionary force within the logic form, which allows us to also treat topic and focus as discourse phenomena within the logical form. 344