WWW 2007 / Track: Web Services Session: Orchestration and Choreography Towards the Theoretical Foundation of Choreography Qiu Zongyan, Zhao Xiangpeng, Cai Chao, and Yang Hongli LMAM & Depar tment of Informatics, School of Math., Peking University, Beijing, CHINA zyqiu@pku.edu.cn, {zxp,caic,yhl}@math.pku.edu.cn ABSTRACT With the growth of interest on the web services, people pay increasingly attention to the choreography, that is, to describe collaborations of participants in accomplishing a common business goal from a global viewpoint. In this paper, based on a simple choreography language and a roleoriented process language, we study some fundamental issues related to choreography, especially those related to implementation, including semantics, pro jection and natural pro jection, dominant role in choices and iterations, etc. We propose the concept of dominant role and some novel languages structures related to it. The study reveals some clues about the language, the semantics, the specification and the implementation of choreography. Categories and Subject Descriptors H.3.5 [Info. Sys.]: Misc.--Web-based services ; D.2.1 [Soft.]: SE--Specifications ; D.3.1 [Prog. Language]: Formal Definitions and Theory--Semantics General Terms Design, Languages, Theory, Verification Keywords Choreography, Semantics, Implementation, Pro jection, Dominant Role, Dominated Choice, Dominated Loop 1. INTRODUCTION Web services promise the interoperability of various applications running on heterogeneous platforms over the Internet, and are gaining more and more attention. Web service composition refers to the process of combining web services to provide value-added services, which has received much interest in supporting enterprise application integration. The description of the single services locally from cooperation of other services is called an orchestration. The de facto standard for orchestration is BPEL [2]. On another level, we need to specify protocols among different services to achieve a business goal. WS-CDL [9, 8] Supported by NNSF of China (No. 60573081). Copyright is held by the International World Wide Web Conference Committee (IW3C2). Distribution of these papers is limited to classroom use, and personal use by others. WWW 2007, May 8­12, 2007, Banff, Alberta, Canada. ACM 978-1-59593-654-7/07/0005. is a W3C candidate recommendation designed for describing the collaborative observable behavior of multiple services that interact with each other, where the behaviors are performed by the participants, and the specification written in WS-CDL, called a choreography, provides a global view. Extensive writing exists on the specification of parallel and distributive systems, for example CSP [7] and CCS [12]. However, almost all of them support intrinsically the specification from the local viewpoint. Even for the communication, the specification is still local, as it expresses when a process sends or receives a message from a specific channel. Furthermore, there is not a special concept of the participants in the computation. With the blooming of web technology, real computations are increasingly established as a kind of processes that various computing facilities take part in, which are independent entities and might reside in any place throughout the world. For accomplishing the goal of the computation, they should not only have "correct" functionalities, but also interact with each other correctly. With the interaction becoming more complex, the problems related to specify the interaction of the participants become harder and harder. Moreover, it is even harder to verify the interaction locally. This is the motivation behind the design of WS-CDL. However, as quoted from the specification, WS-CDL is "not an execution business process description language or an implementation language". A choreography designates a business task performed by multiple roles, but does not give an implementation composed of a set of participants. The specification of the individual participants is at the level of BPEL-like languages. For the successful development of the web-based applications, it is urgent and important to understand the approaches for connecting the global view of the choreography with the local view of each participant. The work presented here is motivated by this requirement. It is an effort to make a clear view of choreography at an abstract level for a better understanding of what is required by the choreography to each of the roles, and further the choreography itself and related issues involved. To explore the essence of choreography, we define a small language Chor as a model of simplified WS-CDL, and a simple process language for describing roles from a local viewpoint, both with formal syntax and semantics. With these models, we discuss the concept of projection, which maps a choreography to a set of role processes, and define the implementation of choreography based on pro jections. We propose a natural projection and two structural conditions related to it. Then we discuss another pro jection that 973 WWW 2007 / Track: Web Services remedies the relative ordering problem where a choreography violates the sequential condition. For the correct implementation of the choice structures in a choreography, we find that the dominant role of a choice is a very important concept. To ensure the roles processes take consistently the same branch in their independent version of a choice (in the choreography), when the choice is pro jected, there must be only one role ­ the dominant one ­ to make the real choice, and the other roles simply follow the dominator's decision. Otherwise, the consistency cannot be guaranteed. With this recognition, we suggest to introduce dominant role as an explicit concept on the choreography level. We extend Chor and the role language by some new structures, including dominated choice, and dominated loop, with formal syntax and semantics of these structures and relatives. Furthermore, we define a pro jection that maps, as well as other structures, the dominated choice and dominated loop in Chor to the corresponding choice and loop structures for the dominant and dominated roles in the role language. As shown in the paper, the choreography designer can select the dominant role for each choice or loop as they want. The pro jection can generate all the necessary structures and synchronizing actions, and produce a correct implementation, i.e., the parallel composition of the resulted roles will correctly show the behavior described by the choreography. The basic part of Chor and the process language are defined in Section 2 and Section 3. Section 4 is devoted to pro jection and implementation, where we provide a detailed discussion about natural pro jection and structural conditions related to this pro jection. With the concept of dominant role, we extend our framework in Section 5, and deal with the implementation of choices and loops. A number of related general issues are discussed in Section 6. Finally we list some related work and give a conclusion. Session: Orchestration and Choreography A Figure 1: Syntax & Semantics of Chor : : = BA (basic activities) | A; A (sequential composition) | A A (choice) | A A (parallel composition) B A : : = skip (no action) | ai (activity in role Ri ) [i,j ] |c (communication) Basic: [[skip]] = { } b [[a]] = { a } b [[c[i,j ] ]] = { c[i,j ] } b Sequential: Choice: Parallel: [[A1 ; A2 ]] = [[A1 ]] a [[A2 ]] b [[A1 [[A1 A2 ]] = [[A1 ]] [[A2 ]] b A2 ]] = interlv ([[A1 ]], [[A2 ]]) b A Trace Semantics of Chor We consider the meaning of a choreography as the set of all possible traces of its execution. A trace is a sequence of activities of the form 1 , 2 , . . . , n , where n is the length, and is the empty trace with length 0. We will use t, t1 , . . . to denote traces, and use T , T1 , . . . to denote trace sets. We use operator a for trace concatenation, and lift it to trace sets on either or both sides: t a T = {t a t | t T } b a t = {t a t | t T } T b T1 a T2 = {t1 a t2 | t1 T1 , t2 T2 } b 2.1 We need a function interlv to interleave activities of two traces and give the corresponding trace set. The definition is routine and is omitted here. We lift interlv to trace sets: interlv (T1 , T2 ) = {t | t1 T1 , t2 T2 · b such that t interlv (t1 , t2 )} We also use filter operator on traces and trace sets. The trace(s) t S (or T S ) retains only the elements of S in t (or T ), and keeps their orders unchanged. 2. THE LANGUAGE Chor R C = {R 1 , . . . R n } locals (Ri ) = {ai , . . . , ai i } 1 n Finite roles take part in a choreography C , where each role is associated with a number of basic local activities: We use meta-variable ai to denote an arbitrary activity of Ri , and use a, a1 , . . . for activities of any role. The communication from role Ri to Rj takes the form of c[i,j ] , where c is a channel name. We use c, c1, . . . to represent concrete communication names in examples. We do not care about the messages transferred in communications, nor the variables receiving the messages. The syntax of the choreography language Chor is given in Fig.1 (upper part). A basic activity can be a local activity of a role Ri , a communication, or skip which does nothing. The composition structures considered here are the sequential composition, choice, and parallel composition. A choreography (a "program") is simply an activity A. We call Ri the performer of local activity ai , and call Ri and Rj the performers of c[i,j ] . We use locals (C ) (locals (Ri )) to denote the set of all local activities in C (of Ri ), use comms (C ) (comms (Ri )) for the set of all communication activities in C (performed by Ri ), use acts (C ) (acts (Ri )) for the set of all activities appearing in C (performed by Ri ), and use , , . . ., possibly with subscript, to denote an arbitrary (local or communication) activity. Definition 1. (Trace Semantics of Choreography). The semantics of choreography C is the trace set [[C ]] defined by rules in Fig.1 (lower part). Here we adopt the interleaving semantics for parallel composition. W e assume that sequential composition operator ";" has higher priority, while " " and " " have the same priority. 2.2 Laws and Examples Many laws hold for choreographies in Chor. We list some of them in Fig.2, where A, A1 , A2 , A3 are arbitrary activities, and = means semantical equivalence. The proofs are straightforward, and are omitted here. Many other laws can be found, which are out of the main focus of this paper. In company with other laws, the Unit and Idemp otent laws can be used to simplify a choreography by removing all unnecessary skip and replacing A A with A, while keeping the semantics unaltered. Since the choice can always be moved outward by the Distribution laws of , any choreography can be transformed into the form A1 . . . Am , where no choice appears in each of A1 , . . . , Am . Using the laws in the other direction, we can transform a choreography to a form where each is restricted to a minimal scope. Thus we have following definition. 974 WWW 2007 / Track: Web Services Session: Orchestration and Choreography Figure 2: Laws in Chor Asso ciation and Symmetry: (A1 ; A2 ); A3 = A1 ; (A2 ; A3 ) (; assoc.) (A1 A2 ) A3 = A1 (A2 A3 ) ( assoc.) (A1 A2 ) A3 = A1 (A2 A3 ) ( assoc.) A1 A2 = A2 A1 ( sym.) A1 A2 = A2 A1 ( sym.) Unit and Idemp otent: skip; A = A; skip = A (; unit) skip A = A ( unit) A A=A ( idem.) Choice Distribution: (A1 A2 ); A = (A1 ; A) (A2 ; A) (;- distr.1) A; (A1 A2 ) = (A; A1 ) (A; A2 ) (;- distr.2) (A1 A2 ) A = (A1 A) (A2 A) ( - distr.) Figure 3: Syntax & Semantics of Role Language P : : = BP (basics) B P : : = skip (no action) | P ; P (sequential) |a (local) | P P (choice) | c! (send) | P P (parallel) | c? (receive) Skip: skip - Local: P1 - P1 a - ; P 2 P2 P1 P2 P2 - - a equential: Choice: ParalPel: l 1 P1 ; P2 - P1 ; P2 P1 1 P2 P1 - - - P c? fst (P1 ) P2 P2 P1 P1 c c! fst (P2 ) P2 /c! P2 /c? c? fst (P2 ) P1 P1 P2 - P1 P2 - P 2 P2 - P1 /c? P2 - P1 /c! c c! fst (P1 ) P2 - P1 O Definition 2. (Choice Normal Form). A choreography of the form A1 . . . Am , with no choice in A1 , . . . , Am , is called in the Distributed Choice Normal Form. A choreography in which the scope of every can not be limited further is called in the Minimal Choice Normal Form. bviously, any choreography has an equivalent Distributed Choice Normal Form, and a Minimal Choice Normal Form. One interesting fact is, although a choreography may have parallel structures, no deadlock can happen in its execution. Theorem 1 (Deadlock-freeness). Suppose that every basic activity in a choreography wil l terminate, then the choreography wil l always terminate. Proof. All traces in the trace set of a choreography can be constructed using rules in Fig.1 unconditionally. Now we present some examples. Example 1. Here is a simple choreography in Chor: C1 = (a1 1 a2 ); c1[1,2] ; a2 ; c2[2,1] 1 2 local view of interactions. A sending action and a receiving action engage in a handshake when they have the same channel name and the two roles involved are ready to perform them. We can define the sets locals (P ), comms (P ) and acts (P ) too, as in Sec.2. We define the semantics of processes as trace sets too. The semantic rules are of the form P - P , meaning that a process P transforms to P fter executing activities in , which is a sequence of activities (i.e. a trace). The rules are listed in Fig.3 (lower part), where denotes the empty process with no command. The two rules on the lower right are for synchronization. Function fst (for first ) is defined as: fst ( ) = fst (skip) = fst (P1 P2 ) = b fst () = {} b fst (P1 ; P2 ) = fst (P1 ) b fst (P1 P2 ) = fst (P1 ) fst (P2 ) b We define fst (P1 P2 ) as to force choices taking a branch in the first. Moreover, P / is the process resulted from P after executing , as defined below: when = b skip/ = b / = ( when = P1 ; P2 )/ = P1 /; P2 b (P1 P2 )/ = b 8 when fst (P1 ) < P1 / P2 P1 P2 / when fst (P2 ) (P1 P2 )/ = b : otherwise where denotes undefined, and only the defined cases are used here. In the parallel case, when both two conditions hold in the same time, we take them as two separate rules. If no rule is applicable to a non-empty process, we say it is deadlocked. A rule is introduced to represent this situation with a special symbol The trace set of C1 is as fol lows: [[C1 ]] = { a1 , a2 , c1[1,2] , a2 , c2[1,2] , a2 , a1 , c1[1,2] , a2 , c2[2,1] } 1 1 2 1 1 2 Here is another very simple choreography: C2 = ( a1 1 a2 ); a1 1 2 [[C2 ]] = { a1 , a2 , a1 1 1 2 , a2 , a1 , a1 1 1 2 } We wil l meet this example again in Sec.4.1 etc., because it shows some interesting features of choreographies. 3 : . A LANGUAGE FOR ROLES T P= and P is deadlocked P - A choreography describes the interaction among roles from a global view. It is intended to be implemented by the coordination of a set of independent processes. In order to study the relationship between the globally described choreography and the coordinative activities of each role, we define a simple language for the roles here. The role language is given in Fig.3. The only difference from Chor is that it takes a local view on communications, where sending actions and receiving actions represent roles' = o define the trace set of processes, we define: P P - P N P - P P P P = P = P ow, we can have the definition: Definition 3. (Traces of Process) If P = , then is 975 WW 2007 / Track: Web Services Session: Orchestration and Choreography Obviously, this is not the same as [[C2 ]] (see Exam.1). ometimes Equation (1) may fail with nproj. In these cases, we can blame the fault to the choreography under consideration, or to the pro jection used (nproj here), or to the semantics we choose, or even to the languages. We will investigate the problem further on some of these issues. Figure 4: The Natural Pro jection nproj(skip, k ) = skip b nproj(ai , k ) = ai b if k = i nproj(ai , k ) = skip b if k = i nproj(c[i,j ] , k) = c[i,j ] ! b nproj(c[i,j ] , k) = c[i,j ] ? b if k = i if k = j S nproj(c[i,j ] , k) = skip b if k = i k = j nproj(A1 ;A2 , k ) = nproj(A1 , k ); nproj(A2 , k) b nproj(A1 A2 , k) = nproj(A1 , k ) nproj(A2 , k) b nproj(A1 A2 , k ) = nproj(A1 , k ) nproj(A2 , k) b called a trace of P . The semantics of process P is the trace set [[P ]] ={ |P = }. b 4.2 Restricted Natural Choreography The natural projection seems intuitive. Thus, from one point of view, we can take nproj as a criterion to distinguish the "good" choreographies from the "bad" ones, and say that C1 in Exam.2 is good, while C2 is bad. Definition 4. (Restricted Natural Choreography). Suppose C is a choreography with n roles. We call C a restricted natural choreography (RN choreography for short), if [[nproj(C )]] = [[C ]] (2) Definition 4 defines a restricted level of "well-formedness". If a choreography is RN, we can efficiently partition it into a set of processes whose parallel composition shows the expected behavior. It is natural to ask a question: what structures make an RN choreography? We will propose some sufficient conditions here. I t is easy to see that all laws in Fig.2 hold still for processes. Besides, although we adopt the same notation [[·]] for two different languages, it will not cause confusion. 4. PROJECTION AND IMPLEMENTATION A choreography is a global description of a collaborative task performed by multiple partners. We think that each role in a choreography is a concrete entities taking part in the task and should be implemented by a distinguishable, independent process. A reasonable definition of implementation is based on projection and local conformance (see Sec.6). A projection is a procedure which takes a choreography C with n roles and delivers n processes in the role language. The combination of these role mimic the behavior of C . In the simplest case, for pro jection proj , we want to have1 : [[proj (C, 1) ... proj (C, n)]] = [[C ]] (1) In a sense, processes proj (C, 1), . . . , proj (C, n) make up an implementation of C . We will use proj (C ) as a shorthand for proj (C, 1) . . . proj (C, n). Please note that, if the execution of the left hand side of (1) runs into deadlock, some of its traces will end with Since no trace in [[C ]] has a part, the equation will never hold. Conversely, if we have (1), the role processes will never deadlock (Theorem 1). No standard pro jection is defined in WS-CDL. We study a simple one, and then consider the problems recognized. 4.2.1 Sequential Composition The problem related to sequential composition is to keep the relative order among the processes produced by the projection. We have seen a counterexample in Exam.2, where the processes can not keep the relative order of their activities, and an extra trace presents. To describe a condition for sequential compositions, we need to define two activity sets: lead (A) includes all activities that can be executed first in the execution of A, and end (A) includes activities that can be the last activity of A. These two sets can be defined recursively. The following condition guarantees that a sequential composition never breaks Equation (2): Condition 1. (Sequential Composition). A sequential composition A1 ; A2 is restricted natural (RN), if it satisfies: 1 end (A1 ), 2 lead (A2 )· 1 and 2 have a common performer. (3) . as The concept performer is defined in Sec.2. ere are some choreographies where all sequential compositions presented are RN: a1 ; a1 1 2 (a1 1 a2 ); c[1,2] 1 (c1[1,3] a2 ; c2[2,4] ); c3[1,2] ; a1 1 1 The last two example show that, when one side of ";" is a parallel or a choice, we need to consider each of the branches separately. We will discuss the choices further in next subsection. Example 3. A simplest counterexample is C3 = a1 ; a2 . 1 1 After projection we get nproj(C3 , 1) = a1 nproj(C3 , 2) = a2 1 1 1 2 [[nproj(C3 )]] = { a1 , a1 , a2 , a1 } 1 1 The relative order of a1 and a2 is not kept. We can add a 1 1 communication to separate the two activities, and obtain an RN choreography a1 ; c[1,2] ; a2 . Also, C2 in Exam.2 can be 1 1 modified into an RN one similarly. This remedy procedure wil l be discussed further in Sec. 4.3. H 4.1 Natural Projection The natural projection (nproj ) defined in Fig.4 is a simple partition following the structure of choreographies, where [i, j ] is taken as a part of the channel name. The pro jection may leave some skip and P P in role processes. A procedure can be introduced to simplify the results. We will omit this detail, and present the simplified results directly. Example 2. Let us apply nproj to C1 of Exam.1: nproj(C1 , 1) = nproj(C1 , 2) = a1 ; c1[1,2] !; c2[2,1] ? 1 a2 ; c1[1,2] ?; a2 ; c2[2,1] ! 1 2 It is easy to see that [[nproj(C1 )]] = [[C1 ]]. However, this is not always the case. Consider C2 in Exam.1: nproj(C2 , 1) = a1 ; a1 nproj(C2 , 2) = a2 1 2 1 2 1 1 [[nproj(C2 )]] = { a1 , a1 , a2 , a1 , a2 , a1 , a1 , a1 , a2 } 1 1 2 1 2 1 1 W W e might use other similar equations, e.g. (4) in Sec.4.3. 976 WWW 2007 / Track: Web Services Example 4. Condition 1 has a sad consequence: we have no way to enforce local activities of more than two roles to terminate at the same time, when they run in different paral lel branches. Consider C4 = (a1 a2 a3 ); a1 . No matter 1 1 1 2 what and how many communications introduced, we can not make it an RN choreography. Here is a try: C= [[nproj(C )]] = 4 Session: Orchestration and Choreography Example 6. Consider: C7 = (a1 ; c1[1,2] a1 ; c1[1,2] ) c2[2,1] ; a2 1 2 1 nproj(C7 , 1) = (a1 ; c1[1,2] ! a1 ; c1[1,2] !) c2[2,1] ? 1 2 nproj(C7 , 2) = c1[1,2] ? c2[2,1] !; a2 1 The dominant role of the choice is R1 . No matter which branch R1 chooses, it can always synchronize with R2 . It is not hard to see that [[nproj(C7 )]] = [[C7 ]]. Now consider a similar example involving paral lel composition of two choices with different dominant roles: C8 = (a1 ; c1[1,2] a1 ; c1[1,2] ) (c2[2,1] ; a2 a2 ; c2[2,1] ) 1 2 1 2 nproj(C8 , 1) = (a1 ; c1[1,2] ! a1 ; c1[1,2] !) c2[2,1] ? 1 2 nproj(C8 , 2) = c1[1,2] ? (c2[2,1] !; a2 1 a2 ; c2[2,1] !) 2 Dominant roles of the two choices are R1 and R2 , respectively. No matter how R1 and R2 choose, the synchronization wil l be fine, and [[nproj(C8 )]] = [[C8 ]]. ( a1 1 4 a2 1 Other arrangements wil l have similar results. ome researchers proposed to add the multipartite communication activities to solve this problem, e.g. [3]. S a3 ); c1[1,2] ; c2[1,3] ; a1 1 2 [[C4 ]] { a1 , a2 , c1[1,2] , a3 , c2[1,3] , a1 , 1 1 1 2 a2 , a1 , c1[1,2] , a3 , c2[1,3] , a1 } 1 1 1 2 4.2.2 Choice The nproj maps a choice in a choreography to a choice in each role process. Within the parallel composition, these processes run independently, thus are not guaranteed to take consistently the same branch in the run of their own version of the choice. Inconsistent choices can result in extra traces, or even deadlock. The simplest example showing the problem is a1 a2 , which has trace set { a1 , a2 }. After pro jec1 1 1 1 tion and parallel composition, we get (a1 skip) (a2 skip) 1 1 with trace set { , a1 , a2 , a1 , a2 , a2 , a1 }. 1 1 1 1 1 1 For a choice to bring no trouble, one possibility is that at most one role process really makes a choice in its version of the choice; for each of the other processes, all branches in its version of the choice are the same, thus it makes no real choice here. We have the following definition: Condition 2. (Choice). For a choice, if we can determine a special role, called dominant role hereafter, while for each of the other roles involved, all branches of those roles' version of this choice are the same and thus can be merged together, then this choice is an RN choice. I deally, we hope to find a set of structural conditions, that are both sufficient and necessary to distinguish RN choreographies from the others. However, we can only prove that the conditions given above are sufficient. Theorem 2 (Restricted Natural Choreography). Suppose choreography C is in its Minimal Choice Normal Form (Definition 2). If each of the sequential compositions in C satisfies Condition 1 and each of the choices in C satisfies Condition 2, then C is restricted natural. Proof. The proof is in our report [13]. A counterexample showing the conditions are not necessary is a1 ; a2 a2 ; a1 . It is RN but violates Condition 1. 1 1 1 1 N 4.3 Other Projections ow we give some examples to illustrate this condition. Example 5. Choreography C5 includes an RN choice with communication activities: C5 = a1 ; (a1 ; c[1,2] a1 ; c[1,2] ); a2 1 2 3 1 nproj(C5 , 1) = a1 ; (a1 a1 ); c[1,2] ! 1 2 3 nproj(C5 , 2) = c[1,2] ?; a2 1 R1 is the dominant role here, and the choice in R2 disappears. It is easy to see that [[nproj(C5 )]] = [[C5 ]]. Here is a negative example: C6 = a1 ; (a1 ; c1[1,2] a1 ; c2[1,3] ); a2 1 2 3 1 1 1 [1,2] 1 nproj(C6 , 1) = a1 ; (a2 ; c1 ! a3 ; c2[1,3] !) nproj(C6 , 2) = (c1[1,2] ? skip); a2 1 nproj(C6 , 3) = c2[1,3] ? skip Because none of the choices in the three role processes can shrink away, we cannot have [[nproj(C6 )]] = [[C6 ]]. Furthermore, the paral lel composition may run into dead lock. As pointed in Sec.4.1, we can also blame nproj for the failure of Equation (1) in general. Taking another pro jection may make an equation similar to (1) hold for more choreographies. In previous work [13], we proposed a projection to remedy the ordering of activities among different roles. Given a choreography C with n roles, which might violate Condition 1 (but not Condition 2), the pro jection inserts first some communications (in either direction) into C in the suitable positions to enforce the relative orderings between roles, to produce a choreography C , such that [[nproj(C )]] acts (C ) = [[C ]] Here is the filter operation described in Sec.2. Note that (4) is true for some non-RN choreographies. Therefore we have an extended class of "good" choreographies. The problem related to Condition 2 involves the consistent choice in different roles. To solve this problem, we need to extend Chor. We explore the problem related to choices and iterations in the next section. (4) 4 5. AN EXTENDED FRAMEWORK .2.3 Structural Theory of RN Choreography For parallel composition, we find no extra conditions. The interaction between parallel and sequential composition has been studied. Now we consider the interaction between parallel composition and choice. Exam.5 shows some cases for this interaction. Here are some more examples: Practically, choreographies with choices violating Condition 2 are common. It is not reasonable to classify them as non-implementable. As an example, we consider now a real case, where a customer asks two sellers for the price and other relevant information of some product. After receiving two replies, the customer makes a choice based on some criterion, and informs the sellers. Depending on the customer's 977 WWW 2007 / Track: Web Services choice, the sellers will either continue to trade or simply go to rest. If we made this scenario as a choreography, there must be a choice in which all the three participants need to make their choices in a consistent way. Thus, a choice structure violating Condition 2 is inevitable here. In the study, we find that dominant role is a very important concept. It is the key of Condition 2, where we try to determine if there is only one role who makes a real choice. Now we consider to introduce it explicitly into our languages. In fact, when a designer writes a choice involving multiple roles in a choreography, she/he has a clear idea in mind that one of the roles is special who makes the real choice, while the others should follow this role's decision. Reasonably, in the scenario above, the customer is the dominator, and the sellers are dominated by the customer's choice. We will explore this idea in this section, and introduce a dominated choice structure into Chor as a substitution of the ordinary choice. We need also the corresponding concept to express the communication of the dominance from one role to other roles in the role language, and thus introduce a guarded choice structure with communications as guards for this. We define a new pro jection, an extension of nproj, to produce correct role processes. After the study of choices, we turn to loops. Based on the concept of dominant role, we introduce another new structure called dominated loop. We will extend the role language with some iteration structures, and define the pro jection rules. From the view at the end of Sec.4.1, this part of the work is an effort related to the language and pro jection, towards correctness of Equation (1) in general. In the forthcoming discussion, we will not consider problems related to sequential composition, are they are already discussed before. The new pro jection proposed can deal with all the extensions. With it, all these structures are "implementable" in a reasonable sense. Session: Orchestration and Choreography implementation. We can consider the designation as a directive, similar to the pragma in some programming languages, which gives the compiler clues for compiling while having no effect on the semantics. i i i i Because (A1 A2 ) A3 = A1 (A2 A3 ), we can introduce multi-branch choice structures as a simple extension. Now we use the extended language to express the example discussed in the beginning of this section, a scenario where a customer wants to buy some product from two sellers. Example 7. Suppose the customer is role R1 , and the two sel lers are R2 and R3 respectively. Here is a choreography that describes the protocol of their interactions: C9 = (c1[1,2] ; a2 ; c2[2,1] 1 (c5[1,2] ; a2 ; c6 2 [2,1] 1 c3[1,3] ; a3 ; c4[3,1] ); a1 ; 1 1 c7[1,3] ; a3 ; c8[3,1] ) 2 Initial ly, the customer informs sel lers what is wanted. The sel lers do some local work (a2 and a3 respectively), and then 1 1 send responses to the customer (c2[2,1] and c4[3,1] ). After some local work a1 , the customer chooses a sel ler, that is de1 scribed by a dominated choice. In each branch of the choice, the customer continues the trade with one sel ler. Now we calculate the trace sets. To make things clear, we use T0 to represent the trace set produced by the paral lel structure. We have [[C9 ]] = T0 a a1 a{ c5[1,2] ; a2 ; c6[2,1] , c7[1,3] ; a3 ; c8[3,1] } 1 2 2 This example wil l be used throughout this section. he role language has a real extension -- the guarded choice: P : : = . . . | c1 ?P1 [] c2 ?P2 As to the semantics, we need only to extend the definition of fst and P /, but not the semantic rules, because the guarded choice can be reduced only with other parallel branches. Here are the definitions: fst (c1 ?P1 [] c2 ?P2 ) = {c1 ?, c2 ?} b 8 when = c1 ? < P1 P2 when = c2 ? (c1 ?P1 [] c2 ?P2 )/ = b : otherwise We could take a general form c1 P1 [] c2 P2 , where both c1 and c2 can be either send or receive. However, for the problems considered here, current definition is enough. On the other hand, the multi-branch guarded choice c1 ? P1 [] . . . [] cm ?Pm is a necessity. With some extension on fst and P /, it is easy to include this extended structure into our language. We omit the detailed definition here. Now we present some laws relevant to guarded choices, where = represents semantical equivalence, as in Fig.2: c1 ?P1 [] c2 ?P2 = c2 ?P2 [] c1 ?P1 c1 ?P1 ; Q [] c2 ?P2 ; Q = (c1 ?P1 [] c2 ?P2 ); Q c?P1 [] c?P2 = c?; (P1 P2 ) We assume that [] has the same priority as , thus has less grouping power than the sequential composition. We will still use [[·]] for the semantics of both Chor and the role language. It causes no confusion. T 5.1 Dominated Choice Now we consider the language extension to Chor for the reasonable implementation of choice structures. 5.1.1 Language Extensions The first modification to Chor is a new choice structure introduced to replace the ordinary one, with the form: A::=...| A i i A |... A A represents a choice with R as its dominant role. The semantics of this choice is the same as A A from the global view, i.e., the choreographic view. On the other hand, i plays a critical role in the pro jection discussed below. Here is the semantic rule for it: [[A i i A]] = [[A1 ]] [[A2 ]] b All the laws held for the ordinary choice are true when we replace it with the dominated choice. An important and interesting property is: [[A i A]] = [[A j A]] even if i = j . It is not a surprise, because in choreographies, the roles always make choice consistently. The designation of i has no effect on semantics of this level, but only on the 978 WWW 2007 / Track: Web Services Session: Orchestration and Choreography Please notice that both dproj(C9 ) and dproj(C9 ) have the intended behavior. In any run, the role processes wil l perform a course of activities described in the choreography, with some auxiliary synchronization. Furthermore, they wil l never run into dead lock. 5.1.2 Projection i Now we consider a pro jection dproj for the extended Chor, which will take care of the dominant marks on choices. Suppose A1 A2 is a structure in a choreography with n roles. For each number j = i, we should introduce two fresh channels, namely cj and cj . The pro jection of A1 role Rj (j = i) takes the form of a guarded choice: dproj(A1 i i 5 A2 on .1.3 Retrospect We list some interesting issues related to dominated choices here: · The communications inserted by dproj are inevitable for obtaining a consistent implementation. This kind of synchronization is rather tedious if written by hand. However, they can be generated automatically as shown. This benefits choreography designers. · The communications inserted by dproj serve only for synchronization. They are not substantial in the sense they do not imply any real work. After filtering them out with acts (C ), we obtain the original trace set. Thus, what we achieve is an equation similar to (4). A2 , j ) = cj ?dproj(A1 , j ) [] cj ?dproj(A2 , j ) b i For role Ri , the pro jection produces an ordinary choice: dproj(A1 where 1 = j i..nj =i cj ! 2 = j i..nj =i cj ! As a result, when the execution of the roles arrive at their versions of the choice structure, role Ri makes the real choice, and notifies all the other roles which branch it selects. Thus, those roles will take the same branch in their versions of the choice consistently. These pro jection rules involve consistent selection of the new channel names, thus can not be defined separately. However, the rules above are accurate enough, and can be implemented straightforwardly. The rules of dproj for the basic activities and other structures take the same form as what for nproj (Fig.4). We do not rewrite them again. Example 8. Now consider the customer-sel ler choreography in Exam.7. Applying dproj for each role, we have: dproj(C9 , 2) = c1[1,2] ?; a2 ; c2[2,1] !; 1 (c9[1,2] ?c5[1,2] ?; a2 ; c6[2,1] ! [] c10[1,2] ?skip) 2 dproj(C9 , 3) = c3[1,3] ?; a3 ; c4[3,1] !; 1 (c11[1,3] ?skip [] c12[1,3] ?c7[1,3] ?; a3 ; c8[3,1] !) 2 dproj(C9 , 1) = c1[1,2] ?; a2 ; c2[1,2] ?; a2 ; 1 1 ( (c9[1,2] ! c11[1,3] !); c5[1,2] !; c6[1,2] ? (c10[1,2] ! c12[1,3] !); c7[1,3] !; c8[1,2] ?) We list the process for R1 as the last one only for making the presentation clearer. When R1 takes the first branch in the choice, R2 and R3 wil l receive a notification from channels c9[1,2] and c11[1,3] respectively. Then R3 wil l end immediately, while R2 going on its work further. If R1 takes the other branch, things wil l be reversed. Now we consider an interesting modification, to see what happens if we make R2 the dominator of the choice. We cal l this modification C9 , and have the fol lowing results: dproj(C9 , 1) = c1[1,2] ?; a2 ; c2[1,2] ?; a2 ; 1 1 ( c9[2,1] ?c5[1,2] !; c6[1,2] ? [] c10[2,1] ?c7[1,3] !; c8[1,2] ?) dproj(C9 , 3) = c3[1,3] ?; a3 ; c4[3,1] !; 1 ( c11[2,3] ?skip [] c12[2,3] ?c7[1,3] ?; a3 ; c8[3,1] !) 2 dproj(C9 , 2) = c1[1,2] ?; a2 ; c2[2,1] !; 1 ( (c9[2,1] ! c11[2,3] !); c5[1,2] ?; a2 ; c6[2,1] ! 2 (c10[2,1] ! c12[2,3] !)skip) Although this implementation may be thought unfair because R2 might tend to make the choice self-beneficial ly, the implementation works perfectly. A2 , i) = 1 ; dproj(A1 , i) b 2 ; dproj(A2 , i) · From the [[A A]] = [[A A]] and an equation similar to (4), we know that with any set of the designations of dominant roles for each choice in the choreography, the implementation obtained will be equivalent on an abstract level, (i.e. the choreography level). In this case, choreography designers can choose any role as the dominator for a choice, thus they can choose the most reasonable one. The pro jection will take care of all details of the implementation. A formal proof of such a theorem is an aspect of our ongoing work. · The pro jection rules given are for the most general cases. We have many opportunities to reduce the unnecessary communications for a more efficient implementation. This can be seen as optimization, and is an interesting future work. Thinking about the problem related to the choice structure further, we can find that the dominated choice A A is not a necessity theoretically. In fact, we can make an arbitrary choreography C "implementable" by randomly selecting one role as the dominator for each choice in C , and use the technique presented above to produce the role processes. The result will work fine and is deadlock-free. However, it makes a real sense for the choreography designer to determine the dominant roles, because the random selection might produce a result that is unacceptable practically. As shown in the customer-seller example, taking one seller as the dominant role of the choice, we can also get a consistent implementation in the role process level. But this designation is unacceptable practically, because the dominant seller will make the choice in favor of her/himself generally, resulting in an unfair implementation. Anyway, dominant role is the most important concept in the implementation, no matter whether explicitly specified or randomly selected. i i j 5.2 Dominated Loop One important weakness of the framework developed yet is that the choreographies can only have behavior with finite traces, because there is no iterative or recursive structure in the languages. Now we extend Chor further to deal with 979 WWW 2007 / Track: Web Services this problem. With the help of the concept dominant role, we have a clear approach already. Session: Orchestration and Choreography Here are the semantical rules for the simple loop: P - 5.2.1 Language Extensions P - P ; P People usually use P to denote an abstract loop, which causes its body P to execute zero of more times. Theoretically, P is in fact a choice with infinite branches P = skip P (P ; P ) (P ; P ; P ) ... Thus we can apply the same technique for choices to introduce loop structures into our languages. Because all roles should go consistently on either iterating further or exiting, we need a dominant role to make the decision and notify the others, to enforce them going always the same way. We introduce a new structure dominated loop into Chor, with the syntax: A ::= ...| A It means that activity A will be executed zero or more times, and Ri is the dominant role of this loop who decides if the next iteration is necessary. To define the semantics of A, we introduce the star operator · on traces. For any trace t, we have: S t0 = b tn+1 = tn a t b t = iN {ti } b where is the set of natural numbers. We lift the star operator to trace sets: S T = tT t b Now we can define the semantics of a loop: [[ A]] = [[A]] b We assume that has the highest priority. Thus, it applies only to the minimal structure after it. Here are some laws for the loop structure: A = skip (A1 ; A2 ) = skip skip A A = A i A = A i A = A The last law holds even if i = j . Thus, the designation of the dominant role is also a directive with no effect on the (globally-viewed) semantics of the choreography. We need to introduce two new structures into the role language, to express the behavior of the dominant and dominated roles. The dominant role iterates on its own, while the dominated role should continue or stop its iteration according to the notification from the dominator. We call the structure for the dominated roles as dominated loop in the role language, which is a kind of guarded command: P : : = . . . | P | c1 ? [] c2 ?P The process P repeats the behavior of P by zero or more times. For the dominated loop c1 ? [] c2 ?P , if a handshake happens on c1 , the loop terminates immediately; otherwise, if a handshake happens on c2 , then the loop waits for the next handshake after the execution of P . The loop can be seen as a kind of guarded loop proposed by Edsgar W. Dijsktra, with a breaking mechanism. Because it is not symmetric, we adopt a non-symmetric symbol [] . i j i i i i i i i For the dominated loop, we need not to define new rules, but only to extend the definition of fst and P /, as what we did for the guarded choice. Here are the relevant definitions: fst (P ) = b fst (c1 ? [] c2 ?P ) = {c1 ?, c2 ?} b 8 when = c1 ? < P when = c2 ? (c1 ? [] c2 ?P )/ = b : otherwise We assume that has the highest priority too, and [] has the same priority as and , thus has less grouping power than the sequential composition. 5.2.2 Projection i Now we consider the part of dproj related to dominated loops. Suppose C is a choreography with n roles including A as a part. When this dominated loop is pro jected into the role language, there will be an ordinary loop in the process corresponding to Ri , and a dominated loop in each of the rest role processes. For each role Rj where j = i, we need to introduce two fresh channels cj and cj , with the pro jection rule: dproj( A, j ) = cj ? [] cj ?dproj(A, j ) b For role Ri , we have: dproj( A, i) = (1 ; dproj(A, i)); 2 b where we have also 1 = j i..nj =i cj ! Let us see an example: Example 9. Now consider an extension of the scenario described in Exam.7. Suppose in this time, the customer wants to buy a number of products from the two sel lers: C10 = C9 There is a simple dominated loop, where each separate trade is described by choreography C9 . For implementation, we project C10 to the three roles: dproj(C10 , 2) = c13? [] c14?dproj(C9 , 2) dproj(C10 , 3) = c15? [] c16?dproj(C9 , 3) dproj(C10 , 1) = ((c14! c16!); dproj(C9 , 1)); (c13! 1 i i N 2 = j i..nj =i cj ! A; A A1 ; (A2 ; A1 ); A2 i i c15!) The results of projecting C9 have been given in Exam.8, and are omitted here. It is easy to verify that the paral lel composition of these three processes shows the intended behavior. We can also designate a sel ler, e.g., R2 , as the dominant role of the loop, and also as the dominator of the choice embedded in the loop: C10 = C9 In this case, we have dproj(C10 , 1) = c13? [] c14?dproj(C9 , 1) dproj(C10 , 3) = c15? [] c16?dproj(C9 , 3) dproj(C10 , 2) = ((c14! c16!); dproj(C9 , 2)); (c13! 2 c15!) 980 WWW 2007 / Track: Web Services Now sel ler R2 can sel l as many products as she/he wants. Although hard to accept, the implementation works fine. Session: Orchestration and Choreography Is the distinction important in the choreography world? All related work we read regards the relative speed of roles, but without a discussion about its rationality. However, from our view of point, this is questionable. Although the semantics used here is useful in exploring properties and problems of choreographies, it may not be the best choice, because the intention here is to describe collaborations between independent web services. Neglecting the independency of roles, we may obtain some results which are not important practically. Furthermore, to ensure the relative order, we need to insert many synchronizing actions (Sec.4.3) which may be "unnecessary", and produce a slow implementation. If we do no care about the relative speed of different roles, we can get rid of Condition 1, and may obtain more effective implementations. Now we are looking for other reasonable semantics that can reflect the situation better. We hope that with the semantics, the correctness of our implementation approach can be proved formally. In fact, here we have a dilemma: if we allow roles to run freely except observing the explicit synchronization written in the choreography, the three choreographies listed above will be semantically indistinguishable. This might bring confusions to the choreography designers too. A ll the discussions in Sec.5.1.3 are applicable to the loops. Especially, we can also use ordinary loop structures and let the pro jection to choose randomly a dominator for each loop in the choreography, and make a viable implementation with the technique shown above. However, from the practical viewpoint, taking the concept of dominant role explicitly makes a real sense. 6. REMARKS In this paper we present some preliminary results of an effort towards the theoretical foundation of the choreography. Many related issues are not touched or only partially developed. We discuss some of them in this section. Implementation. WS-CDL document [9] defines a choreography as a multi-participant contract from a global perspective. It does not define clearly the "correct" implementation of a choreography, but proposes only the concept conformance. It says "each participant can then use the global definition to build and test solutions that conform to it. The global specification is in turn realized by combination of the local systems, on the basis of appropriate infrastructure support". This leaves the definition of implementation open, and creates confusion. Some researchers take a global view where a choreography is thought just a (business) process, and an implementation is any of the processes which shows the behavior required by the choreography. For example, N. Busi et al. [4] gave a definition following this vein. On the other hand, we take the role-based view or local view, and consider that a choreography describes a (business) process implemented by a set of roles. An implementation is a set of processes where each one implements one of the roles in a clear way, and the combination of them has the required behavior. The fundamental dissimilitude of these two viewpoints is whether the roles and activities described in the choreography must have their reincarnations in any admitted implementation. The local view says that these details are really important, but the global view says it is not the case. We think that the definition of implementation based on the global view is too loose. The extreme case is an implementation with only one process, which executes all activities of all roles in the choreography, and makes all the communications as local assignments. The approach proposed here takes the way of pro jection and local conformance, where each role is a real process in any valid implementation which runs independently in parallel with other processes corresponding to other roles. Thus, the pro jections is very important in the implementation, and should be studied further. We also defined and studied the concept of local conformance in previous work [13]. Semantics. The semantics used for Chor and generally, for choreographies, is also a topic to study. We take here all the relative order between activities described in choreography significant, even if they appear in different (and independent) roles, thus distinguish choreographies such as a1 ; a2 a2 ; a1 a1 a2 Language. As shown in this paper, we are still in the early stage of defining a language for choreography. We propose some novel structures for the choreography language and the process language in this paper, based mainly on the implementation view point. The most important concept is dominant role. From our view, this concept must have its position in any language designed for choreographies. There are some other clues about the design of the languages. In our previous paper [13], we gave a simple example which shows one weak point of the common structural flow structures for the description of choreography. Another example is the controversial issues about the control structure workunit of WS-CDL [3]. Aalst et al. listed some challenges including defining a "real" choreography language in their paper [16]. It seems that lot work should be done before a widely accepted result in this field. 7. CONCLUSION The rapid-developing web technology is creating an environment where increasingly computation tasks are carried out by multiple processes (services) residing over the Internet. Due to the nature of web services, to guarantee the correct interaction of independent, communicating services become even more critical [14]. This is the motivation under the development of WS-CDL [9] for specifying choreography, i.e., the global observation of business protocol among participants (roles). Because of these situations, a deep and thorough understanding of the essence of choreography and relevant problems is urgent and important. N. Busi et al. formalized choreography and orchestration using process algebra, with a global view on conformance [4] (see Sec.6). J. Mendling and M. Hafner proposed a simple pro jection from WS-CDL to BPEL [11] without a discussion about its correctness. Another interesting reference is the Pi4SOA pro ject [1]. In our previous work, we defined formal models for choreography, and proposed and studied the natural pro jection and Equation (1), in the context of verification [17] and implementation [13]. 981 WWW 2007 / Track: Web Services As the work related to ours the most, Carbone, et al. [5] studied a two-level paradigm for the description of communication behaviors, on the global message flows and endpoint behavior levels. Three principles for well-structured global description and a theory for pro jection were developed, with a condition and a framework for relating the global description of communication-centric software to its local correspondents. Fu et al. [6] proposed a framework for modeling global behaviors of electronic services composed by autonomous peers. The global protocol is specified with Buchi automata. A trace semantics recording the sending ¨ messages is given, while pro jection of protocol, pro jection of trace, and the semantics of composed peers are defined. The Sequencing Constraints language of the SSDL protocol framework [15] pays attentions on the description of protocols for multiparty collaboration using message-oriented programming abstractions. Li and He [10] discussed also the relationship between choreography and orchestration. Compared with our work, none of these work introduced the concept of dominant role explicitly into the languages, nor dominated choice and dominated loop structures. In this paper, many basic concepts related to choreography are studied. A small language Chor for choreography and a simple process language for the roles from local viewpoints are defined, both with formal syntax and semantics. Based on them, we discuss the concept of pro jections, which map a given choreography into a set of role processes. A special mapping named natural projection is discussed in detail. With this pro jection, we define a level of well-formedness, and propose two structural conditions. We discuss also a pro jection which can remedy the sequential order problem of choreographies. The most important contribution of this paper is recognition of the concept dominant role which is critical in the implementation of any choice or iteration structures in choreographies. We suggest to take dominant role as a languagelevel concept, and attach it to each choice and loop structures. With this concept, choreography designers can express their desire clearer. And we can have also a clearer model for the implementation of choreographies. Based on the concept dominant role, we extend our languages with some novel language structures related to dominant roles, including dominated choice and dominated loop structures on both choreography and role process levels. We propose a pro jection, which is an extension of the natural pro jection with rules about dominated choices and dominated loops. There are different rules for the dominant role and dominated roles. As a result, the parallel composition of the generated role processes will be an implementation of the choreography. We show that the selection of the dominant role for each choice or loop has no effect on the choreography-level semantics, but only on the implementation. Thus, the choreography designer can make their selection as desired, and the pro jection will always generate a correct implementation, where all necessary actions to synchronize the processes are generated automatically. We discuss some additional problems related to choreography in the paper, including the local viewpoint about the implementation vs. the global one, the rationality of the semantics, etc. A number of examples are presented in some sections to help the readers get better understanding of the properties of choreographies, and of the interesting phenomena appeared in this field. Session: Orchestration and Choreography 8. ACKNOWLEDGMENTS The authors would like to thank Shengchao Qin and Xueshan Feng for their helpful comments. 9. REFERENCES [1] Pi4SOA. http://www.pi4soa.org/. [2] T. Andrews, F. Curbera, H. Dholakia, et al. Business Process Execution Language for Web Services, version 1.1, 2003.5. http://ifr.sap.com/bpel4ws/. [3] A. Barros, M. Dumas, and P. Oaks. A Critical Overview of the Web Services Choreography Description Language. www.bptrends.com, 2005. [4] N. Busi, R. Gorrieri, C. Guidi, et al. Choreography and orchestration conformance for system design. In Coordination 2006, LNCS 4038. Springer, 2006. [5] M. Carbone, K. Honda, and N. Yoshida. Structured global programming for communication behaviour. http://www.pi4tech.com/xwiki/bin/view/ research/papers, 2006. [6] X. Fu, T. Bultan, and J. Su. Conversation protocols: A formalism for specification and verification of reactive electronic services. In Proceedings of CIAA 2003, LNCS 2759, pages 188­200. Springer, 2003. [7] C. A. R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985. [8] N. Kavantzas. Aggregating web services: Choreography and WS-CDL. Technical report, Oracle Corporation, 2004. [9] N. Kavantzas, D. Burdett, G. Ritzinger, et al. Web Services Choreography Description Language, version 1.0, 2005. http: //www.w3.org/TR/2005/CR- ws- cdl- 10- 20051109/. [10] J. Li and J. He. Conformance validation between choreography and orchestration. Software Eng. Inst., East China Normal University, manuscript, 2006. [11] J. Mendling and M. Hafner. From inter-organizational workflows to process execution: Generating BPEL from WS-CDL. In OTM 2005, LNCS 3762. Springer, 2005. [12] R. Milner. Communication and Concurrency. Prentice Hall, 1989. [13] Z. Qiu, C. Cai, X. Zhao, and H. Yang. Exploring into the essence of choreography. Preprint 2006-63 of Institute of Mathematics, Peking University, http: //www.math.pku.edu.cn:8000/en/preindex.php. [14] G. Salaun, L. Bordeaux, and M. Schaerf. Describing and reasoning on web services using process algebra. In 2nd Inter. Conf. on Web Services. IEEE, 2004. [15] J. W. Simon Woodman, Savas Parastatidis. Sequencing Constraints SSDL Protocol Framework. http://ssdl.org. [16] W. van der Aalst, M. Dumas, A. ter Hofstede, et al. Life after BPEL? In WS-FM 2005, LNCS 3670. Springer, 2005. [17] X. Zhao, H. Yang, and Z. Qiu. Towards the formal model and verification of web services choreography description language. In WS-FM 2006, LNCS 4184. Springer, 2006. 982