Complexities for Generalized Models of Self-Assembly
Gagan Aggarwal Michael H. Goldwasser Ming-Yang Kao Robert T. Schweller§

Abstract In this paper, we extend Rothemund and Winfree's examination of the tile complexity of tile self-assembly [6]. o They provided a lower bound of ( lolg lg NN ) on the tile og complexity of assembling an N × N square for almost all N . Adleman et al. [1] gave a construction which achieves this bound. We consider whether the tile complexity for self-assembly can be reduced through several natural generalizatio of the model. One of our results ns is a tile set of size O( log N ) which assembles an N × N square in a model which allows flexible glue strength between non-equal glues (This was independently discovered in [3]). This result is matched by a lower bound dictated by Kolmogorov complexity. For three other geno eralizations, we show that the ( lolg lg NN ) lower bound og applies to N × N squares. At the same time, we demonstrate that there are some other shapes for which these generalizations allow reduced tile sets. Specifically, for thin rectangles with length N and width k , we provide a tighter lower bound of ( Nkk ) for the standard model, yet we also give a construction which achieves o O( lolg lg NN ) complexity in a model in which the temperog ature of the tile system is adjusted during assembly. We also investigate the problem of verifying whether a given tile system uniquely assembles into a given shape, and show that this problem is NP-hard. 1 Intro duction The tile assembly model extends the theory of Wang tilings of the plane [7] by adding a natural mechanism for growth. Under this model, we can consider a system of Wang tiles to be analogous to a computer proof Computer Science, Stanford University, Stanford, CA 94305. Email: gagan@cs.stanford.edu. Supp orted in part by Stanford Graduate Fellowship, NSF Grant IIS-0118173, NSF ITR Award Numb er 0331640, an SNRC grant, and grants from Microsoft and Veritas.  Department of Mathematics and Mathematical Computer Science, Saint Louis University, 221 North Grand Blvd., St. Louis, MO 63103. Email: goldwamh@slu.edu.  Department of Computer Science, Northwestern University, Evanston, IL 60201. Email: kao@cs.northwestern.edu. Supported in part by NSF grant EIA-0112934. § Department of Computer Science, Northwestern University, Evanston, IL 60201. Email: schwellerr@cs.northwestern.edu. Supported in part by NSF grant EIA-0112934.
 Department
1

gram and the shape produced by the system analogous to the output of the program. Rothemund and Winfree [6] introduced the study of the tile complexity of self-assembled shapes, defined as the minimum number of distinct Wang tiles required to assemble the shape. They studied the tile complexity of selfassembled squares. We extend that study by considering the tile complexity of self-assembled squares and rectangles for natural generalizations of the tile assembly model. In the flexible glue model, we remove the restriction imposed by Rothemund and Winfree [6] that different glue types have bonding strength of zero. In the multiple temperature model, we permit the temperature of the system to change during the assembly process. In the multiple tile or q -tile model, we allow tiles to cooperate by assembling into supertiles before being added to the growing result. In the unique shape model, we say that a system uniquely assembles a shape if the shape of its resultant supertiles is unique, though its produced supertile may not be unique. In the standard model, Kolmogorov complexity o dictates a lower bound of ( lolg lg NN ) for self-assembling og N × N squares for almost all values of N [6]. Adleman et al. [1] have shown how to reach this bound for all N . We show (in Thm. 6.1) that this lower bound also applies to each of our models except for the flexible glue model. For this model, Kolmogorov complexity only dictates  an ( log N ) lower bound (as shown in Thm. 6.2). We show how to achieve this bound for all N by encoding an arbitrary (log N )-bit binary number into the glue function (Thm. 5.1). Additionally, we provide a lower bound of ( Nkk ) (see Thm. 3.1) for assembling thin k × N rectangles (rectangles whose width k is less than log N These lower log log N -log log log N in their length N ). bounds show that it can require significantly larger tile sets to assemble thinner rectangles than thicker rectangles. With this in mind, we utilize the multiple temperature model to construct a thin rectangle by first constructing a thicker rectangle using a small tile set. We then raise the temperature of the system so that portions of the larger rectangle fall apart, leaving the desired rectangle (Thm. 4.1). Given a tile system, Adleman et al. [2] give an
1

Copyright © 2004 by the Association for Computing Machinery, Inc. and the Society for industrial and Applied Mathematics. All Rights reserved. Printed in The United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Association for Computing Machinery, 1515 Broadway, New York, NY 10036 and the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688

880


algorithm to verify whether the tile system uniquely assembles into a given supertile. However, in the unique shape model, a tile system could uniquely assemble into a shape even if it does not produce a unique supertile. We investigate the problem of verifying whether a given tile system uniquely assembles into a given shape in the unique shape model, and show that this problem is co-NP-complete (Thm. 7.1). This result is in contrast to the polynomial-time verification algorithm for the standard model. Pap er Layout: In Section 2 we define the standard tile model as well as our generalized models. In Section 3 we introduce a construction for assembling k × N rectangles and provide a lower bound on the tile complexity of such rectangles. In Section 4 we use our construction to show how the multiple temperature model can reduce tile complexity when k N . In Section 5 we use the flexible glue model to reduce the tile-complexity of assembling N × N squares from  o ( lolg lg NN ) to ( log N ). In Section 6 we discuss og the lower bounds for assembling N -dimensional shapes for our models as dictated by Kolmogorov complexity. Finally, in Section 7, we show the co-NP-completeness result for the problem of verifying whether a given tile system uniquely assembles into a given shape. 2 Basics 2.1 The Standard Mo del. For alternate descriptions of the tile assembly model, see [1, 2, 6]. Briefly, the tiles in the model are four sided Wang tiles denoted by ordered quadruples (n , e , s , w ). The entries of the quadruples are glue types taken from an alphabet  representing the north, east, south, and west edges of the Wang tile, respectively. For a given tile t, we use the function north(t) to denote the glue on the north edge of t, and similarly define east(t), south(t) and west(t). A tile system is an ordered quadruple T, s, G,  where T is a set of tiles called the tileset of the system,  is a positive integer called the temperature of the system, s  T is a single tile called the seed tile, and G is a function from 2 to {0, 1, . . . ,  } called the glue function of the system. It is assumed that G(x, y ) = G(y , x), and  null   such that G(null, x) = 0 x  . In the standard tile assembly model [1, 2, 6], the glue function is such that G(x, y ) = 0 when x = y . When we are dealing with the standard glue model, we denote G(x, x) by G(x). We d{ fine a configuration to be a mapping from Z2 e to T empty} where empty is a special tile whose edges have no bonding strength with any glue type. For a configuration C we say a tile t  T is attachable at (i, j ) if C (i, j ) = empty and G(n , south(C (i, j + 1))) +

G(e , west(C (i + 1, j ))) + G(s , north(C (i, j - 1))) + G(w , east(C (i - 1, j )))   . Informally, for a tile t attachable to configuration C at (i, j ), we define the act of attaching t to C at (i, j ) as a way to produce a new configuration from C in which the empty tile at (i, j ) is replaced by t. We define the adjacency graph of a configuration C as follows. Let the set of vertices be the set of coordinates (i, j ) such that C (i, j ) is not empty. Let there be an edge between vertices (x1 , y1 ) and (x2 , y2 ) iff |x1 -x2 |+|y1 -y2 | = 1. We refer to a configuration whose adjacency graph is finite and connected as a supertile. For a supertile S , we denote the number of vertices w (tiles) in the supertile by size(S ). For any supertile S hose adjacency graph is a subgraph of the graph of a supertile S , we say S is a sub-supertile of S . A cut of a supertile is a cut of the adjacency graph of the supertile. In addition, for each edge ei in a cut define the edge strength si of ei to be the glue strength (from the glue function) of the glues on the abutting edges of the adjacent tiles corresponding to ei . Now s define the cut strength of a cut c to be i for each edge ei in the cut. In the standard model, assembly takes place by growing a supertile starting with tile s at position (0, 0). We permit any t  T that is attachable at some position (i, j ) to attach and thus increase the size of the supertile. For a given tile system, any supertile that can be obtained by starting with the seed and attaching arbitrary attachable tiles is said to be produced. If this process comes to a point at which no tiles in T can be added, the resultant supertile is said to be terminal ly produced. For a given shape S , we say a tile system  uniquely produces shape S if  terminally produces a unique supertile with shape S . The tile complexity of a shape S is the minimum tile set size required to uniquely assemble S under a given assembly model. 2.2 Four Generalized Mo dels

The Multiple Temp erature Mo del. In the multiple temperature model, we replace the integer temperature  in the tile system description with a sequence of integers {i }k=1 called the temperature sequence of i the system. We refer to a system with k temperatures in its temperature sequence to be a k -temperature system. In a k -temperature system, assembly takes place in k phases. In the first phase, assembly takes place as in the standard model under temperature 1 . Phase 1 continues until no tiles can be added. In phase two, tiles can be added or removed under 2 . Specifically, at any point during phase 2 if there exists a cut of the

881


Tile Complexities of Self-Assembling k × N Rectangles, k  N Thin Rectangles LB UB Standard
Nk k
1

Thick Rectangles LB UB
log N log log N

(Thm. 3.1) Flexible Glue
Nk k
1

Nk (Thm. 3.2) Nk (Thm. 3.2)
1

1

(see [6])  (Thm. 6.2)

(see [1]) log N (Thm. 5.1)

(Thm. 3.1) Multi-Temperature

log N log log N

log N log log N

(Thm. 6.1) q -Tile
log N log log N

(Thm. 4.1) Nk (Thm. 3.2) Nk (Thm. 3.2)
1 1

(Thm. 6.1)
log N log log N

(see [1])

(Thm. 6.1) Unique Shape
Nk k
1

(Thm. 6.1)
log N log log N

(see [1])

(Thm. 3.1)

(Thm. 6.1)

(see [1])

Table 1: This table gives the general flavor of our results regarding upper and lower asymptotic bounds on tile complexity under our models for k × N rectangles, but the reader should reference precise details in the stated o theorems. A rectangle is thin when k < log log N l-goN log log N and thick otherwise. lg resultant supertile with cut strength less than 2 , we can remove the portion of the supertile occurring on the side of the cut not containing the seed tile. Also, at any point in the second phase we can add any tile in T to the supertile if the tile is attachable at said position under temperature 2 . The second phase of this assembly continues until no tiles can be added or removed. We then go to phase 3 in which we can add and remove tiles under temperature 3 . We continue this process up through k . If during each phase of the assembly we must reach a point when no more tiles can be added or removed regardless of the choice of additions and removals, then we say the tile system terminal ly produces the shape assembled after phase k . If a tile system always ends with a unique terminally produced supertile R, we say the tile system uniquely assembles the shape of R under the k -temperature model. model, we redefine what we mean for a system to uniquely produce a shape S . In this model, a tile system uniquely produces a shape S if the only terminal supertiles produced by the tile system are of shape S . Thus we allow the system to produce many different supertiles as long as they all have the desired shape.

The q -Tile Mo del. In the q -tile or multiple tile model we allow tiles in the system to combine into larger supertiles of size at most q before being added to the growing seed supertile. Specifically, for a tile set T , we define a set of addable supertiles WT as follows: First, T  WT . Second, if a supertile r is produced by abutting two supertiles s, t  WT , then r is also in WT if the sum of the edge strengths of the abutting edges of s and t reach or exceed the temperature of the system and size(s) + size(r)  q . In the q -tile assembly model, we allow the addition of supertiles from the set WT as well as T . Specifically, a supertile in WT can be The Flexible Glue Mo del. In the flexible glue attached to the growing seed supertile at any position model, we remove the restriction that G(x, y ) = 0 for such that the edge strengths of the abutting edges sum x = y . This reduces the lower bound for assembling N to at least the temperature of the system. For q = 1,  o dimensional shapes from ( lolg lg NN ) to ( log N ). we have WT = T , which gives us the standard model of og assembly. The Unique Shap e Mo del. In the unique shape

882


3 The Assembly of k × N Rectangles In this section, we present both an upper and lower bound for assembling k × N rectangles for arbitrary k and N . The upper bound comes from a simple con1 struction which constitutes a k -digit, base-N k counter 1 and has tile complexity (N k + k ). We also conjecture that it has optimal time complexity (as defined in [1]) of (k + N ). In later sections, we use modifications of this counter to show how both the 2-temperature and flexible glue model can reduce tile complexity of certain shapes. Additionally, for the case of k = log N , this construction constitutes a log N -bit, base-2 counter which assembles a log N × N block. The tile set created by this special case was independently discovered by Cheng and Espanes [3]. They show that it assembles a log N × N block in optimal time complexity (N ). This permits the assembly of N × N squares in optimal tile complexity and optimal time complexity as in [1], but does so in a much simpler fashion and uses only temperature  = 2.

i from 1 to m-1 Hairpin Tiles i from 0 to m-1 Normal Tiles g ui ui r u0 g ui R Hi r g P Hi p p

Seed (S0) and Seed Column Tiles u0 ui-1

Sk-1
sk-1

hi

...
u0 um-1 c1

i
g

s2

R
r

d0 d0

S1
s1

P
p

S0

Return Probe and Probe Tiles (marked R and P respectively) Chain Tiles p g g g r

c0

Cm-1

Cm-2
cm-1

...
cm-2 ci+1

Ci

...
ci c2

C1

c1

C0

c0

Figure 1: A tile set to assemble a k × mk rectangle in Theorem 3.1. The tile complexity of self-assembling a (k + m) tile complexity under the standard assembly 1 k×N rectangle is ( Nkk ) for the standard model, the model. unique shape model, and the flexible glue model. We show this by first providing a construction for the standard model and then arguing that the construction works for the unique shape model and the q -tile model as well. Proof for the Standard Model. For a given N , let 1 m = N k . To show this bound, we give a general 1 tile set to assemble a k ×mk rectangle in O(N k + k ) tiles under the standard model. We then show how to adjust the tile set so it produces a k × N rectangle. The tile system we use constitutes a k -digit base-m counter. The system has temperature  = 2 and the tile set and glue strength function are given in Figure 1. The assembly takes place as follows. The north edge of the seed tile produces a length k seed column from the seed column tiles. The west edge of the seed produces a length m chain from the chain tiles. The 0 normal tile can then fill in all the rows and columns up until P column m - 1. In this column the H1 hairpin tile must be placed. This causes a hairpin growth which causes another length m chain of chain tiles to be placed in the first row. The next section of m columns are then filled with the 0 normal tile in all rows but the second, which will contain the 1 normal tile or a hairpin tile. In general, when the Cm-1 chain tile is placed, probe tiles are added on top of each other until a row is found that does not consist of the m - 1 normal tile. In such a row the appropriate pair of hairpin tiles are placed and

Proof. Suppose we had a tile system with fewer than 1 N ( 2(k!) ) k distinct tile types. Then there must be fewer N than 2(k!) distinct k tile columns consisting of these tiles. So in a k × N block, there must exist some k -tile column configuration which is repeated in more than 2(k !) columns. For each of these identical columns we can assign an ordering on the k tiles that corresponds to a possible relative order in which the k tiles of the given column could be placed. Since there are at most k ! orderings possible, we get that at least 3 of the identical columns must also have an identical ordering. From this we derive our contradiction as follows. Wherever the seed tile of the construction occurs, it lies to either the west or east of two of the identically placed columns. And if the seed tile occurs to the west (east) of a column, then all tiles east (west) of the column are determined by (1) the tiles and their positions in the column, and (2) the relative placement order of the tiles in the column. Thus we have a contradiction. This implies that the size
N of the tile set is at least ( 2(k!) ) k = ( Nkk ).
1 1

Remark. We also point out that the argument for Theorem 3.1 applies to any length N shape whose height at each column is bounded by a value k . Theorem 3.2. The tile complexity of self-assembling a 1 k × N rectangle is O(N k + k ) for the standard model, the unique shape model, and the q -tile model.

883


a downward growing column of return probe tiles are placed until the bottom row is encountered, at which point the C0 chain tile is placed to start the length m chain growth over again in the first row. If no such row is encountered, the assembly finishes. The idea here is that the bottom chain row represents the least significant digit of the counter, thus incrementing every column. After each block of m we need to find the most significant digit that requires incrementation, as well as rollover all the trailing digits displaying m - 1 to 0. The hairpin tiles perform the incrementation, while the probe and return probe tiles perform the rollover. It is easy to see that for a given k and m, this construction assembles into a k × mk block. In addition, we can adjust the glue types of the west edges of the seed column and seed tiles to represent any k -digit, base-m number. By doing this we can start the counter at any number between 0 and mk - 1. Thus we can designate the length of the constructed shape to be any number between 1 and mk . Therefore, to assemble a 1 a k × N block, we use the above tile set with m = N k nd the west edges of the seed and seed columns tiles set to represent the number mk - N . Thus we 1 can construct a k × N block in 4m + k = O(N k + k ) tiles. Proof for the Unique Shape Model. This follows from the construction for the standard model. Proof for the q -Tile Model. this version. 4 We omit this proof in
c0

i from 1 to m-1 Hairpin, Return Probe and Probe Tiles (marked H, R and P respectively) ui ' g' ui ' ui ' r u0 '

g'

g' R' hi

Seed (S0) and Seed Column Tiles P' ui-1 '
' u0

Hi
r

Hi
p'

Sj'-1
sj-1

'

g'

R'
r

d0 d0 ''

P'
p'

um-1 '

...
u0 ' u0 '

i'

'

p'

sk+2

'
g'

' Sk+1
sk+1

g' R* hi

ui* g' ui* ui* r u0 *

Hi
r

Hi
p'

P* p

ui*1 -

' Sk
sk

i
g

*

u0 * um-1 * u0

'

S* -1 k
sk-1

R*
r g

d0 d0 **

P*
p

i from 0 to m-1 Normal Tiles

Sk-2
sk-2

ui g ui ui r u0

Hi
r

R hi

Hi
p

P p

ui-1 u0

...
S1 S0
r c1

g

s2

i
g

R
r

d0 d0

P
p

um-1 c1

s1

Chain Tiles p g

g

g

Cm-1

Cm-2
cm-1

...
cm-2 ci+1

Ci

...
ci c2

C1

C0

c0

Reducing Tile Complexity with the 2-Temp erature Mo del Now we show how a multiple temperature model can reduce tile complexity for assembling thin rectangles. For a given k and N with k N , the idea is to use a modification of the construction from section 3 to build a j × N rectangle for optimal (bigger than k ) j such that the top j - k rows are less stable than the bottom k . We then raise the temperature of the system to cause all but the bottom k rows to fall apart. We then compare the complexity of this construction with the lower bound for the standard model given in section 3 to show that the 2-temperature model can beat a lower bound for the standard model. Minimizing the Complexity. For a given j and N , assembling a j × N rectangle using the construction of Theorem 3.2 yields an upper bound that is a function of j . If we are only interested in constructing a rectangle of length N , but do not care about the width, we can choose j as a function of N such that the tile complexity is minimized. To do this we choose a 1 value for j that balances the size of N j and j . For

Figure 2: A tile set to assemble a k × N rectangle in 1 (j + m) tile complexity for m = N j under the 2temperature model with 1 = 4 and 2 = 6.

j=

is number of tiles used in our construction for building a j × N block for such a j 1 o is j + N j = ( lolg lg NN ), which meets the lower bound og dictated by Kolmogorov complexity [6]. We use this result in the following theorem. Theorem 4.1. Under the 2-temperature model, the tile complexity of self-assembling a k × N rectangle for an o arbitrary integer k  2 is O( lolg lg NN ). og
o Proof. In the case that k exceeds log log N l-goN log log N , lg we can simply use a single temperature model to build the two perpendicular axes of the rectangle in optio mal O( lolg lg NN ) complexity. The addition of a conog stant number of tiles can then fill in the rest of the

log N log log N -log log log N log N Thus the log log N .

o = ( lolg lg NN ), the term N j og

1

884


3' 2' 0' 1* C1

3' 2' 0' H* C0

3' 2' 0' H* C4

3' 2' 0' 0* C3

3' 2' 0' 0* C2

3' 2' 0' 0* C1

3' H' R' R* C0

3' H' P' P* C4

3' 1' 4' 4* C3

3' 1' 4' 4* C2

3' 1' 4' 4* C1

0 ui

1 ui ui

bi,0 ui ui

bi-1,m-1

i0
g0 i bm-1,m-1

i1
g1 i

Hi
r

R hi

Hi

P p

ui-1

b0,0 u0

X

um-1

i from 0 to m-1 0 Normal Tiles (left) 1 Normal Tiles (right)

i from 1 to m-1 Hairpin Tiles

S1
s1

c1 Seed Column Tiles gj g1

Figure 3: Here is a typical section of a self-assembled 5 × N block from the tile set in Figure 2 that will break down to a 2 × N block under temperature 6. Edges with strength 4 are marked by 4 adjacent arrows, while edges with strength 1 are marked with 1. All remaining edges have strength 3.

p p

Stop Tile Chain Tiles gm-2

S0
r

c0

Cm-1

Cm-2
cm-1

...
cm-2 cj+1

Cj

...
cj c2

C1

c1

C0

c0

Figure 4: This tile set creates a 2 × n block whose top rectangle. Otherwise, to reach the bound we define a row represents a given n-bit binary number b. Here bij tile system that assembles a j × N block for optimal is the value of the bit in position im + j in b. The glue 1 o value j = log log N l-goN log log N under temperature 1 function for glues gi and gj for i from 0 to m - 1 and lg 1 j from 1 to m - 2 is G(gi , gj ) = bij . All other pairs of and breaks down to a smaller k × N block in the secnon-equal glues have strength 0. ond phase of the two-temperature assembly process. As 1 in the construction from Theorem 3.2, let m = N j . Consider the two-temperature tile system with 1 = 4 and 2 = 6 and the tile set and glue strength function Thus from Theorem 3.1 the lower bound for the stan1 given in Figure 2. dard model is ( Nkk ) = ((log N )(log log N )). Since Under temperature 1 , this tile system assembles a this bound only gets higher for smaller values of k , the j × N block in exactly the same fashion as our single 2-temperature model beats it for all k between 2 and log N temperature system from Theorem 3.2. However, for 2 log log N . each tile in the top j - k rows other than the seed column  tiles, the north and south edges have glue strength 1. 5 Assembling N × N Squares in O( log N ) Tiles Therefore, since no glue in the system exceeds strength log N 4, we are assured that any cut consisting of one north- Kolmogorov complexity dictates an ( log log N ) lower south edge and one east-west edge has cut strength less bound on the tile complexity of self-assembling N × N than 2 = 6. Therefore, under the two-temperature squares for the standard model in which we limit the model each tile in the top j - k rows can be removed glue function so that G(x, y ) = 0 for x = y . However, one at a time, starting at the northwest corner of the if we lift this restriction, the bound no longer applies. ti j × N block. We are then left with the bottom k rows. In this section, we show that the le complexity of self-assembling N × N squares is ( log N ) under the Since our design ensures that each edge in the bottom k rows has strength 3 or greater, no cut of the bottom flexible glue model for almost all N . k × N supertile can have cut strength less than 6. Thus Theorem 5.1. The ti complexity of self-assembling le no cut can break up the remaining k × N block. And N × N squares is O( log N ) under the flexible glue since any alternate choice of cuts would only expedite model. the process to this end, we have uniquely constructed a 1 o k × N block in 10m + j = O(N j + j ) = O( lolg lg NN ) tile Proof. The trick as introduced in [6] is to be able to og complexity under the two-temperature assembly model. initialize a fixed length binary counter to any arbitrary (log N )-bit binary number. This can be done trivially For small values of k , this beats the lower bound for in O(log N ) tile complexity. By simulating base conlN o the standard model. Consider k = 2 looglog N . For such version, it can be done in O( lolg lg NN ) [1]. Here we cong  og 1 struct a 2 × log N block in O( log N ) tiles such that the k the value of N k is top row of the block encodes a given binary number b. 2 log log N 2 log log N N log N = (2log N ) log N = (log N )2 . We accomplish this by taking advantage of the flexible

885


2

0

1

1

0 1

1

0

1 0

1 1

1 1

0 1

0

1

1 1

0 1

1 0

0 1

2
C5 C4 C3

1

2

H2

R

H2

P

1

1

1

1

H1

R

H1

P

0

0

0

0

S1 S0

C2

C1

C0

C5

C4

C3

C2

C1

C0

C5

C4

C3

C2

C1

 Figure 5: Assembling an arbitrary n-bit binary number in O( n) tiles. Here we show the construction for n = 36 and binary number b = . . . 0110101110011010.

glue function and encoding b into the glue function. Let n = log N + 1 and m = n . Let b be a given n-bit binary number. Let bij be the bit in position im + j in b. We use the tile set and glue function from Figure 4 to construct a 2 × n block such that the top row of the block represents the number b. For convenience we denote some glues by the symbols 0 and 1. To simplify the illustration, the tile set in Figure 4 assumes that m2 = n. If this is not the case, we can do as before in Theorem 3.2 and initialize our 2-digit counter to an arbitrary value c so that we assemble a block of length exactly n. At any rate, our construction uses 5 n + 1 tiles and constructs a 2 × n block in the same fashion as the general k ×n assembly. Additionally, our glue function assures us that for the (i, j ) digit of b, the corresponding position in the top row of our construction can only be tiled with either a hairpin, seed column, or normal tile with north edge glue equal to bij . To complete the N × N square we need to create a fixed length binary counter that is initialized to our given binary number b and grows north, incrementing row by row, until 2n is reached. The addition of a constant number of stairstep tiles can then finish off the square, as shown in [6]. The fixed length binary counter can also be implemented in a constant number of tiles as is done in [1, 6]. Alternatively, we can use our construction from Section 3 for building a k × N block for k = log N . This yields a binary counter consisting of eight tiles plus log N seed column tiles. However, since we already have a "seed column" via our 2 × n seed block, we can omit the seed column tiles. Thus the total number of tiles used for the assembly of an N × N  square is only a constant greater than the ( log N ) tiles used to build the 2 × n seed block.

to the multiple temperature model, the unique shape model, and the q -tile model. For the flexible glue model,  we modify their argument to get a ( log N ) lower bound from Kolmogorov complexity, making our construction in Theorem 5.1 tight. Theorem 6.1. The tile complexity of self-assembling o N × N squares is ( lolg lg NN ) for the q -tile model, og the multiple temperature model, and the unique shape model. Proof. We note that any tile system with tileset size of n in the above models can be represented in O(n log n) bits assuming the maximal temperature of the system is bounded by a constant. Additionally, for each model there exists a constant size Turing machine that takes as input a tile system and outputs the maximum length of the shape produced by the given tile system under the corresponding model. As discussed in [6], Kolmogorov o complexity thus dictates a ( lolg lg NN ) lower bound on og the tile complexity of assembling an N × N square for almost all N . We omit the details of this proof in this version. Theorem 6.2. The ti complexity of self-assembling le N × N squares is ( log N ) under the flexible glue model for almost al l N .

Proof. First we note that there exists a Turing machine of constant size that takes as input a tile system and outputs the maximal dimension of the produced terminal supertile, assuming there is one, under the flexible glue model. Thus, if a tile set that assembles an N × N square is given as input, the output is N . Therefore, by Kolmogorov complexity, the sum of the size of the Turing machine plus the length of the 6 Kolmogorov Lower Bounds corresponding binary representation of the tile set must Rothemund and Winfree [6] have shown that Kol- have the lower bound (log N ) for almost all N . And o mogorov complexity dictates a ( lolg lg NN ) lower bound since the machine's size is fixed, this is a bound on the og on the tile complexity for self-assembling N × N squares number of binary digits required to represent the tile for almost all N . We show that their proof generalizes set. We can represent a flexible glue tile set of size n

886


tiles and temperature  bounded by a constant with f (n) = 4n log(4n) + (4n)2 log  = O(n2 ) bits as follows. For each tile we have four sides for which we must denote a glue. We need at most 4n distinct glues, and for each glue we need to store the bonding strength with every other glue in the model. This can be stored in a 4n × 4n matrix. Now for a given N , let  (N ) be the cardinality of the minimum tile set for assembling an N × N square under our flexible model. Then for almost all N there exist constants C1 , C2 , and C3 such that l C1 log N  f ( (N ))  C2  (N )2  C3  (N ) og N .  So  (N ) = ( log N ) for almost all N. Shap e Verification in the Unique Shape Mo del Given a shape, Adleman et al. [2] studied the problem of finding the minimum set of tiles which uniquely produces a supertile A, such that A has the given shape. To show that the decision version of the problem is in NP, they gave an algorithm to verify whether a given set of tiles uniquely produces a supertile which has the given shape. Note that they insist that the tile system should assemble into a unique terminal supertile. Under the unique shape model, we say that a tile system T uniquely produces a shape if all terminal supertiles of T (henceforth called T erm(T)) have the given shape, and no larger supertiles are produced. We note that this definition automatically implies that if a tile system uniquely produces a shape, then T erm(T) is non-empty. Definition 7.1. Unq-Shape( T, S , G,  , W ) is the u problem of verifying whether the tile system T, S , G,  niquely assembles into the shape W under the unique shape model. Theorem 7.1. Unq-Shape is co-NP-complete. It remains NP-hard even when the glue function is restricted such that G(,  ) = 0 for ,    with  =  and the temperature  = 2. We begin by showing that Unq-Shape is in co-NP, and then show that it is NP-hard (and thus co-NPhard). Lemma 7.1. Unq-Shape is in co-NP. Proof. For this, we need to show that if an instance of Unq-Shape is false, i.e. if the tile system in the instance does not assemble uniquely into the given shape, then there is a short proof of the fact. By definition, a given tile system T = T, S , G,  does not uniquely assemble into the given shape W iff one of the following occurs: 7

1. A terminal supertile of a shape different from W can be assembled. In this case, T erm(T) contains a supertile A with a shape different from W . Then A, along with the order in which the tiles join to assemble A would suffice as a proof. In order to check this proof, we first verify that A can indeed be assembled by adding the tiles in the order specified. Then we check that A is a terminal supertile by simply testing whether any tile is attachable at any of the empty sites adjacent to it. 2. A supertile C of size one larger than the given shape can be assembled. Note that this supertile need not be terminal. We note that if any such C exists, then there exists such a C with size one larger than W . In this case, C , along with the order in which tiles are added to assemble C would suffice as a proof. We verify this proof by checking that C can indeed be assembled by adding the tiles in the order specified, and that the size of the supertile is larger than the shape. In both cases, the size of the proof is linear in the instance size, and the verification algorithm runs in time quadratic in the instance size. Lemma 7.2. Unq-Shape is NP-hard. Proof. The proof uses a construction proposed by LaBean and Lagoudakis in [4] to solve SAT by exploiting the massive parallelism possible in DNA operations to emulate a non-deterministic device that solves SAT. We reduce 3-SAT to Unq-Shape using their construction. Given any instance of 3-SAT with m clauses and n different variables, we construct an instance of Unq-Shape, ( T, S , G, 2 , W ) with |T | = O(m + n), and the size of the shape W of order O(mn). There is a one-to-one correspondence between variable assignments and the supertiles in T erm(T), i.e every possible assignment to variables is associated with a distinct terminal supertile. If the assignment satisfies the formula, the terminal supertile associated with it has the shape of an (n + 2) × (m + 3) rectangle; else the shape associated with the assignment has the shape of an (n + 2) × (m + 3) rectangle with its top-right corner missing. Thus if the tile system uniquely assembles into a rectangle with its top-right corner missing, then no assignment satisfies the 3-SAT formula; on the other hand, if the tile system does not uniquely assemble into this corner-missing rectangle, then there is at least one assignment which satisfies the 3-SAT formula. Thus a polynomial time algorithm to decide Unq-Shape will result in a polynomial time algorithm for 3-SAT. The idea of the reduction is to have the bottom row of the rectangle encode the clauses, the left column

887


encode the variables, and let the second column correspond to a possible assignment of values to the variables. The rest of the assembly evaluates the formula at that assignment, and the row below the top row represents which of the clauses get satisfied by the assignment. A tile gets added to the top-right corner iff the assignment satisfies all the clauses. Figures 6(a) and (b) show two of the terminal supertiles of a tile system corresponding to the formula (x1 + x2 + x3 )(x1 + x2 + x3 )(x1 + x2 + x3 ). Each small square represents a tile. The label of an edge corresponds to the binding glue on the edge. All glues are of strength 1, except for the glues corresponding to those edge labels which are accompanied by a black circle ­ these are of strength 2. The glues on edges without any labels do not match. The temperature  = 2. Note that since the assignment x1 = 0, x2 = 1, x3 = 1 satisfies the formula, a tile gets attached in the top-right corner in Figure 6(a). Also, since the assignment x1 = 0, x2 = 1, x3 = 0 does not satisfy the formula, the rectangle in Figure 6(b) is missing a tile in the top-right corner. Take any 3-SAT formula with n variables, x1 , x2 , . . . , xn , appearing in m clauses C1 , C2 , . . . , Cm . We define a tile system, T, S , G, 2 (the temperature is fixed at 2), corresponding to it as follows: · Seed and auxiliary tiles.
G x1 R BL G R

· Assignment tiles. Each variable can take one of two values, so there is a total of 2n tiles for assigning values. Tile A0i and A1i can attach with strength 2 to the east side of xi , and have glue 0xi and 1xi respectively on their east side to attach the computation tiles. Any supertile with a full second column has either A0i or A1i attached to the east to ith variable tile, representing the variable assignment which will be evaluated by the terminal supertile resulting from this supertile.
G xi

0

0xi

G xi

1

1xi

· Computation tiles. For any clause-variable pair, Cj and xi , three cases arise: 1. xi is present as a positive literal: In this case, tiles (a) and (b) shown below are present. 2. xi is present as a negative literal: In this case, tiles (c) and (d) shown below are present. 3. xi is not present in clause Cj : For this case, tiles (b) and (c) shown below are present.
OK 1xi Cj 1xi 0xi Cj 0xi 1xi OK 1xi 0xi OK 0xi 1xi OK 1xi OK OK 0xi OK 0xi OK

Cj
Cj (a)

Cj
Cj (b)

Cj
Cj (c)

Cj
Cj (d)

Propagating OK

*
Seed

*
GTL

TL BL G G

*

C1 BR
G G

*

0/1x i

*
R

TL G

*

Tr

Top-left tile

Auxiliary tiles

· Variable tiles. There are n tiles encoding the variables for the first column. These tiles attach on top of the seed tile to give the first column. The east side of tiles xi provides a strength-2 glue to the assignment tiles.
G TL

In all the three cases, there are 2n tiles to propagate OK upwards. All these tiles have OK on their north and south sides; n of the tiles have 1xi on their east and west sides, while the remaining n tiles have 0xi on their east and west sides. · Final check tiles.
Tr

T
OK

Tr

Tr

F
Cj

Fa

Fa

F
Cj

Fa

Fa

F
OK

Fa

Tr SAT R

xn
G xn

G xn

G x i+1

xi
G xi
1<=i<n

G xi

The tile labeled S AT attaches to the top-right corner only if the formula is satisfied by the assignment in the second column of the supertile. 8 Further Research Directions · The Kolmogorov lower bounds for the assembly of N × N squares do not apply if the temperature of the system is a large exponential function of N . If this is allowed, is it possible to reduce the tile complexity of assembling N × N squares? · Are there any shapes in which the q -tile model or the unique shape model can be used to reduce the size of the minimum tileset, or can it be shown that

· Clause tiles. There are m tiles encoding the clauses for the bottom row. These tiles attach to the right of the seed tile to give the bottom row. The north side of Cj has a strength-1 glue Cj to attach the computation tiles.
Cj Cj G Cm C j+1 G Cm G

Cj

Cm BR
G

1<=j<m

888


*
G TL

G TL G x3 G x2 G x1 BL G

T 1 1 0 *

Tr

T
OK

Tr

T
OK

Tr

T
OK

Tr SAT R

*
G TL

G TL G x3 G x2 G x1 BL G

T 0 1 0 *

Tr

T
OK

Tr

F
C2

Fa

F
OK

Fa R

x3
G x3

1x3

OK 1x3 OK 1x3 OK 1x3 *
OK C2 OK 1x 2 OK 1x2 OK R

x3
G x3

0x3

OK 0x3 C2
OK C2

0x3

OK 0x3 *
OK R

x2
G x2

1x2 OK 1x2 C1 0x1 C1 G

C2
C2

*
R

x2
G x2

1x2 OK 1x2 C1 0x1

C2
C2

1x2 OK 1x2 OK

*
R

x1
G x1

C1
C1

0x1

C2 0x1 OK 0x1
C2 C3 C3 G

*
R

x1
G x1

C1
C1

0x1

C2 0x1 OK 0x1
C2 C3 C3 G

*
R

*

C1

C2 G (a)

C2

C3

BR G

*

*

C1 G

C1

C2 G (b)

C2

C3

BR G

*

Figure 6: Two terminal supertiles for the formula (x1 + x2 + x3 )(x1 + x2 + x3 )(x1 + x2 + x3 ). these models are equivalent to the standard model with respect to tile complexity? · In the q -tile model we put a bound of size q on the size of addable supertiles. If we remove this bound, the set of supertiles that can be added is possibly infinite and our proof of the Kolmogorov lower bound therefore no longer holds. For this -tile , or two-handed assembly model, what can we say about the lower bound for assembling N × N squares? · Can it help to use more than two temperatures in the multiple temperature model? If so, does the temperature need only be monotonically increasing or can it help to raise and lower the temperature? · Our construction for the flexible glue model shows how to create rectangles with width at least logarithmic in their length, but does not work for thinner rectangles. However, for thick rectangles with less than logarithmic width we can combine the flexible glue model with the 2-temperature model  to attain the tight ( log N ) tile complexity. Is it possible to assemble such a rectangle using only the flexible glue model?
[2] L. Adleman, Q. Cheng, A. Goel, M. Huang, D. Kempe, P. Espanes, and P. Rothemund. Combinatorial optimization problems in self-assembly. In Proceedings of the 34nd Annual ACM Symposium on Theory of Computing, pages 23­32, 2002. [3] Q. Cheng and P. M. de Espanes. Resolving two open problems in the self-assembly of squares. Technical Report 793, University of Southern California, June 2003. [4] M. Lagoudakis and T. LaBean. 2D DNA Self-assembly for Satisfiability. In Proceedings of the 5th DIMACS Workshop on DNA Based Computers held at MIT, Cambridge, 1999. [5] M. Li and P. Vitanyi. An Introduction to Kolmogorov Complexity and Its Applications (Second Edition). Springer Verlag, New York, 1997. [6] P. Rothemund and E. Winfree. The program-size complexity of self-assembled squares. In Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, pages 459­468, 2000. [7] H. Wang. Proving theorems by pattern recognition. Bel l Systems Technical Journal, 40:1­42, 1961. [8] E. Winfree. Algorithmic Self-Assembly of DNA. PhD thesis, California Institute of Technology, Pasadena, 1998.

References
[1] L. Adleman, Q. Cheng, A. Goel, and M. Huang. Running time and program size for self-assembled squares. In Proceedings of the 33nd Annual ACM Symposium on Theory of Computing, pages 740­748, 2001.

889