One of the enduring questions in biology is how eukaryotic cells arose from prokaryotic ancestors at least 2 billion years ago. Besides differences in genome organization, eukaryotic animals, plants, and fungi possess a much higher degree of cellular compartmentation in the form of membrane bound organelles than their distant bacterial and Archaean cousins. But how did such a plethora of cellular domains, each with a discrete role in metabolism, evolve?
To the extent that science proves anything, it answered the question for two eukaryotic organelles a long time ago. Mitochondria and chloroplasts evolved from endosymbiotic associations between an ancestral host cell and smaller prokaryotic partners. In the case of chloroplasts, the symbiont was a photosynthetic cyanobacterium; for mitochondria, most likely it was ana-proteobacterium.
The cytoplasm of eukaryotic cells is like chicken soup-it's chock full of organelles suspended like chunks of assorted vegetables and noodles in cytosolic broth. The broth also contains filaments of various dimensions that collectively comprise the cell's cytoskeleton. Like the bones of a large animal, the cytoskeleton provides a structural framework lending shape to cells and against which enzymatic 'muscles' work to elicit movement. That's how amoebae migrate, algae swim, stem cells divide, and cytoplasm streams relentlessly up, down, and across plant cells.
While the cytoskeleton is as much a hallmark of eukaryoticity as any mitochondrion or chloroplast, the origin of its filaments in deep time is more mysterious. Biologists assumed that genes for cytoskeletal proteins arose from prokaryotic precursors, but evidence in favor of the hypothesis was scarce, until recently.
Tubulin First on Stage
Microtubules comprise one component of the cytoskeleton responsible for a variety of movements including mitosis and meiosis. The 25 nm tubes consist of dimerica- and b-tubulin subunits that share about 40 percent sequence homology. Another form,y-tubulin, functions in microtubule formation.
But where did microtubules come from? It now appears that tubulins share a common ancestor with a protein called FtsZ, a key player in bacterial cell division.1 FtsZ is also present in plants, where it functions in chloroplast division,2 and a similar protein associates with mitochondria, at least in one alga.3 FtsZ polymerizes into filaments in the test tube in a process dependent on GTP. The same nucleotide is required for tubulin assembly into microtubules.1
Tubulins and FtsZ are clearly related, judging from similarities in three-dimensional structure. And although the proteins share only about 15 percent amino acid sequence identity overall, they're much more similar at the local level, particularly at the domain responsible for binding and cleaving GTP.4,5
Actin Into the Fold
Like the tubulins, actin-another essential component of the eukaryotic cytoskeleton-is a globular protein that binds nucleotide, in this case ATP. As actin monomers polymerize into 6-nm-wide microfilaments consisting of two helically wound protofilaments, the ATP, situated in a deep enzymatic cleft between two halves of the protein, hydrolyzes to ADP and inorganic phosphate.
It turns out that actin shares its ATPase domain with a family of proteins including hexokinase, the enzymatic kick starter of glycolysis, and several bacterial proteins. One of them is called MreB, a protein essential for generating or maintaining the rod shape of many bacteria. By examining structural similarities between eukaryotic actin and MreB from Thermotoga maritima, a research team at the Medical Research Council in Cambridge, England recently concluded that the two proteins are more closely related to each other than to other members of the family and undoubtedly share a common ancestor.6
The group showed that the three-dimensional shapes of actin and MreB are so similar they can be superimposed. The analogy with tubulin/FtsZ goes even further. Both proteins share considerable amino acid homology at several key sequences surrounding the ATP binding site, again situated deep in a cleft between two halves of the folded polypeptide chain.
Under the right conditions, MreB polymerizes into protofilaments that pair up lengthwise. The protein subunits are spaced about the same distance apart along the filaments as in polymeric actin, but MreB double filaments aren't nearly as helical.
The similarity between MreB and actin doesn't stop at structure and sequence. In a paper published earlier in 2001, a research group led by Jeffrey Errington at the University of Oxford, U.K. visualized MreB in the rod shaped cells of Bacillus subtilis using fluorescence and electron microscopy.7 MreB forms filamentous bands that encircle the cell in low helices, like reinforcing hoops. In an essay accompanying the Cambridge group's article, Duke University cell biologist Harold Erickson calculated that each band contains 10 protofilaments.8
When Errington's team genetically deprived cells of functional MreB, they became spherical. A search of genome databases showed that MreB is present in bacteria with nonspherical shapes, including rods. It's absent in spherical cocci. In other words, MreB has a cytoskeletal function. "I think it is quite convincing that MreB is the actin progenitor," says Erickson. "A key step, still unknown, going from bacteria to vertebrates is to develop a mechanism to make the double-helical actin filament from the single MreB protofilament structure."
More Acts to Follow
The story doesn't end with MreB; there's more to find out. Scientists want to know if MreB is also present in eukaryotes-associated with mitochondria and chloroplasts-as is FtsZ. According to Katherine Osteryoung, a plant biologist at Michigan State University in East Lansing who identified two FtsZ genes in the mustard plant Arabidopsis,2 "there's no obvious indication of MreB in plants that I've found or am aware of."
Actin normally functions along with the motor enzyme myosin to produce cellular motion, while microtubules utilize two other motor families called dynein and kinesin related proteins. Researchers now wonder whether MreB and FtsZ work in conjunction with bacterial motors. According to Erickson, "none have been turned up in genetic screens for cell division (or other activities), and none have been identified by sequence gazing. My bet is that kinesin and myosin evolved in eukaryotes, after the evolution of microtubules and eukaryotic actin filaments."
Still, Osteryoung is pleased with the latest results: "To someone interested in these issues, establishment of the prokaryotic origins of two major eukaryotic cytoskeletal proteins is enormously satisfying. I look forward to the day when evolutionary intermediates... from MreB to actin and FtsZ to tubulin, perhaps awaiting discovery in some obscure and primitive eukaryote, will more fully reveal the evolutionary steps by which key components of the eukaryotic cytoskeleton acquired their present-day structures and functions."
Barry A. Palevitz (firstname.lastname@example.org) is a contributing editor for The Scientist.
1. H.P. Erickson, "FtsZ, a tubulin homologue in prokaryotic cell division," Trends in Cell Biology, 7:362-7, 1997.
2. K.W. Osteryoung, "Organelle fission: Crossing the evolutionary divide," Plant Physiology, 123:1213-6, 2000.
3. P.L. Beech et al., "Mitochondrial FtsZ in a chromophyte alga," Science, 287:1276-9, 2000.
4. E. Nogales et al., "Structure of the alpha-beta tubulin dimer by electron crystallography," Nature, 391:199-203, 1998.
5. J. Lowe, L.A. Amos, "Crystal structure of the bacterial cell-division protein FtsZ," Nature, 391:203-6, 1998.
6. F. Van den Ent et al., "Prokaryotic origin of the actin cytoskeleton," Nature, 413:39-44, Sept. 2, 2001.
7. L.J.F. Jones et al., "Control of cell shape in bacteria: helical, actin-like filaments in Bacillus subtilis," Cell, 104:913-22, 2001.
8. H.P. Erickson, "Evolution in bacteria," Nature, 413:30, Sept. 6, 2001.
Dinoflagellate chloroplast genes are unique in that each gene is on a separate minicircular chromosome. To understand the origin and evolution of this exceptional genomic organization we completely sequenced chloroplast psbA and 23S rRNA gene minicircles from four dinoflagellates: three closely related Heterocapsa species (H. pygmaea, H. rotundata, and H. niei) and the very distantly related Amphidinium carterae. We also completely sequenced a Protoceratium reticulatum minicircle with a 23S rRNA gene of novel structure. Comparison of these minicircles with those previously sequenced from H. triquetra and A. operculatum shows that in addition to the single gene all have noncoding regions of approximately a kilobase, which are likely to include a replication origin, promoter, and perhaps segregation sequences. The noncoding regions always have a high potential for folding into hairpins and loops. In all six dinoflagellate strains for which multiple minicircles are fully sequenced, parts of the noncoding regions, designated cores, are almost identical between the psbA and 23S rRNA minicircles, but the remainder is very different. There are two, three, or four cores per circle, sometimes highly related in sequence, but no sequence identity is detectable between cores of different species, even within one genus. This contrast between very high core conservation within a species, but none among species, indicates that cores are diverging relatively rapidly in a concerted manner. This is the first well-established case of concerted evolution of noncoding regions on numerous separate chromosomes. It differs from concerted evolution among tandemly repeated spacers between rRNA genes, and that of inverted repeats in plant chloroplast genomes, in involving only the noncoding DNA cores. We present two models for the origin of chloroplast gene minicircles in dinoflagellates from a typical ancestral multigenic chloroplast genome. Both involve substantial genomic reduction and gene transfer to the nucleus. One assumes differential gene deletion within a multicopy population of the resulting oligogenic circles. The other postulates active transposition of putative replicon origins and formation of minicircles by homologous recombination between them.
The chloroplast genomes of algae and land plants are circular molecules, usually a single large circle of approximately 120–200 kbp bearing about 100–250 genes (Palmer 1985 ; Reith 1995 ; Sugiura 1995 ; Turmel, Otis, and Lemieux 1999 ). In marked contrast to this generally prevailing genomic organization, the chloroplast genes so far sequenced from the peridinean dinoflagellates Heterocapsa triquetra (Zhang, Green, and Cavalier-Smith 1999 ) and Amphidinium operculatum (Barbrook and Howe 2000 ) are all found on 2–3 kbp minicircles. Each minicircle contains a chloroplast gene (coding region) and a noncoding region in which two or three parts are highly conserved among minicircles within each species. The noncoding region of these minute chromosomes almost certainly includes a replicon origin and the promoter of the gene, though neither has been functionally characterized (Zhang, Green, and Cavalier-Smith 1999 ; Zhang, Cavalier-Smith, and Green 2001 ). It might also include sequences important for DNA segregation, but such a function might not be necessary if the minicircle copy number is as high as the 100–1,000 estimated for the analogous mitochondrial single-gene minicircles of dicyemid mesozoa (Watanabe et al. 1999 ).
Chloroplast and mitochondrial genomes are simplified relics of the much larger cellular genomes of their cyanobacterial and α-proteobacterial ancestors (Gray 1999 ). The origin of minicircles in dinoflagellates and dicyemids is the most radical evolutionary change in their genomic organization thus far established. The chloroplast minicircles of dinoflagellates and the mitochondrial minicircles of dicyemids are the only known cases of the fragmentation of genomes into completely separate unigenic chromosomes in nature. This makes their origin and maintenance of special evolutionary interest. In order to better understand both processes we have fully sequenced nine further chloroplast minicircular chromosomes from five diverse species of photosynthetic dinoflagellates.
Minicircular chloroplast genes are probably widely present among dinoflagellates, as was first shown by DNA hybridization using chloroplast genes psbA and 23S rRNA. This method revealed minicircle-sized bands on electrophoretic gels of native DNA from a number of dinoflagellate species in addition to those from which complete sequences were obtained: Heterocapsa pygmaea, H. rotundata, and Amphidinium carterae (Zhang, Green, and Cavalier-Smith 1999 ). After obtaining similar evidence for several different species, we amplified the psbA and 23S rRNA minicircles by PCR from the genomic DNA of H. niei, H. pygmaea, H. rotundata, A. carterae and the 23S rRNA minicircle only from Protoceratium reticulatum and report their complete sequences here.
We show that the noncoding regions of psbA and 23S rRNA minicircles in these dinoflagellate species are very different from those of H. triquetra and A. operculatum. Sequence comparison indicates that the noncoding regions of both psbA and 23S rRNA minicircles consist of two to four core regions (or cores), very conserved in each dinoflagellate, embedded within variable regions. We discuss the evolution and possible functional significance of these organizational differences among minicircular chloroplast chromosomes for replication or segregation. Although extremely conserved within each species, the cores are very divergent among species. This is typical of concerted evolution in which evolutionary divergence is also accompanied by a molecular process homogenizing all members of a multigene family (Elder and Turner 1995 ; Liao 2000 ). Concerted evolution was first described for tandemly repeated ribosomal RNA genes in Xenopus (Brown, Wensink, and Jordan 1972 ) and has been widely studied for multiple gene families in eukaryotes (see Elder and Turner 1995 for reviews). Concerted evolution is also known for several dispersed repeated genes and their flanking noncoding sequences (e.g., Liao 2000 ; Meinersmann and Hiett 2000 ). However, the concerted evolution of the dinoflagellate core regions appears to be the first example of concerted evolution occurring directly between regions of noncoding DNA flanking nonhomologous genes. As such, it is of considerable evolutionary interest and may also be functionally significant.
The origin of chloroplast gene minicircles in dinoflagellates was a remarkable and unprecedented event in the evolution of chloroplast genomes. We shall present evidence from their broad phylogenetic distribution among dinoflagellates that minicircles originated once only, the initial fragmentation of the chloroplast genome having occurred relatively early in peridinean evolution. We discuss how it may have happened and present two alternative models for its molecular mechanism.
Materials and Methods
Total DNAs were extracted from the dinoflagellates H. pygmaea (CCMP 1490), H. niei (CCMP 447), H. rotundata (NEPCC D680), and P. reticulatum (NEPCC D535) as described for H. triquetra (Zhang, Green, and Cavalier-Smith 1999 ). DNA was extracted from A. carterae (CCMP 1314) by the same method but without glassbead vortexing.
Specific dinoflagellate chloroplast 23S rRNA and psbA primers were designed based on the H. triquetra 23S rRNA and psbA sequences; degenerate primers were based on all available chloroplast 23S rRNA and psbA gene sequences, as described elsewhere (Zhang, Green, and Cavalier-Smith 2000 ). The specific primer pair 23S1-23S4 and the degenerate primers D23S1-D23S2 (fig. 1 and table 1 ) were used to amplify the noncoding region of minicircular 23S rRNA genes from H. pygmaea, H. niei, H. rotundata, A. carterae, and P. reticulatum. Primer pairs bA1-bA5 or DbA1-DbA5 (fig. 1 and table 1 ) were used to amplify the noncoding region of the psbA minicircles. PCR reactions were carried out for 35 cycles: 94°C for 30 s, 55°C for 30 s, followed by 2 min at 72°C in a GeneAmp PCR system 9600 (Perkin-Elmer). The reaction mixture (50 μl) contained 0.2 mM dNTP, 1× PCR buffer, 0.1–1.0 μg template DNA, 50–200 pmol primer, 2.0 or 2.5 mM MgCl2, and 1.5–2.5 units Taq polymerase (Sigma). Products were purified from low-melting gels or using a purification kit (Amersham-Pharmacia Biotech) and used for sequencing.
Sequencing reactions were done in a Perkin-Elmer GeneAmp 9600 using the ABI cycle sequencing protocol: 94°C for 5 s, 50°C for 5 s, 60°C for 4 min for 25 cycles. Each reaction contained 2–3 μl Bigdye, 20–30 ng DNA (purified PCR product), 3–5 pmol primer, and distilled water to make up to 10 μl. The sequencing samples were precipitated by adding 1/10 volume sodium acetate (pH 5.2) and 2 volumes 95% ethanol, quenched on ice for 10 min, centrifuged for 20 min, air dried, and analyzed by an ABI 377 automatic sequencer. Sequences were assembled and edited using Staden software (http://www.mrc-lmb.cam.ac.uk/pubseq). Sequences were aligned by Clustal W (Thompson, Higgins, and Gibson 1994 ) and manually improved using GDE (Smith 1994 ).
The noncoding region of the psbA minicircle was amplified by inverse PCR from H. niei, H. pygmaea, H. rotundata, and A. carterae, using outward-directed primer pairs as shown for H. triquetra in figure 1 . This yielded one product from H. rotundata, using primers bA1-bA5 and one product from A. carterae, using degenerate primers DbA1-DbA5 (fig. 1 and table 1 ). However, amplification of H. pygmaea genomic DNA using primer pair bA1-bA5 gave two products of 1.7 and 1.9 kbp. Two products, slightly different in size, were also obtained from H. niei DNA. No PCR product was obtained from P. reticulatum using primers bA1-bA5 or DbA1-DbA5, even though DNA blots hybridized with a psbA probe had suggested that psbA minicircles are present in P. reticulatum (Zhang, Green, and Cavalier-Smith 1999 ).
Complete psbA minicircles were assembled from these sequences and the overlapping coding sequences previously determined (Zhang, Green, and Cavalier-Smith 2000 ). The size of the psbA minicircles ranges from 2,195 bp in H. pygmaea to 2,311 bp in A. carterae (fig. 1 and table 2 ). In H. pygmaea, the noncoding region of the large psbA circle is 269-bp longer than in the small psbA circle because of insertions, mainly in the D1 and D4 segments (figs. 1, 3 , and table 2 ); curiously the larger circle had shorter D2 and D3 variable regions than the smaller one. These results explain the labeling of doublet bands on genomic DNA blots hybridized with psbA probes (Zhang, Green, and Cavalier-Smith 1999 ). Because amplification of the coding region of H. niei gave only one product, whereas amplification of the noncoding region gave two products H. niei probably also has two types of psbA minicircles differing only in the size of the noncoding region, consistent with the labeling of doublet bands on genomic DNA blots hybridized with psbA probes (data not shown).
23S rRNA Minicircles
The noncoding region of the 23S rRNA minicircle was amplified by inverse PCR from genomic DNA of H. niei, H. pygmaea, H. rotundata, A. carterae, and P. reticulatum, using the outward primer pair 23S1-23S4 or the degenerate primers D23S1-D23S2 (fig. 1 and table 1 ). PCR reactions yielded one product from H. pygmaea and H. rotundata, using primers 23S1-23S4 and one product from A. carterae, using primers D23S1-D23S2. However, H. niei amplified using 23S1-23S4 gave two products differing in size by about 0.3 kbp. In P. reticulatum, PCR amplification using D23S1-D23S2 also yielded two products, differing by approximately 0.5 kbp. PCR amplification of the coding region using inward-directed primer pairs gave only one product from H. niei and P. reticulatum (Zhang, Green, and Cavalier-Smith 2000 ), indicating that both species have two dissimilar-sized 23S rRNA minicircles: identical in the gene but different in the noncoding region. The 23S rRNA gene in all these dinoflagellates, except P. reticulatum (see subsequently), has the same orientation and organization as in H. triquetra, other algae, and land plants. The smaller PCR products from H. niei and P. reticulatum were cloned and sequenced.
Circular contigs were generated when the sequences of the 23S rRNA genes and associated noncoding regions of each dinoflagellate were assembled (fig. 1 and table 2 ). In general, 23S rRNA minicircles are larger than psbA minicircles (fig. 1 ). The size of the 23S rRNA minicircles also varies more among species, from 2,651 bp in A. carterae to 3,772 bp in P. reticulatum, the biggest thus far sequenced from a dinoflagellate.
Unusual Gene Organization in P. reticulatum 23S rRNA Minicircles
The RNA-specifying region of the P. reticulatum 23S rRNA minicircle is highly similar to that of H. triquetra, with >88% identity, but its gene organization differs strikingly (fig. 2 ). Sequence alignment with 23S rRNA genes of various organisms indicated that the P. reticulatum 23S rRNA gene consists of two fragments (the light gray region and the hatched region in fig. 2 ) that have interchanged their positions without changing the orientation of either part of the gene. This is surprising because all other chloroplast 23S rRNA genes from dinoflagellates and other organisms have the same organization and orientation. However, there are examples of rRNA gene fragmentation in mitochondria, e.g., Chlamydomonas (Boer and Gray 1998 ). Moreover, nuclear 28S rRNA is frequently posttranscriptionally fragmented into two or more pieces (six in the trypanosomatid Crithidia and even more in Euglena; Smallman, Schnare, and Gray 1996 ). Their large ribosomal subunits can be assembled into a functional unit using rRNA fragments, suggesting that the rearranged P. reticulatum minicircular 23S rRNA gene could also be functional.
Assembling the sequences of several clones revealed that various indels are present in the noncoding regions of the 23S rRNA minicircles in P. reticulatum. This implies that heterogeneous molecules might be present in the PCR product of the noncoding region. Heterogeneous 23S rRNA minicircles resulting from indels in the noncoding region were also observed in different clones in H. triquetra, despite PCR amplification of the noncoding region yielding only one product (Zhang, Green, and Cavalier-Smith 1999 ). Probably, each minicircular chloroplast chromosome of P. reticulatum and H. triquetra is a population of minicircles with a homogeneous coding region but heterogeneous noncoding regions.
Extremely Conserved Cores in the Noncoding Regions
In each dinoflagellate, the noncoding region is very conserved and readily alignable between the psbA and 23S rRNA minicircles. Some motifs within these regions are almost identical in both and are called core regions or cores. For convenience, they are named after whatever single nucleotide run occurs near their center, e.g., the 9G and 9A cores making up the 9G-9A-9G tripartite noncoding region of all nine H. triquetra minicircles (fig. 3 ; Zhang, Green, and Cavalier-Smith 1999 ). The sequences of the noncoding regions of different species are apparently unrelated and cannot be aligned (fig. 3b ).
The noncoding region of H. pygmaea has three identical 94-bp cores (5G) with a run of 5G's near the center (figs. 1 and 3a ). In H. rotundata, the psbA minicircle has three different cores of 111 bp (6G), 194 bp (6T), and 95 bp (6T′), the 95-bp 6T′ core being identical to the first 95 bp of the 6T core, i.e., a partial duplication (fig. 3a ). However, the 23S rRNA circle has two complete 6T cores, separated by 20 bp, as well as the 95-bp 6T′ core (fig. 1 ). Thus, the 23S rRNA circle has a quadripartite and the psbA circle a tripartite noncoding region. In H. niei, both psbA and 23S rRNA circles have a quadripartite noncoding region, with a core of 169 bp, with a run of 6 T's at its center, and three identical 90-bp cores with a central 7 G's (figs. 1 and 3 ). Interestingly, the single 6T core on the psbA circle lies between two of the 7G cores, but on the 23S rRNA circle all three 7G cores are together (fig. 1 ). There is no obvious sequence relationship between the cores of different species.
The psbA and 23S rRNA circles in A. carterae have a bipartite noncoding region, completely different from the tripartite or quadripartite noncoding regions of the four Heterocapsa species. It consists of a large core of 142 bp and a small 48-bp one (figs. 1 and 3 ). Sequence comparison of the psbA minicircle of A. carterae (CCMP 1314) with the psbA minicircle of A. operculatum (Barbrook and Howe 2000 ) revealed that they are identical. Sequence alignments showed that the noncoding regions of the psbA and 23S rRNA of A. carterae circles were highly related to the five A. operculatum circles (petD, atpB, psaA, psbA and psbB), which were assumed to have a 49-bp core (Barbrook and Howe 2000 ). Surprisingly, the complete psbA circle sequences (AF206672, ACA311632) of another isolate of A. carterae (CS-21, CSIRO Culture Collection, Hobart, Australia) are not identical to the psbA minicircle of the strain A. carterae (CCMP 1314). These results suggest that A. carterae (CCMP 1314, axenic) and A. operculatum (CCAP 1106, axenic) are probably the same species, despite having been collected from different places, whereas A. carterae (CCMP 1314) and A. carterae (CS-21) are not the same species even though they have the same species name. This suggests that at least one of these three Amphidinium strains is misidentified.
In P. reticulatum, DNA hybridization revealed that the psbA gene may be on both minicircular chromosomes and large molecules, whereas the 23S rRNA gene is only on minicircular chromosomes (Zhang, Green, and Cavalier-Smith 1999 ). PCR amplification from genomic DNA using inwardly and outwardly directed 16S rRNA primer pairs indicated that the 16S rRNA gene is also on a minicircle (data not shown). Although the coding region of the psbA gene was successfully amplified, its noncoding region could not be. Because the 23S rRNA minicircle is the only chloroplast gene minicircle completely sequenced from P. reticulatum at the moment, it was not possible to determine its conserved cores in the noncoding region. The noncoding region of the 23S rRNA minicircle of P. reticulatum is unalignable with that of known chloroplast gene minicircles of Heterocapsa or Amphidinium species (see previously).
Short Repeated Sequences in the Noncoding Regions
In each minicircle there are variable spacers between the cores and between them and the coding region (D1–D4, fig. 3a ). The corresponding spacers in psbA and 23S rRNA circles vary in size in each dinoflagellate, and in general, are not conserved between the different minicircles of the same species. At least some of this size variation is caused by short, direct repeat sequences, as found in the D2 region of the nine H. triquetra minicircles (Zhang, Green, and Cavalier-Smith 1999 ). In H. pygmaea, the psbA circle has five 26-bp direct repeats in the D2 region, each separated by a few bases. In H. rotundata, the D1 regions of both psbA and 23S rRNA circles have two direct repeats of 20 bp, and the 23S rRNA circle also has two different direct repeats of 20 bp. In H. niei, psbA and 23S circles have two to five repeats of several different sequences (11–51 bp), some of which are shared between the two circles. The P. reticulatum 23S rRNA circle has two 66-bp tandem repeats.
Inverted repeats can form hairpins suggested to have a replication function in the chloroplast genomes of Euglena (Schlunegger and Stutz 1984 ) and Chlamydomonas (Wu et al. 1986 ). Inverted repeats of 20 and 28 bp were found in the D2 region of H. niei (fig. 4 ) and in the 6G core (111 bp) of H. rotundata (fig. 3a ). Interestingly, a 19-bp inverted repeat was also found in the 9A cores (188 bp) of H. triquetra chloroplast gene minicircles (figs. 3a and 4 ). The inverted repeats are exclusively present in the cores that are not duplicated, i.e., no inverted repeats are found in the three identical cores of H. pygmaea. Two inverted repeats were found in the noncoding region of the P. reticulatum 23S rRNA circle (fig. 4 ). No repeats were found in the minicircles of A. carterae.
Interspecific Variability in Core Organization of the Minicircle Noncoding Region
In H. triquetra, all nine single-gene minicircles (Zhang, Green, and Cavalier-Smith 1999 ) share the same tripartite organization of their noncoding regions, as do a family of five selfish minicircles containing gene fragments derived from four of them (Zhang, Green, and Cavalier-Smith 2001 ). Thus, the three core regions are well conserved among 14 separate minicircular chromosomes in H. triquetra, whereas the intervening regions (D1–D4) are much less well conserved and in parts cannot be mutually aligned. In A. operculatum, only one core region is conserved across all five minicircles studied by Barbrook and Howe (2000) , although we found two core regions on comparing the two minicircles sequenced here for an A. carterae strain (CCMP 1314), which may actually be the same species. Because H. triquetra and A. operculatum (and the very closely related A. carterae) are only distantly related on rRNA trees (Saldarriaga et al. 2001 ), it initially seemed possible that the presence of three core regions on one species and only two in the other might reflect an ancient divergence in minicircle organization.
However, our present results reveal that the basic organization of dinoflagellate minicircle noncoding regions is very divergent even among species belonging to the same genus. Thus, although H. pygmaea has a tripartite organization like that of H. triquetra, all three 5G cores are identical to each other, whereas in H. triquetra only the two flanking 9G cores are mutually related. Heterocapsa niei also has three almost identical cores (7G), as well as a single unrelated 6T core, but the order of the cores is not conserved between the 23S rRNA and PsbA circles. The identical repeats in these three species probably arose by tandem duplication, with subsequent divergence of the variable region coupled with conservation of the identical regions (probably by gene conversion; see subsequently). The fact that in the H. nieipsbA circle two of these repeats are separated by an unrelated core sequence means that rearrangements in the order of the cores also occurred. An analogous rearrangement must also have taken place in H. triquetra if the two related flanking cores arose by tandem duplication, not duplicative transposition. In H. rotundata there is evidence of one complete duplication of the 6T core in the 23S rRNA minicircles and a partial duplication of 6T to make the 6T′ core in both genes.
The virtually adjacent repetition of the central core in the 23S rRNA H. rotundata arose by a relatively recent tandem duplication. The fact that the repeats are separated by a sequence identical to a part of the D3 region, which is highly variable even among circles in the same species, confirms how recent this duplication is. It is highly probable that the three identical core repeats of H. pygmaea also evolved by tandem duplication, but the high divergence of their flanking regions suggests that this must have been long ago. Heterocapsa is a well-defined apparently monophyletic genus (Daugbjerg et al. 2000 ; Saldarriaga et al. 2001 .) The lack of conservation of the noncoding regions among its species contrasts strikingly with the conservation within species of the core sequences and their number and relative positions. That there has been ample time for considerable divergence between the D1–D4 regions of Heterocapsa is confirmed by a crude application of a molecular clock. This would make the basal radiation of Peridinea about 2.4 times older than that of Heterocapsa (using fig. 1 of Saldarriaga et al. 2001 ). As the fossil record suggests that Peridinea may be only about 80 Myr old (Tappan 1980 ; Fensome et al. 1993 ), H. rotundata and triquetra may have diverged about 30 MYA from the common ancestor of H. pygmaea and H. niei, whereas H. pygmaea and H. niei may have been diverging from each other for about 20 Myr. Although rates of 18S rRNA evolution are dramatically heterogeneous among different dinoflagellate lineages (Saldarriaga et al. 2001) , the branch lengths for Heterocapsa and the numerous clades nearest to it are relatively short and uniform so these estimates are probably reasonable order-of-magnitude approximations; even if they are in error by several-fold, this would not alter our key point that a very substantial divergence between the species is expected for sequences not subject to strong stabilizing selection or a homogenization mechanism.
The Noncoding Region Probably Includes the Replication Origin
Chloroplast replication origins generally map to noncoding regions in the neighborhood of the rRNA genes (Sears, Stoike, and Chiu 1996 ; Kunnimalaiyaan, Shi, and Nielsen 1997 ), although the core region of the Chlamydomonas origin partly overlaps a ribosomal protein gene (Chang and Wu 2000 ). Replication origins frequently contain predicted stem-loop structures, inverted repeats, and multiple direct repeats, not only in plastids (Sears, Stoike, and Chiu 1996 ; Kunnimalaiyaan, Shi, and Nielsen 1997 ) but also in a number of systems ranging from bacteria to animals (reviewed in Pearson et al. 1996 ).
We previously suggested that the 9A region of H. triquetra could contain the replication origin because it is present both in normal minicircles and in aberrant chimeric ones containing multiple fragments of different genes (Zhang, Cavalier-Smith, and Green 2001 ). Inverted repeats were found in the 9A core of H. triquetra, the 6G core of H. rotundata, and adjacent to the 6T core of H. niei (fig. 3a ). Interestingly, inverted repeats are present only on cores without duplicates, not on those with duplicates or triplicates. The fact that inverted repeats were not found in the three identical cores of H. pygmaea or in the noncoding region of A. carterae makes it less likely that they are necessary features of replication origins. On the other hand, inverted repeats have been suggested to be hotspots for recombination (Kawata et al. 1997 ), consistent with our model for the recombinational origin of aberrant minicircles in H. triquetra (Zhang, Cavalier-Smith, and Green 2001 ).
The conserved cores in the noncoding region of the dinoflagellate minicircles are comparable with the conserved sequence blocks in the control or D-loop region at the origin of replication in animal mitochondria (Quinn and Wilson 1993 ). The marked interspecific divergence in Heterocapsa core regions is closely analogous to that seen in comparisons between the conserved sequence blocks of animals that diverged many millions of years ago (Quinn and Wilson 1993 ), in keeping with our previous arguments for relatively ancient divergence. In contrast, among Amazona parrots which probably diverged relatively recently (scores of thousands of years), the conserved blocks are identical among different species (Eberhard, Wright, and Bermingham 2001 ). The analogy with vertebrate mitochondrial control regions even extends to such details as the stronger conservation in the number of conserved blocks among closer relatives; thus, the differences in conserved block number in early diverging lampreys (Lee and Kocher 1995 ), compared with their constancy in tetrapods (Quinn and Wilson 1993 ) is closely analogous to the change in core numbers between the most divergent dinoflagellates Amphidinium and Heterocapsa, compared with their greater similarity within Heterocapsa. The similar patterns of sequence conservation and divergence between dinoflagellate noncoding regions and the animal mitochondrial control region may therefore stem from an underlying similarity in replicative function.
The precise number or arrangement of conserved cores per noncoding region cannot be important for replication because in H. rotundata the 23S rRNA circle has an extra copy of one core compared with the psbA circle, and in H. niei the two types of cores are in different orders. Changes in the number of replication control regions have also been observed in animal mitochondria; in snakes both copies have been maintained identically over scores of millions of years, despite very great divergence among species (Kumazawa et al. 1996 ), just as we find for Heterocapsa.
Like the tripartite noncoding region of H. triquetra circles, the noncoding regions of H. pygmaea, H. niei, H. rotundata, A. carterae, and P. reticulatum minicircles can all be folded into elaborate secondary structures with various hairpins, stem loops and large loops using DNA fold (http://mfold.wustl.edu/∼folder/dna). In all the cases, each core can be a part of a hairpin or a loop, despite the sequences of the noncoding regions being completely different among species. The capacity for secondary structure might, therefore, be important in replication by serving as the replication origin or in DNA segregation (or both) by binding circles to a membrane (Zhang, Green, and Cavalier-Smith 1999 ; Barbrook and Howe 2000 ).
Concerted Evolution of Cores in the Noncoding Region of Chloroplast Gene Minicircles
Concerted evolution refers to the concerted divergence of members of multigene families, each gene of the family being highly similar or identical within each species but very divergent among different species (Graur and Li 2000 ). Concerted evolution was first observed in Xenopus for the spacers of tandemly repeated ribosomal RNA genes (Brown, Wensink, and Jordan 1972 ; Hillis et al. 1991 ). It is very common in tandemly repeated multigene families, e.g., those encoding histones (Coen, Strachan, and Dover 1980 ) and ubiquitin (Nenoi et al. 1998 ). It is also observed in dispersed ribosomal RNA operons in bacterial genomes, very likely driven by gene conversion (Liao 2000 ). Two mutational mechanisms have been proposed for the concerted evolution of repeated genes: (1) unequal sister chromatid exchange or crossing over (Smith 1976 ), which would be effective for generating and maintaining tandem repeats, and (2) gene conversion (Dover 1982 ), which can maintain identical sequences dispersed over a chromosome or on different chromosomes within a species. Unequal crossing over may be responsible for the presence of more than one copy of some cores in Heterocapsa species, as well as for the extra copies found in 23S gene circles of H. niei and H. rotundata (fig. 1 ), but duplication through replication error is even more likely. It is difficult to see how unequal exchange could contribute to the maintenance of core identity among different gene circles of the same species, especially because the intervening regions are divergent.
The cores in the noncoding region of chloroplast gene minicircles in dinoflagellates clearly undergo concerted evolution because they are identical or extremely conserved among minicircles in a species but are totally different between species. Does this conservation of core sequences and arrangement reflect selective constraints arising from the probable functions of these regions in replication, transcription initiation, and perhaps chromosome segregation, or is it purely the outcome of the essentially neutral dynamics of gene conversion? We shall argue that the underlying causes of this concerted divergence are likely to involve both gene conversion and selection for similarity (but not identity) between cores; neither of these alone can explain the facts.
We suggest that gene conversion is the primary molecular mechanism maintaining near-identity of the related cores within a species. We previously found evidence for gene conversion in the D2 and D3 regions of H. triquetra minicircles, where there are repeats shared by two or three genic circles but not by all (Zhang, Green, and Cavalier-Smith 1999 ). Similar instances of shared identical repeats were found in the D2 regions of a family of five highly rearranged minicircles containing fragments of several genes (Zhang, Green, and Cavalier-Smith 2001 ). Furthermore, the sequences of the gene fragments are maintained almost identical to those of the corresponding normal genes, even though it is highly unlikely that any of these fragments is functional. The extra 7G and 6T cores in the 23S minicircles of H. niei and H. rotundata are also unlikely to have been maintained identical to their counterparts in the same circle and in the psbA circles by selection alone because the absence of these extra copies in the psbA circles shows that their presence is not essential.