Critical Reviews in Microbiology, 31:101–135, 2005
Copyright c Taylor & Francis Inc. ISSN: 1040-841X print / 1549-7828 online
DOI: 10.1080/10408410590922393
Protein Signatures Distinctive of Alpha Proteobacteria and Its Subgroups and a
Model for α-Proteobacterial Evolution
Radhey S. Gupta
Department of Biochemistry and Biomedical Sciences, McMaster University,
Hamilton, Ontario, Canada
Alpha (α) proteobacteria comprise a large and metabolically diverse group.
No biochemical or molecular feature is presently known that can distinguish
these bacteria from other groups. The evolutionary relationships among this group,
which includes numerous pathogens and agriculturally important microbes, are
also not understood. Shared conserved inserts and deletions (i.e., indels or
signatures) in molecular sequences provide a powerful means for identiï¬cation
of different groups in clear terms, and for evolutionary studies (see
www.bacterialphylogeny.com). This review describes, for the ï¬rst time, a
large number of conserved indels in broadly distributed proteins that are
distinctive and unifying characteristics of either all α-proteobacteria,
or many of its constituent subgroups (i.e., orders, families, etc.). These
signatures were identiï¬ed by systematic analyses of proteins found in the
Rickettsia prowazekii (RP) genome. Conserved indels that are unique to
αproteobacteria are present in the following proteins: Cytochrome c
oxidase assembly protein Ctag, PurC, DnaB, ATP synthase αsubunit,
exonuclease VII, prolipoprotein phosphatidylglycerol transferase, RP-400, FtsK,
puruvate phosphate dikinase, cytochromeb, MutY, and homoserine dehydrogenase.
The signatures in succinyl-CoA synthetase, cytochrome oxidase I, alanyl-tRNA
synthetase, and MutS proteins are found in all α-proteobacteria, except
the Rickettsiales, indicating that this group has diverged prior to the
introduction of these signatures. A number of proteins contain conserved indels
that are speciï¬c for Rickettsiales (XerD integrase and leucine
aminopeptidase), Rickettsiaceae (Mfd, ribosomal protein L19, FtsZ, Sigma 70 and
exonuclease VII), or Anaplasmataceae (Tgt and RP-314), and they distinguish
these groups from all others. Signatures in DnaA, RP-057, and DNA ligase A are
commonly shared by various Rhizobiales, Rhodobacterales, and Caulobacter,
suggesting that these groups shared a common ancestor exclusive of other α-proteobacteria.
A speciï¬c relationship between Rhodobacterales and Caulobacter is indicated
by a large insert in the Asn-Gln amidotransferase. The Rhizobiales group of
species are distinguished from others by a large insert in the Trp-tRNA
synthetase. Signature sequences in a number of other proteins (viz.
oxoglutarate dehydogenase, succinyl-CoA synthase, LytB, DNA gyrase A, LepA, and
Ser-tRNA synthetase) serve to distinguish the Rhizobiaceae, Brucellaceae, and
Phyllobacteriaceae families from Bradyrhizobiaceae and Methylobacteriaceae.
Based
on the distribution patterns of these signatures, it is now possible to
logically deduce a model for the branching order among α-proteobacteria,
which is as follows: Rickettsiales → Rhodospirillales-Sphingomonadales →
Rhodobacterales-Caulobacterales →
Rhizobiales(Rhizobiaceaea-Brucellaceae-Phyllobacteriaceae, and
Bradyrhizobiaceae). The deduced branching order is also consistent with the
topologies in the 16 rRNA and other phylogenetic trees. Signature sequences in
a number of other proteins provide evidence that α-proteobacteria is a
late branching taxa within Bacteria, which branched after the δ,
-subdivisions but prior to the β,γproteobacteria. The shared presence
of many of these signatures in the mitochondrial (eukaryotic) homologs also
provides evidence of the α-proteobacterial ancestry of mitochondria.
Keywords Bacterial Phylogeny; Alpha Proteobacteria Trees; Protein Signatures;
Rickettsiales; Rhodobacterales; Branching Order; Mitochondrial Origin; Rickettsia
prowazekii; Rhizobiales
Received 20 December 2004; accepted 8 December 2005. Address correspondence to
Radhey S. Gupta, Department of Biochemistry and Biomedical Sciences, McMaster University,
Hamilton, Ontario,
Canada L8N 3Z5. E-mail: gupta@mcmaster.ca
INTRODUCTION The alpha (α) proteobacteria comprise an important group
within Bacteria, which has contributed seminally to many aspects of the history
of life (Margulis 1970; Kersters et al. 2003). It is now established that
mitochondria, which enable eukaryotic cells to produce energy via oxidative
phosphorylation, are the result of endosymbitotic capture of an
α-proteobacteria by the primitive eukaryotic cell (Margulis 1970; Falah
& Gupta 1994; Viale & Arakaki 1994; Andersson et al. 1998; Gray et al.
1999; Karlin & Brocchieri 2000; Emelyanov 2001a; Esser et al. 2004). There
is also strong evidence indicating that the ancestral eukaryotic cell itself
may haveoriginated via a fusion, or long-term symbiotic association, event
between one or more αproteobacteria and an archaebacteria (or Archaea)
(Gupta et al. 1994; Lake & Rivera 1994; Gupta & Golding 1996; Margulis
1996; Gupta 1998; Martin & Muller 1998; Ribeiro & Golding 1998;
Andersson et al. 1998; Karlin et al. 1999; Lang et al. 1999; Kurland &
Andersson 2000; Emelyanov 2001a, 2003b). The symbiosis between α-proteobacteria
(viz. Rhizobiaceae species) and plant root nodules plays a central role in the
ï¬xation of atmospheric nitrogen by plants (Sadowsky & Graham 2000; Van
Sluys et al. 2002; Kersters et al. 2003; Sawada et al. 2003). Additionally,
many α-proteobacterial species (viz. Rickettsiales,
101
102
R. S. GUPTA
Brucella, Bartonella) are adapted to intracellular life style and are major
human and animal pathogens (Moreno & Moriyon 2001; Kersters et al. 2003; Yu
& Walker 2003). The α-proteobacteria exhibit enormous diversity in
terms of their morphological and metabolic characteristics and they include
numerous phototrophs, chemolithotrophs and chemoorganotrophs (Stackebrandt et
al. 1988; De Ley 1992; Kersters et al. 2003). This group also harbors all known
aerobic photoheterotrohic bacteria, which contain bacteriochlorophyll a, but
are unable to grow photosynthetically under anaerobic conditions (Yurkov &
Beatty 1998). These bacteria are abundant in the upper layers of oceans (Kolber
et al. 2001). The α-proteobacterial species are presently recognized on
the basis of their branching pattern in the 16S rRNA trees, where they form a
distinct clade within the proteobacterial phylum (Woese et al.1984; Stackebrandt
et al. 1988; Olsen et al. 1994; Gupta 2000; Kersters et al. 2003). This group
has been given the rank of a Class or subdivision within the Proteobacteria
phylum (Stackebrandt et al. 1988; Murray et al. 1990; De Ley 1992; Stackebrandt
2000; Ludwig & Klenk 2001; Garrity & Holt 2001; Kersters et al. 2003).
Other than their distinct branching in the 16S rRNA or other phylogenetic trees
(De Ley 1992; Viale et al. 1994; Eisen 1995; Gupta et al. 1997; Gupta 2000;
Stepkowski et al. 2003; Emelyanov 2003a; Battistuzzi et al. 2004), there is no
reliable phenotypic or molecular characteristic known at present that is
uniquely shared by different α-proteobacteria which distinguish them from
all other bacteria (Kersters et al. 2003). On the basis of 16S rRNA trees the
α-proteobacteria have been divided into seven main subgroups or orders
(viz. Caulobacterales, Rhizobiales, Rhodobacterales, Rhodospirillales,
Rickettsiales, Sphingomondales, and Parvularucales) (Maidak et al. 2001;
Garrity & Holt 2001; Kersters et al. 2003). However, the branching order
and interrelationships among these subgroups are presently not resolved and no
distinctive features that can distinguish these groups from each other are
known (Kersters et al. 2003). In our recent work, we have been utilizing a new
approach based on identiï¬cation of conserved indels (also referred to as
signatures) in proteins sequences that is proving very useful in identifying
different groups within Bacteria in clear molecular terms and clarifying
evolutionary relationships among them (see www.bacterialphylogeny.com) (Gupta
1998, 2003, 2004;Grifï¬ths & Gupta 2002, 2004a; Gupta & Grifï¬ths
2002; Gupta et al. 2003). We have previously described many protein signatures
that are distinctive characteristics of the proteobacterial phylum and which
also provided information regarding its branching position relative to other
bacterial groups (Gupta 1998, 2000; Grifï¬ths & Gupta 2004b). This review
focuses on examining the evolutionary relationships among α-proteobacteria
using the signature sequence as well as traditional phylogenetic approaches. In
recent years, complete genomes of several α-proteobacteria (viz.
Bartonella henselae, Bart. quintana, Bradyrhizobium japonicum, Brucella
melitensis, Bru. suis, Caulobacter crescentus, Mesorhizobium loti,
Sinorhizobium loti, Rhodopseudomonas palustris, Agrobacterium tumefaciens,
Rick-
ettsia conorii, Ri. prowazekii, Ri. typhi, and Wolbachia sp. (Drosophila
endosymbiont)) have become available (Andersson et al. 1998; Kaneko et al.
2000, 2002; Nierman et al. 2001; Wood et al. 2001; Ogata et al. 2001; Galibert
et al. 2001; DelVecchio et al. 2002; Paulsen et al. 2002; Larimer et al. 2004;
McLeod et al. 2004). These provide valuable resources for identifying novel
molecular features that are likely distinctive characteristics of α-proteobacteria
and its various subgroups, and which may prove helpful in clarifying the
evolutionary relationships among them. This article, describes for the ï¬rst
time, a large number of conserved indels in widely distributed proteins that
are either uniquely shared by all α-proteobacteria, or which are shared by
only particular subgroups (i.e., families or orders) of this Class.
Thesesignatures provide novel and deï¬nitive molecular means for distinguishing
α-proteobacteria and many of its subgroups from all other bacteria. The
distribution of these signatures in different α-proteobacteria also
enables one to logically deduce the relative branching orders and
interrelationships among different α-proteobacteria subgroups.
Phylogenetic studies have also been carried out based on 16S rRNA and a number
of proteins sequences. Based on this information, a detailed model for the
evolutionary relationships among α-proteobacteria has been developed.
PHYLOGENETIC TREE FOR ALPHA PROTEOBACTERIA BASED ON 16S rRNA SEQUENCES Although
α-Proteobacteria comprise a major group within Bacteria (Garrity &
Holt 2001) with >5200 sequences in the Ribosomal Database Project II (Maidak
et al. 2001), there is no detailed review or article that discusses the
evolutionary relationships among this group (i.e. indicating the relationships
among different subgroups and orders within this Class) (Kersters et al. 2003).
Most of the articles on α-Proteobacteria are aimed at clarifying the phylogenetic
placement of particular species at either genus or family levels (Dumler et al.
2001; Gaunt et al. 2001; Young et al. 2001; Taillardat-Bisch et al. 2003; van
Berkum et al. 2003; Broughton 2003; Stepkowski et al. 2003; Sawada et al.
2003). The second edition of Bergey’s Manual (Ludwig & Klenk 2001) and the
third edition of Prokaryotes (Kersters et al. 2003) present condensed
phylogenetic trees for the α-Proteobacteria (or Proteobacteria) as a whole
to indicate presumed relationships among different subgroups comprisingthis
subdivision. However, most of these trees do not show any bootstrap scores or
even individual species (Ludwig & Klenk 2001; Kersters et al. 2003), making
it difï¬cult to get a clear sense of the reliability of the observed (or indicated)
relationships. Hence, as an initial step toward understanding the evolutionary
relationships among α-Proteobacteria, a phylogenetic tree based on 16S
rRNA sequences was constructed from 65 α-proteobacterial species, covering
its major subgroups. The resulting neighborjoining bootstrapped consensus tree
is presented in Figure 1. The tree shown was rooted using the 16S rRNA
sequences from epsilon proteobacteria, which show deeper branching than the
α-subdivision in the rRNA as well as various other trees (Olsen
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
103
FIG. 1. A neighbor-joining bootstrap consensus tree for α-proteobacteria
based on 16S rRNA sequences. The tree was bootstrapped 100 times and bootstrap
scores which were >60 are indicated on the nodes. The tree was rooted using
H. pylori. However, the tree topologies was not altered on rooting with other
deep branching bacteria (e.g., Aq. aeolicus). The groups of species
corresponding to some of the main subgroups within α-proteobacteria are
marked.
indicates anomalous branching in the tree.
104
R. S. GUPTA
et al. 1994; Viale et al. 1994; Eisen 1995; Gupta 1998). The bootstrap scores
for all nodes, which were >60 (out of 100) are indicated on the tree. In the
resulting tree a number of different clades are either clearly (>90%
bootstrap score) or reasonably well resolved. Theseincluded the clades
corresponding to group of species which are recognized as major orders within
the α-Proteobacteria (Rhizobiales, Rhodospirillales, Caulobacterales,
Sphingomonadales, Rhodobacterales, and Rickettsiales) (Ludwig & Klenk 2001;
Garrity & Holt 2001; Kersters et al. 2003). Within Rhizobiales, the
Bradyrhizobiaceae family of species was clearly separated from some of the
other families within this order (viz. Rhizobiaceae, Brucellaceae, and
Phyllobacteriaceae) (Wang et al. 1998; Sadowsky & Graham 2000; Dumler et
al. 2001; van Berkum et al. 2003; Stepkowski et al. 2003). Within
α-Proteobacteria, the deepest branching was observed for the Rickettsiales
group of species. Within the Rickettsiales, the Rickettsia, and Orientia
genera, which form part of the Rickttsiaceae family, were clearly resolved from
the Anaplasmataceae family comprised of Ehrlichia, Wolbachia, Anaplasma, and
Neorickettsia species (Dumler et al. 2001; Yu & Walker 2003). In contrast
to these well-resolved clades or relationships, various nodes indicating the
interrelationships among different orders had lower bootstrap scores (50% are
indicated on various nodes. All inserts and deletions were excluded from the
sequence alignment used for phylogenetic analysis. The α-proteobacteria
formed a well-deï¬ned clade in this tree, however, their branching position
relative to other groups was not resolved. The Rickettsiales order formed the
deepest branch within α-proteobacteria and they were also clearly resolved
from other α-proteobacteria. The arrows mark the suggested positions where
the identiï¬ed signatures were introduced inthis protein.
(Figure 10). One of these deletions is a distinctive characteristic of all
α-proteobacteria and not found in any other bacteria. The other deletion,
in addition to the α-proteobacteria, is also commonly present in the two
Desulfovibrio species (δ-proteobacteria), suggesting a distant
relationship of this group to α-proteobacteria, as also seen with the PPDK
protein (Figure 9). In addition to these deletions, the FtsK protein also
contains a 5–6 aa insert that is unique to various α-proteobacteria in
comparison to the other groups of proteobacteria (present in
position corresponding to aa 513–520 in Ri. prowazekii protein). Since the
region where this insert is found exhibits variability in other bacteria, this
signature is not shown. The FtsK protein has also been previously shown to
contain an 8–9 aa insert in a different region of the protein that is a
distinctive characteristic of various Bacteriodetes and Chlorobium species
(Gupta 2004). The FtsK homologs are not found in most eukaryotic organisms.
However, a homolog of this protein is present in Plasmodium yoelii (Genebank
accession number 23485217). The origin and
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
111
FIG. 8.
Partial sequence alignment of RP-400 protein showing a 4–6 aa insert that is
speciï¬c for various α-proteobacteria, except Z. mobilis.
possible signiï¬cance of this gene/protein is presently unclear. A 1 aa
deletion that is speciï¬c for various α-proteobacteria is also present in
the Cytochrome b (Cyt b; PetB) protein (Figure 11), which is a subunit of the
cytochrome reductase, which isan integral part of the electron transport chain
(Daldal et al. 1987; Stryer 1995; Emelyanov 2003a). This indel is not present
in other bacteria including that from Aquifex aeolicus, indicating that it is a
deletion in α-proteobacteria, rather than an insert in other bacteria. Cyt
b is one of the 13 proteins that is still encoded by mitochondrial DNA (Lang et
al. 1999). Sequence information for Cyt b is available from a large number
(>500) of mitochondrial genomes and phylogenetic studies based on this
protein provides evidence for the origin of mitochondria from within the
Rickettsiaceae (Sicheritz-Ponten et al. 1998; Emelyanov 2003a). Similar to the
α-proteobacteria, Cyt b from all eukaryotic mitochondrial homologs was
found to lack this 1 aa indel, providing evidence of their speciï¬c
relationship to the α-proteobacteria. B. Signature Sequences
Distinguishing Rickettsiales from Other α-Proteobacteria In phylogenetic
trees based on 16S rRNA, as well as many protein sequences, the Rickettsiales
are found to form the deepest
branching clade within α-proteobacteria (see Figures 1 and 7) (Dumler et
al. 2001; Gaunt et al. 2001; Yu et al. 2001; Kersters et al. 2003; Yu & Walker
2003; Stepkowski et al. 2003). We have identiï¬ed several signatures that are
present in various αproteobacteria, except the Rickettsiales. These
signatures are described below. The enzyme succinyl CoA-synthetase, which is
part of the citric acid cycle, carries out cleavage of the thioester bond in
succinyl-CoA in a coupled reaction to generate succinate and producing GTP
(Bridger et al. 1987; Stryer 1995). It is the only step inthe citric acid cycle
that directly leads to the formation of a high-energy phosphate bond. The beta
subunit of this protein contains a conserved insert of 10 aa, that is commonly
present in all other α-proteobacteria, except the Rickettsiales (Figure
12). Surprisingly, this insert is also present in Ral. metallidurans (a
β-proteobacterium), but not in any other β-proteobacteria, including
the closely related species Ral. solanacearum. This suggests that the Succ-CoA
synthetase gene in Ral. metallidurans has likely originated by non-speciï¬c
means such as LGT. A smaller unrelated insert in this region, which is
presumably of independent origin, is also present in Cytophaga and
Rhodopirellula species (not shown). It is of interest that a 7–8 aa insert is
also present in this position in various eukaryotic homologs. It is unclear at
present, whether this latter insert
112
R. S. GUPTA
FIG. 9. Excerpt from sequence alignment for pyruvate phosphate dikinase (PPDK)
protein showing a signature for α-proteobacteria. The Rickettsiales
species contain a 5 aa long insert, where all other α-proteobacteria have
a 12 aa insert in the same position. Two different homologs of PPDK are found
in Brad. japonicum, only one of which is found to contain the insert. A smaller
conserved insert of 10 aa is also present in this position in various δ-proteobacteria
suggesting that they may be speciï¬cally, but distantly, related to the
α-proteobacteria.
has originated from an α-proteobacterial ancestor or it is of independent
origin. If these inserts are of common origin, then this would suggest that the
eukaryotichomologs of Succ-CoAsynthetase have originated from an
α-proteobacterial ancestor other than the Rickettsiales. This observation
will be at variance with other evidence pointing to a closer relationship of
mitochondria to the Rickettsiales species (Viale & Arakaki 1994; Gupta
1995; Andersson et al. 1998; Sicheritz-Ponten et al. 1998; Gray et al. 1999;
Lang et al. 1999; Emelyanov 2001a, 2001b, 2003a). Emelyanov (2001a, 2001b) has
observed a closer relationship of mitochondrial homologs to certain rickettsial
species (e.g. Holospora obtusa, Caedibactera caryophila), for which sequence
information for this protein is lacking at present. It is possible that
Succ-CoA synthetase from these species may contain this insert. Presently, the
possibility that the insert in eukaryotic homologs was independently introduced
also cannot be excluded.
Another signature showing a similar distribution pattern has been identiï¬ed
in cytochrome oxidase polypeptide I (Cox I). In this case, a 5 aa insert in a
conserved region is commonly present in various α-proteobacterial species
except the Rickettsiales (Figure 13). It should be noted that
α-proteobacteria contain two different related proteins. One of these,
which harbors this insert seems to correspond to Cox I, whereas the other
homologs lacking the insert are mainly those from Cytochrome o ubiquinol
oxidase (Davidson & Daldal 1987). However, all Rickettsiales species
contain only a single homolog of this protein, corresponding to Cox I. The
observed insert in both SuccCoA-synthetase and Cox I were thus likely
introduced in a common ancestor of the remainder of theα-proteobacteria
after the branching of Rickettsiales. Similar to the Cyt b, the Cox I in
eukaryotic cells is also encoded by mitochondrial DNA (Andersson et al. 1998;
Gray et al. 1999) and sequence information for this protein is available from a
large number of mitochondrial
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
113
FIG. 10. Partial sequence alignments of FtsK protein showing two different
signatures (1 aa deletions) that are informative characteristics of
α-proteobacteria. The deletion on the left is unique to various
α-proteobacteria, whereas the one on the right is also commonly shared by
two Desulfovibrio species (δ-proteobacteria) suggesting their relatedness
to the α-proteobacteria.
genomes. The eukaryotic homologs of Cox I do not contain the identiï¬ed insert
(results not shown) indicating their possible derivation from Rickettsiales
(Emelyanov 2003a). Two other proteins were found to contain inserts of variable
lengths in highly conserved regions in various α-proteobacterial species,
with the exception of Rickettsiales (Figure 14). In alanyl-tRNA synthetase
(AlaRS), which is ubiquitously found in all organisms, an insert of between
5–11 aa is present in a highly conserved region in various
α-proteobacteria, except the Rickettsiales (and also Mag. magnetotacticum)
(Figure 14A). Another signature showing a similar distribution pattern is found
in the MutS protein, which is involved in the DNA mismatch repair (Sixma 2001;
Martins-Pinheiro et al. 2004). In this case, a conserved insert of 2–5 aa is
present in various α-proteobacteria (Figure 14B), but not inRickettsiales.
The simplest explanation
for these signatures is that they were introduced in an ancestral α-proteobacterial
lineage, after the branching of Rickettsiales (and also possibly Mag.
magnetotacticum). The observed variations in the lengths of these inserts have
presumably resulted from subsequent genetic changes. We have also identiï¬ed a
number of α-proteobacteria-speciï¬c signatures in proteins for which no
homologs are found in the Rickettsiales. In the MutY protein, which is an A-G
speciï¬c DNA glycosylase involved in DNA repair (Parker & Eshleman 2003;
Martins-Pinheiro et al. 2004), a 4–9 aa insert in a conserved region is present
in various α-proteobacteria (Figure 15A). An insert of similar length is
also present in most eukaryotic homologs (with the exception of Anopheles
gambiae) indicating their possible derivation from α-proteobacteria.
Another signature showing similar species distribution is present in
114
R. S. GUPTA
FIG. 11. Partial sequence alignment for Cyt b protein showing a 1 aa deletion
that is speciï¬c for various α-proteobacteria. This deletion is also
present in all mitochondrial homologs (Cyt b is encoded by mitochondrial DNA)
providing strong evidence of their α-proteobacterial ancestry.
the protein homoserine dehydrogenase (Figure 15B). This indel consists of a 1
aa insert in a conserved region that is present in various
α-proteobacteria, but not any other proteobacteria. The homologs of both
these proteins were not detected in the Rickettsiales species and their absence
is very likely due to selective loss of these genes in a common ancestor of
theRickettsiales (Martins-Pinheiro et al. 2004), presumably due to the
intracellular life-style of these organisms (Boussau et al. 2004). The
observed inserts in these genes could have been introduced in a common ancestor
of the α-proteobacteria, either before or after the loss of these genes in
Rickettsiales. Several proteins contain conserved inserts that are either
unique for the Rickettsiales or for the two main families, Rickettsiaceae and
Anaplasmataceae, comprising this order (Dumler et al. 2001; Yu & Walker
2003). The Rickettsiales-speciï¬c signatures are present in the proteins XerD
and leucine aminopeptidase
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
115
FIG. 12. Partial sequence alignment of Succ-CoA synthase showing a 10 aa insert
that is present in various α-proteobacteria, except the Rickettsiales.
This insert is not found in other bacteria, except Ral. metallidurans, which
has likely acquired it by non-speciï¬c means. A smaller insert is also present
in this position in various eukaryotic homologs.
(Figure 16). XerD protein (Figure 16B) is a part of the XerCD
integrase/recombinase that is involved in the cell division process and
decatenation of DNA duplexes (Ip et al. 2003). A 7 aa insert is present in a
conserved region of this protein which is uniquely shared by all Rickettsiales
and not found in any other bacteria (Figure 16A). Another 2 aa insert that is
speciï¬c for Rickettsiales is present in leucine aminopeptidase (Figure 16A),
which is an exopeptidase that selectively releases N-terminal amino acids from
peptides and proteins (Gonzales & RobertBaudouy 1996). Thesignatures that
are speciï¬c for Rickettsia include a 4 aa insert in a highly conserved region
of the transcription repair coupling factor (Mfd) (Martins-Pinheiro et al.
2004) (Figure 17A), a 10 aa insert in ribosomal protein L19
(Figure 17B) and a 1 aa insert in the FtsZ protein (Figure 17C). Two additional
Rickettsia-speciï¬c signatures consisting of a 1 aa insert in the major sigma
factor-70 (at position 141 in the R. prowazekii sequence) and a 1 aa deletion
in exouclease VII (at position 137 in the Ri. prowazekii homolog) were also
identiï¬ed, but they are not shown here. The identiï¬ed signatures in these
proteins are present only in various Rickettsiaceae species and not found in
other Rickettsiales (viz. Ehrlichia, Wolbachia, Anaplasma) or other groups of
bacteria. Within eukaryotes, a homolog of the transcription repair-coupling
factor is only detected in Arabidoposis thaliana and it lacks the identiï¬ed
insert (results not shown). The homologs of ribosomal protein L19 are found in
various plants and algae but not in any of the animal
116
R. S. GUPTA
FIG. 13. Partial sequence alignment of Cox I showing a 5 aa insert that is
present in various α-proteobacteria, except Rickettsiales. The other
α-proteobacteria also contains a second more distantly related homolog
that lacks this insert.
species. Of these, an 8 aa insert in the same position is present only in the
homolog from Cyanophora paradox (not shown). The signiï¬cance and possible
origin of this insert is not clear. Similar to the ribosomal protein L19, FtsZ
homologs are also found only in plants but not in animals. These homologsalso
lacked the insert that is present in Rickettsiaceae. The plant homologs of
these proteins likely correspond to those of the plastids, which because of
their cyanobacterial ancestry (Gray 1989; Morden et al. 1992; Margulis 1993;
Gupta et al. 2003) are expected to be lacking Rickettsia-speciï¬c signatures.
We have also identiï¬ed two large inserts that are commonly shared by the
Ehrlichia, Wolbachia, and Anaplasma species but not found in any of the
Rickettsia species or other bacteria. These signatures include a 15 aa insert
in the HlyD family of secretory protein (Figure 18A) and a 10–11 aa insert in
the tRNA guanine transglycosylase (Tgt) protein (Figure 18B), involved in the
synthesis of hypermodiï¬ed nucleoside queousine (Reuter & Ficner 1995).
The eukaryotic homologs of Tgt do not contain this insert providing evidence
against their origin from Anaplasmatacaea family of species (results not
shown). The homologs of HlyD are not found in eukaryotes. These signatures
point to a close relationship between Ehrlichia, Wolbachia, and Anaplasma
species, which is also seen in phylogenetic trees based on many other sequences
(Dumler et al. 2001; Gaunt et al. 2001; Yu et al. 2001; Taillardat-Bisch et al.
2003; Yu & Walker 2003; Stepkowski
et al. 2003; Emelyanov 2003a). These signatures were likely introduced in a
common ancestor of the Anaplasmataceae family, which now includes all
Ehrlichia, Anaplasma, Cowdria, Wolbachia, and Neorickettsia species (Dumler et
al. 2001; Yu & Walker 2003).
C. Signature Sequences for Other Subgroups of α-Proteobacteria and
Providing Information Regarding TheirInterrelationships Signature sequences in
a number of other proteins are useful in distinguishing other subgroups of
α-proteobacteria and they also provide information clarifying the
interrelationships among them. In the DnaA protein involved in chromosomal
replication (Messer 2002), a 5 aa insert is present in various Rhizobiales and
Caulobacter/Rhodobacter species (Figure 19A). However, this insert is not found
in any of the Rickettsiales, as well most α-proteobacterial species
belonging to the orders Sphingomonadales and Rhodospirillales. The species Mag.
magnetotacticum contains two different homologs of this protein, only one of
which is found to contain the insert. Another insert showing a similar
distribution pattern is present in the protein RP057, which is a homolog of the
glucose-inhibited division protein B (Romanowski et al. 2002). This protein
contains a 3 aa insert that is common to the same subgroups of
α-proteobacteria
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
117
FIG. 14. Signature sequences in alanyl-tRNA synthetase (AlaRS) and MutS
proteins that are informative for the α-proteobacteria. In AlaRS (upper
panel) an insert of variable length in a highly conserved region is present in
various α-proteobacteria, except the Rickettsiales and Mag. magentotacticum.
The DNA mismatch repair protein MutS (lower panel) also contains a 3–5 aa
insert in various α-proteobacteria, except Rickettsiales. The inserts
lengths in this case also serve to differentiate Rhodospirillales and
Sphingomonadales species from the Rickettsiales, Rhodobacterales, and
Caulobacterales.
118
R.S. GUPTA
FIG. 15. Partial sequence alignments of MutY (upper panel) and homoserine
dehydrogenase (lower panel) proteins showing inserts (boxed) in conserved
regions that are speciï¬c for α-proteobacteria. The homologs of both
these proteins are not found in the Rickettsiales. For MutY, an insert of
approximately similar length is also present in various eukaryotic homologs,
with the exception of Anopheles gambiae.
as the insert in the DnaA protein, but which is not found in the Rickettsiales
or Rhodospirillales/Sphingomonadales species (Figure 19B). The variable length
inserts are also present in this position in other bacteria (not shown).
However, within proteobacteria this insert is limited to the above subgroups of
α-proteobacteria. Based on the distribution patterns of these signatures,
these inserts were likely introduced in a common ancestor of the Rhizobiales
and Caulobacter/Rhodobacter after the branching of Rickettsiales and
Rhodospirillales/ Sphingomonadales orders (Figure 19C).
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
119
FIG. 16. Signature sequences in XerD integrase (upper panel) and leucine
aminopeptidase (lower panel) that are distinctive of the Rickettsiales order and
not found in other α-proteobacteria or other bacteria.
120
R. S. GUPTA
FIG. 17. Signature sequences in transcription repair coupling factor Mfd (A),
Ribosomal protein L19 (B), and FtsZ (C) proteins that are distinctive of
Rickettsia species and not found in other α-proteobacteria including
Anaplasmataceae family (e.g., Wolbachia, Ehrlichia, Anaplasma) of species. Two
additionalsignatures showing similar distribution are found in the sigma
factor-70 and exonuclease VII proteins.
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
121
FIG. 18. Signature sequences in RP-314 (A) and tRNA guanine transglycosylase
(Tgt) (B) proteins that are distinctive of the Anaplasmataceae family of
species and not found in Rickettsia or various other bacteria.
The protein DNA ligase (NAD dependent; Lig A) contains a 12 aa insert in a
highly conserved region that is commonly shared by various Rhizobiales as well
as Rhodobacterales species (Figure 20A), but which is not found in C.
crescentus, Rhodospirillales (Rhodo. rubrum, Mag. magnetotacticum), and Sph-
ingomonadales (Z. mobilis, Novo. armoaticivorans). The absence of this insert
in the Mesorhizobium sp. BNC1, is somewhat surprising, but it could result from
non-speciï¬c mechanisms. This signature suggests that Rhizobiales species may
be more closely related to Rhodobacterales in comparison to
122
R. S. GUPTA
FIG. 19. Partial sequence alignments of DnaA (panel A) and RP-057 (panel B)
proteins showing inserts in conserved regions (boxed) that are only present in
various Rhizobiales, Rhodobacterales, and Caulobacter, but not found in other
α-proteobacteria or bacteria. These inserts were likely introduced in a
common ancestor of the above groups after the branching of Rickettsiales,
Rhodospirillales, and Sphingomonadales as indicated in panel C.
Caulobacter and other α-proteobacteria. However, another prominent insert
(11 aa) in a highly conserved region of the protein aspargine-glutamine
amidotransferasepoints to a speciï¬c relationship between Rhodobacterales and
Caulobacter species (Figure 20B), to the exclusion of all other
α-proteobacteria. Martins-Pinheiro et al. (2004) have reported
phylogenetic analysis based on LigA sequences. The α-proteobacteria formed
a distinct clade in the tree, but they consisted of only certain Rhizobiaceace
and Caulobacter species (Martins-Pinheiro et al. 2004). To fully understand the
evolutionary signiï¬cance of these signatures, it would be necessary to obtain
sequence information for these proteins from additional Caulobacterales. We
have also identiï¬ed many conserved inserts that are speciï¬c for species
belonging to the Rhiziobiales order. The
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
123
FIG. 20. Signature sequences in DNA ligase A and (upper panel) and Asn-Gln
amidotransferase (lower panel) that are informative for α-proteobacteria.
The signature in DNA ligase is commonly shared by various Rhizobiales as well
as C. crescentus species, while that in the Asn-Gln amidotransferase is
uniquely shared by Rhodobacterales and Caulobacter, indicating a speciï¬c
relationship between these subgroups.
Trp-tRNA synthetase (TrpRS) contains a large insert in a highly conserved
region which is uniquely shared by various Rhizobiales species (Figure 21A),
but not found in any of the other α-proteobacteria or other groups of
bacteria (results for other groups of bacteria not shown). The absence of this
insert in various Rickettsiales, Rhodospirillales, Sphingomonadales, and
Rhodobacterales as well as Caulobacter provides evidence that these groups
havebranched off prior to the introduction of this insert (Figure 21A). The
length of the insert in TrpRS also serves to distinguish the Rhizobiaceae,
Brucellaceae, and Phyllobacteriaceae family of species from those belonging to
Bradyrhizobiaceae and Methylobacteriaceae. The insert in the former group of
species is 19 aa long, whereas the latter species contain only a 9–10 aa
insert. Because the insert sequence in all of these species is conserved, it is
likely that the insert was introduced only once
in a common ancestor of the Rhizobiales and subsequent modiï¬cation has led to
the observed length variation. The distinctness of Bradyrhizobium and
Rhodopseudomonas from other Rhizobiales is also supported by a signature (3 aa
insert) in Seryl-tRNA synthetase (SerRS), which is uniquely present in these
species (Figure 21B) and it serves to distinguish them from other Rhizobiales
as well as other α-proteobacteria. A schematic diagram indicating the
suggested positions where signatures described in Figures 20 to 23 have been
introduced is presented in Figure 21C. We have also identiï¬ed several
signatures that are uniquely present in the Rhizobiaceae, Brucellaceae, and
Phyllobacteriaceae families of species, but not found in other
α-proteobacteria including Bradyrhizobium and Rhodopseudomonas. These
signatures include a 7 aa insert in Oxoglutarate dehydrogenease (Figure 22A), a
5 aa insert in Succ-CoA synthase (Figure 22B),
124
R. S. GUPTA
FIG. 21. Signature sequences in Trp-tRNA synthetase (upper panel) and Ser-tRNA
synthetase (lower panel) that are informative for α-proteobacteria. The ï¬rst
ofthese signatures is speciï¬c for Rhizobiales. The insert length in this
signature also distinguishes Bradyrhizobiaceae and Methylobacteriaceae species
from other Rhizobiales. The insert in the Ser-tRNA synthetase is speciï¬c for
the Bradyrhizobiaceae species and distinguishes this family from other
Rhizobiales.
a 3 aa insert in LytB metalloproteinase (Figure 23A) and a 2 aa insert in DNA
gyrase A subunit (Figure 23B). A smaller insert in oxoglutarate dehydrogenase
is also present in Novosphingobacteria, but since its sequence is unrelated, it
is either of independent origin or could have resulted from LGT. In addition to
these proteins, a 1–2 aa insert that is speciï¬c for Rhizobiaceae is also
found in a conserved region of the LepA protein (Figure 23C). The evolutionary
positions where these signatures have been introduced are indicated in Figure
21C. It is of interest that in contrast to other Rhizobiaceae species, which
contain only 1 aa inserts, Sinorhizobium meliloti and Agrobacterium
tumefacienes are found to contain 2 aa inserts in the LepA protein (Figure
23C). This observation points to a speciï¬c relationship between these two
Rhizobiaceae species, as has been suggested based on other lines of evidences
(Young et al. 2001). A 2 aa insert in the DnaK protein, which is commonly
shared by species belonging to Rhizobium and Sinrhizobium genera, as
well as Ehrlichia and a few other proteobacteria, has also been described by
Stepkowski et al. (2003). D. Signature Sequences Indicating the Phylogenetic
Placement of α-Proteobacteria A number of signatures described in earlier
work have indicated thatproteobacteria is a late branching phylum in comparison
to other main groups within Bacteria (Gupta 1998, 2000, 2003; Gupta & Grifï¬ths
2002; Grifï¬ths & Gupta 2004b). These signatures included a 4 aa insert in
alanyl-tRNA synthetase, an insert of >100 aa in RNA polymerase β (RpoB)
subunit, a 10 aa insert in CTP synthase, a 2 aa insert in inorganic
pyrophosphatase, and a 2 aa insert in Hsp70 protein. The identiï¬ed signatures
in these proteins were present in all proteobacterial homologs, but they were
absent from most other bacterial phyla (viz. Firmicutes, Actinobacteria,
Thermotogae, DeinococcusThermus, Cyanobacteria, Spirochetes). In a number of
cases,
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
125
FIG. 22. Signature sequences in Oxoglutarate dehydrogenase (upper panel) and
Succ-CoA synthase (lower panel) proteins that are commonly shared by only
certain Rhizobiales families (e.g., Rhizobiaceae, Brucellaceae, and
Phyllobacteriaceae), and not found in Bradyrhizobiaceae or other
α-proteobacteria.
where the corresponding proteins were present in Archaea (viz. RpoB, Hsp70,
AlaRS), the archael homologs also lacked the indicated inserts, indicating that
the absence of these indels constitute the ancestral states and that these
signatures were introduced after branching of the groups lacking these indels
(Gupta & Grifï¬ths 2002; Gupta 2003; Grifï¬ths & Gupta 2004b). A
number of identiï¬ed signatures (7 aa insert in SecA, 1 aa deletion in the Lon
protease) were uniquely shared by only the α, β, and γ
-proteobacteria, providing evidence of the later branching of these
subdivisions(Gupta 2000, 2001, 2003). Two additional signatures that are
helpful in understanding the phylogenetic placement of α-proteobacteria
are described in the following section.
Figure 24 shows the excerpt from a sequence alignment for the transcription
termination factor Rho,
which is an RNAbinding protein that plays a central role in the RNA chain
termination (Opperman & Richardson 1994). This protein is present in all
main groups of bacteria, except cyanobacteria (Gupta & Grifï¬ths 2002;
Gupta 2003), where RNA chain termination presumably occurs via a
Rho-independent mechanism. A 3 aa insert is present in a highly conserved
region of Rho,
which is a distinctive characteristic of all α, β, and γ -proteobacteria.
The length of this insert is 2–3 aa longer in various Rickettsiales species,
which suggests an additional insert in this group of bacteria. In contrast to
the α, β, and γ -proteobacteria, this insert is not present in
δ, ε-proteobacteria or any other
126
R. S. GUPTA
FIG. 23. Signature sequences in LytB (A), DNA gyrase A (B) and LepA proteins
that are distinctive characteristics of only certain Rhizobiales families
(e.g., Rhizobiaceae, Brucellaceae, and Phyllobacteriaceae), but not found in Bradyrhizobiaceae
or other α-proteobacteria.
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
127
FIG. 24. Partial sequence alignment of Rho
protein showing a conserved insert that is commonly shared by various α,
β, and γ -proteobacteria, but not found in any other groups of
bacteria including the δ, -proteobacteria and all other phyla of
gram-positive and gram-negative bacteria. Thisinsert was likely introduced in a
common ancestor of the α, β, and γ -proteobacteria after the
branching of other bacterial phyla (see Figure 26). Many other signatures
showing similar distribution pattern and supporting the indicated branching
position of α, β, and γ -proteobacteria have been described in
earlier work.
groups of Gram-negative and Gram-positive bacteria. This signature provides
evidence that the groups consisting of α, β, and γ
-proteobacteria have branched off late in comparison to the other groups of
bacteria. Another novel signature that is useful in understanding the branching
position of αproteobacteria is present in the ATP synthase alpha subunit.
In this case, an 11 aa insert in a highly conserved region
is present in various β and γ -proteobacteria, but it is not found in
any α-proteobacteria or other groups of bacteria (Figure 25). The absence
of this insert in various other bacteria as well as archael homologs provides
evidence that it was introduced in a common ancestor of the β and γ
-proteobacteria after the divergence of other bacteria, including
α-proteobacteria (Figure 26).
128
R. S. GUPTA
FIG. 25. Partial sequence alignment of ATP synthase α-subunit showing a
highly conserved insert that is commonly shared by various β and γ
-proteobacteria, but not found in any other groups of bacteria including the
α- and δ, -proteobacteria and all other phyla of Gram-positive and
Gram-negative bacteria. This insert is also not present in archael or
eukaryotic homologs indicating that it was introduced in a common ancestor of
the β and γ -proteobacteria after thebranching of all other groups
including α-proteobacteria. Other signatures showing similar relationships
have been described in earlier work (Gupta 1998, 2000, 2001, 2003).
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
129
FIG. 26. Evolutionary relationships among α-proteobacteria based on
signature sequences in different proteins. The branching position of
α-proteobacteria relative to other groups of bacteria is based on
signature sequences such as those shown in Figures 24 and 25. The evolutionary
stages where these signatures have been introduced are indicated by thick
arrows. Many other signatures that are helpful in resolving the branching order
of other groups have been described in our earlier work (Gupta 1998, 2000,
2001, 2003, 2004; Gupta & Grifï¬ths 2002; Grifï¬ths & Gupta 2004b
(see also www.bacterialphylogeny.com)). The evolutionary relationship among
α-proteobacteria shown here was deduced based on the distribution patterns
of different signatures described in this review. The long thin arrows mark the
positions where the signature sequences in various proteins have likely been
introduced.
130
R. S. GUPTA
CONCLUSIONS The α-proteobacteria are a morphologically and metabolically
very diverse group of organisms, which are presently recognized as a distinct
group solely on the basis of their branching pattern in the 16S rRNA tree
(Woese et al. 1984; Stackebrandt et al. 1988; Murray et al. 1990; De Ley 1992;
Ludwig & Klenk 2001; Kersters et al. 2003). No biochemical, molecular or
other features are presently known, which are uniquely shared by various
α-proteobacteriaand that can clearly distinguish this group from all
others. The evolutionary relationships within this group of bacteria are also
presently not understood. This review describes many novel signatures consisting
of conserved inserts and deletions in widely distributed proteins that provide
deï¬nitive means for deï¬ning the α-proteobacteria and many of its
subgroups, and for understanding evolutionary relationships among them. Because
of the rarity and highly speciï¬c nature of these genetic changes, the
possibility of their arising independently by either convergent or parallel
evolution is low (Gupta 1998; Rokas & Holland 2000). The simplest and most
parsimonious explanation for such rare genetic changes, when restricted to a
particular clade(s), is that they were introduced only once in common ancestors
of the particular group(s) and then passed on to various descendants. The
signature approach has proven very useful in the past in clarifying a number of
important evolutionary relationships, which could not be reliably resolved
based on phylogenetic trees (Rivera & Lake 1992; Baldauf & Palmer
1993). Our earlier work has identiï¬ed many signatures that are either
speciï¬c for particular groups of bacteria (viz. chlamydiae, cyanobacteria,
Bacteroidetes-ChlorobiFibrobacter, Deinococcus-Thermus, Proteobacteria) (Gupta
2000, 2004; Grifï¬ths & Gupta 2002, 2004a; Gupta et al. 2003), or which
are commonly shared by certain bacterial phyla providing information regarding
their interrelationships (Gupta 1998, 2003; Gupta & Grifï¬ths 2002;
Grifï¬ths & Gupta 2004b). A summary of the different signatures that
weredescribed in this review and the overall picture of α-proteobacterial
evolution that emerges based upon them is presented in Figure 26. Most of the
signatures described here were unique for either all α-proteobacteria or
certain of its subgroups, and except for a few isolated instances, they were
not found in other bacteria. These ï¬nding provides evidence that the genes
containing these signatures have not been laterally transferred from
α-proteobacteria to other bacteria, although LGT for certain other genes
have been previously reported (Wolf et al. 1999). A large number of these
signatures, present in broadly distributed proteins (cytochrome assembly
protein Ctag, SAICAR synthetase, DnaB, ATP synthase α, exonuclease VII,
PLPG transferase, RP-400, puruvate phosphate dikinase, FtsK, and Cyt b) were
distinctive characteristics of all α-proteobacteria. Two additional
proteins, MutY and homoserine dehydrogenase, also contain signatures that were
speciï¬c for α-proteobacteria. However, the homologs of these proteins
were not found in Rickettsiales. These signatures, for the ï¬rst time,
describe molecular characteristics that unify all α-proteobacteria, and
provide means to clearly distin-
guish them from all other bacteria. The unique presence of these signatures in
various α-proteobacteria, which is a very diverse group (Kersters et al.
2003), strongly suggests that these indels should be functionally important for
this group of organisms. Hence, studies examining their functional effects
should be of much interest. Signature sequences in other proteins are helpful
in deï¬ning many of the α-proteobacteriasubgroups and in clarifying
evolutionary relationships among them. A number of proteins, which include,
Succ-CoA synthetase, Cox I, AlaRS, and MutS, contain conserved inserts that are
shared by all other α-proteobacteria, except the Rickettsiales. The homologs
of these proteins from other bacteria also lack these indels providing evidence
that these signatures were introduced in a common ancestor of other
αproteobacteria after the divergence of Rickettsiales. The Rickettsiales
order also consistently forms the deepest branching lineage in 16S rRNA and
various protein trees (Dumler et al. 2001; Gaunt et al. 2001; Kersters et al.
2003; Yu & Walker 2003; Stepkowski et al. 2003). Signature sequences in a
number of proteins were found to be speciï¬c for either the Rickettsiales
order (viz. XerD integrase and leucine aminopeptidase) or the two main
families, Rickettsiaceae (viz. transcription repair coupling factor, ribosomal
protein L19, and FtsZ proteins) and Anaplasmataceae (RP-314 and Tgt proteins).
These signatures were likely introduced in the common ancestors of these
groups. These groups are also clearly distinguished in the phylogenetic trees
based on 16S rRNA (Figure 1) (Dumler et al. 2001; Yu & Walker 2003) and
various proteins (Figure 7) (Stepkowski et al. 2003; Emelyanov 2003a).
Signature sequences in a number of proteins (viz. chromosomal replication
factor, RP-057 and DNA ligase) were commonly shared by various Rhizobiales,
Rhodobacterales, and in most cases Caulobacterales (currently represented by only
C. crescentus), but they were not present in Rickettsiales, Rhodospirillales as
well asSphingomonadales species. These results provide evidence that the groups
lacking these signatures diverged prior to the introduction of these
signatures. A unique signature has also been identiï¬ed for the Rhizobiales
order (viz. TrpRS), and one which is commonly shared by Rhodobacterales and C.
crescentus. The latter signature suggests a speciï¬c relationship between the
Rhodobacterales and Caulobacter groups. The relationships indicated by these
signatures are also generally supported by the phylogenetic trees based on 16S
rRNA and various proteins (Gaunt et al. 2001; Kersters et al. 2003; Stepkowski
et al. 2003; Emelyanov 2003a). Signatures sequences in a number of other
proteins (viz. oxoglutarate dehydrogenase, Succ-CoA synthase, DNA gyrase A,
LepA, and LytB), are able to distinguish the Rhizobiaceae, Brucellaceae, and
Phyllobacteriaceae families from the Bradyrhizobiaceae species. The
distinctness of Bradyrhizobiaceae from other Rhizobiales is also clearly
indicated by a signature sequence in seryl-tRNA synthetase that is speciï¬c
for this group. These signatures are also consistent with the observation that
Bradyrhizobiaceae species are only distantly related to other Rhizobiales (viz.
Rhizobeaceae,
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA
131
Brucellaceae, and Phyllobacteriaceae) (Figure 1) (Sadowsky & Graham 2000;
Gaunt et al. 2001; van Berkum et al. 2003; Kersters et al. 2003; Stepkowski et al.
2003; Moulin et al. 2004). A speciï¬c relationship between Sinorhizobium and
Agrobacterium species was also indicated by the signature sequence in the LepA
protein. On the basis of16S rRNA or various genes/proteins trees, it has proven
difï¬cult to reliably determine the interrelationships among different
α-proteobacterial subgroups (Ludwig & Klenk 2001; Kersters et al.
2003). However, based upon the distribution patterns of various signatures, it
is now possible to logically deduce the branching order of the main
α-proteobacterial subgroups (Figure 26). The model for α-proteobacterial
evolution, which has been developed here is based upon a large number of
proteins, which are involved in different functions. This model is internally
highly consistent and it is difï¬cult to logically explain the observed
distributions of these signatures by alternate means. The model developed here
is also consistent with the relationships, which are resolved in the 16S rRNA
or other phylogenetic trees (viz. deep branching and distinctness of
Rickettsiales, a closer relationship between Rhizobiaceae, Brucellaceae, and
Phyllobacteriaceae as compared to Bradyrhizobiaceae; a closer relationship
between Rhodobacterales and Caulobacterales; distinctness of Rickettsiaceae from
Anaplasmataceae species; distinctness of Rhizobiales order containing various
root nodule bacteria, etc.) (Sadowsky & Graham 2000; Dumler et al. 2001;
Kersters et al. 2003; Yu & Walker 2003; Moulin et al. 2004). A few minor
inconsistencies seen at present (e.g., phylogenetic placement of Ca.
crescentus) should be clariï¬ed when sequence information from additional
species becomes available. In this context, it is important to acknowledge that
sequence information is available at present from only a limited number
ofα-proteobacterial species. Although, these species include
representatives from different α-proteobacterial orders, it is necessary
to obtain sequence information for many other species from different genera and
families to test and validate this model. Signature sequences in a number of
proteins, a few of which are described here, also provide evidence that
α-proteobacteria is a late diverging group within Bacteria (Gupta 1998,
2000, 2003; Gupta & Grifï¬ths 2002). Within proteobacteria, δ and -subdivisions
are indicated to have branched prior to α-proteobacteria, whereas β
and γ -subdivisions are indicated as later branching groups (see also
www.bacterialphylogeny.com). The branching of α-proteobacteria in this
position is also supported by the16S rRNA and various protein trees (Olsen et
al. 1994; Viale et al. 1994; Eisen 1995; Kersters et al. 2003). The
αproteobacteria, which is a very large group within Bacteria (>5000
entries in the RDP-II database) (Maidak et al. 2001), are presently recognized
as a Class within the Proteobacteria phylum (Woese et al. 1984; Stackebrandt et
al. 1988; Murray et al. 1990; Ludwig & Schleifer 1999; Boone et al. 2001;
Kersters et al. 2003). However, presently there are no clearly deï¬ned
criteria for the higher taxa (viz. Phylum, Class, Order, etc.)
within Bacteria (Woese et al. 1985; Stackebrandt 2000; Ludwig & Klenk 2001;
Gupta & Grifï¬ths 2002; Gupta 2002). Based on the observations that
α-proteobacteria can now be clearly distinguished from all other bacteria
based upon a large number of molecular characteristics, and that this group
also branches distinctly from all other groupsof bacteria including the β,
γ - and δ, -proteobacteria, it is suggested that α-proteobacteri
should be recognized as a main group or phylum within Bacteria, rather than as
a subdivision or class of the Proteobacteria (Gupta 2000, 2004; Gupta & Grifï¬ths
2002). Signature sequences in a few proteins (viz. PPDK and FtsK) indicate that
α-proteobacteria might have shared a distant ancestry with the δ-proteobacteria
exclusive of other bacteria, but this relationship needs to be further
investigated and conï¬rmed. The α-proteobacteria have also given rise to
mitochondria (Margulis 1970; Gray & Doolittle 1982; Andersson et al. 1998;
Sicheritz-Ponten et al. 1998; Gray et al. 1999; Gupta 2000; Emelyanov 2001a,
2003a, 2003b) and very likely played a central role in the origin of the
ancestral eukaryotic cell (Gupta & Singh 1994; Gupta & Golding 1996;
Margulis 1996; Gupta 1998; Martin & Muller 1998; Lopez-Garcia & Moreira
1999; Karlin et al. 1999; Lang et al. 1999; Emelyanov 2003b; Rivera & Lake
2004). Many of the α-proteobacteria speciï¬c signatures identiï¬ed in
the present work are also present in the mitochondrial/eukaryotic homologs,
providing additional evidence of their derivation from an
α-proteobacterial ancestor. In a few cases, the α-proteobacterial
signatures are present in genes which are encoded by the mitochondrial DNA
(viz. Cox I and Cyt b). The shared presence of these signatures in the
mitochondrial homologs provides further strong evidence for the
α-proteobacterial ancestry of mitochondria, as previously shown by
phylogenetic analysis (Andersson et al. 1998; Sicheritz-Ponten et al.
1998;Emelyanov 2003a). The current evidence suggests that within α-proteobacteria,
the Rickettsiales group of species are the closest relatives of mitochondria
(Gupta 1995; Andersson et al. 1998; Sicheritz-Ponten et al. 1998; Gray et al.
1999; Lang et al. 1999; Emelyanov 2001a, 2001b). However, this view is
supported by only some of the identiï¬ed signatures and further work is needed
to clarify this aspect.
LIST OF ABBREVIATIONS AlaRS, alanyl-tRNA synthetase; CFBG,
ChlamydiaFibrobacter-Bacteroidetes-Green sulfur bacteria; Cyt., Cytochrome; Cox
I, Cytochrome oxidase polypeptide I; LGT, lateral gene transfer; PLPG,
Prolipoprotein-phosphatidylgycerol; PPDK, pyruvate phosphate dikinase; RP,
Rickettsia prowazekii; SerRS, serine-tRNA synthetase; Succ-CoA, Succinyl-CoA;
Tgt, tRNA-guanine transglycosylase; TrpRS, tryptophanyl-tRNA synthetase;
Abbreviations in the species names are: A., Agrobacterium; Ana., Anaplasma;
Aqu., Aquifex; Azo., Azotobacter; Azospir., Azospirillum; Bac., Bacillus;
Bact., Bacteroides; Bart., Bartonella; Bdello., Bdeollovibrio; Bif., Biï¬dobacterium;
Bor., Borrelia; Bord., Bordetella; Brad. Bradyrhizobium; Bru.,
132
R. S. GUPTA Boone, D.R., Castenholz, R.W., and
Garrity, G.M. 2001. Bergey’s Manual of Systematic Bacteriology. Springer, New York. Boussau, B.,
Karlberg, E.O., Frank, A.C., Legault, B.A., and Andersson, S.G. 2004.
Computational inference of scenarios for alpha-proteobacterial genome
evolution. Proc. Natl. Acad. Sci. USA 101, 9722–9727. Bridger, W.A., Wolodko,
W.T., Henning, W., Upton,
C., Majumdar, R., and Williams, S.P. 1987. The subunits of succinyl-coenzyme A
synthetase—function and assembly. Biochem. Soc. Symp. 103–111. Broughton, W. J.
2003. Roses by other names: Taxonomy of the Rhizobiaceae. J. Bacteriol. 185,
2975–2979. Capiaux, H., Lesterlin, C., Perals, K., Louarn, J.M., and Cornet, F.
2002. A dual role for the FtsK protein in Escherichia coli chromosome
segregation. EMBO Rep. 3, 532–536. Chase, J.W., Rabin, B.A., Murphy, J.B.,
Stone, K.L., and Williams, K.R. 1986. Escherichia coli exonuclease VII. Cloning
and sequencing of the gene encoding the large subunit (xseA). J. Biol. Chem.
261, 14929–14935. Daldal, F., Davidson, E., and Cheng, S. 1987. Isolation of
the structural genes for the Rieske Fe-S protein, cytochrome b and cytochrome
c1 all components of the ubiquinol: Cytochrome c2 oxidoreductase complex of
Rhodopseudomonas capsulata. J. Mol. Biol. 195, 1–12. Davidson, E., and Daldal,
F. 1987. Primary structure of the bc1 complex of Rhodopseudomonas capsulata.
Nucleotide sequence of the pet operon encoding the Rieske cytochrome b, and
cytochrome c1 apoproteins. J. Mol. Biol. 195, 13–24. De Ley, J. 1992. The
Proteobacteria: Ribosomal RNA cistron similarities and bacterial taxonomy. In
The Prokaryotes, eds. A. Balows, H.G. Tr¨ per, u M. Dworkin, W. Harder, and
K.H. Schleifer, 2111–2140. Springer-Verlag,
New York. DelVecchio, V.G.,
Kapatral, V., Redkar, R.J., Patra, G., Mujer, C., Los, T., Ivanova, N.,
Anderson, I., Bhattacharyya, A., Lykidis, A., Reznik, G., Jablonski, L.,
Larsen, N., D’Souza, M., Bernal, A., Mazur, M., Goltsman, E., Selkov, E.,
Elzer, P. H., Hagius, S., O’Callaghan, D., Letesson, J. J., Haselkorn, R., and
Kyrpides, N. 2002. The genome sequence ofthe facultative intracellular pathogen
Brucella melitensis. Proc. Natl. Acad. Sci. USA 99, 443–448. Dumler, J.S.,
Barbet, A.F., Bekker, C.P., Dasch, G.A., Palmer, G.H., Ray, S.C.,
Rikihisa, Y., and Rurangirwa, F.R. 2001. Reorganization of genera in the
families Rickettsiaceae and Anaplasmataceae in the order Rickettsiales: Uniï¬cation
of some species of Ehrlichia with Anaplasma, Cowdria with Ehrlichia and
Ehrlichia with Neorickettsia, descriptions of six new species combinations and
designation of Ehrlichia equi and ‘HGE agent’ as subjective synonyms of
Ehrlichia phagocytophila. Int. J. Syst. Evol. Microbiol. 51, 2145–2165. Eisen,
J.A. 1995. The RecA protein as a model molecule for molecular systematic
studies of bacteria: Comparison of trees of RecAs and 16S rRNAs from the same
species. J. Mol. Evol. 41, 1105–1123. Emelyanov, V.V. 2001a. Evolutionary
relationship of Rickettsiae and mitochondria. FEBS Letters 501, 11–18.
Emelyanov, V.V. 2001b. Rickettsiaceae, rickettsia-like endosymbionts, and the
origin of mitochondria. Biosci. Rep. 21, 1–17. Emelyanov, V.V. 2003a. Common
evolutionary origin of mitochondrial and rickettsial respiratory chains. Arch.
Biochem. Biophys. 420, 130–141. Emelyanov, V.V. 2003b. Mitochondrial connection
to the origin of eukaryotic cell. Eur. J. Biochem. 270, 1599–1618. Espeli, O.,
Lee, C., and Marians, K.J. 2003. A physical and functional interaction between
Escherichia coli FtsK and topoisomerase IV. J. Biol. Chem. 278, 44639–44644.
Esser, C., Ahmadinejad, N., Wiegand, C., Rotte, C., Sebastiani, F.,
GeliusDietrich, G., Henze, K., Kretschmann, E., Richly, E., Leister, D.,Bryant,
D., Steel, M.A., Lockhart, P.J., Penny, D., and Martin, W. 2004. A Genome
Phylogeny for Mitochondria Among -Proteobacteria and a Predominantly
Eubacterial Ancestry of Yeast Nuclear Genes. Mol. Biol. Evol. 21, 1643–1660.
Falah, M., and Gupta, R.S. 1994. Cloning of the hsp70 (dnaK) genes from
Rhizobium meliloti and Pseudomonas cepacia: Phylogenetic analyses of
mitochondrial origin based on a highly conserved protein sequence. J.
Bacteriol. 176, 7748–7753.
Brucella; Buch., Buchnera; Burk., Burkholderia; Ca., Caulobacter; Camp.,
Campylobacter; Cb., Chlorobium; Cfx., Chloroflexus; Chl., Chlamydia; Chlam,
Chlamydophila; Chromo., Chromo-bacterium; Clo., Clostridium; Cor.,
Cornyebacterium; Cox., Coxiella; Cyt., Cytophaga; Dei., Deinococcus; Dechloro.,
Dechloromonas; Des., Desulfovibrio; Desulf., Desulï¬tobacterium; Dros. endo.,
Drosophila endosymbiont; E., Escherichia; Ent., Enterococcus; Fuso.,
Fusobacterium; Geo., Geobacter; H., Haemophilus; Hel., Helicobacter; Lac.,
Lactococcus; Lactobac., Lactobacillus; Lep., Leptospira; Lis., Listeria; Leg.,
Legionella; Mag., Magnetococcus; Meso., Mesorhizobium; Methano.,
Methanobacterium; Methyl., Methylobacillus; Microbul., Microbulbifer; Myc.,
Mycobacterium; Myx., Myxococcus; Nei., Neisseria; Nit., Nitrosomonas; Nitro.,
Nitrosospira; Novo., Novosphingobacterium; Olig., Oligotropha; Para.,
Paracoccus; Pas., Pasteurella; Photobac., Photobacterium; Por., Porphyromonas;
Pse., Pseudomonas; Ral., Ralstonia; Rhi., Rhizobium; Rho., Rhodobacter; Rhodo.,
Rhodospirillum; Rhodopseud., Rhodopseudomonas; Ri., Rickettsia; Shew.,
Shewanella; Sino., Sinorhizobium; Sta.,Staphylococcus; Str., Streptomyces;
Strep., Streptococcus; Syn., Synechococcus; Sulfo., Sulfolobus; T., Thermotoga;
Thermoan., Thermoanaerobacter; Thermosyn., Thermosynechococcus; Tre.,
Treponema; Vib., Vibrio; Xan., Xanthomonas; Thiobac., Thiobacillus; Wol.,
Wolinella; Xyl., Xylella; Yer., Yersinia; Z., Zymomonas. ACKNOWLEDGMENTS The
competent technical assistance of Pinay Kanth, Jeveon Clements, Larissa
Shamseer, and Adeel Mahmood in creating sequence alignments of proteins from
Rickettsia prowazekii and other genomes is thankfully acknowledged. I am also
thankful to Yan Li for developing certain computer programs that facilitated
the creation of signature sequence ï¬les and for help in setting up the
bacterial signatures website (www.bacterialphylogeny.com). Thanks are also due
to Emma Grifï¬ths and Pinay Kanth for helpful comments on the manuscript. The
work on signature sequences described here was mostly completed by August 2004.
This work was supported by a research grant from the National Science and
Engineering Research Council of Canada and the Canadian Institute of Health
Research. REFERENCES
Andersson, S.G., Zomorodipour, A., Andersson, J.O., Sicheritz-Ponten, T.,
Alsmark, U.C., Podowski, R.M., Naslund, A.K., Eriksson, A.S., Winkler, H.H.,
and Kurland, C.G. 1998. The genome sequence of
Rickettsia prowazekii and the origin of mitochondria. Nature 396, 133–140.
Baldauf, S.L., and Palmer, J.D. 1993. Animals and fungi are each other’s
closest relatives: Congruent evidence from multiple proteins. Proc. Natl. Acad.
Sci. USA 90, 11558–11562. Battistuzzi, F.U., Feijao, A., and Hedges, S.B. 2004.
Agenomic timescale of prokaryote evolution: Insights into the origin of
methanogenesis, phototrophy, and the colonization of land. BMC. Evol. Biol. 4,
44. Bengtsson, J., von Wachenfeldt, C., Winstedt, L., Nygaard, P., and
Hederstedt, L. 2004. CtaG is required for formation of active cytochrome c
oxidase in Bacillus subtilis. Microbiology 150, 415–425.
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA Galibert, F.,
Finan, T.M., Long, S.R., Puhler, A., Abola, P., Ampe, F., BarloyHubler, F.,
Barnett, M. J., Becker, a., Boistard, P., Bothe, G., Boutry, M., Bowser, L.,
Buhrmester, J., Cadieu, E., Capela, D., Chain, P., Cowie, A., Davis, R. W.,
Dreano, s., Federspiel, N. A., Fisher, R. F., Gloux, S., godrie, T., Goffeau,
A., Golding, B., Gouzy, J., Gurjal, M., Hernandez-Lucas, I., Hong, A., Huizar,
L., Hyman, R. W., Jons, T., Kahn, D., Kahn, M. L., Kalman, S., Keating, D. H.,
Kiss, E., Komp, c., Lelaure, v., Masuy, d., Palm, C., Peck, M. C., Pohl, T. M.,
Portetelle, d., Purnelle, B., Ramsperger, U., Surzycki, r., Thebault, P.,
Vandenbol, M., Vorholter, F. J., Weidner, S., Wells, D. H., Wong, K., Yeh, K.
C., and Batut, J. 2001. The composite genome of the legume symbiont
Sinorhizobium meliloti. Science 293, 668–672. Garrity, G.M., and Holt, J.G.
2001. The road map to the manual. In Bergey’s Manual of Systematic
Bacteriology, eds. D. R. Boone and R. W. Castenholz, 119–166. Springer-Verlag, Berlin.
Gaunt, M.W., Turner, S.L., Rigottier-Gois, L., Lloyd-Macgilp, S.A.,
and Young, J.P. 2001. Phylogenies of atpD and recA support the small subunit
rRNAbased classiï¬cation of rhizobia. Int. J. Syst. Evol.Microbiol. 51, 2037–
2048. Gonzales, T., and Robert-Baudouy, J. 1996. Bacterial aminopeptidases:
Properties and functions. FEMS Microbiol. Rev. 18, 319–344. Gray, M.W. 1989.
The evolutionary origins of organelles. Trends in Genet. 5, 294–299. Gray,
M.W., Burger, G., and Lang, B.F. 1999. Mitochondrial evolution. Science 283,
1476–1481. Gray, M.W., and Doolittle, W.F. 1982. Has the endosymbiont
hypothesis been proven?. Microbiol. Rev. 46, 1–42. Grifï¬ths, E., and Gupta,
R.S. 2002. Protein signatures distinctive of chlamydial species: Horizontal
transfer of cell wall biosynthesis genes glmU from Archaebacteria to
Chlamydiae, and murA between Chlamydiae and Streptomyces. Microbiology 148,
2541–2549. Grifï¬ths, E., and Gupta, R.S. 2004a. Distinctive protein signatures
provide molecular markers and evidence for the monophyletic nature of the
Deinococcus-Thermus phylum. J. Bacteriol. 186, 3097–3107. Grifï¬ths, E., and
Gupta, R.S. 2004b. Signature sequences in diverse proteins provide evidence for
the late divergence of the order Aquiï¬cales. International Microbiol. 7, 41–52.
Gupta, R.S. 1995. Evolution of the chaperonin families (Hsp60, Hsp10 and Tcp1)
of proteins and the origin of eukaryotic cells. Mol. Microbiol. 15, 1–11.
Gupta, R.S. 1998. Protein phylogenies and signature sequences: a reappraisal of
evolutionary relationships among archaebacteria, eubacteria, and eukaryotes.
Microbiol. Mol. Biol. Rev. 62, 1435–1491. Gupta, R.S. 2000. The phylogeny of
Proteobacteria: Relationships to other eubacterial phyla and eukaryotes. FEMS
Microbiol. Rev. 24, 367–402. Gupta, R.S. 2001. The branching order and
phylogeneticplacement of species from completed bacterial genomes, based on
conserved indels found in various proteins. Inter. Microbiol. 4, 187–202.
Gupta, R.S. 2002. Phylogeny of Bacteria: Are we now close to understanding it?.
ASM News. 68, 284–291. Gupta, R.S. 2003. Evolutionary relationships among
photosynthetic bacteria. Photosynth. Res. 76, 173–183. Gupta, R.S. 2004. The
phylogeny and signature sequences characteristics of Fibrobacters, Chlorobi and
Bacteroidetes. Crit. Rev. Microbiol. 30, 123– 143. Gupta, R.S., Aitken, K.,
Falah, M., and Singh, B. 1994. Cloning of Giardia lamblia heat shock protein
HSP70 homologs: Implications regarding origin of eukaryotic cells and of
endoplasmic reticulum. Proc. Natl. Acad. Sci. USA 91, 2895–2899. Gupta, R.S.,
Bustard, K., Falah, M., and Singh, D. 1997. Sequencing of heat shock protein 70
(DnaK) homologs from Deinococcus proteolyticus and Thermomicrobium roseum and
their integration in a protein-based phylogeny of prokaryotes. J. Bacteriol.
179, 345–357. Gupta, R.S., and Golding, G.B. 1996. The origin of the eukaryotic
cell. Trends Biochem. Sci. 21, 166–171. Gupta, R.S., and Grifï¬ths, E. 2002.
Critical issues in bacterial phylogenies. Theor. Popul. Biol. 61, 423–434.
133
Gupta, R.S., Pereira, M., Chandrasekera, C., and Johari, V. 2003. Molecular
signatures in protein sequences that are characteristic of Cyanobacteria and
plastid homologues. Int. J. Syst. Evol. Microbiol. 53, 1833–1842. Gupta, R.S.,
and Singh, B. 1994. Phylogenetic analysis of 70 kD heat shock protein sequences
suggests a chimeric origin for the eukaryotic cell nucleus. Curr. Biol. 4,
1104–1114.Hiser, L., Di Valentin, M., Hamer, A.G., and Hosler, J.P. 2000. Cox11p
is required for stable formation of the Cu(B) and magnesium centers of
cytochrome c oxidase. J. Biol. Chem. 275, 619–623. Hui, F.M., and Morrison,
D.A. 1993. Identiï¬cation of a purC gene from Streptococcus pneumoniae. J.
Bacteriol. 175, 6364–6367. Ip, S.C., Bregu, M., Barre, F.X., and Sherratt, D.J.
2003. Decatenation of DNA circles by FtsK-dependent Xer site-speciï¬c
recombination. EMBO J. 22, 6399–6407. Jeanmougin, F., Thompson, J.D., Gouy, M.,
Higgins, D.G., and Gibson, T.J. 1998. Multiple sequence alignment with Clustal
x. Trends Biochem. Sci. 23, 403–405. Kaneko, T., Nakamura, Y., Sato, S.,
Asamizu, E., Kato, T., Sasamoto, S., Watanabe, a., Idesawa, K., Ishikawa, a.,
Kawashima, K., Kimura, t., Kimura, T., Kishida, Y., Kiyokawa, c., Kohara, M.,
Matsumoto, M., Matsuno, a., Mochizuki, Y., Nakayama, S., Nakazaki, N., Shimpo,
S., Sugimoto, M., Takeuchi, C., Yamada, M., and tabata, S., Complete genome
structure of the nitrogen-ï¬xing symbiotic bacterium Mesorhizobium loti. DNA
Res. 7, 331–338. Kaneko, T., Nakamura, Y., Sato, S., Minamisawa, K., UCHIUMI,
T., Sasamoto, s., Watanabe, A., Idesawa, K., Iriguchi, M., Kawashima, K.,
Kohara, M., Matsumoto, M., Shimpo, S., Tsuruoka, H., Wada, T., Yamada, M., and
Tabata, S., 2002. Complete genomic sequence of nitrogen-ï¬xing symbiotic
bacterium Bradyrhizobium japonicum USDA110. DNA Res. 9, 189–197. Karlin, S.,
and Brocchieri, L. 2000. Heat shock protein 60 sequence comparisons:
Duplications, lateral transfer, and mitochondrial evolution. Proc. Natl. Acad.
Sci. USA 97, 11348–11353. Karlin, S.,Brocchieri, L., Mrazek, J., Campbell,
A.M., and Spormann, A.M. 1999. A chimeric prokaryotic ancestory of mitochondria
and primitive eukaryotes. Proc. Natl. Acad. Sci. USA 96, 9190–9195. Kersters,
K., Devos, P., Gillis, M., Vandamme, P., and Stackebrandt, E. 2003.
Introduction to the Proteobacteria. In The Prokaryotes: An Evolving Electronic
Resource for the Microbiological Community, ed. M. e. al. Dworkin,
Springer-Verlag, New York. Kolber, Z.S., Plumley, F.G., Lang, A.S., Beatty,
J.T., Blankenship, R.E., VanDover, C.L., Vetriani, C., Koblizek, M., Rathgeber,
C., and Falkowski, P.G. 2001. Contribution of aerobic photoheterotrophic
bacteria to the carbon cycle in the ocean. Science 292, 2492–2495. Ku, M.S.,
Kano-Murakami, Y., and Matsuoka, M. 1996. Evolution and expression of C4
photosynthesis genes. Plant Physiol. 111, 949–957. Kurland, C.G., and
Andersson, S.G. 2000. Origin and evolution of the mitochondrial proteome.
Microbiol. Mol. Biol. Rev. 64, 786–820. Lake, J.A., and Rivera, M.C. 1994. Was
the nucleus the ï¬rst endosymbiont? Proc. Natl. Acad. Sci. USA 91, 2880–2881.
Lang, B.F., Gray, M.W., and Burger, G. 1999. Mitochondrial genome evolution and
the origin of eukaryotes. Annual Review of Genetics 33, 351– 397. Larimer,
F.W., Chain, P., Hauser, L., Lamerdin, J. Malfatti, S., Do, L., Land, M. L.,
Pelletier, D. A., Beatty, J. t., Lang, A. S., Tabita, F. R., Gibson, J. L.,
Hanson, T. E., Bobst, C., Torres, J. L., Peres, C., Harrison, F. H., Gibson,
J., and Harwood, C. S., 2004. Complete genome sequence of the metabolically
versatile photosynthetic bacterium Rhodopseudomonas palustris. Nat.
Biotechnol.22, 55–56. Leyva, J.A., Bianchet, M.A., and Amzel, L.M. 2003.
Understanding ATP synthesis: Structure and mechanism of the F1-ATPase (Review).
Mol. Membr. Biol. 20, 27–33. Lopez-Garcia, P., and Moreira, D. 1999. Metabolic
symbiosis at the origin of eukaryotes. Trends Biochem. Sci. 24, 88–93. Ludwig,
W., and Klenk, H.-P. 2001. Overview: A phylogenetic backbone and taxonomic
framework for prokaryotic systamatics. In Bergey’s Manual of
134
R. S. GUPTA Ayodeji, B., Kraul, M., Shetty, J., Malek, J., Van Aken, S. E.,
Reidmuller, S., Tettelin, H., Gill, S. R., White, O., Salzberg, S. L., Hoover,
D. L., Lindler, L. E., Halling, s. M., Boyle, S. M., and Fraser, C. M., 2002.
The Brucella suis genome reveals fundamental similarities between animal and
plant pathogens and symbionts. Proc. Natl. Acad. Sci. USA 99, 13148–13153. Qi,
H.Y., Sankaran, K., Gan, K., and Wu, H.C. 1995. Structure-function relationship
of bacterial prolipoprotein diacylglyceryl transferase: Functionally signiï¬cant
conserved regions. J. Bacteriol. 177, 6820–6824. Reuter, K., and Ficner, R.
1995. Sequence analysis and overexpression of the Zymomonas mobilis tgt gene
encoding tRNA-guanine transglycosylase: Puriï¬cation and biochemical
characterization of the enzyme. J. Bacteriol. 177, 5284–5288. Ribeiro, S., and
Golding, G.B. 1998. The mosaic nature of the eukaryotic nucleus. Mol. Biol.
Evol. 15, 779–788. Rivera, M.C., and Lake, J.A. 1992. Evidence that eukaryotes
and eocyte prokaryotes are immediate relatives. Science 257, 74–76. Rivera,
M.C., and Lake, J.A. 2004. The ring of life provides evidence for a genome
fusion origin of eukaryotes.Nature 431, 152–155. Rokas, A., and Holland, P.W.
2000. Rare genomic changes as a tool for phylogenetics. Trends Ecol. Evol. 15,
454–459. Romanowski, M.J., Bonanno, J.B., and Burley, S.K. 2002. Crystal
structure of the Escherichia coli glucose-inhibited division protein B (GidB)
reveals a methyltransferase fold. Proteins 47, 563–567. Sadowsky, M.J. and P.H.
Graham. 2000. Root and Stem Nodule Bacteria of Legumes. In The Prokaryotes: An
Evolving Electronic Resource for the Microbiological Community, ed. M. e. al.
Dworkin. Springer-Verlag, New York. Sawada, H., Kuykendall, L.D., and Young,
J.M. 2003. Changing concepts in the systematics of bacterial nitrogen-ï¬xing
legume symbionts. J. Gen. Appl. Microbiol. 49, 155–179. Sicheritz-Ponten, T.,
Kurland, C.G., and Andersson, S.G. 1998. A phylogenetic analysis of the
cytochrome b and cytochrome c oxidase I genes supports an origin of
mitochondria from within the Rickettsiaceae. Biochim. Biophys. Acta. 1365,
545–551. Sixma, T.K. 2001. DNA mismatch repair: MutS structures bound to
mismatches. Curr. Opin. Struct. Biol. 11, 47–52. Soni, R.K., Mehra, P.,
Choudhury, N.R., Mukhopadhyay, G., and Dhar, S.K. 2003. Functional
characterization of Helicobacter pylori DnaB helicase. Nucleic Acids Res. 31,
6828–6840. Stackebrandt, E. 2000. Deï¬ning Taxonomic Ranks. In The
Prokaryotes: An Evolving Electronic Resource for the Microbiological Community,
ed. M. e. al. Dworkin. Springer-Verlag, New York. Stackebrandt, E., Murray,
R.G.E., and Tr¨ per, H.G. 1988. Proteobacteria classis u nov., a name for the
phylogenetic taxon that includes the “Purple bacteria and theirRelatives.” Int.
J. Syst. Bacteriol. 38, 321–325. Stepkowski, T., Czaplinska, M., Miedzinska,
K., and Moulin, L. 2003. The variable part of the dnaK gene as an alternative
marker for phylogenetic studies of rhizobia and related alpha Proteobacteria.
Syst. Appl. Microbiol. 26, 483–494. Stryer, L. 1995. Biochemistry. W.H. Freeman
and Co., New York. Taillardat-Bisch, A.V., Raoult, D., and Drancourt, M. 2003.
RNA polymerase beta-subunit-based phylogeny of Ehrlichia spp., Anaplasma spp.,
Neorickettsia spp. and Wolbachia pipientis. Int. J. Syst. Evol. Microbiol. 53,
455– 458. van Berkum, P., Terefework, Z., Paulin, L., Suomalainen, S.,
Lindstrom, K., and Eardly, B.D. 2003. Discordant phylogenies within the rrn
loci of Rhizobia. J. Bacteriol. 185, 2988–2998. Van Sluys, M.A.,
Monteiro-Vitorello, C.B., Camargo, L.E., Menck, C.F., da Silva, A.C., Ferro,
J.A., Oliveira, M.C., Setubal, J.C., Kitajima, J.P., and Simpson, A.J. 2002.
Comparative genomic analysis of plant-associated bacteria. Annu. Rev.
Phytopathol. 40, 169–189. Viale, A.M., and Arakaki, A.K. 1994. The chaperone
connection to the origins of the eukaryotic organelles. FEBS Letters 341,
146–151. Viale, A.M., Arakaki, A.K., Soncini, F.C., and Ferreyra, R.G. 1994. Evolutionary
relationships among eubacterial groups as inferred from GroEL (chaperonin)
sequence comparisons. Int. J. Syst. Bacteriol. 44, 527–533.
Systematic Bacteriology, eds. D. R. Boone and R. W. Castenholz, 49–65.
Springer-Verlag, Berlin. Ludwig, W., and Schleifer, K.H. 1999. Phylogeny of
Bacteria beyond the 16S rRNA Standard. ASM News 65, 752–757. Maidak, B.L.,
Cole, J.R., Lilburn, T.G., Parker,C.T., Jr., Saxman, P.R., Farris, R.J.,
Garrity, G.M., Olsen, G.J., Schmidt, T.M., and Tiedje, J.M. 2001. The RDP-II
(Ribosomal Database Project). Nucleic Acids Res. 29, 173– 174. Margulis, L.
1970. Origin of Eukaryotic cells. Yale University Press, New Haven, CT.
Margulis, L. 1993. Symbiosis in Cell Evolution. W.H. Freeman and Company, New
York. Margulis, L. 1996. Archaeal-eubacterial mergers in the origin of Eukarya:
Phylogenetic classiï¬cation of life. Proc. Natl. Acad. Sci. USA 93, 1071–1076.
Martin, W., and Muller, M. 1998. The hydrogenosome hypothesis for the ï¬rst
eukaryote. Nature 392, 37–41. Martins-Pinheiro, M., Galhardo, R.S., Lage, C.,
Lima-Bessa, K.M., Aires, K.A., and Menck, C.F. 2004. Different patterns of
evolution for duplicated DNA repair genes in bacteria of the Xanthomonadales
group. BMC. Evol. Biol. 4, 29. McLeod, M.P., Qin, X., Karpathy, S.E., Gioia, J.
Highlander, S. K., Fox, G. E., McNeill, T. Z., Jiang, H., Muzny, d., Jacob, L.
S., Hawes, A. C., Sodergren, E., Gill, R., Hume, J., Morgan, M., Fan, G., Amin,
A. G., Gibbs, R. A., Hong, C., Yu, X. J., Walker, D. H., and Weinstock, G. M.,
2004. Complete genome sequence of Rickettsia typhi and comparison with
sequences of other rickettsiae. J. Bacteriol. 186, 5842–5855. Messer, W. 2002.
The bacterial replication initiator DnaA. DnaA and oriC, the bacterial mode to
initiate DNA replication. FEMS Microbiol. Rev. 26, 355– 374. Morden, C.W.,
Delwiche, C.F., Kuhsel, M., and Palmer, J.D. 1992. Gene phylogenies and the
endosymbiotic origin of plastids. Biosystems 28, 75–90. Moreno, E., and
Moriyon, I. 2001. The Genus Brucella. The Prokaryotes:An Evolving Electronic
Resource for the Microbiological Community. In ed. M. e. al. Dworkin.
Springer-Verlag, New York. Moulin, L., Bena, G., Boivin-Masson, C., and
Stepkowski, T. 2004. Phylogenetic analyses of symbiotic nodulation genes
support vertical and lateral gene cotransfer within the Bradyrhizobium genus.
Mol. Phylogenet. Evol. 30, 720– 732. Murray, R.G.E., Brenner, D.J., Colwell,
R.R., De Vos, P., Goodfellow, M., Grimont, P.A.D., Pfennig, N., Stackebrandt,
E., and Zavarzin, G.A. 1990. Report of the Ad Hoc Committee on approaches to
taxonomy within the Proteobacteria. Int. J. Syst. Bacteriol. 40, 213–215.
Nierman, W.C., Feldblyum, T.V., Laub, M.T., Paulsen, I.T., Nelson, K.E., Eisen,
J., Heidelberg, J.F., Alley, M.R., Ohta, N., Maddock, J.R., Potocka, I., Nelson,
W.C., Newton, A., Stephens, C., Phadke, N.D., Ely, B., DeBoy, R.T., Dodson,
R.J., Durkin, A.S., Gwinn, M.L., Haft, D.H., Kolonay, J.F., Smit, J., Craven,
M.B., Khouri, H., Shetty, J., Berry, K., Utterback, T., Tran, K., Wolf, A.,
Vamathevan, J., Ermolaeva, M., White, O., Salzberg, S.L., Venter, J.C.,
Shapiro, L., and Fraser, C.M. 2001. Complete genome sequence of Caulobacter
crescentus. Proc. Natl. Acad. Sci. USA 98, 4136–4141. Ogata, H., Audic, S.,
Renesto-Audiffren, P., Fournier, P.E., Barbe, V., Samson, D., Roux, V.,
Cossart, P., Weissenbach, J., Claverie, J.M., and Raoult, D. 2001. Mechanisms
of evolution in Rickettsia conorii and R. prowazekii. Science 293, 2093–2098.
Olsen, G. J., Woese, C. R., and Overbeek, R. 1991. The winds of (evolutionary)
change: Breathing new life into microbiology. J. Bacteriol. 176, 1–6. Opperman,
T.,and Richardson, J.P. 1994. Phylogenetic analysis of sequences from diverse
bacteria with homology to the Escherichia coli rho gene. J. Bacteriol. 176,
5033–5043. Parker, A.R., and Eshleman, J.R. 2003. Human MutY: Gene structure,
protein functions and interactions, and role in carcinogenesis. Cell Mol. Life
Sci. 60, 2064–2083. Paulsen, I.T., Seshadri, R., Nelson, K.E., Eisen, J.A.
Heidelberg, J. F., Read, T. D., Dodson, R. J., Umayam, L., Brinkac, L. M.,
Beanan, M. J., Daugherty, s. C., DeBoy, R. T., Durkin, A. S., Kolonay, J. F.,
Madupu, r., Nelson, W. C.,
PHYLOGENY AND SIGNATURES DISTINCTIVE OF α-PROTEOBACTERIA Wang, E.T., van
Berkum, P., Beyene, D., Sui, X.H., Dorado, O., Chen, W.X., and Martinez-Romero,
E. 1998. Rhizobium huautlense sp. nov., a symbiont of Sesbania herbacea that
has a close phylogenetic relationship with Rhizobium galegae. Int. J. Syst.
Bacteriol. 48 Pt. 3, 687–699. Woese, C.R., Stackebrandt, E., Macke, R.J., and
Fox, G.E. 1985. A phylogenetic deï¬nition of the major eubacterial taxa.
System. Appl. Microbiol. 6, 143– 151. Woese, C.R., Stackebrandt, E., Weisburg,
W.G., Paster, B.J., Madigan, M.T., Fowler, C.M.R., Hahn, C.M., Blanz, P.,
Gupta, R., Nealson, K.H., and Fox, G.E. 1984The phylogeny of purple bacteria:
The alpha subdivision. System. Appl. Microbiol. 5, 315–326. Wolf, Y.I.,
Aravind, L., and Koonin, E.V. 1999. Rickettsiae and Chlamydiae— evidence of
horizontal gene transfer and gene exchange. Trends Genet 15, 173–175. Wood,
D.W., Setubal, J.C., Kaul, R., Monks, D.E. Kitajima, J. P., Okura, V. K., Zhou,
Y., Chen, L., Wood, G. E., Almeida, N. F., Jr., Woo, L., Chen,Y.,Paulsen, I.
T., Eisen, J. A., Karp, P. D., Bovee, D., Sr., Chapman, P., Clendenning, J., Deatherage,
G., Gillet, W., Grant, c., Kutyavin, T., Levy, R., Li, M. J., McClelland, E.,
Palmieri, A., Raymond, C., Rouse, G., Saenphimmachak, C., Wu, Z., Romero, P.,
Gordon, D., Zhnag, S., Yoo, H., Tao, Y., Biddle,
135
P., Jung, M., Krespan, W., Perry, M., Gordon-Kamm, B., Lioa, L., Kim, S.,
Hendrick, C., Zhao, Z. Y., Dolan, M., Chumley, F., Tingey, S. V., Tomb, J. F.,
Godon, M. P., Olson, M. V., and Nester, E. W., 2001. The genome of the natural
genetic engineer Agrobacterium tumefaciens C58. Science 294, 2317– 2323. Young,
J.M., Kuykendall, L.D., Martinez-Romero, E., Kerr, A., and Sawada, H. 2001. A
revision of Rhizobium Frank 1889, with an emended description of the genus, and
the inclusion of all species of Agrobacterium Conn 1942 and Allorhizobium undicola
de Lajudie et al. 1998 as new combinations: Rhizobium radiobacter, R.
rhizogenes, R. rubi, R. undicola and R. vitis. Int. J. Syst. Evol. Microbiol.
51, 89–103. Yu, X.J. and D. H. Walker. 2003. The Order Rickettsiales. In The
Prokaryotes: An Evolving Electronic Resource for the Microbiological Community,
ed. M. e. al. Dworkin. Springer-Verlag, New York. Yu, X.J., Zhang, X.F.,
McBride, J.W., Zhang, Y., and Walker, D.H. 2001. Phylogenetic relationships of
Anaplasma marginale and ‘Ehrlichia platys’ to other Ehrlichia species
determined by GroEL amino acid sequences. Int. J. Syst. Evol. Microbiol. 51,
1143–1146. Yurkov, V.V., and Beatty, J.T. 1998. Aerobic anoxygenic phototrophic
bacteria. Microbiol. Mol. Biol. Rev. 62, 695–724.