What is the significance of the evolution of Hox gene clusters during vertebrate evolution?


Figure 2: Changes in specific vertebral elements for the Hox5, Hox6, Hox9, Hox10, and Hox11 paralogous mutants.

On the left side of the panel, a diagram of the axial skeleton is shown, with specific vertebral elements shown in the right panel marked (C, cervical; T, thoracic; L, lumbar, S, sacral). Wild-type, control elements from specific vertebral positions are denoted by letter and number. The analogous segment from the paralogous mutants are shown on the right and left, with colored boxes for each paralogous mutant group.

Now examine the mouse portion of Figure 1. Vertebrates, including mice, have Hox genes that are homologous to those of the fly, and these genes are clustered in discrete locations with a 3'-to-5' order reflecting an anterior to posterior order of expression. There are several differences between the mouse and fly Hox genes, however. One obvious difference is that there are more Hox genes on the 5' side of the mouse segment; these correspond to expression in the tail, and flies do not have anything homologous to the chordate tail. Another difference is that, in the mouse, there are four banks of Hox genes: HoxA, HoxB, HoxC, and HoxD. Vertebrates have these parallel, overlapping sets of Hox genes, which suggest that morphology could be a product of a combinatorial expression of the genes in the four Hox clusters. This means that there could be a Hox code, in which identity can be defined with more gradations by mixing up the bounds of expression of each of the genes.

In the fly, the situation is much simpler. Because each segment more or less expresses only one Hox gene, mutating or knocking out a single Hox gene will have an effect on the corresponding body segment. In vertebrates, though, each segment has at least two, and in some cases four, Hox genes that may be involved in its development. As a result, there is the possibility of redundancy.

For instance, in mice, the HoxA3 gene is expressed in the anterior cervical vertebrae, near the region where the first neck vertebra articulates with the skull. Deleting HoxA3 has no detectable effects on that joint; either its influence is too subtle to measure, it affects some other aspect of cervical specification, or it has a partner gene that takes over its job in its absence. Notice in Figure 1 that HoxA3 has a paralog, or copy, called HoxD3, which is expressed in a very similar place. When HoxD3 is mutated all by itself, there is a serious abnormality; here, the first neck vertebra has a partial fusion with the base of the skull. However, knocking out both HoxA3 and HoxD3 shows that HoxA3 is important after all; without it, the first neck vertebra doesn't form. In fact, in this instance, it is thought that the initial mesodermal tissue for the bone is so thoroughly respecified that it fuses completely with the skull instead, becoming part of the base of the skull.

These results tell us that a combination of Hox genes is required for the proper development of the first cervical vertebra. They also complicate analyses by indicating that knocking out the Hox genes one at a time in the mouse will result in cases in which no phenotype or a partial phenotype will be seen, even when the gene has an important role to play in that segment. Ultimately, all of the paralogous genes need to be knocked out. That is, in order to see what the third Hox gene does in the cluster, for instance, you need to carry out a paralogous deletion that destroys the function of HoxA3, HoxB3, and HoxD3 (there is no HoxC3) to assess the phenotype.

This phenomenon is also one reason why homeotic mutations in vertebrates are so rarely seen. In flies, one gene can be mutated, resulting in a haltere being transformed into a wing, or an antenna turning into a leg; in the mouse, two to four genes must be simultaneously removed to get a similar complete transformation.

  • PDF
  • Split View
    • Article contents
    • Figures & tables
    • Video
    • Audio
    • Supplementary Data

It is now well established that there were four Hox gene clusters in the genome of the last common ancestor of extant gnathostomes. To better understand the evolution of the organization and expression of these genomic regions, we have studied the Hox gene clusters of a shark (Scyliorhinus canicula). We sequenced 225,580 expressed sequence tags from several embryonic cDNA libraries. Blast searches identified corresponding transcripts to almost all the HoxA, HoxB, and HoxD cluster genes. No HoxC transcript was identified, suggesting that this cluster is absent or highly degenerate. Using Hox gene sequences as probes, we selected and sequenced seven clones from a bacterial artificial chromosome library covering the complete region of the three gene clusters. Mapping of cDNAs to these genomic sequences showed extensive alternative splicing and untranslated exon sharing between neighboring Hox genes. Homologous noncoding exons could not be identified in transcripts from other species using sequence similarity. However, by comparing conserved noncoding sequences upstream of these exons in different species, we were able to identify homology between some exons. Some alternative splicing variants are probably very ancient and were already coded for by the ancestral Hox gene cluster. We also identified several transcripts that do not code for Hox proteins, are probably not translated, and all but one are in the reverse orientation to the Hox genes. This survey of the transcriptome of the Hox gene clusters of a shark shows that the high complexity observed in mammals is a gnathostome ancestral feature.

Hox gene cluster, Scyliorhinus canicula, transcriptome, shark, gnathostomes

In all vertebrate species examined so far, Hox genes are tightly linked in duplicated clusters and several rounds of whole-genome duplications at different times and in different lineages have resulted in variable numbers of Hox clusters in different species (Kuraku and Meyer 2009; Ravi et al. 2009). It is now well established that two rounds of duplications occurred before the diversification of extant gnathostomes (jawed vertebrates), even if it is not clear yet whether these events occurred before or after the separation of extant cyclostomes (jawless fish) and gnathostomes (Robinson-Rechavi et al. 2004; Kuraku et al. 2009). For gnathostomes, four Hox clusters are the ancestral state. Additional rounds of genome duplication have subsequently increased the number of clusters in some lineages, as exemplified by some actinopterygians (ray-finned fish) (Kurosawa et al. 2006; Hoegg et al. 2007; Zou et al. 2007; Mungpakdee et al. 2008). In these genomes, some clusters contain only few genes or have even completely disappeared. A negative correlation between the number of Hox clusters and the number of genes on each cluster has been observed (Wagner et al. 2003), supporting the idea that new rounds of duplication may release constraints on the maintenance of clustered genes, as has been observed in the ParaHox genes (Mulley et al. 2006).

Currently, our views on the evolution of the organization and expression of Hox gene clusters in vertebrates rely on studies of a few osteichthyans (“bony fish,” such as zebrafish, mouse, chicken, or xenopus). In addition, most of these studies have been limited to the tetrapods. In osteichthyans, spatial and temporal collinearity are conserved along the anteroposterior axis in the branchial arches, hindbrain, somites, and appendages. Complex expression patterns involving partial spatial and/or temporal collinearity, or no collinearity at all, have also been observed during the development of kidneys, gut, male urogenital tract and external genitalia, müllerian tract, lungs, hematopoiesis and lymphomagenesis, and development of endometrium and hairs (Papageorgiou 2007 and references therein). These data clearly show that Hox genes have been co-opted several times during vertebrate evolution (Duboule 1998).

There are much fewer studies on the other main group of gnathostomes, the chondrichthyans (cartilaginous fish), and only by studying species from groups specifically selected for their phylogenetic position will we be able to gain a more detailed understanding of the evolutionary dynamics of Hox gene organization and functional diversity in vertebrates (Monteiro and Ferrier 2006). The chondrichthyans consist of two taxonomic groups: Holocephali (chimaeras) and Elasmobranchii (sharks and rays). These two monophyletic groups diverged about 400 Ma (Janvier 1996), whereas the chondrichthyans and osteichthyans diverged about 450 Ma. Sampling species among chimaeras, sharks, and rays is thus of prime importance in order to refine our understanding of evolutionary events at the level of the gnathostomes.

A survey of Hox genes in the genome of a chimaera, the elephant shark Callorhinchus milii, identified 45 genes belonging to four Hox clusters (Ravi et al. 2009). No description of a complete set of Hox gene clusters has ever been published for a shark. However, two Hox clusters, HoxA (complete) and HoxD (partial), have been described in the horn shark Heterodontus francsici, suggesting that its genome has between two and four Hox gene clusters (Kim et al. 2000; Chiu et al. 2002; Powers and Amemiya 2004a). No Hox gene expression pattern has been published for any chimaera. Partial data on HoxA11, HoxA13, and posterior HoxD gene (HoxD9 to HoxD13) expression have been reported for the dogfish (Scyliorhinus canicula) and HoxD14 in a closely related species, Scyliorhinus torazame. Spatial and temporal collinearity were observed along the body axis and in the fins (Freitas et al. 2006, 2007; Sakamoto et al. 2009), except for HoxD14 which showed a very restricted domain of expression in the most posterior part of the hindgut (Kuraku et al. 2008). The dogfish being a particularly good chondrichthyan model from an experimental point of view (Coolen et al. 2007), we set up a project to study both the genome and the transcriptome of this species, with particular attention to the Hox genes.

Five cDNA libraries for four different developmental stages and a bacterial artificial chromosome (BAC) library were constructed and used to study the organization of Hox gene clusters and their transcription in the dogfish. Our results show that the HoxC cluster has been lost or possibly restricted to only very few genes, a situation that has not be reported in other gnathostome species. Mapping of cDNAs to genomic sequences showed extensive alternative splicing and untranslated exon sharing between neighboring Hox genes. We also identified several transcripts that do not code for HOX proteins, are probably not translated, and all but one are in the reverse orientation to the Hox genes. The complexity of the Hox gene cluster transcriptome found in mammals was also observed in the dogfish, suggesting that it has been inherited from the last common ancestor of the gnathostomes. Moreover, taking into account conserved noncoding sequences found upstream from poorly conserved untranslated exons in different species, we identified homologous exons that could not be identified solely on the basis of sequence similarity. Based on these analyses, we propose that some alternative splicing variants are very ancient and already encoded for in the ancestral Hox gene cluster that predated the two rounds of genomic duplication in vertebrates.

Materials and Methods

Dogfish cDNA Libraries

Five cDNA libraries were constructed from embryos or tissues covering a wide range of developmental stages (Ballard et al. 1993). cDNAs were cloned in pPSORT1. The libraries were not normalized, and because we did not select full-length cDNAs, the cDNAs could be 5′ truncated. These libraries were called D (embryos from stages 9 to 15), B and E (two independent libraries, embryos from stages 19 to 24), F (from a juvenile animal, just after hatching), and C (adult brain and eyes).

Random Sequencing of Expressed Sequence Tags

For each library, clones were picked at random: 6,380–23,881–97,248–88,940, and 9,131 from libraries B, C, D, E, and F, respectively. Each clone was one-shot sequenced from the 5′ end (i.e., 225,580 5′ end expressed sequence tags [ESTs] were generated).

Identification and Sequencing of Hox Gene cDNA

We looked for ESTs of Hox genes using BlastX similarity search against the SWISS-PROT protein database. All the ESTs for which the description of the best hit contained the key word “hox” were retained as putative Hox transcripts. We discarded the few ESTs that in fact were not similar to Hox proteins but for which the best hit included the word hox. In order to identify further Hox ESTs that may not have been identified using this first approach, we also used the TBlastN program to search for similarity with all human and chimaera Hox protein sequences against all the ESTs. SP6 and T7 primers flanking the insertion site in the vector were used to sequence the cDNAs corresponding to the ESTs matching with Hox proteins. When necessary, internal primers were designed to get the full sequence.

Construction and Screening of the Dogfish BAC Library

A 2× coverage BAC genomic library, with an average insert size of 150 kb, was constructed. A total of 140,000 clones were spotted out on five membranes (28,000 on each membrane). Dogfish Hox gene fragments were polymerase chain reaction (PCR) amplified, with primers designed from cDNA, and used to synthesize P32-labeled probes that allowed, by southern blot, the identification of BAC clones containing Hox genes. A second round of selection among positive clones was performed using primers for different Hox genes in order to select the minimal number of BAC clones needed to cover the complete Hox clusters. Seven BACs were retained in order to cover the entire HoxA, HoxB, and HoxD clusters.

Sequencing of BACs and Assembly of Hox Clusters

BAC clones were sequenced using standard shotgun sequencing methods. The HoxA cluster was reconstructed with three overlapping BACs and the HoxB cluster with two overlapping BACs. As the two BACs used to assemble the HoxD cluster were not overlapping, a long-range PCR with primers designed to match to the ends of the BACs was performed in order to amplify the missing genomic region. The PCR product of 5.2 kb was cloned, and three independent clones were entirely sequenced using primer walking (supplementary fig. S1, Supplementary Material online). Sequences of the three Hox cluster were deposited in GenBank under accession number FQ032658–FQ032660.

Transcript Mapping

We performed est2genome (EMBOSS package) (Rice et al. 2000) alignments with all the ESTs against the three Hox cluster sequences. We identified eight new Hox ESTs, and the corresponding cDNAs were sequenced. Twenty-four “non-Hox” ESTs, all but one with an antisense orientation to Hox gene, were also identified, and the corresponding cDNAs were sequenced. Mapping of cDNA sequences on genomic sequences was performed using est2genome from the EMBOSS package. The output file of est2genome was parsed into a file readable by Artemis (Rutherford et al. 2000) in order to obtain a graphical representation. Sequences of the 140 cDNAs matching with the genomic sequences of Hox clusters were deposited in the GenBank database under accession numbers FQ032661–FQ032800.

Sequence Annotation

Hox-translated exons and microRNA (miRNA) genes were predicted using a similarity-based approach. Repetitive sequences were identified using CENSOR (Kohany et al. 2006) with default settings (//www.girinst.org/censor/index.php).

Phylogenetic Analyses

In order to confirm which orthologous group of genes each dogfish Hox gene belongs to, we performed phylogenetic analyses. For each paralogous groups (Hox1 to Hox14), a multiple protein sequence alignment was carried out using ClustalX (Thompson et al. 1997) and optimized manually using the MUST program (Philippe 1993). After removing regions of ambiguous homology, the edited alignment was used in subsequent analyses. We first applied ProtTest2.4 (Abascal et al. 2005) to estimate the optimal model of amino acid substitution. Using this model, a maximum likelihood (ML) tree was inferred using PHYML3.0 (Guindon and Gascuel 2003). The robustness of the ML tree was estimated by 100 bootstrap replications.

Identification and Analysis of Conserved Noncoding Elements

Multiple alignments of dogfish and other vertebrate Hox cluster sequences were generated using SLAGAN (Brudno et al. 2003) and visualized using VISTA (Frazer et al. 2004).

Results and Discussion

HoxA, HoxB, and HoxD Cluster Assembly and Annotation

Several dogfish Hox genes were identified from the embryonic cDNA library sequences (see next paragraph) and were used to design probes to screen for BACs containing Hox genes. We isolated and sequenced seven BACs encompassing the entire genomic sequence of three Hox cluster loci (A, B, and D) in the dogfish (supplementary fig. S1, Supplementary Material online). Hox gene content in these three Hox gene clusters was estimated using BlastX searches against the SWISS-PROT protein database. A total of 11, 11, and 12 Hox genes were found in the HoxA, HoxB, and HoxD genomic sequences, respectively (total: n = 34) (fig. 1 and supplementary fig. S2, Supplementary Material online). We further confirmed which orthologous group these genes belong to by phylogenetic analyses (supplementary fig. S3, Supplementary Material online). Additionally, we identified two mir-10 and two mir-196 sequences. These miRNAs have been previously identified in other vertebrate Hox gene clusters (Yekta et al. 2004; Tanzer et al. 2005). A mir-10a was found between HoxB4 and HoxB5 and a mir-10b between HoxD4 and HoxD5. Their sequences are identical to those previously identified in other gnathostomes (supplementary fig. S4A, Supplementary Material online). A mir-196a was found between HoxB9 and HoxB10 and a mir-196b between HoxA9 and HoxA10. The mir-196a sequence is fully conserved within gnathostomes, whereas one substitution was found in mir-196b (supplementary fig. S4A, Supplementary Material online). Searching for the complementary sequence of mir-196 in the dogfish Hox gene cluster sequences, we found two putative target sequences for mir-196, downstream to mir-196 in the 3′ untranslated region of the HoxB8 and the HoxD8 genes (supplementary fig. S4B, Supplementary Material online). These sequences are well conserved within gnathostomes and seem to be even more constrained in the dogfish lineage than in osteichthyans (supplementary fig. S4C, Supplementary Material online). Using CENSOR (Kohany et al. 2006), we also identified repetitive sequences within and around the Hox clusters. Their density is much lower inside the Hox clusters than in the flanking regions, except for the region between HoxB13 and HoxB10 (fig. 1). Low density of repetitive elements is a common feature of vertebrate Hox gene cluster organization with very few exceptions, such as in the green anole lizard (Di-Poi et al. 2009).

Open in new tabDownload slide

HoxA, HoxB, and HoxD cluster loci in dogfish. HOX-coding sequences (red boxes) and identified transcripts (dark blue boxes: Hox RNAs; light blue: noncoding RNAs) were located on each locus using Artemis (Rutherford et al. 2000). For each coding sequence, the number of transcripts is shown above the position of its coding sequence. Repetitive elements as predicted by CENSOR (Kohany et al. 2006) are plotted on the bottom line of each locus.

Transcriptomic Evidence for HoxC Gene Losses

Using Blast programs (TBlastN and BlastX), we looked for Hox transcripts in a set of 225,580 dogfish ESTs. The TBlastN search against the SWISS-PROT protein database identified 105 sequences matching Hox genes with a sense orientation and three sequences matching with an antisense orientation. A BlastX search for similarity with the complete set of human and chimaera Hox protein sequences against our ESTs identified one additional Hox gene EST, which had not been found in the previous search. A total of 106 ESTs were thus identified as Hox gene transcripts at this stage. Phylogenetic analyses showed that these 106 sense-oriented ESTs corresponded to 30 different Hox genes belonging to HoxA, HoxB, or HoxD gene clusters, but none could be assigned to the HoxC gene cluster. The three antisense-orientated ESTs matched three Hox gene-coding sequences (HoxB4, HoxB8, and HoxD4) (fig. 1). It is possible to miss some transcripts by Blast searches against databases because of low similarity with homologous sequences from other species (cDNAs truncated at their 3′ end, cDNAs with a long untranslated 5′ sequence, and noncoding transcripts are such poorly conserved sequences), so we performed est2genome (EMBOSS package) (Rice et al. 2000) alignments between all ESTs and the three Hox gene cluster sequences. We identified 8 additional Hox ESTs and 24 “non-Hox” ESTs, all but one with an antisense orientation to the Hox genes.

We thus identified 34 Hox-coding sequences organized into three clusters (HoxA, HoxB, and HoxD) in the dogfish genome. For clusters A, B, and D, which contain 11, 11, and 12 Hox genes, respectively, we identified 9, 10, and 11 corresponding transcripts using the EST data set (total: n = 30), so we were unable to identify a transcript in the EST data for only four genes out of a total of 34 genes organized into three clusters. From this, we can estimate the probability of “not finding a known Hox gene in the EST set” being 4/34 (0.12). If we presume that HoxC genes are not transcribed at a very lower level or at very different stages compared with genes from other Hox clusters, the probability of finding no EST for HoxC cluster genes is 0.12n, where n is the number of genes in this cluster. The probability of “not finding a HoxC gene” is already very low if n = 2 (0.122 = 0.0144). It is therefore very unlikely that there are more than two HoxC genes in this genome. This result is in accordance with previous data suggesting that there are probably only two HoxC genes (HoxC4 and HoxC10) in the horn shark genome (Powers and Amemiya 2004b). Using sequences of the elephant shark and other vertebrates, we designed degenerate primers to PCR amplify a fragment of these HoxC genes. We could not get PCR products (data not shown).

Evolution in the number of Hox genes and of Hox gene clusters is a matter of interest as an example of a positive correlation between anatomical complexification/diversification and gene duplication in vertebrates (Ohno 1970; Wagner et al. 2003). However, the finding that actinopterygians have experienced several whole-genome duplications resulting, in some cases, in higher numbers of genes than in tetrapods, in particular humans, questions this gradist and anthropocentric hypothesis. In addition, fossils and species richness in extant clades provide no obvious support for a link between gene duplication and the evolution of complexity in vertebrates (Donoghue and Purnell 2005) or speciation rate (Alfaro et al. 2009; Santini et al. 2009). Here we present transcriptomic evidence suggesting that the dogfish may have completely, or almost completely, lost its HoxC gene cluster without this event being related to a new round of whole-genome duplication. This is not a chondrichthyan ancestral feature as the chimaera has Hox gene clusters similar to the one inferred for the last common vertebrate ancestor (Ravi et al. 2009) (fig. 2). A complete lack of data on Hox gene clusters from other elasmobranchs, such as rays or distantly related sharks, hampers a better analysis of Hox gene cluster evolution in this group. It would be particularly interesting to examine the tempo of HoxC cluster loss: were these genes lost gradually or in one or few steps implying simultaneous loss of several genes? Functional studies suggest that it could be easier to lose a HoxC cluster in one step than through the deletion of individual genes (Suemori and Noguchi 2000).

Open in new tabDownload slide

Early steps in gnathostome Hox cluster evolution. The loss of Hox genes is indicated on the branches, pseudogenes are colored gray, putative genes in the dogfish HoxC cluster (in reference to horn shark data) are shown as gray dotted squares.

Transcript Mapping on Hox Gene Cluster Genomic Sequences

The cDNAs corresponding to all 140 ESTs matching with Hox genomic sequences were fully sequenced. A graphical representation of transcript sequences mapped on the genomic sequences was created using Artemis (Rutherford et al. 2000) (fig. 1). We found cases of alternative splicing for HoxA11, HoxB9, HoxB4, HoxB3, and HoxD3 and two occurrences of untranslated 5′ exon sharing between HoxD3 and HoxD4 (fig. 1). In addition to Hox-coding cDNAs, 26 other transcripts were identified, either spliced or unspliced, all but one in an antisense orientation relative to the Hox transcripts. Among these 26 transcripts, six sequences partially overlapped with a Hox transcript (fig. 1). Alignments of all these sequences with Hox loci of the chimaera showed that some of these transcripts include conserved domains. However, all alignments involved many indels, suggesting that these transcripts are noncoding. Extensive alternative splicing variants of Hox genes and widespread antisense transcription of noncoding RNAs have also been observed in human and mouse (Mainguy et al. 2007), but the functional relevance of this transcriptional complexity is still poorly understood. In particular, the functions of most of the long noncoding RNAs are unknown, except for HOTAIR, which is coded for by the HoxC locus and known to repress in trans the transcription of a large domain of the HoxD locus in human (Rinn et al. 2007).

Structural Evidence of Homology between Poorly Conserved Exons

VISTA plots of the SLAGAN alignments between the dogfish HoxA, B, and D loci and orthologous sequences from chimaera, coelacanth, human, zebrafish, and medaka identified many conserved sequences (supplementary figs. S5, Supplementary Data, and Supplementary Data, Supplementary Material online). As expected, conserved noncoding elements (CNEs) were found in addition to highly conserved and translated exonic sequences. We then focused our attention on the three regions with Hox gene alternative splicing: HoxD4/HoxD3, HoxB4/HoxB3, and HoxB10/HoxB9. In order to study the homology of different transcripts in gnathostomes, dogfish exons were compared with mouse and human exons annotated by the “Human And Vertebrate Analysis aNd Annotation” (Havana) group at Sanger Institute and displayed by the Ensembl genome browser (supplementary figs. S8 and Supplementary Data, Supplementary Material online). Translated exons were called E1 and E2 and upstream untranslated exons E-1 to E-6 (figs. 3 and 4 and supplementary fig. S10, Supplementary Material online).

Open in new tabDownload slide

Transcript mapping and conserved noncoding sequences at the HoxD3-HoxD4 locus. The main frame shows the dogfish locus with identified transcripts localized using Artemis (Rutherford et al. 2000) and conserved noncoding sequences identified by comparison to human sequence using VISTA (Frazer et al. 2004) (translated exons: E1 and E2; untranslated exons: E-1 to E-6 according to their distance to the exon E1 of HoxD3). Above the main frame are shown the human and the mouse homologous loci with localization of the human and mouse translated and untranslated exons, respectively (note that identical name for untranslated exons in different species does not imply homology). Proposed orthology of exons is represented as a black line between the frames.

The HoxD4/HoxD3 locus showed a particularly complex transcriptional organization. In the dogfish, six untranslated exons were identified, two of them (E-4 and E-5) were shared between HoxD4 and HoxD3. None of these exons show sequence conservation in gnathostomes (fig. 3 and supplementary fig. S7, Supplementary Material online) but the 5′ terminal exons E-2, E-5, and E-6 were localized just downstream of CNEs. As the internal exons (E-4 and E-1) were not associated with any CNE, it suggests that the 5′ terminal exon–coupled CNEs may be involved in the positioning of the origin of transcription for different messenger RNAs (mRNAs) sharing the same coding exons. The similar location of human (and mouse) exon E-3 and dogfish E-6, close to the same CNE, suggests that they are homologous even though their primary sequence is not conserved (fig. 3 and supplementary fig. S11, Supplementary Material online). This CNE (position 110 kb in the dogfish HoxD cluster, fig. 3) is located upstream of HoxD5 in the dogfish and was conserved in almost all osteichthyan HoxD clusters despite osteichthyans having lost HoxD5 early during their evolution, that is, before their last common ancestor (Kuraku and Meyer 2009). This observation suggests that this CNE is not involved in the expression of HoxD5. In contrast, the loss of HoxD3b in the medaka is associated with the loss of this CNE (supplementary fig. S7, Supplementary Material online). This CNE may therefore be involved in HoxD3 regulation but not in the regulation of the more closely linked HoxD5 and HoxD4 genes. Similarly, the CNEs (position 119 kb and 137.5 kb in the dogfish HoxD cluster, fig. 3) linked with the dogfish E-5 and E-2 exons are found in the medaka HoxDa gene cluster (with a HoxD3a gene) but not in the HoxDb gene cluster (which is missing the HoxD3b gene, see supplementary fig. S7, Supplementary Material online). These observations suggest that these three CNEs are involved in the transcriptional regulation of HoxD3 genes in gnathostomes. The internal exons called E-1 in both species are not conserved, but their location between the same CNEs suggests that they may also be homologous (fig. 3).

Transcription at the HoxB4/HoxB3 locus showed striking similarities with transcription at the HoxD4/HoxD3 locus. The same exon associations (E-3/E-1/E1/E2 and E-2/E-1/E1/E2) were found for both clusters (figs. 1, 3, and 4). The relative position of the untranslated exons (E-1, E-2, and E-3) between HoxD4 (HoxB4) and HoxD3 (HoxB3) suggests that they are homologous (figs. 3 and 4). HoxB4/HoxB3 and HoxD4/HoxD3 genomic sequences showed low sequence conservation except for the translated exons (supplementary fig. S12, Supplementary Material online) and the HoxB3-E-2 and HoxD3-E-2 exons, further suggesting that they are homologous exons. HoxB3-E-3 and HoxD3-E-3 exons are both downstream of the same CNE. The dogfish HoxB4-E-1 and human HoxB3-E-5 exons are located downstream of the same CNE (fig. 4). This further suggests exon sharing between HoxB4 and HoxB3 as observed for HoxD4 and HoxD3.

Comparative genomic analysis of two alternative splicing variants of the zebrafish Hoxb3a gene had previously identified putative homologous untranslated exons shared by human and zebrafish (Hadrys et al. 2004) (see also fig. 4). Human HOXB3-E6 and zebrafish Hoxb3a-E4 exons are downstream of a CNE shared among gnathostomes (fig. 4 and supplementary fig. S6, Supplementary Material online). This CNE, at position 155.8 kb on the dogfish cluster B, is homologous to the CNE at position 119.3 kb on dogfish cluster D, upstream of the dogfish HoxD3-E5 exon which is a 5′ terminal exon (supplementary fig. S12, Supplementary Material online) and therefore putatively homologous to the human 5′ terminal HOXB3-E6 exon. This observation highlights a Hox gene cluster characteristic that may be inherited from the ancestral Hox cluster and conserved in gnathostome duplicated clusters.

Three different transcripts of the dogfish HoxB9 locus were found (supplementary fig. S10, Supplementary Material online). Surprisingly, one of these transcripts combines the first coding exon of HoxB10 (HoxB10-E1, named HoxB9-E-2 in supplementary fig. S10, Supplementary Material online) and the second coding exon of HoxB9 (HoxB9-E2). Another transcript does not contain the expected HoxB9-E1 exon but an upstream alternative exon (E-1, second and third cDNAs in supplementary fig. S10, Supplementary Material online). This transcript is probably noncoding as the exon E-1 only codes for four amino acids with no methionine when translated in frame with exon E2 (supplementary fig. S13, Supplementary Material online). Sequence comparison with homologous sequence from the chimaera also supports this conclusion (supplementary fig. S14, Supplementary Material online). The intergenic sequence between HoxB10 and HoxB9 does not contain CNEs at the gnathostome level (excepted for mir-196a), but a comparison between dogfish and chimaera showed that the E-1 exon and the upstream flanking sequence are conserved between these species (supplementary fig. S10, Supplementary Material online). We did not find any putative orthologs of the HoxB9-E-1/E2 transcript in the other species, and so it may therefore only be expressed in chondrichthyans. A similar case has been observed at the human HOXA9 locus (Popovic et al. 2008).

In our survey, four transcripts were found to originate from the second coding exon of a Hox gene and a 5′ untranslated exon (HoxA11-E-1/E2, HoxB9-E-1/E2, HoxB9-E-2/E2, and HoxB4-E-1/E2). It is worth noting that all but one of these (HoxA11-E-1/E2) result from the splicing of an miRNA-bearing intron in the pre-mRNA (figs. 1 and 4 and supplementary fig. S10, Supplementary Material online). As miRNAs are often intron encoded (Rodriguez et al. 2004; Morlando et al. 2008), we hypothesize that these noncoding Hox transcripts are actually pre-miRNAs. Within the HoxB9, HoxB4, and HoxD4 loci, we found both canonical and noncanonical (with 5′ untranslated exons) transcripts encompassing putative miRNA-coding introns (figs. 1, 3, and 4 and supplementary fig. S10, Supplementary Material online). A dual role for the transcription of Hox genes like this could explain the high level of complexity observed at these loci.

Hox Gene Cluster Transcriptional Complexity in Gnathostomes

The increase of vertebrate complexity in terms of number of cell types and/or tissues and/or organs may have relied on the complexification of the gene regulatory networks through gene duplications, increase in the number of alternative splicing, or an increase in transcription of noncoding RNAs. The relative importance of these different mechanisms in building biological innovations during evolution is not yet clear. Our transcriptomic survey of the Hox genomic regions suggests that the complexity of their transcription (i.e., alternative splicing and long noncoding RNA) predated the divergence of extant gnathostomes. The association of poorly conserved exons with identified CNEs revealed a homology between untranslated exons in gnathostomes such as dogfish and humans. The same data suggest that some HoxB3 and HoxD3 alternative splicing variants may be homologous and thus may have been already coded for by the unduplicated, single ancestral Hox cluster from which they are derived.

Hox Gene Cluster Sequence Divergence in Chondrichthyans versus Sarcopterygians

It has recently been pointed out that the sequence of the HoxA cluster is more conserved in chondrichthyans than in osteichthyans (Mulley et al. 2009). In actinopterygians, many Hox genes and CNEs were lost concomitantly with genome duplications that may have released some selective constraints on the maintenance of duplicated copies. In sarcopterygians such as humans and the coelacanth, such genome duplications did not occur (Kuraku et al. 2009). However, it has been shown that the Hox gene clusters have evolved more slowly in the coelacanth lineage than in the human lineage (Amemiya et al. 2010). We compared the level of sequence conservation for Hox A, B, and D clusters in two chondrichthyans (dogfish and elephant shark) and two sarcopterygians (human and coelacanth) species. We know that the divergence time for each pair of species is approximately the same, around 410 My (Janvier 1996); so if the mutation rate and selective constraints on the sequences are similar in both groups, we would expect similar levels of sequence divergence. At all three loci examined, many more CNEs were identified when comparing the two chondrichthyans than when comparing the two sarcopterygians (supplementary fig. S15, Supplementary Material online). The fraction of conserved sequences within Hox gene clusters was estimated using the Gumby algorithm (Prabhakar et al. 2006) on the VISTA server (//genome.lbl.gov/vista/index.shtml). In the comparison of the HoxA cluster between dogfish and chimaera on one hand and between human and coelacanth on the other hand, respectively, 36.5% and 7.5% of the total cluster length was identified as composed of CNEs, without a striking difference in cluster length. For the same two pairs of species, the CNE sequences represented, respectively, 13.7% and 3.3% of the HoxB gene clusters and 24.0% and 5.0% of the HoxD gene clusters (table 1). Previous results have shown that the HoxA cluster evolves at a lower rate in chondrichthyans than in sarcopterygians. These data extend the result from the HoxA cluster (Mulley et al. 2009) to the HoxB and HoxD clusters. As the Gumby algorithm is designed to detect CNEs evolving slower than the background neutral evolutionary rate of nonexonic regions, these results also indicate that chondrichthyan Hox gene clusters contain much more, and longer, CNEs than sarcopterygian Hox gene clusters (table 1). However, it is not possible to accurately assess the relative roles of mutation rate and selective constraint variations in shaping these different conservation profiles as no independent estimation of the mutation rate in chondrichthyans is currently available.

Table 1.

CNE Content of Hox Gene Clusters in Chondrichthyans and Sarcopterygians.

 Cluster Length (bp) CDS Total Length (bp) Number of CNEs CNE Total Length (bp) CNE Mean Identity (%) CNE Mean Size (bp) Fraction of CNEs in a Cluster (%) 
Hox A        
    Dogfish versus chimaera 105,601 9,525 126 38,544 75.8 305 36.5 
107,303 9,810     35.9 
    Human versus coelacanth 105,638 10,803 46 7,871 76.4 171 7.5 
130,264 10,419     6.0 
Hox B        
    Dogfish versus chimaera 181,559 9,426 101 24,815 76.1 246 13.7 
139,151 9,480     17.8 
    Human versus coelacanth 199,047 8,508 35 6,620 77.8 189 3.3 
210,686 9,391     3.1 
Hox D        
    Dogfish versus chimaera 105,888 10,527 95 25,406 76.0 267 24.0 
106,639 10,650     23.8 
    Human versus coelacanth 96,539 9,853 30 4,843 75.5 161 5.0 
115,416 6,957     4.2 

Conclusion

Our aim was to better understand the evolution of the Hox gene cluster in gnathostomes by examining a species selected for its phylogenetic position rather than because it was a model organism in developmental biology. Our survey of the Hox cluster transcriptome in the dogfish, a chondrichthyan, has shed light on the early evolution of the organization and transcriptional complexity of these genomic regions. Our comparison of gnathostome transcriptomic data, including structural information, identified homology between some exons, which was not detectable using sequence similarity. In addition, these comparisons also suggest homology of alternative splicing variants of different Hox gene clusters and therefore have given us insights into features of the ancestral, pre-duplicated Hox cluster. We obtained good evidence that, without involving additional cluster duplication, the HoxC cluster has been lost or is restricted to only few genes in the dogfish. Sequencing of the complete dogfish genome would ultimately validate our description of the Hox cluster number and organization. A larger dogfish EST data set would also allow us to refine and extend the analysis of how the transcriptome has evolved in different vertebrate lineages. New sequencing technologies now offer the experimental tools to reach these goals.

We thank our colleagues Patrick Laurenti and Cushla Metcalfe for improving the manuscript. This work was supported by grants from the Centre National de la Recherche Scientifique (ATIP) and the GIS génomique marine. M.D.-T. and S.O. were supported by a doctoral fellowship from the French Ministère de l'Education Nationale et de la Recherche.

References

,  ,  ,  ,  ,  ,  ,  . 

Nine exceptional radiations plus high turnover explain species diversity in jawed vertebrates

, , , vol.  (pg. -)

,  ,  , et al. 

Complete HOX cluster characterization of the coelacanth provides further evidence for slow evolution of its genome

, , , vol.  (pg. -)

,  ,  . 

A series of normal stages for development of Scyliorhinus canicula, the lesser spotted dogfish (Chondrichthyes: Scyliorhinidae)

, , , vol.  (pg. -)

,  ,  ,  ,  ,  ,  . 

Glocal alignment: finding rearrangements during alignment

, , , vol.  (pg. -)

,  ,  ,  ,  ,  . 

Molecular evolution of the HoxA cluster in the three major gnathostome lineages

, , , vol.  (pg. -)

,  ,  , et al. 

Evolution of axis specification mechanisms in jawed vertebrates: insights from a chondrichthyan

, , , vol.  pg.  

,  ,  ,  . 

Comparative phylogenomic analyses of teleost fish Hox gene clusters: lessons from the cichlid fish Astatotilapia burtoni

, , , vol.  pg.  

,  ,  ,  ,  ,  ,  ,  ,  ,  . 

Hox cluster genomics in the horn shark, Heterodontus francisci

, , , vol.  (pg. -)

,  ,  ,  . 

Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor

, , , vol.  pg.  

,  ,  . 

Timing of genome duplications relative to the origin of the vertebrates: did cyclostomes diverge before or after?

, , , vol.  (pg. -)

,  ,  ,  ,  ,  . 

Noncanonical role of Hox14 revealed by its expression patterns in lamprey and shark

, , , vol.  (pg. -)

,  ,  , et al. 

Organization and structure of hox gene loci in medaka genome and comparison with those of pufferfish and zebrafish genomes

, , , vol.  (pg. -)

,  ,  ,  ,  ,  . 

Primary microRNA transcripts are processed co-transcriptionally

, , , vol.  (pg. -)

,  ,  ,  ,  ,  ,  . 

Close sequence comparisons are sufficient to identify human cis-regulatory elements

, , , vol.  (pg. -)

,  ,  ,  ,  ,  . 

Elephant shark (Callorhinchus milii) provides insights into the evolution of Hox gene clusters in gnathostomes

, , , vol.  (pg. -)

,  ,  , et al. 

Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs

, , , vol.  (pg. -)

,  ,  . 

Phylogenetic dating and characterization of gene duplications in vertebrates: the cartilaginous fish reference

, , , vol.  (pg. -)

,  ,  ,  ,  ,  ,  . 

Heterochronic shift in Hox-mediated activation of sonic hedgehog leads to morphological changes during fin development

, , , vol.  pg.  

,  ,  ,  . 

Did genome duplication drive the origin of teleosts? A comparative study of diversification in ray-finned fishes

, , , vol.  pg.  

,  ,  ,  ,  . 

The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools

, , , vol.  (pg. -)

,  ,  ,  ,  ,  . 

Hox gene clusters in blunt snout bream, Megalobrama amblycephala and comparison with those of zebrafish, fugu and medaka genomes

, , , vol.  (pg. -)

research articles

  • Toplist

    Latest post

    TAGs