The development of rigorous molecular taxonomy pioneered by Carl Woese has freed evolution science to explore numerous cellular activities that lead to genome change in evolution. These activities include symbiogenesis, inter- and intracellular horizontal DNA transfer, incorporation of DNA from infectious agents, and natural genetic engineering, especially the activity of mobile elements. This article reviews documented examples of all these processes and proposes experiments to extend our understanding of cell-mediated genome change.
From 1971 to 1985, Carl Woese and colleagues generated oligonucleotide catalogs of 16S/18S rRNAs from more than 400 organisms. Using these incomplete and imperfect data, Carl and his colleagues developed unprecedented insights into the structure, function, and evolution of the large RNA components of the translational apparatus. They recognized a third domain of life, revealed the phylogenetic backbone of bacteria (and its limitations), delineated taxa, and explored the tempo and mode of microbial evolution. For these discoveries to have stood the test of time, oligonucleotide catalogs must carry significant phylogenetic signal; they thus bear re-examination in view of the current interest in alignment-free phylogenetics based onk-mers. Here we consider the aims, successes, and limitations of this early phase of molecular phylogenetics. We computationally generate oligonucleotide sets (e-catalogs) from 16S/18S rRNA sequences, calculate pairwise distances between them based onD2statistics, compute distance trees, and compare their performance against alignment-based andk-mer trees. Although the catalogs themselves were superseded by full-length sequences, this stage in the development of computational molecular biology remains instructive for us today.
Science is all about making discoveries. That’s it! It was my good fortune and Carl’s good fortune to share an experiment that produced an unexpected result. In the 1960s, Carl became interested in the classification of bacteria with the ultimate goal of defining the relatedness of bacterial groups as well as events in the evolution of these organisms. He proposed to do this by studying the sequence of monomers in proteins or nucleic acids. Study of the sequence of amino acids in conserved proteins had severe limitations and could not serve Carl’s purpose. However, the publication by Sanger of a technique for analysis of RNA caught Carl’s attention. His previous experiments with the ribosome had convinced him that this organelle was of very ancient origin; it had only one role in the cell and so was “insulated” from the vast phenotypic variations of bacterial cells.
Not long after Carl Woese died, I received a message from Robin Gutell asking if I would contribute an article to this issue ofRNA Biology. While my admiration for Carl’s contributions to biology knows no bounds, I did not know him well personally. For that reason I advised Robin to strike my name off the list of contributors and replace it with that of someone who is better qualified than I am, but he persisted, and here we are. I guess Robin thought it would be useful to hear from one of those who admired Carl from afar.
Themod(mdg4)locus ofDrosophila melanogastercontains several transcription units encoded on both DNA strands. Themod(mdg4)pre-mRNAs are alternatively spliced, and a very significant fraction of the maturemod(mdg4)mRNAs are formed by trans-splicing. We have studied the transcripts derived from one of the anti-sense regions within themod(mdg4)locus in order to shed light on the expression of this complex locus. We have characterized the expression of anti-sensemod(mdg4)transcripts in S2 cells, mapped their transcription start sites and cleavage sites, identified and quantified alternatively spliced transcripts, and obtained insight into the regulation of themod(mdg4)trans-splicing. In a previous study, we had shown that the alternative splicing of somemod(mdg4)transcripts was regulated by Brahma (BRM), the ATPase subunit of the SWI/SNF chromatin-remodeling complex. Here we show, using RNA interference and overexpression of recombinant BRM proteins, that the levels of BRM affect specifically the abundance of a trans-splicedmod(mdg4)mRNA isoform in both S2 cells and larvae. This specific effect on trans-splicing is accompanied by a local increase in the density of RNA polymerase II and by a change in the phosphorylation state of the C-terminal domain of the large subunit of RNA polymerase II. Interestingly, the regulation of themod(mdg4)splicing by BRM is independent of the ATPase activity of BRM, which suggests that the mechanism by which BRM modulates trans-splicing is independent of its chromatin-remodeling activity.
Following reports by ENCyclopedia Of DNA Elements (ENCODE; GENCODE) Consortium and others, it is now fairly evident that the majority (70–80%) of the mammalian genome has the potential to be transcribed into non-protein-coding RNAs (ncRNAs). Critical to our understanding of genetic processes is the mechanism by which ncRNAs exert their roles. Accordingly, ncRNAs are shown to regulate the expression of protein-coding loci (i.e., genes) at the transcriptional as well as post-transcriptional stages. We recently reported on a widespread transcription at the DNA enhancer elements in myogenic cells. In our study, we found certain enhancer RNAs (eRNAs) regulate chromatin accessibility of the transcriptional machinery at loci encoding master regulators of myogenesis (i.e., MyoD/MyoG), thus suggesting their significance and site-specific impact in cellular programming. Here, we examine recent discoveries pertinent to the proposed role(s) of eRNAs in regulating gene expression. We will highlight consistencies, discuss confounding observations, and consider a lack of critical information in a way to prioritize future objectives.
Mycobacterium tuberculosis, the causative agent of tuberculosis in humans, is a bacterium with the unique ability to persist for years or decades as a latent infection. This latent state, during which bacteria have a markedly altered physiology and are thought to be dormant, is crucial for the bacteria to survive the stressful environments it encounters in the human host. Importantly,M. tuberculosiscells in the dormant state are generally refractory to antibiotics, most of which target cellular processes occurring in actively replicating bacteria. The molecular switches that enableM. tuberculosisto slow or stop its replication and become dormant remain unknown. However, the slow growth and dormant state that are hallmarks of latent tuberculosis infection have striking parallels to the “quasi-dormant” state ofEscherichia colicells caused by the toxin components of chromosomal toxin-antitoxin (TA) modules. An unusually large number of TA modules inM. tuberculosis, including nine in themazEFfamily, may contribute to initiating this latent state or to adapting to stress conditions in the host. Toward filling the gap in our understanding of the physiological role of TA modules inM. tuberculosis,we are interested in identifying their molecular mechanisms to better understand how toxins impart growth control. Our recent publication1?uncovered a novel function of a MazF toxin inM. tuberculosisthat had not been associated with any other MazF ortholog. This toxin, MazF-mt6, can disrupt protein synthesis by cleavage of 23S rRNA at a single location in an evolutionarily conserved five-base sequence in the ribosome active center.
The clustered regularly interspaced short palindromic repeats (CRISPR)-CRISPR-associated (Cas) system has recently been used to engineer genomes of various organisms, but surprisingly, not those of bacteriophages (phages). Here we present a method to genetically engineer theEscherichia coliphage T7 using the type I-E CRISPR-Cas system. T7 phage genome is edited by homologous recombination with a DNA sequence flanked by sequences homologous to the desired location. Non-edited genomes are targeted by the CRISPR-Cas system, thus enabling isolation of the desired recombinant phages. This method broadens CRISPR Cas-based editing to phages and uses a CRISPR-Cas type other than type II. The method may be adjusted to genetically engineer any bacteriophage genome.
Promoter-associated RNAs (pRNAs) are a family of ~90–100 nt-long divergent RNAs overlapping the promoter of the rRNA (rDNA) operon. pRNA transcripts interact with TIP5, a component of the chromatin remodeling complex NoRC, which recruits enzymes for heterochromatin formation and mediates silencing of rRNA genes. Here we present a comprehensive analysis of pRNA homologs, including different versions per species, as result of in silico studies in available metazoan genome assemblies. Comparative sequence analysis and secondary structure prediction ended up in two possible secondary structures, which let us assume a possible dual function of pRNAs for regulation of rRNA operons. Furthermore, we validated parts of our computational predictions experimentally by RT-PCR and sequencing. A representative seed alignment of the pRNA family, annotated with possible secondary structures was released to the Rfam database.
The RIG-I-like receptors (RLRs)—RIG-I, MDA5, and LGP2—detect intracellular pathogenic RNA and elicit an antiviral immune response during viral infection. The protein architecture of the RLR family consists of multiple functional domains, including N-terminal Caspase Activation and Recruitment Domains (CARDs) for signaling initiation, a central RNA helicase core, and a C-terminal domain for RNA sensing. With these specialized sensing-and-responding modules, RLRs are able to selectively bind non-self RNA species and trigger downstream signaling events leading to interferon production. This article summarizes the recent progress toward defining the precise mechanisms of RNA recognition and subsequent signal induction by RLRs.
Epstein–Barr virus (EBV) is a tumorigenic human γ-herpesvirus, which produces several known structured RNAs with functional importance: two are implicated in latency maintenance and tumorigenic phenotypes, EBER1 and EBER2; a viral small nucleolar RNA (v-snoRNA1) that may generate a small regulatory RNA; and an internal ribosomal entry site in the EBNA1 mRNA. A recent bioinformatics and RNA-Seq study of EBV identified two novel EBV non-coding (nc)RNAs with evolutionary conservation in lymphocryptoviruses and likely functional importance. Both RNAs are transcribed from a repetitive region of the EBV genome (the W repeats) during a highly oncogenic type of viral latency. One novel ncRNA can form a massive (586 nt) hairpin, while the other RNA is generated from a short (81 nt) intron and is found in high abundance in EBV-infected cells.
As evidenced from mammalian cells the eukaryotic translation initiation factor eIF4G has a putative role in nuclear RNA metabolism. Here we investigate whether this role is conserved in the yeastSaccharomyces cerevisiae. Using a combination of in vitro and in vivo methods, we show that, similar to mammalian eIF4G, yeast eIF4G homologues, Tif4631p and Tif4632p, are present both in the nucleus and the cytoplasm. We show that both eIF4G proteins interact efficiently in vitro with UsnRNP components of the splicing machinery. More specifically, Tif4631p and Tif4632p interact efficiently with U1 snRNA in vitro. In addition, Tif4631p and Tif4632p associate with protein components of the splicing machinery, namely Snu71p and Prp11p. To further delineate these interactions, we map the regions of Tif4631p and Tif4632p that are important for the interaction with Prp11p and Snu71p and we show that addition of these regions to splicing reactions in vitro has a dominant inhibitory effect. The observed interactions implicate eIF4G in aspects of pre-mRNA processing. In support of this hypothesis, deletion of one of the eIF4G isoforms results in accumulation of un-spliced precursors for a number of endogenous genes, in vivo. In conclusion these observations are suggestive of the involvement of yeast eIF4G in pre-mRNA metabolism.
Spliceosomal snRNAs are extensively 2'-O-methylated and pseudouridylated. The modified nucleotides are relatively highly conserved across species, and are often clustered in regions of functional importance in pre-mRNA splicing. Over the past decade, the study of the mechanisms and functions of spliceosomal snRNA modifications has intensified. Two independent mechanisms behind these modifications, RNA-independent (protein-only) and RNA-dependent (RNA-guided), have been discovered. The role of spliceosomal snRNA modifications in snRNP biogenesis and spliceosome assembly has also been verified.
One of the major challenges facing researchers working with eukaryotic ribosomes lies in their lability relative to their eubacterial and archael counterparts. In particular, lysis of cells and purification of eukaryotic ribosomes by conventional differential ultracentrifugation methods exposes them for long periods of time to a wide range of co-purifying proteases and nucleases, negatively impacting their structural integrity and functionality. A chromatographic method using a cysteine charged Sulfolink resin was adapted to address these problems. This fast and simple method significantly reduces co-purifying proteolytic and nucleolytic activities, producing good yields of highly biochemically active yeast ribosomes with fewer nicks in their rRNAs. In particular, the chromatographic purification protocol significantly improved the quality of ribosomes isolated from mutant cells. This method is likely applicable to mammalian ribosomes as well. The simplicity of the method, and the enhanced purity and activity of chromatographically purified ribosome represents a significant technical advancement for the study of eukaryotic ribosomes.
The nonsense-mediated mRNA decay (NMD) pathway is responsible for the rapid degradation of eukaryotic mRNAs on which ribosomes fail to terminate translation properly. NMD thereby contributes to the elimination of aberrant mRNAs, improving the fidelity of gene expression, but also serves to regulate gene expression at the posttranscriptional level. Here we discuss recent evidence as to how and where mRNAs targeted to NMD are degraded in human cells. We discuss accumulating evidence that the decay step of human NMD can be initiated by two different mechanisms: either by SMG6-mediated endonucleolytic cleavage near the aberrant stop codon, or by deadenylation and decapping. While there is evidence that mRNAs targeted for NMD have the capacity to accumulate with other translationally repressed mRNAs in P-bodies, there is currently no evidence that this is required for the degradation of the NMD substrate. It therefore remains an open question whether NMD in human cells is restricted to a particular cellular location or whether it can be initiated wherever translation of the NMD substrate takes place.
The human MeCP 2 gene encodes a ubiquitously expressed methyl CpG binding protein. Mutations in this gene cause a neurodevelopmental disorder called Rett Syndrome (RS). Mutations identified in the coding region of MeCP 2 account for approximately 65% of all RS cases. However, 35% of all patients do not show mutations in the coding region of MeCP 2, suggesting that mutations in non-coding regions likely exist that affect MeCP 2 expression rather than protein function. The gene is unusual in that is has a >8.5 kb 3′ untranslated region (3′ UTR), and the size of the 3′UTR is differentially regulated in various tissues because of distinct polyadenylation signals. We have identified putative cis-acting auxiliary regulatory elements that play a role in alternative polyadenylation of MeCP 2 using an in vivo polyadenylation reporter assay and in a luciferase assay. These cis-acting auxiliary elements are found both upstream and downstream of the core CPS F binding sites. Mutation of one of these cis-acting auxiliary elements, a G-rich element (GRS) significantly reduced MeCP 2 polyadenylation efficiency in vivo. We further investigated what trans-acting factor(s) might be binding to this cis-acting element and found that hnRNP F protein binds specifically to the element. We next investigated the MeCP 2 3′ UTRs by performing quantitative real-time PC R; the data suggest that altered RNA stability is not a major factor in differential MeCP 2 3′ UTR usage. In sum, the mechanism(s) of regulated alternative 3′UTR usage of MeCP 2 are complex, and insight into these mechanisms will aid our understanding of the factors that influence MeCP 2 expression.