The architecture of SARS-CoV-2 transcriptome
Authors
Dongwan Kim1,2, Joo-Yeon Lee3
, Jeong-Sun Yang3
, Jun Won Kim3
, V. Narry Kim1,2,4,*, and
Hyeshik Chang1,2,*
DOI: 10.1016/j.cell.2020.04.011
SARS-CoV-2 is a betacoronavirus responsible for the COVID-19 pandemic. Although the
SARS-CoV-2 genome was reported recently, its transcriptomic architecture is unknown.
Utilizing two complementary sequencing techniques, we here present a high-resolution map
of the SARS-CoV-2 transcriptome and epitranscriptome. DNA nanoball sequencing shows
that the transcriptome is highly complex owing to numerous discontinuous transcription
events. In addition to the canonical genomic and 9 subgenomic RNAs, SARS-CoV-2
produces transcripts encoding unknown ORFs with fusion, deletion, and/or frameshift. Using
nanopore direct RNA sequencing, we further find at least 41 RNA modification sites on viral
transcripts, with the most frequent motif, AAGAA. Modified RNAs have shorter poly(A) tails
than unmodified RNAs, suggesting a link between the modification and the 3′ tail. Functional
investigation of the unknown transcripts and RNA modifications discovered in this study will
open new directions to our understanding of the life cycle and pathogenicity of SARS-CoV-2.