《BioRxiv,2月9日,Machine learning analysis of genomic signatures provides evidence of associations between Wuhan 2019-nCoV and bat betacoronaviruses》

  • 来源专题:COVID-19科研动态监测
  • 编译者: zhangzx
  • 发布时间:2020-02-10
  • Abstract.As of February 8, 2020, the 2019 Novel Coronavirus (2019-nCoV) spread to 29 countries with 725 deaths and more than 34000 confirmed cases. 2019-nCoV is being compared to the infamous SARS coronavirus, which resulted, between November 2002 and July 2003, in 8098 confirmed cases worldwide with a 9.6% death rate and 774 deaths. Though 2019-nCoV has a death rate of 2% as of 8 February, the 34963 confirmed cases in a few weeks (December 8, 2019 to February 8, 2020) are alarming, with cases...

相关报告
  • 《BioRxiv,2月4日,Machine learning-based analysis of genomes suggests associations between Wuhan 2019-nCoV and bat Betacoronaviruses》

    • 来源专题:COVID-19科研动态监测
    • 编译者:zhangmin
    • 发布时间:2020-02-05
    • Machine learning-based analysis of genomes suggests associations between Wuhan 2019-nCoV and bat Betacoronaviruses Gurjit S Randhawa, Maximillian P.M. Soltysiak, Hadi El Roz, Camila P.E. de Souza, Kathleen A. Hill, Lila Kari doi: https://doi.org/10.1101/2020.02.03.932350 Abstract As of February 3, 2020, the 2019 Novel Coronavirus (2019-nCoV) spread to 27 countries with 362 deaths and more than 17000 confirmed cases. 2019-nCoV is being compared to the infamous SARS coronavirus outbreak. Between November 2002 and July 2003, SARS resulted in 8098 confirmed cases worldwide with a 9.6% death rate and 774 deaths. Mainland China alone suffered 349 deaths and 5327 confirmed cases. Though 2019-nCoV has a death rate of 2.2% as of 3 February, the 174895 confirmed cases in a few weeks (December 8, 2019 to February 3, 2020) are alarming. Cases are likely under-reported given the comparatively longer incubation period. Such outbreaks demand rapid elucidation and analysis of the virus genomic sequence for timely treatment plans. We classify the 2019-nCoV using MLDSP and MLDSP-GUI, alignment-free methods that use Machine Learning (ML) and Digital Signal Processing (DSP) for genome analyses. Genomic sequences were mapped into their respective genomic signals (discrete numeric series) using a two-dimensional numerical representation (Chaos Game Representation). The magnitude spectra were computed by applying Discrete Fourier Transform on the genomic signals. The Pearson Correlation Coefficient was used to calculate a pairwise distance matrix. The feature vectors were constructed from the distance matrix and used as an input to the supervised machine learning algorithms. 10-fold cross-validation was applied to compute the average classification accuracy scores. The trained classifier models were used to predict the labels of 29 2019-nCoV sequences. The classification strategy used over 5000 genomes and tested associations at taxonomic levels of realm to species. From our machine learning-based alignment-free analyses using MLDSP-GUI, we corroborate the current hypothesis of a bat origin and classify 2019-nCoV as Sarbecovirus, within Betacoronavirus. *注,本文为预印本论文手稿,是未经同行评审的初步报告,其观点仅供科研同行交流,并不是结论性内容,请使用者谨慎使用.
  • 《BioRxiv,2月5日,(第2版更新)Genomic variance of the 2019-nCoV coronavirus》

    • 来源专题:COVID-19科研动态监测
    • 编译者:zhangmin
    • 发布时间:2020-02-06
    • Genomic variance of the 2019-nCoV coronavirus Carmine Ceraolo, View ORCID ProfileFederico M Giorgi doi: https://doi.org/10.1101/2020.02.02.931162 Abstract There is rising global concern for the recently emerged novel Coronavirus (2019-nCov). Full genomic sequences have been released by the worldwide scientific community in the last few weeks in order to understand the evolutionary origin and molecular characteristics of this virus. Taking advantage of all the genomic information currently available, we constructed a phylogenetic tree including also representatives of other coronaviridae, such as Bat coronavirus (BCoV) and SARS. We confirm high sequence similarity (>99%) between all sequenced 2019-nCoVs genomes available, with the closest BCoV sequence sharing 96.2% sequence identity, confirming the notion of a zoonotic origin of 2019-nCoV. Despite the low heterogeneity of the 2019-nCoV genomes, we could identify at least two hyper-variable genomic hotspots, one of which is responsible for a Serine/Leucine variation in the viral ORF8-encoded protein. Finally, we perform a full proteomic comparison with other coronaviridae, identifying key aminoacidic differences to be considered for antiviral strategies deriving from previous anti-coronavirus approaches. *注,本文为预印本论文手稿,是未经同行评审的初步报告,其观点仅供科研同行交流,并不是结论性内容,请使用者谨慎使用.