Machine Learning for COVID-19 needs global collaboration and data-sharing
Nathan Peiffer-Smadja, Redwan Maatoug, François-Xavier Lescure, Eric D’Ortenzio, Joëlle Pineau & Jean-Rémi King
Nature Machine Intelligence (2020)
The COVID-19 pandemic poses a historical challenge to society. The profusion of data requires machine learning to improve and accelerate COVID-19 diagnosis, prognosis and treatment. However, a global and open approach is necessary to avoid pitfalls in these applications.
On 31 December 2019, the first cases of a viral pneumonia with unknown aetiology were reported in the city of Wuhan, China. In the following weeks, the Chinese authorities and the World Health Organization (WHO) announced the discovery of a novel coronavirus and its associated disease: SARS-CoV-2 and COVID-19, respectively. On 21 April 2020, the number of cases of COVID-19 exceeded 2.4 million and the death toll exceeded 170,000 worldwide1. The outbreak of COVID-19 represents a major and urgent threat to global health. While the unprecedented speed of the COVID-19 spread partly finds its roots in our increasingly globalized society, the global sharing of scientific data also offers a promising tool to fight the disease. In the past four months, more than 12,400 articles have been published2 and scientific data collected from thousands of patients have been released3. The majority of these studies follow the standard scientific method: that is, investigate a few hypotheses at a time on a controlled sample. While undeniably successful, this standard method suffers from two well-known challenges, both critical to our pandemic situation: (1) it requires considerable expertise and human input and (2) it only considers a handful hypotheses at a time. Machine learning (ML) has been used to meet these challenges in various pathologies4,5, including infectious diseases6. Herein, we describe two areas where ML could supplement standard statistical methods in the COVID-19 pandemic, discuss the practical challenges that such a ML approach entails, and advocate for a global collaboration and data-sharing.