GENOMIC AND SUB-GENOMIC CHARACTERIZATION OF SARS-COV-2 SAMPLES IN KENYA
Mwangi, Jane Njeri
MetadataShow full item record
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), the etiological agent of Corona Virus Disease 2019 (COVID-19), has spread rapidly across the human population, resulting in up to 213 million confirmed cases and approximately 4.4 million deaths as of August 25, 2021. Despite tremendous scientific efforts including six approved vaccines, the biology and evolutionary dynamics of SARS-CoV-2 remain to be fully described. As of August 25, 2021, there were 35,921 publicly available whole genome sequences from Africa deposited in the in Global Initiative on Sharing All Influenza Data (GISAID) database, against a backdrop of approximately 7.6 million confirmed cases (Africa CDC). Owing to this paucity of genome data, understanding the genomic epidemiology of SARS-CoV-2 at fine spatial scales remain a challenge in many parts of sub-Saharan Africa. The goal of this study was to utilize genomic data to understand genomic and sub-genomic characterization of SARS-CoV-2. Whole genome sequence data in addition to available epidemiological data was utilized to optimize the ARTIC network bioinformatic assembly pipeline to suit in-house analysis and understand the sub-genomic RNA (sgRNA) expression profile of SARS-CoV-2 from samples in Kenya. The sgRNA was quantified using Periscope tool and visualized in R v4.1.1. The study showed that the median read count of 13,632 reads from a bin of 1 million random sampled reads used for assembly yields better genomes. In addition, normalization value of 5,000 produced better amplicon coverage per pool, although it does not account for amplicon drop-outs in some regions. In this study, ORF10 abundance level was observed in low amounts by support of reads from three samples, one sample of which was from a deceased case. High abundance levels of non-canonical sgRNA at the envelope gene (E) position 26,442 and nucleocapsid (N) position 28,282 was observed in the samples. The abundance at position 28,282 has been shown to interfere with type I interferon signaling, which viral pathogens have to overcome to replicate in host cell. Quantifying sgRNA from genomic data and detecting new variants in sgRNA has potential for in-depth study of the transcriptome and may inform virus transmissibility or pathogenicity.