The Identification of the SARS-CoV-2 Whole Genome: Nine Cases Among Patients in Banten Province, indonesia

© The Author(s) 2021. Open Access. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License which permits unrestricted use, sharing, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Adhiyanto et al. | J Pure Appl Microbiol | 15(2):936-948 | June 2021 Article 6958 | https://doi.org/10.22207/JPAM.15.2.52


Sample collection
Since June 2020, The Medical Research Laboratory, Faculty of Medicine, Syarif Hidayatullah State Islamic University as become part of the COVID-19 Laboratory Examination Network for Banten, West Java, Indonesia. On average, we have received 400-500 samples every week, amounting to about 4000 samples by the end of October 2020, with a positivity rate of 25% (1000/4000 samples). We received samples from hospitals, clinics and public health centres in the Banten area to us in viral transport medium (VTM).In this study, we collected and sequenced nine RNA samples from these patients.

Ethical clearance
Ethical clearance was issued by the Ethic Committee of the Faculty of Medicine UIN Syarif Hidayatullah Jakarta (No.B-005/F12/KPK/ TL.00/02/2021) and each specimen was submitted to our laboratory along with informed consent from the patient.

RNA extraction
Two hundred µL of VTM containing patient samples were transferred into a tube containing 600 µL of TRI-Zol solution (TRI Reagent®) and shaken until homogeneous. The homogeneous samples were centrifuged and the supernatants transferred into RNase-free tubes. Each supernatant in its RNase-free tube was mixed with 600 µL of ethanol (95-100%) and gently shaken. The mixture was transferred to a Zymo-Spin TM IC column in a collection tube, centrifuged, and the collection liquid discharged. Subsequently, 400 µL of RNA wash buffer was added into the column and centrifuged. This was followed by 5 µL of DNase I and 35 µL of DNA digestion buffer being added into the column matrix and incubated for 15 minutes at room temperature (RT). Then, 400 µL of Direct-Zol TM RNA prewash was added, centrifuged and the liquid flowing into the collection tube removed. Next,700 µL of RNA wash buffer was added to the column and centrifuged for two minutes to ensure that the wash buffer had transferred into the collection tube and the lysate had passed through the membrane. After that, the column was transferred into an RNase-free tube and 50 µL of DNase/RNase-free water was added to elute the RNA, followed by centrifugation so that the RNA flowed into the RNase-free tube.
RNA dissolved by 50 µL DNase/RNase-free water was stored at -80°C if not used immediately.

qRT-PCR analysis
We conducted qRT-PCR by using one step reaction Biosensor Standard M SARS-CoV-2 PCR (Chungcheongbuk-do, Republic of Korea) in a Roche LC480 II machine. The protocol was as described in the manufacture's insert kit with targets of ORF1ab gene and E gene.

SARS-Cov-2 sequencing
Out of 6000 samples, we randomly selected nine samples from patients who were positive for COVID-19 and had a cycle threshold (Ct) value of between 15 and 25. All samples were measured for concentration and purity prior to sequencing.
All samples were prepared using ARTIC Network protocols and analytic methods 9 . The protocols and analyses were based on the ARTIC multiplex PCR sequencing protocol for COVID-19 devised by Josh Quick. The protocol generates 400 bp amplicons in a tiled fashion across the whole COVID-19 genome.
Oxford Nanopore's GridION sequencer was operated using MinKNOW version 20.06.9 and MinKNOW core version 4.0.3. High accuracy base-calling was conducted using Guppy 10 version 4.0.11. All generated reads were assembled using EPI2ME Labs platform employing ARTIC workflow.
After preparation using ARTIC multiplex PCR, all samples were analysed using GridION operated by MinKNOW software. Base-calling was performed using Guppy with high accuration mode. Raw reads were assembled employing EPI2ME Labs software. All sequencing and bioinformatic workflows are shown in Figures 1  and 2.
In this study, we examined the whole genome of SARS-CoV-2 from nine COVID-19 patients using Nanopore's GridION sequencer. We found variations in the changes of nucleotide bases in the exon and intergenic virus regions. The clinical characteristics of these patients are summarized in Table 1. Most of them had fever, anosmia, dry cough, and fatigue and did not have comorbidities.
We fo u n d m a ny m i s s e n s e a n d synonymous or silent mutations in the samples, as shown in Table 2. One sample was found to have a nucleotide deletion while the other changes were

Symptom
Fever   Journal of Pure and Applied Microbiology nucleotide substitutions. Moreover, we detected the gene area and amino acids that changed in these nine samples ( Table 2) and most of the changes were found in the ORF1ab target gene. All of these SARS-CoV-2 sequences had already been submitted to GISAID. The bioinformatic analysis using EPI2ME software revealed numbers of SNP variants ranging from 13 to 23 and one variant being a deletion.
As shown in Table 2, the most common nucleotide changes in our nine patients were C>T at nucleotide positions 241; 3037; 14408; 26735 and A>G at nucleotide position 23403. The codon and amino acid changes found in most samples were CCT>CTT.P>L proline to leucine, and GAT> GGT.D>G aspartate to glycine.

disCussiON
It has been reported that the SARS-CoV-2 genome is similar to that of the SARS-CoV virus that caused the epidemic in 2003. Overall, the protein characteristics for SARS-CoV have been identified, consisting of polyproteins Orf1a and Orf1ab; four structural proteins spike (S), envelope (E), membrane protein(M), and nucleocapsid (N); and eight accessory protein forms: Orf3a, Orf3b, Orf6, Orf7a, Orf7b, Orf8a, Orf8b, and Orf9. Accessory proteins, besides their function in viral replication, play a role in the interaction of the virus with its host.
In SARS-CoV-2,there are 11 proteins: Orf1ab, Orf2 (referred to as S protein); Orf3a and Orf4 are E proteins; Orf5 is the M protein; Orf6, Orf7a, Orf7b, Orf8, and Orf9 are N proteins, and finally Orf10 12-17 . In SARS-CoV-2, the Orf1ab gene expresses polyproteins, consisting of 16 nonstructural proteins (NSP). NSP1 is known as an inhibitor of host gene expression binding to the host's 40S ribosome which results in selective degradation of the host's mRNA so that the viral mRNA can bind 18,19 . NSP2 has the ability to influence the host cell environment by binding to prohibitin proteins (PHB) 1 and 2 of host cells and resulting in cell cycle progression, cell migration, cellular differentiation, apoptosis and mitochondrial biogenesis 20 . NSP3 is a protease protein that plays a role in the release of essential viral proteins. The interaction of NSP3 and NSP4 is very important for viral replication 21,22 .
Mutations in viruses aim to adapt to the environment and new hosts. The capacity of viruses to adapt to their new hosts and environments depends to a great extent on their ability to produce diversity in a short period of time. Thus, the rate of spontaneous mutation between viruses varies widely due to the diversityproducing element encoded by the virus and its host. Viral diversity can also occur in response to certain selective stresses. RNA viruses have the ability to mutate faster than DNA viruses. Understanding the rate of virus mutation has implications for treatment, the development of drug resistance, immunity, pathogenesis and vaccination in efforts to control the disease 23,24 .
The most common types of SNPs detected in this study are missense and synonymous variants in the exon area. Missense mutations in the exon can affect codon changes for amino acids in the translation process. The effects of synonymous or silent mutations will affect post-transcriptional mRNA processes [25][26][27][28] . Such nucleotide mutations in the exon region enable changes in the amino acid sequence and alterations of the tertiary structure of the target protein.
Moreover, the intergenic region is an area where transcriptional enhancer sites related to regulatory functions are often found. Mutations in this area will affect the regulatory processes of the gene 29,[30][31][32][33] . Therefore, all of these changes can result in the phenotypic diversity of the virus. It has been shown that mutations in the Orf area can affect the translation termination process 34 . Bali et al. demonstrated that synonymous mutations will affect protein function 35 .
Overall, missense mutation in the codon region was most prevalent in our samples. Changes due to missense will change the amino acid code and ultimately the amino acid that will be translated. Changes of proline (P) to leucine (L), and aspartate (D) to glycine(G) were those found in all of our samples.
Missense variants can affect the tertiary structure of a protein depending on the nature or character of the amino acids that form the polypeptide sequence. Changes of polar amino acids to non-polar, charged or neutral, will affect the properties of the proteins. We found 15 missense variants in the Orf1ab gene in our samples, and any change in amino acid residues will affect the structure of the Orf1ab polyprotein, which is known toplay a role in adaptation and virulence in its host. Graham et al. have described the possible role of Orf1ab in the pathogenic immune response of viruses and their hosts 36 . The Orf1ab polyprotein consists of 16 non-structural proteins (NSP) 37 . Non-structural proteins have a role in directing viral assembly after the virus invades the host cell, including viral transcription and replication, proteolytic processing, suppression of immune responses and expression of host genes [37][38][39][40][41] . One of our samples had an alteration in the amino acid residue glutamate to lysine at position 2273 of Orf1ab gene at amino acid position 670 (part of NSP3). The glutamate residue is a polar amino acid and the acid is replaced by an alkaline lysine residue. Changes in the sequence of amino acids in a polypeptide or protein in an organism can have a positive or negative effect on the organism, one of which is the ability of the virus to adapt to its host. NSP3 plays many roles in the viral life cycle as it can act as a scaffold protein to interact with itself and to bind to other viral NSPs or host proteins. NSP3 is also very important for the formation of replication transcription complex (RTC). RTC is linked to the host endoplasmic reticulum membrane to produce convoluted membranes and double membrane vesicles in SARS-CoV-2. Changes in the nucleotide bases or amino acids of NSP3 are likely to have an influence in the role of NSP3 42 . However, comprehensive information regarding the association of Glu670Lys with the virulence of SARS-CoV-2 is needed.
The conversion of aspartate residue to glycine residue at position 614 (Asp614Gly) of the S gene was found in all of our samples. The S gene plays a role in the formation of the spike protein (S), such that the S protein will be recognized and will bind to the host receptor so that the SARS-CoV-2 virus can enter the host cell. Several studies have reported a possible link between changes in D614G and the ability of the virus to infect host cells. This variant mutation was found at multiple geographic levels [43][44][45][46][47] . The aspartate residue at position 614 lies outside the receptor-binding domain or RBD and does not change the affinity of the S protein to bind ACE2, but is thought to play a role in ACE2-mediated cell transduction. In addition, it appears that S protein played a key role in the evolution of the coronavirus in circumventing the host's immune mechanism 43,[47][48][49] .
It is interesting that we also found another variant besides D614G in the S protein in two of our patients, namely Q677H 50 . This variant was first reported in Surabaya, East Java, and was also found in our patients living in Banten, West Java. Personal communication with patients provides information on the possibility of contact with relatives or family members who are from East Java. This shows the importance of lockdown measures during the SARS-CoV-2 pandemic to limit or prevent the widespread of highly virulent virus variants.
In addition to mutations in the S protein, we observed that patients with severe conditions had missense mutations in spikeD614G, in NS3(Orf3a)Q57H, and NSP12 (Orf1ab) P323L. Majumdar et al. report an association between mutations in Orf3a and manifestations of SARS-CoV-2 immuno-pathogenic infection 51 . Wu et al. report that mutations in Q57H cause a dramatic change in protein structure that would affect the binding affinity for antiviral proteins 52 . The ORF3a protein is one of the largest accessory proteins in SARS-CoV-2 and is a link in the pathogenesis of COVID-19 53 . The mutation in the NSP12 as RdRp catalyses the replication of RNA and could affect the speed of viral replication 54 . The combination of three mutations in this region could lead to more severe clinical presentation and fatality of COVID-19.
Mutations are expected as natural events within the viral life cycle. Viral adaptation to the host usually results in higher transmission potential, as has been observed for SARS-CoV, MERS, and influenza. The pattern and time course of mutations in virus genomes are critical in estimating phylogenetic trees, which, in turn, depict the epidemic course effectively in real time. Mutations can provide information for understanding emerging outbreaks. The field of genomic epidemiology is presently employed in the mitigation and control of the SARS-CoV-2 outbreak 55 .

CONClusiON ANd FutuRe PROsPeCts
This study has shown several changes in the nucleotide sequence of SARS-CoV-2 resulting in different variants. The changes that occur cause the virus to survive and may lead to detrimental impact