Drug Target Identification for Listeria monocytogenes by Subtractive Genomics Approach

We discover essential enzymes catalyzing critical metabolic reactions as potential drug targets, which may help to fight Listeria infections and their associated secondary infections extensively and effectively. A comparative metabolic pathway approach has been applied to identify and determine putative drug targets against Listeria monocytogenes . For this, enzymes unique to pathogenic pathways of L. monocytogenes EGD-e were determined using the KEGG database. They were further refined by selecting enzymes with sequences non-homologous to the host Homo sapiens and analysing their essentiality to the pathogen’s survival. We report 15 essential pathogen-host non-homologous proteins as putative drug targets that can be exploited for development of specific drug targets or vaccines against multidrug resistant strains of L. monocytogenes. Finally, four essential enzymes from the pathogen: UDP-N-acetylglucosamine 1-carboxyvinyltransferase, Acetate kinase, Phosphate acetyltransferase, and Aspartate kinase were reported as novel putative targets for vaccine and drug development against L. monocytogenes infections. Unravelling novel target proteins and their associated pathways by comparing metabolic pathway analysis between L. monocytogenes EGD-e and host H. sapiens , develops the novelty of the work towards broad spectrum putative drug targets. This research design yields putative drug target critical enzymes that turn out to be fatal to the pathogen without interacting with the host machinery.


INTRODUCTION
Humans are victim of a large number of microbial pathogens through infection in the respiratory tract, gastrointestinal tract, or genital tract epithelium.These pathogens further migrate to secondary locations through lymph and blood to cause diseases in the liver, spleen, bones, and central nervous system (CNS). 1 Listeria monocytogenes is a Gram-positive, rod-shaped, foodborne pathogenic bacterium that is associated with listeriosis, CNS infection, meningitis, sepsis, liver infection, spleen infection and premature birth/ abortions. 2,35][6] L. monocytogenes has been found proliferating in colon of a patient suffering from IBD, much higher than in a healthy individual.Resistant biofilms of Listeria have been reported on all forms of synthetic and biological surfaces, including gut epithelium. 7,8L. monocytogenes enters the human host through the ingestion of contaminated food and reaches the intestine.It crosses the intestinal epithelial barrier and enters the bloodstream, from where it causes secondary infection in several organs such as liver, spleen, bones, brain etc. 2 In South Africa, 978 people were found positive for listeriosis infection which led to 674 hospitalizations and 183 deaths were recorded in 2018. 9Listeriosis outbreaks due to food borne infections have been recorded regularly in Turkey. 101,763 cases of listeriosis from 27 states were recorded by European Food Safety Authority (EFSA) in 2013, leading to 191 deaths. 113][14] The current treatment of such Listeria infections depends on the use of antibiotics, which turn out to be ineffective and have risks of developing drug-resistant strains.At first, the most common treatment for Listeria infections was administration of antibiotics such as penicillin G and/or ampicillin along with aminoglycoside such as gentamicin or kanamycin.Subsequently, a combination of trimethoprim and a sulphonamide was also used as a therapy. 3,15Initially, the majority of the L.monocytogenes strains (isolated from various sources) were susceptible to antibiotics, active against the Gram positive bacteria.The first drug resistance was observed against the antibiotic tetracycline in 1988. 16,17A multidrug resistant strain of L. monocytogenes was also isolated in France in 1988. 16,17Consequently, several multidrug resistant strains of Listeria have been discovered from environment, food sources and human listeriosis gut/stool samples. 17,18L. monocytogenes strains isolated from several food items (including ready-to-eat food) were reported to be resistant to several antibiotics such as ampicillin, clindamycin, nalidixic acid, penicillin (100%) and oxacillin (94.1%). 19The indiscriminate use of effective antibiotics in humans and animals has resulted in an exposure to high concentrations of these antibiotics, leading to resistance development in pathogenic strains of Listeria through gene alterations and transposons. 20,21The infection of these multidrug resistant strains of L. monocytogenes through various food sources may pose a serious threat to public health. 9So, this research attempts to find enzymes catalyzing essential metabolic reactions as potential drug targets, which may help to fight L. monocytogenes infections and their associated secondary infections extensively and effectively.
The easy accessibility of human genome sequences, complete genome sequences of several human infecting pathogens, and availability of several computational tools for their in silico analysis helps us to identify biomarkers and potential drug targets to combat these pathogens.Among these, comparative metabolic pathway analysis interpretations involve the understanding of the organism's physiology, intracellular procedures, and networks, which can find the specific target molecules responsible for their survival, vital functions, or virulence.This can further be used to develop specific and effective drugs against the pathogens.Metabolic pathways illustrate how the bio-molecular units interact with each other to carry out the functions required for the survival, reproduction, and other organism-specific activities. 22Understanding the phenotype and function of organisms requires a detailed analysis of the metabolic pathways involved along with the study of single genes, as the expression of all the protein metabolites is a result of the action of enzymes catalyzing their interconversion.Proteins are the functional biomolecular element of the cell which converts the genetic information into practical reality.They are involved in gene regulation, cell metabolism, transport of nutrients, signal transduction, and enzymatic catalysis.Hence, the pathogen-specific proteins/enzymes can be used as broad-spectrum potential drug targets against these pathogens.
In the present study, comparative metabolic pathway analysis was implied between L. monocytogenes EGD-e (Reference genome) and H. sapiens with the aim to select enzymes unique to L. monocytogenes EGD-e, which are non-homologous to host.These enzymes can work as putative targets for the development of drugs to eradicate these pathogen without interacting with the host machinery.These enzymes were further analysed for their essential role in survival of the pathogen.As targeting the critical enzymes turn out to be fatal to the pathogen and help developing broad-spectrum drugs.Different comparative genomics and transcriptomics studies and comparative proteomic analysis have been performed between different strains of L. monocytogenes or between different species of Listeria with diverse objectives, but unravelling novel target proteins and their associated pathways by comparing metabolic pathway analysis between L. monocytogenes EGD-e and host Homo sapiens, develops the novelty of the work towards broad spectrum putative drug targets.

Metabolic Pathway Analysis of Pathogen and Host
Metabolic pathways of L. monocytogenes EGD-e and H. sapiens were retrieved from Kyoto Encyclopedia of Genes and Genomes (KEGG) database (https://www.genome.jp/kegg/). 23A manual assessment was performed to classify common and unique metabolic pathways between host and pathogen.The metabolic pathways present in pathogen but missing in the host were categorized as unique pathways, whereas the pathways present in both the pathogen and host were categorized as common pathways.The enzymes from unique metabolic pathways, having a sanctioned Enzyme Commission number (EC number), were mapped, and their protein sequences were extracted from the National Centre for Biotechnology Information (NCBI) genome database. 24entification of Pathogen-Host Non-homologous Proteins BLASTP 25 analysis (Protein-protein Basic Local Alignment Search Tool) was performed for protein sequences of enzymes from unique metabolic pathways against the H. sapiens proteome.The protein sequences with an e value cut-off of 5e-3 were considered as homologous to the pathogen and were excluded from the study.The protein sequences without a hit under this criterion were considered to have no significant homolog in H. sapiens and selected for further analysis. 26

Identification of Essential Pathogen-Host Nonhomologous Proteins
BLASTP analysis was performed between the obtained non-homologous protein sequences and prokaryotic essential genes from database of essential genes (DEG; http://www.essentialgene.org/) to yield essential pathogen-host nonhomologous proteins 27 .The criteria of essentiality are as follows: evalue < 1e-10, bit score ≥ 100 and percentage identity ≥ 30. 26

Subcellular Localization of Proteins
The exact position of these proteins in the cell was predicted using CELLO version 2.5 (subCELlular LOcalization predictor version 2.5). 28CELLO applies the amino acid composition and di-peptide composition based on physicochemical parameters of amino acids to predict the subcellular position of the proteins. 28The gram-positive bacterial proteins have the following localization sites: the cell membrane, the cytoplasm, the extracellular space and the cell wall.

Structural Classification of the Unique Identified Target Proteins
The predicted enzymes as drug targets were further analysed and classified for their three dimensional structure and presence of any best possible ligands in their binding pockets.The available structures of the predicted potential drug targets were retrieved and studied from the RCSB Protein Data Bank. 29The research design for this study has been presented in Figure.

KEGG Pathway Analysis Interpretations
KEGG database analysis revealed a total of 107 pathways associated with L. monocytogenes EGD-e.Most of these pathways were biosynthesis and metabolic pathways such as glycolysis, TCA cycle, carbohydrate metabolism, fatty acid biosynthesis and degradation, amino acid biosynthesis and degradation, nucleotide metabolism, peptidoglycan biosynthesis, vitamin metabolism, central dogma, nitrogen and sulfur metabolism etc.Also, other pathways associated with bacterial functions and virulence such as secondary metabolites biosynthesis, microbial metabolism in different environments, degradation of aromatic compounds, resistance to antibiotics such as vancomycin and beta-Lactam, cationic antimicrobial peptide (CAMP) resistance, quorum sensing, chemotaxis, NOD-like receptor signaling pathway, two-component system, and bacterial invasion of epithelial cells were reported (https:// www.genome.jp/kegg/).
H. sapiens have 337 reported pathways in the KEGG database.These pathways consist of the eukaryotic biosynthesis and metabolic pathways.They also have several pathways related to drug and xenobiotics resistance and metabolism, signalling pathways, apoptosis,  The comparative metabolic pathway analysis using the KEGG database resulted in 28 pathways that were unique to the pathogen L. monocytogenes EGD-e, whereas 71 pathways, common to both L. monocytogenes EGD-e and H. sapiens (Table 1 and Table 2 respectively).A total number of 180 enzymes with valid EC numbers were identified from the unique pathogen pathways (Supplementary Table 1).

Determination of Essential Pathogen-Host Nonhomologous Proteins
The unique pathogen enzymes were then compared with host proteome to identify host non-homologous proteins.Out of 180 enzymes, 120 enzyme sequences revealed d" 35% identity with human proteome, and 52 enzyme sequences did not show any hit against the human proteome (Supplementary Table 2).Enzymes which are unique to L. monocytogenes and also do not show any or significant homology to the host proteome may act as effective drug targets as these drug/ vaccine candidates have reduced risk of any unwanted interaction with the host proteins.Hence, these drugs will be safe and not adversely affect the human host metabolism.
Essential genes are the least number of genes which are obligatory for the existence of any organism. 30Essential genes determined for 48 bacterial species have been listed in DEG.This essential genes list can be directly extracted and BLAST analysis can be performed.Alternatively, a common list of prokaryotic essential genes can be used for analysis of other organisms which are not listed in DEG, as performed in this study (DEG; http://www.essentialgene.org/).The knockout of any bacterial essential gene can produce lethal phenotypes, so essential genes may act as significant drug targets. 31This can also be exploited for development of specific drug targets or vaccines against multidrug resistant strains such as L. monocytogenes.Some essential genes may be conserved over a number of related species and are potential targets for development of broad spectrum antibiotics. 27,31Hence, 172 host-nonhomologous enzyme sequences were analyzed for essentiality of L. monocytogenes using DEG.98 enzymes were found to be essential enzymes of L. monocytogenes with an average identity of 49% to essential protein sequences of prokaryotes (Supplementary Table 3).

Subcellular Localization of Target Enzymes/ Proteins
Subcellular localization of these enzymes was determined by CELLO v.2.5, which may provide important information about the function of protein.The bacterial proteins/enzymes present in Gram-positive bacterial dataset are mostly localized in the cytoplasm and the cell membrane.Rest of the proteins are localized in the extracellular space and very few are found at the cell wall. 32CELLO categorized our essential host non-homologous enzymes of L. monocytogenes, as presented in Supplementary Table 4.

Prediction of Potential Targets/ Enzymes for Drug and Vaccine Development
Out of the 52 non-homologous enzymes (with no identity match with H. sapiens proteome), 15 were recorded as essential enzymes for L. monocytogenes by DEG analysis with an average E value ≤ 1.4 e -23 and identity ≥ 47%.Further, we propose four L. monocytogenes enzymes as putative drug targets, completely non-homologous to human and critically important to survival of pathogen (with identity ≥ 60% to the essential prokaryotic sequences), as listed in Table 3.The nature and site of action of all these four enzymes was determined to be cytoplasmic.These enzymes tend to play a central role in the pathogenesis of L. monocytogenes, causing infections in humans.
U D P -N -a c e t y l g l u c o s a m i n e 1-carboxyvinyltransferase is an enzyme of class transferases that catalyzes the transfer of enolpyruvate group to UDP-N-acetyl-a-Dglucosamine which is a significant and committing reaction of cell wall formation in bacteria. 33It belongs to 5-enolpyruvylshikimate-3-phosphate (EPSP) synthase family and subfamily of MurA.This enzyme also substantially involved in other biological processes like cell division, cell wall cycle, cell wall organization, and cell shape regulation along with peptidoglycan biosynthetic pathway. 33,34(https://www.uniprot.org/).
Acetate kinase is involved in the acetyl-CoA biosynthesis pathway by catalyzing a subpathway reaction of phosphorylating acetate utilizing ATP and a divalent cation such as Mg 2+ or Mn 2+ .The reaction is summarized as: acetate + ATP = acetyl phosphate + ADP.The reverse reaction can also be catalyzed by this same enzyme.It is also involved in some metabolic intermediate biosynthesis, such as the organic acid metabolic process (https://www.genome.jp/kegg/).It has also been reported that acetyl phosphate levels in L. monocytogenes are directly involved in monitoring cell motility, chemotaxis, and resistant biofilm formation.In a study, Gueriri and co-workers developed acetate kinase mutants of L. monocytogenes with a blocked synthesis of acetyl phosphate.These mutants were reported with a phenotype of decreased ability of biofilm formation and diminished expression of flagellar protein biosynthesis and motility genes. 35lso, other studies have reported that some L. monocytogenes virulence factors such as VirR/ VirS can be activated by the production of acetyl phosphate in the cells. 36The activity of acetate kinase has also been recorded in other pathogenic intestinal bacterial strains such as Desulfovibrio piger Vib-7 and Desulfomicrobium sp.Rod-9.These pathogen bacteria have been found to be involved in causing IBD in the human host. 37he third drug target enzyme, phosphate acetyltransferase, shows transferase activity.Phosphate acetyltransferase, along with the subsequent action of acetate kinase, produces acetate from acetyl-CoA (or acetyl phosphate) and generates ATP. 38he Aspartate kinase enzyme performs kinase activity, transferase activity, and binding of ATP by the reaction: ATP + L-aspartate = 4-phospho-L-aspartate + ADP.Aspartate kinase is involved in cellular amino acid biosynthetic pathways such as lysine biosynthesis via diaminopimelate (DAP) formation, homoserine biosynthesis, and threonine biosynthetic pathway 39 (https:// www.uniprot.org/).These targets have not been used for drug/vaccine development to our best knowledge.
Other than these four novel drug targets, most of the other 11 enzymes have also been established as potential candidates for drug targets and have been reported in several other studies.Fructose-bisphosphate aldolase (FBA) class II, is a cytoplasmic or surface exposed bacterial enzyme catalysing the cleavage of fructose-1,6bisphosphate to D-glyceraldehyde-3-phosphate and dihydroxyacetone phosphate, an important reversible step in glycolysis and gluconeogenesis. 40BA is known to perform two or more unrelated functions in several bacterial species and hence is a moonlighting protein.FBA can play a significant role in binding to the host's cells and to host's proteins, subsequent generation of an immune response etc, and hence is involved in physiology and pathogenesis of the bacteria. 40The structure and sequence of FBA remains conserved among same and different bacterial species, so it can be exploited to develop broad spectrum antibiotics/ vaccines against a wide group of pathogenic bacteria. 41Mendonca and coworkers reported FBA class II as a novel immunogenic surface protein and monoclonal antibody (mAb-3F8) against this protein for detection of the Listeria spp.and to distinguish Listeria from other pathogenic bacteria. 42 h e p r i m a r y t r e a t m e n t fo r L .monocy togenes infected population is administration of b-lactam antibiotics (such as penicillins) which target a set of enzymes, the penicillin-binding proteins (PBPs) involved in peptidoglycan linking. 43Our findings also report penicillin-binding protein 2A as a putative drug target.Peptidoglycan is a foremost element of the bacterial cell wall which is integral to the cell structure and morphology.Peptidoglycan biosynthesis involves the creation of mesh like structure, which is facilitated by two steps: transglycosylation (elongation of glycan chain) and transpeptidation (peptide cross-linking the flanking glycan chains). 44High molecular weight Class A PBP catalyze both of these reactions through their N-terminal glycosyltransferase and C-terminal transpeptidase domain. 44Another research which applied several in silico approaches such as subtractive genomics and protein-protein interaction network topology, reported PDB4 along with 10 other proteins as a putative drug target in L. monocytogenes EGD-e proteome. 45e have also reported several phosphotransferase system (PTS) such as cellobiose-specific IIB component, mannitolspecific IIB component, fructose-specific IIB component, sugar-specific IIA component, and beta-glucoside-specific IIA component as putative drug targets.The PTS system is composed of a few soluble proteins and one membrane spanning protein, which are involved in the uptake/ transport of PTS carbohydrates by the cell. 46It has also been reported to be actively involved in several regulatory functions such as catabolite repression, potassium transport, nitrogen and phosphate metabolism, antibiotic resistance, endotoxin production, biofilm formation, and virulence of several pathogens including L. monocytogenes. 46PTS mediated sugar transport of cellobiose, mannose, and glucose has been reported to regulate PrfA activity, which is in turn the major transcription factor regulating the virulence gene in L. monocytogenes. 47In a study based on comparative genomics of Vibrio cholera, several constituents of the PTS were described as drug and vaccine targets against the pathogen. 48Similarly, six components of the PTS system were reported as putative drug targets in Klebsiella pneumoniae MGH78578 using the in silico approach. 49Hence, the PTS system may act as an effective drug target for L. monocytogenes too.

Structural Classification of the Unique Identified Target Proteins
The available structures of the predicted potential drug targets were retrieved and studied from the RCSB Protein Data Bank. 29The structures of the predicted drug targets were available on PDB either in L. monocytogenes or in other microbes.The minimum criteria for considering the three dimensional structure was on the basis of Identity >= 70%, Query coverage of the sequence as 80% and E Value <= 0.00.The crystal structure of PBP 4 from L. monocytogenes in the Ampicillin bound form (PDB ID: 3ZG8) 50 and PBP D2 from L. monocytogenes in apo form(PDB ID: 5ZQA) 51 were accessible in PDB.Both these structures were studied in expression system of Escherichia coli.Several ligands such as (2r,4s)-2-[(1r)-1-{[(2r)-2-Amino-2-Phenylacetyl]amino}-2-Oxoethyl]-5,5-Dimethyl-1,3-Thiazolidine-4-Carboxylic acid, glycerol and Di(hydroxyethyl) ether have been reported to bind to the A and B chains of this enzyme.Structures for other drug target proteins were available for different other organisms.For instance, Apo structure of fructose 1,6-bisphosphate aldolase from Bacillus anthracis str.'Ames Ancestor' (PDB ID: 3Q94) can be studied at PDB. 52 This crystal structure was determined through X-ray diffraction experiment and the enzyme was found to be composed of A and B chains with two reported ligands (1,3-Dihydroxyacetonephosphate and acetate ion) interacting with the A chain of the protein.Similarly, the structure of Cytochrome BD-I ubiquinol oxidase from Escherichia coli (PDB ID: 6RX4) 53 and Homoserine Dehydrogenase from Saccharomyces cerevisiae (PDB ID: 1EBU) 54 were available.Two ligands (3-Aminomethyl-Pyridinium-Adenine-Dinucleotide and L-Homoserine) interacting with the D chain of the Homoserine Dehydrogenase enzyme and four ligands (Heme b/c, Cis-Heme d hydroxychlorin gamma-Spirolactone, 1,2-Dioleoyl-Sn-Glycero-3-Phosphoethanolamine, and Ubiquinone-8) binding with A and B chains of Cytochrome BD-I ubiquinol oxidase have been reported so far.Also, the structures of several enzymes reported as drug targets from the PTS system were retrieved for different organisms from PDB.Such as crystal structure of PTS System Cellobiose-specific Transporter Subunit IIB from Bacillus anthracis (PDB ID: 4MGE), 55 and structure of IIB domain of the mannitol-specific permease enzyme II from Escherichia coli (PDB ID: 1VKR) 56 can be studied from PDB.The crystal structure of the fructose specific IIB subunit of PTS system was available for Bacillus subtilis (PDB ID: 2R48). 57imilarly, the closest structures available for sugarspecific IIA component and beta-glucoside-specific IIA component of the PTS system were studied from PDB.The details of all these structures including the classification, expression system, mutations, gene names and ligand interactions have been compiled in Supplementary Table 5.
Out of the four novel drug targets predicted, the structures of two enzymes: UDP-Nacetylglucosamine 1-carboxyvinyltransferase and phosphate acetyltransferase were available for L. monocytogenes. 58,59The crystal structure of UDP-N-acetylglucosamine 1-carboxyvinyltransferase (murA) from L. monocytogenes EGD-e (PDB ID: 3R38) was determined through X-ray diffraction experiments in the Escherichia coli BL21 expression system.Two ligands have been determined for this protein (Sulfate ion and Chloride ion) which bind to the chain A of the protein. 58Similarly, the crystal structure of phosphate acetyl/butaryl transferase (from L. monocytogenes EGD-e) in complex with CoA (PDB ID: 3U9E) has been determined in Escherichia coli BL21(DE3) by X-ray diffraction studies.Four ligand molecules (Coenzyme A, Arginine, Glycerol, Chloride) have been known to interact with A and B chains of the protein 59 (Supplementary Table 5).
The structure for the third novel protein acetate kinase was found for the organism Salmonella enterica subsp.enterica serovar Typhimurium (PDB ID: 3SK3) in the RCSB Protein Data Bank.Citric acid and 1,2-Ethanediol ligands interact with the A and B chain of the enzyme. 60imilarly, the structure for aspartate kinase was found for the organism Pseudomonas aeruginosa PAO1 (PDB ID: 5YEI) which was determined by X-ray diffraction in expression system Escherichia coli BL21(DE3).Three ligand molecules: Threonine, Lysine, and Glycerol are known to interact with the protein 61 (Supplementary Table 5).Unpinning the structure categorization and identifying the inhibitors for these target proteins opened different methods of research towards drug design and reverse vaccinology approach.

CONCLUSION
The comparative metabolic pathway analysis approach of L. monocytogenes-H.sapiens resulted in four novel putative target proteins: UDP-N-acetylglucosamine 1-carboxyvinyltransferase, acetate kinase, phosphate acetyltransferase, and aspartate kinase, which were very high in essentiality index with the pathogen and non-homologous nature with the host.Other 11 enzymes on the list are also significant putative drug targets and some of them have been reported by prior studies as well.The predicted potential drug target enzymes from L. monocytogenes will not interact with host machinery and also perform essential functions such as peptidoglycan biosynthesis, cell motility, chemotaxis, resistant biofilm formation, virulence, bacterial pathogenicity, amino acid biosynthesis, cell division, and cell wall organization.Therefore, drug development against these targets to combat L. monocytogenes infections will be very promising.Unravelling novel target proteins and their associated pathways by comparing metabolic pathway analysis between Listeria monocytogenes EGD-e and host Homo sapiens, develops the novelty of the work towards specific and broad spectrum putative drug targets.In addition to this, a detailed further analysis of these potent target proteins in terms of in-vivo and in-vitro approach will attain new and unique generation of biomolecules against the diseases caused by L. monocytogenes.

Figure .
Figure.Schematic representation of subtractive genomics methodology for identification of potential drug and vaccine targets against L. monocytogenes.KEGG pathway analysis of metabolism for the pathogen (L.monocytogenes) and the host (H.sapiens) led to the subsequent identification of essential pathogen-host non-homologous enzymes/ proteins which can be targeted as possible drug/vaccine candidates.(KEGG: Kyoto Encyclopedia of Genes and Genomes; BLASTP: Protein-protein Basic Local Alignment Search Tool; DEG: Database of Essential Genes; CELLO v2.5: subCELlular LOcalization predictor version 2.5)

Table 1 .
List of metabolic pathways common to both the pathogen Listeria monocytogenes EGD-e and host Homo sapiens (KEGG: Kyoto Encyclopedia of Genes and Genomes)

Table 2 .
List of metabolic pathways which are unique to the pathogen Listeria monocytogenes EGD-e (KEGG: Kyoto Encyclopedia of Genes and Genomes)

Table 3 .
Potential drug target enzymes from Listeria monocytogenesEGD-e, showing their essential role in survival of the pathogen and non-homologous nature to host Homo sapiens.KEGG: Kyoto Encyclopedia of Genes and Genomes)