Structural Analysis of Avian Encephalomyelitis Virus Polyprotein for Development of Multi Epitopes Vaccine Using Immunoinformatics Approach

Avian Encephalomyelitis (AE) is the disease caused by avian encephalomyelitis virus (AEV). The disease mainly affects young birds nervous system worldwide causing high morbidity and variable mortality rate in chicks and noticed egg dropping and hatchability in mature hens. Vaccination is the only way to control AEV infection since there is no treatment yet to the avian encephalomyelitis. This study aimed to use immunoinformatics approaches to predict multi epitopes vaccine from the AEV polyprotein that could elicit both B and T cells. The vaccine construct comprises 482 amino acids obtained from epitopes predicted against B and T cells by IEDB server, adjuvant, linkers and 6-His-tag. The chimeric vaccine was potentially antigenic and nonallergic and demonstrated thermostability and hydrophilicity in protparam server. The solubility of the vaccine was measured in comparison to E. coli proteins. The stability was also assessed by disulfide bonds engineering to reduce the high mobility regions in the designed vaccine. Furthermore molecular dynamics simulation further strengthen stability of the predicted vaccine. Tertiary structure of the vaccine construct after prediction, refinement was used for molecular docking with chicken alleles BF2*2101 and BF2*0401 and the docking process demonstrated favourable binding energy score of -337.47 kcal/mol and -326.87 kcal/mol, respectively. Molecular cloning demonstrated the potential clonability of the chimeric vaccine in pET28a(+) vector. This could guarantee the efficient translation and expression of the vaccine within suitable expression vector.


InTroDUCTIon
Avian encephalomyelitis virus (AEV) is a positive single stranded RNA virus belonged to picornaviridea family genus tremovirus  The AEV genome is consist of 7.5 kilo basepair comprising a single open reading frame (ORF) encoding proteins of 2134 amino acids Marvil et al. 1999;Wei et al. 2004]. This ORF consists of three parts: part1, part2 and part3 with four viral structural proteins named as vp1, vp2, vp3 and vp4. The vp2 and vp3 regions encode nonstructural proteins Marvil et al. 1999]. However the vp1 region is considered as the most immunogenic part of virus and demonstrated neutralized antibodies against it Muir et al. 1998].
There is no treatment yet to the avian encephalomyelitis, and the only way to control it by vaccination of flocks [Lin et al. 2018;Calnek 1998]. Flock vaccination programs designed to produce offspring with maternal antibodies that can result in well performance of the offspring and stop the transovarian transmission of disease during the stage of susceptibility (one to three weeks) post hatchability [Lin et al. 2018;Yu et al. 2015]. In china vaccination is mainly done from 14 to 16 weeks of breed age by administration of live vaccine (live field virus) into their drinking water or by using wing web inoculation via intracutaneous injection [Lin et al. 2018;Smyth et al. 1994]. This immunity protects hens during laying and their progeny through maternal antibodies [Hauck et al. 2017;Lin et al. 2018; Westbury and Sinkovic 1978]. In addition to that, in Chinese chicken industry section the inactivate vaccine was used in a wide range. However it is not effect enough like a live vaccine [Lin et al. 2018 In this study we attempt to use the reverse vaccinology approach to predict multi epitopes vaccine from the polyprotein of the avian encephalomyelitis virus (AEV) that could elicit both B lymphocytes and T lymphocytes and consider as a safer vaccine candidate.

Viral proteome retrieval
Avian encephalomyelitis virus (AEV) had only one chromosome encoding for polyprotein named glycoprotein-1 or AEV polyprotein with accession number (NP_653151.1). The length of this polyprotein is 2134 amino acids. The National Center for Biotechnology Information (NCBI) at (https:// www.ncbi.nlm.nih.gov/genome/browse/#!/ proteins/5472/891046%7CTremovirus%20A/ viral%20segment%20Unknown/) was used to retrieve the polyprotein sequence and entirely used for vaccine epitopes prediction. Strains retrieval of AEV polyproteins A set of six strains of avian encephalomyelitis virus polyproteins were retrieved from the NCBI at (https://www.ncbi.nlm.nih.gov/ protein/?term=aev+polyprotein) on 15.4.2020 with the following accession numbers NP_653151.1; ALR74730.1; CAA12416.1; sp|Q9YLS4.1; sp|Q6R325.1 and sp|Q6WQ42.1. These proteins were further used to obtain the conserved epitopes sequences among the retrieved strains. Sequence alignment of the retrieved strains and epitopes conservancy Sequence alignment of the polyprotein sequences from the retrieved strains was obtained using multiple sequence alignment (MSA) tools, Clustal W in the BioEdit program, version 7.0.9.0 [Hall 1999]. The purpose of MSA was to obtain 100% conserved epitopes that could elicit the screened B and T lymphocytes.

Phylogenetic tree construction
The six retrieved strains sequences of AEV polyproteins were subjected to phylogenetic analysis using MEGA6 software [Tamura et al. 2013]. Phylogenetic tree was built to demonstrate the common ancestor of each retrieved strain.

B-cell epitopes prediction
The propensity scale and hidden Markov models programmed software from Immune epitopes database (IEDB) analysis resource (http://toolsiedb.ofg/bcell/) were used to predict epitopes from AEV polyprotein interacting with B lymphocytes. Three tools in the IEDB analysis resources were used to analyze B cell epitopes. For the epitope to be considered as a B cell epitope it should be linear epitope, located on the surface of the antigen and antigenic eliciting immune response. Thus linear, surface accessible and antigenic epitopes were assessed by BepiPred linear epitopes prediction [Larsen et al. 2006 [Emini et al. 1985] and kolaskar and tongaonker antigenicity method [Kolaskar and Tongaonkar 1990], respectively.

T cell epitopes prediction
Since the data of epitopes binding to MHC class I and MHC class II is not yet developed for chicken in the IEDB, the human alleles were used for the prediction of the T cell epitopes.

Epitope-MHC class I binding predictions
For MHC class I the peptide binding analysis was evaluated by the IEDB MHC-I estimated tool at (http://tools.iedb.org/mhci/) using Artificial Neural Network (ANN) method. Epitopes lengths was set as 9mers and all the conserved epitopes that bound to alleles at score less than or equal to 2 percentile rank were used for further analysis [Kim et al. 2012;Nielsen et al. 2003;Lundegaard et al. 2008;Sidney et al. 2008].

Epitope-MHC class II binding predictions
Analysis of peptide binding to MHC II molecules was assessed by the IEDB MHC II prediction tool at (http://tools.iedb.org/mhcii/ result/) using Neural Networks Align to identify the binding affinity and MHCII binding core epitopes. All conserved epitopes that bound to alleles at score equal or less than 100 percentile rank were selected for further analysis ].

Determination of antigenicity, allergenicity and toxicity of the predicted epitopes
The VaxiJen v2.0 server at (http://www. ddg-pharmfac.net/vaxijen/VaxiJen/VaxiJen.html) was used to predict the antigenic epitopes of B and T lymphocytes. The server threshold was set to the default threshold (0.4). Antigenic epitopes were further investigated for allergenicity with AllerTOP server [Dimitrov et al. 2013] and toxicity with ToxinPred server [Gupta et al. 2013].

Construction chimeric vaccine
The antigenic, nonallergic and nontoxic epitopes were used to generate the vaccine construct against AEV. Therefore the proposed B cell epitopes and epitopes with high allelic interaction against cytotoxic and helper T lymphocytes from AEV polyprotein were used to construct the chimeric vaccine. Moreover epitopes that interacted with both MHCI and MHCII alleles were used once in the structure of the chimeric vaccine as MHCI or MHCII epitopes. The GGGGS linker was used to link B cell and MHC11 epitopes. While KK linker was used to link MHC1 predicted epitopes. The 50S ribosomal protein L7/L12 of Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv, uniprot accession no P9WHE3) was used as an adjuvant on the amino terminal of the vaccine construct to enhance the immunogenicity of the chimeric vaccine. This adjuvant was linked to the epitopes via EAAAK linker. A six his-tag was added at the carboxyl terminal of the vaccine construct to ease the isolation and identification of the vaccine.

Chimeric vaccine physical and chemical properties
ProtParam (https://web.expasy.org/ protparam/) is a web used to compute physical Journal of Pure and Applied Microbiology and chemical properties of a given protein sequence. The chimeric vaccine construct from the predicted epitopes was analyzed for the physical and chemical properties. The physical and chemical properties comprises the vaccine protein molecular weight (MW), theoretical isoelectric point (pl), amino acid and atomic compositions, extinction coefficient, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (GRAVY).

Secondary structure prediction
The Self-optimized prediction method (SOPMA) at (https://npsa-prabi.ibcp.fr/cgi-bin/ npsa_automat.pl?page=/NPSA/npsa_sopma.html) [Combet et al. 2000] was the tool used to analyze the number of the helix, coiled structures and beta sheets in the secondary structure of the vaccine protein construct.

Tertiary structure prediction
The chimeric vaccine sequence was submitted to PHYRE-2 protein folding recognition server (http://www.sbg.bio.ic.ac.uk/~phyre2/ html/page.cgi?id=index ) [Kelley et al. 2015]. The output PDB file obtained was used in refinement and adaptation of the chimeric vaccine structure.

Tertiary structure refinement and validation
GalaxyWEB web server was used for the refinement process [Shin et al. 2014; Ko et al. 2012]. The server refines the protein structure via utilizing a refinement process that obtains repack side chains and finally achieving complete relaxation of the structure via integrating dynamic simulations [Heo et al. 2013]. The refinement was performed to ameliorate the physical quality of the structure. ProSA-web server at (https:// prosa.services.came.sbg.ac.at/prosa.php ) was initially used for the model validation. This server calculates the overall quality score for a specific input protein PDB structure. Secondly the model validation was achieved via Ramachandran plot in RAMPAGE server at (http://mordred.bioc.cam. ac.uk/~rapper/rampage.php ) [Lovell et al. 2002;Al-Hakim et al. 2015].

Solubility of the chimeric vaccine
Protein-sol (https://protein-sol. manchester.ac.uk/) is a web based algorithms for determination of protein solubility [Hebditch et al. 2017]. The server predicted the solubility of the chimeric vaccine as QuerySol scaled solubility value compared to E. coli experimental dataset that expressed as a population average solubility (PopAvrSol) with 0.45 averages. For instance if the solubility of particular protein was equal to or greater than the population average solubility of E. coli (0.45) the protein considered as soluble protein.

Stability of the vaccine construct (Disulfide bonds prediction)
Disulfide by Design 2.0 (DbD2) for disulfide engineering in proteins was applied to achieve disulfide bonds between the chimeric vaccine residues [Craig and Dombkowski 2013]. The residue pairs are evaluated for proximity and geometry composition for formation of disulfide bonds, assuming that the residue pairs were mutated to cysteines.

Molecular dynamics simulation
To explore the collective motions of proteins or nucleic acids the online server iMODS HDOCK server (http://hdock.phys.hust. edu.cn/) that used protein-protein and protein-DNA/RNA docking was used to dock the vaccine construct with chicken alleles [Yan et al. 2017]. The vaccine construct PDB file (ligand) was submitted to the server with PDB ID of chicken alleles, MHC class 1 BF2*2101 molecule and MHC class 1 BF2*0401 molecule. These alleles were retrieved from the NCBI database with the following PDBIDs: 4D0C, CAK54661.1 for BF2*2101 molecule and 4D0C, CAK54660.1 for BF2*0401 molecule and were used as receptors in the docking process.

Codon adaptation and in silico cloning
In silico cloning was performed to guarantee the expression of the vaccine construct in the selected host. The protein sequence of the chimeric vaccine was first converted into DNA sequence via Java Codon Adaptation Tool (JCAT) server (http://www.prodoric.de/JCat). The rho independent transcription termination, prokaryote ribosome binding site and cleavage site of restriction enzyme were avoided [Shey et al. 2019]. In the JACT, codon adaptation index (CAI) score is 1.0 but >0.8 is considered a good score [Morla et al. 2016]. The favourable GC content of a sequence ranged between 30-70%. The sequence of the BamHI and Xho1restriction enzymes were placed at the 5 and 3 primes of the DNA sequence, respectively. The SnapGene restriction cloning module was used to insert the DNA sequence into pET28a (+) vector between the BamHI and Xho1

Sequence alignment and epitopes conservancy
MSA of all retrieved strains of AEV polyprotein was achieved by using ClustalW embedded in Bioedit software. The software was used to obtain 100% conserved epitope from the retrieved strains. Epitopes conservancy was determined via alignment of the reference sequence and the sequences of the other retrieved strains. Fig. (1-a) provided the conserved regions by the amino acids identity among the retrieved sequences Phylogenetic tree construction Although mutated regions were observed, the alignment provided conserved regions among the retrieved strains. The mutated regions resulted in evolutionary divergence among each retrieved strains. For instance in Fig. (1-b) the strains CAA12416.1 and sp|Q9YLS4.1 were much related to each other in their evolution. On the other hand the strain ALR74730.1 was far related to all other strains and demonstrated far molecular divergence.

B-cell Epitopes Prediction
The AEV polyprotein reference sequence was analyzed by Bepipred linear epitope prediction, Emini surface accessibility, Kolaskar and Tongaonkar antigenicity analysis tools in IEDB with thresholds of 0.06, 1.000 and 1.049 respectively Journal of Pure and Applied Microbiology (Fig. 2). Epitopes passed the three tools were considered as potential epitopes determinants of the B cell. The three tools predicted 76 linear conserved epitopes, 44 epitopes on the surface and 30 antigenic epitopes. However only 14 epitopes overlapped the three tools and were  further investigated for antigenicity using Vaxijen software with default threshold (0.4), allergenicity and toxicity. Upon investigation only three epitopes were shown to be antigenic, nonallergic and nontoxic. The three epitopes, their position and their scores in in different tools were shown in Table (1).

Cytotoxic T-lymphocyte epitopes prediction:
Based on Artificial Neural Network (ANN) analysis tool only 22 epitopes were shown to interact with numerous MHC1alleles. All the MHC-1 predicted epitopes were further analyzed for antigenicity, allergenicity and toxicity. Ten epitopes demonstrated antigenicity and were shown to be nonallergic and nontoxic. The ten epitopes with their scores in the different tools were shown in Table (2) and were elected as cytotoxic T lymphocytes epitopes.

Helper T lymphocytes epitopes prediction
Based on NN-align analysis method, 77 epitopes were predicted interacting with MHC-II alleles. Among them, only 14 epitopes demonstrated antigenicity and were shown to be nonallergic and nontoxic. Thus these epitopes were elected as helper T cell epitopes and were shown in Table (3).

Construction of muli-epitopes vaccine
The chimeric vaccine includes the B and T cell predicted epitopes. Three epitopes were proposed as B cell epitope, ten epitopes as cytotoxic T cell and fourteen epitopes as helper T cell. The chimeric vaccine composed of 482 amino acids after addition of the adjuvants, linkers and 6-His-tag (Fig. 3). The chimeric vaccine demonstrated antigenicity in Vaxigen server with score of 0.5857 and was nonallergen in the Allertop server.

Physical and chemical properties of the vaccine construct
The MW of the chimeric vaccine was 49.80192 KDa with pI value of 9.37. The total number of negatively (Asp+Glu) and positively (Arg+Lys) charged residues was 45 and 61 respectively. The Extinction coefficient was 37945 indicating all pairs of Cys residues form cysteines.  The estimated half-life was 30 hours (mammalian reticulocytes, in vitro), >20 hours (yeast, in vivo) and >10 hours (Escherichia coli, in vivo). The instability index (II) was 28.38 demonstrating the stability of the chimeric vaccine. Aliphatic index was 75.29 and the GRAVY was -0.127 indicating the hydrophilicity of the chimeric vaccine. Chimeric vaccine secondary structure prediction Fig. (4) showed that the 482 amino acids of the predicted vaccine showed that 150 aa (31.12%) involved in formation of alpha helices, 132 aa (27.39%) were extended strands, 66 aa (13.69%) were beta turns while 134 aa (27.80%) were random coils with no unambiguous or any other states. Tertiary structure prediction Fig. (5) provided the 3D structure of the chimeric vaccine predicted by PHYRE2 server. The model was further assessed by Ramachandran plot after refinement and demonstrated that 399 residues were in the favoured region (83.1%), 59 residues were in the allowed region (12.3%) and 22 residues were in the outlier region (4.6%). Moreover proSA server Z-score of the chimeric vaccine was -4.18 which represents the good quality of the model. Fig. (6) showed the solubility of the chimeric vaccine, QuerySol scaled solubility value, was 0.470 compared to the experimental dataset (PopAvrSol) of 0.45 for E. coli proteins. This result showed that the chimeric vaccine is potentially soluble.

Stability (Disulfide bonds prediction) of the chimeric vaccine
A total 65 pairs of amino acids residues probably shown to be implicated in disulfide bond formation. Among them five residues were evaluated to form disulfide bond based on the chi3 residue screening (between −87 and +97), B-factor value and energy value less than 3, (Fig.  7). The five residue pairs that strongly implicated in disulfide bond formation were the following amino acids at the following positions 67 LEU, 71 GLY, 73 LYS,181 GLY and 407 PRO if were mutated to the following amino acids 98 ASP, 124 GLY, 76 GLY, 206 TYR and 410 VAL respectively.

Molecular dynamics simulation
Molecular dynamics simulation of the vaccine protein was performed by NMA (Normal mode analysis) in the iMODS server and presented in Fig. (8). As shown in Fig. the arrows indicated the direction of the mobility of each residue in the chimeric vaccine construct (Fig. 8a). Moreover the deformability of the molecule associated with the residues individual distortion, presented by hinges in the chain (Fig. 8b). Experimental B-factor was obtained from the corresponding PDB field and obtained from the calculation of the NMA (Fig. 8c). The eigenvalue which demonstrated the stiffness of the motion was shown to be 3.145184e−06 (Fig.  8d), where the lower the eigenvalue, the easier the deformation. Covariance matrix provided the coupling between pairs of residues, i.e. whether they experience correlated (red), uncorrelated (white) or anti-correlated (blue) motions (Fig.  8e). The elastic network model defines which pairs of atoms are connected by springs in which each dot showed one spring between the corresponding pair of atoms. Dots are colored based on their stiffness, where the darker grays dots demonstrated stiffer springs and vice versa (Fig. 8f).

Molecular docking of chimeric vaccine with chicken alleles
The chimeric vaccine was used as a ligand and the chicken alleles (BF2*2101 & BF2*0401) as receptors. As shown in (Fig. 9a), the docking process of the vaccine construct with PDB: 4D0C, CAK54661.1 demonstrated that the binding energy score was -337.47. For PDB: 4D0C CAK54660.1 the binding energy score was -326.87 (Fig. 9b). These negatively scored values demonstrated the strong binding between the vaccine protein and the chicken alleles.

In silico cloning
The In silico cloning of the DNA sequence of the vaccine protein provided CAI-Value of 0.9742, providing higher proportion of most common abundant codons while the GC-content was 49.23928%, demonstrating favourable GC content. Fig. (10), showed that DNA sequence was cloned into pET28a (+) vector between BamH1 and Xho1restriction enzymes cutting sites.

DISCUSSIon
The prevention and control of avian encephalomyelitis virus via vaccination of young chickens is of great importance [Calnek and Jehnich 1959a;Calnek and Jehnich 1959b;Schaaf 1958]. However some drawbacks were reported during the course of the vaccination process. For instance the wing-web and intramuscular vaccines was more pathogenic than is desirable although the vaccine was capable of eliciting immune response In this study reverse vaccinology approach was used to design safe multi-epitopes vaccine from avian encephalomyelitis virus polyprotein.
In this study only the 100% conserved epitopes from the AEV polyprotein were elected to interact against the B and T lymphocytes. The predicted B cell epitopes were investigated to be linear, surface accessible and antigenic using IEDB prediction tools. For T cells, large numbers of epitopes were shown to bind to different MHCI and MHCII alleles. The predicted epitopes from B and T lymphocytes showed antigenicity and were tested non allergic and nontoxic, thus, were used as a vaccine candidate. The epitopes were joined together using appropriate linkers sequences Physical and chemical properties of the vaccine construct were assessed via protparam server. Results showed that the computed instability index (II) classifies the protein as stable and the vaccine protein showed aliphatic side chains, indicating potential hydrophobicity. The GRAVY classified the vaccine as hydrophilic with thermal stability. Furthermore vaccine secondary and tertiary structures were evaluated since they are important in vaccine design [Meza et al. 2017]. Moreover the prediction extremely ameliorated by the refined software and demonstrated desirable characteristics on Ramachandran plot predictions. The vaccine construct was shown in yellow colour and the chicken alleles in rainbow colours. A cartoon structure of the vaccine construct docked with BF2*2101 was shown in Fig. (a-left) while the ball structure in (a-right). A cartoon structure of the vaccine construct docked with BF2*0401 was shown in Fig. (b-left) while the ball structure in (b-right).
These indicated that the overall model quality of the vaccine protein was satisfactorily.
The formation of insoluble protein particles may hinder the vaccine protein native form. In addition, purification of the vaccine construct may cause structural alterations during solubilization and refolding steps [Silva et al. 2016;Dill and Shortle 1991]. Thus the study of the solubility of the vaccine construct is a cornerstone in determining the vaccine nature. It was reported that the solubility of the recombinant protein in E. coli is important for biochemical and functional analysis . In this study the vaccine construct provided solubility index of 0.470 compared to that of E. coli (0.45) indicating the favourable solubility of the vaccine construct Vaccines are biological products that had their own stability issues, which must be considered during development. The stability problems obstacle the efficacies of the vaccine construct, thus, it is of great importance to measure the vaccine stability. Moreover the folding stability of protein can also directly impact the availability of B-cell epitopes [Scheiblhofer et al. 2017]. For instance protein destabilization guide to improper folding of the protein tertiary structure. This effectively resulted in loss of conformational epitopes. Also this resulted in proteins with epitopes not recognized by IgE, but maintains their capacity to stimulate T-cell responses [Swoboda et al. 2007;Thalhamer et al. 2010]. In this study the stability of the chimeric vaccine was performed via disulfide bonds formation. The result showed that the stability was well indexed if five residues in the vaccine structure mutated to cysteine. Beside determination of the stability of the vaccine construct via disulfide bonds formation molecular dynamics study was performed to determine the complex stability as well. Previously macromolecules stability was linked with correlated fluctuations of atoms [Clarage et al. 1995;Caspar 1995]. Thus we performed essential dynamics in accordance to the normal modes of proteins deposited in iMODS server to further assess the complex stability of the vaccine. The analysis demonstrated that no significant distortion of the protein atoms resulting in reducing the chance of deformability with no stiffness motion of the vaccine construct thereby strengthening our prediction.
Molecular docking interaction of the vaccine construct and chicken MHC1 molecules (BF2*2101 and BF2*0401) was performed to explore the binding affinity of vaccine construct and eliciting immune response. In this study the attractive binding energy between chicken MHC1 molecules and the vaccine construct demonstrated high binding affinity. This was expressed in negative binding energy values providing strong interaction of the vaccine to MHC1 molecules, that professionally eliciting a potential protective immune response.
The clonability of the chimeric vaccine in a suitable E. coli expression vector is significant step for recombinant protein production [Chen 2012; Rosano and Ceccarelli 2014]. The protein sequence of the designed vaccine was converted to DNA sequence via reverse transcription process and adapted for E. coli strain K12 before cloning into pET28a (+) vector. The CAI value and the GC content showed successful cloning process with high-level of protein expression in the host bacteria.

ConClUSIon
Peptide-based vaccine via reverse vaccinology is becoming one of the important tools for designing vaccines. The predicted vaccine demonstrated favourable interaction against B and T-cells and induced humoral and cellular responses. The predicted vaccine showed chemical stability and solubility and demonstrated good binding with chicken alleles. Molecular cloning showed that the vaccine could be cloned and produced as a recombinant vaccine. The effectiveness and safety of the designed vaccine by this computational analysis needed to be evaluated in clinical trial experiments to confirm their efficacy in inducing protective immune response.