990 resultados para 270202 Genome Structure


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Eukaryotic ribosomal DNA constitutes a multi gene family organized in a cluster called nucleolar organizer region (NOR); this region is composed usually by hundreds to thousands of tandemly repeated units. Ribosomal genes, being repeated sequences, evolve following the typical pattern of concerted evolution. The autonomous retroelement R2 inserts in the ribosomal gene 28S, leading to defective 28S rDNA genes. R2 element, being a retrotransposon, performs its activity in the genome multiplying its copy number through a “copy and paste” mechanism called target primed reverse transcription. It consists in the retrotranscription of the element’s mRNA into DNA, then the DNA is integrated in the target site. Since the retrotranscription can be interrupted, but the integration will be carried out anyway, truncated copies of the element will also be present in the genome. The study of these truncated variants is a tool to examine the activity of the element. R2 phylogeny appears, in general, not consistent with that of its hosts, except some cases (e.g. Drosophila spp. and Reticulitermes spp.); moreover R2 is absent in some species (Fugu rubripes, human, mouse, etc.), while other species have more R2 lineages in their genome (the turtle Mauremys reevesii, the Japanese beetle Popilia japonica, etc). R2 elements here presented are isolated in 4 species of notostracan branchiopods and in two species of stick insects, whose reproductive strategies range from strict gonochorism to unisexuality. From sequencing data emerges that in Triops cancriformis (Spanish gonochoric population), in Lepidurus arcticus (two putatively unisexual populations from Iceland) and in Bacillus rossius (gonochoric population from Capalbio) the R2 elements are complete and encode functional proteins, reflecting the general features of this family of transposable elements. On the other hand, R2 from Italian and Austrian populations of T. cancriformis (respectively unisexual and hermaphroditic), Lepidurus lubbocki (two elements within the same Italian population, gonochoric but with unfunctional males) and Bacillus grandii grandii (gonochoric population from Ponte Manghisi) have sequences that encode incomplete or non-functional proteins in which it is possible to recognize only part of the characteristic domains. In Lepidurus couesii (Italian gonochoric populations) different elements were found as in L. lubbocki, and the sequencing is still in progress. Two hypothesis are given to explain the inconsistency of R2/host phylogeny: vertical inheritance of the element followed by extinction/diversification or horizontal transmission. My data support previous study that state the vertical transmission as the most likely explanation; nevertheless horizontal transfer events can’t be excluded. I also studied the element’s activity in Spanish populations of T. cancriformis, in L. lubbocki, in L. arcticus and in gonochoric and parthenogenetic populations of B. rossius. In gonochoric populations of T. cancriformis and B. rossius I found that each individual has its own private set of truncated variants. The situation is the opposite for the remaining hermaphroditic/parthenogenetic species and populations, all individuals sharing – in the so far analyzed samples - the majority of variants. This situation is very interesting, because it isn’t concordant with the Muller’s ratchet theory that hypothesizes the parthenogenetic populations being either devoided of transposable elements or TEs overloaded. My data suggest a possible epigenetic mechanism that can block the retrotransposon activity, and in this way deleterious mutations don’t accumulate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research presented in my PhD thesis is part of a wider European project, FishPopTrace, focused on traceability of fish populations and products. My work was aimed at developing and analyzing novel genetic tools for a widely distributed marine fish species, the European hake (Merluccius merluccius), in order to investigate population genetic structure and explore potential applications to traceability scenarios. A total of 395 SNPs (Single Nucleotide Polymorphisms) were discovered from a massive collection of Expressed Sequence Tags, obtained by high-throughput sequencing, and validated on 19 geographic samples from Atlantic and Mediterranean. Genome-scan approaches were applied to identify polymorphisms on genes potentially under divergent selection (outlier SNPs), showing higher genetic differentiation among populations respect to the average observed across loci. Comparative analysis on population structure were carried out on putative neutral and outlier loci at wide (Atlantic and Mediterranean samples) and regional (samples within each basin) spatial scales, to disentangle the effects of demographic and adaptive evolutionary forces on European hake populations genetic structure. Results demonstrated the potential of outlier loci to unveil fine scale genetic structure, possibly identifying locally adapted populations, despite the weak signal showed from putative neutral SNPs. The application of outlier SNPs within the framework of fishery resources management was also explored. A minimum panel of SNP markers showing maximum discriminatory power was selected and applied to a traceability scenario aiming at identifying the basin (and hence the stock) of origin, Atlantic or Mediterranean, of individual fish. This case study illustrates how molecular analytical technologies have operational potential in real-world contexts, and more specifically, potential to support fisheries control and enforcement and fish and fish product traceability.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this work is to characterize the genome of the chromosome 1 of A.thaliana, a small flowering plants used as a model organism in studies of biology and genetics, on the basis of a recent mathematical model of the genetic code. I analyze and compare different portions of the genome: genes, exons, coding sequences (CDS), introns, long introns, intergenes, untranslated regions (UTR) and regulatory sequences. In order to accomplish the task, I transformed nucleotide sequences into binary sequences based on the definition of the three different dichotomic classes. The descriptive analysis of binary strings indicate the presence of regularities in each portion of the genome considered. In particular, there are remarkable differences between coding sequences (CDS and exons) and non-coding sequences, suggesting that the frame is important only for coding sequences and that dichotomic classes can be useful to recognize them. Then, I assessed the existence of short-range dependence between binary sequences computed on the basis of the different dichotomic classes. I used three different measures of dependence: the well-known chi-squared test and two indices derived from the concept of entropy i.e. Mutual Information (MI) and Sρ, a normalized version of the “Bhattacharya Hellinger Matusita distance”. The results show that there is a significant short-range dependence structure only for the coding sequences whose existence is a clue of an underlying error detection and correction mechanism. No doubt, further studies are needed in order to assess how the information carried by dichotomic classes could discriminate between coding and noncoding sequence and, therefore, contribute to unveil the role of the mathematical structure in error detection and correction mechanisms. Still, I have shown the potential of the approach presented for understanding the management of genetic information.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Different types of proteins exist with diverse functions that are essential for living organisms. An important class of proteins is represented by transmembrane proteins which are specifically designed to be inserted into biological membranes and devised to perform very important functions in the cell such as cell communication and active transport across the membrane. Transmembrane β-barrels (TMBBs) are a sub-class of membrane proteins largely under-represented in structure databases because of the extreme difficulty in experimental structure determination. For this reason, computational tools that are able to predict the structure of TMBBs are needed. In this thesis, two computational problems related to TMBBs were addressed: the detection of TMBBs in large datasets of proteins and the prediction of the topology of TMBB proteins. Firstly, a method for TMBB detection was presented based on a novel neural network framework for variable-length sequence classification. The proposed approach was validated on a non-redundant dataset of proteins. Furthermore, we carried-out genome-wide detection using the entire Escherichia coli proteome. In both experiments, the method significantly outperformed other existing state-of-the-art approaches, reaching very high PPV (92%) and MCC (0.82). Secondly, a method was also introduced for TMBB topology prediction. The proposed approach is based on grammatical modelling and probabilistic discriminative models for sequence data labeling. The method was evaluated using a newly generated dataset of 38 TMBB proteins obtained from high-resolution data in the PDB. Results have shown that the model is able to correctly predict topologies of 25 out of 38 protein chains in the dataset. When tested on previously released datasets, the performances of the proposed approach were measured as comparable or superior to the current state-of-the-art of TMBB topology prediction.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation has three separate parts: the first part deals with the general pedigree association testing incorporating continuous covariates; the second part deals with the association tests under population stratification using the conditional likelihood tests; the third part deals with the genome-wide association studies based on the real rheumatoid arthritis (RA) disease data sets from Genetic Analysis Workshop 16 (GAW16) problem 1. Many statistical tests are developed to test the linkage and association using either case-control status or phenotype covariates for family data structure, separately. Those univariate analyses might not use all the information coming from the family members in practical studies. On the other hand, the human complex disease do not have a clear inheritance pattern, there might exist the gene interactions or act independently. In part I, the new proposed approach MPDT is focused on how to use both the case control information as well as the phenotype covariates. This approach can be applied to detect multiple marker effects. Based on the two existing popular statistics in family studies for case-control and quantitative traits respectively, the new approach could be used in the simple family structure data set as well as general pedigree structure. The combined statistics are calculated using the two statistics; A permutation procedure is applied for assessing the p-value with adjustment from the Bonferroni for the multiple markers. We use simulation studies to evaluate the type I error rates and the powers of the proposed approach. Our results show that the combined test using both case-control information and phenotype covariates not only has the correct type I error rates but also is more powerful than the other existing methods. For multiple marker interactions, our proposed method is also very powerful. Selective genotyping is an economical strategy in detecting and mapping quantitative trait loci in the genetic dissection of complex disease. When the samples arise from different ethnic groups or an admixture population, all the existing selective genotyping methods may result in spurious association due to different ancestry distributions. The problem can be more serious when the sample size is large, a general requirement to obtain sufficient power to detect modest genetic effects for most complex traits. In part II, I describe a useful strategy in selective genotyping while population stratification is present. Our procedure used a principal component based approach to eliminate any effect of population stratification. The paper evaluates the performance of our procedure using both simulated data from an early study data sets and also the HapMap data sets in a variety of population admixture models generated from empirical data. There are one binary trait and two continuous traits in the rheumatoid arthritis dataset of Problem 1 in the Genetic Analysis Workshop 16 (GAW16): RA status, AntiCCP and IgM. To allow multiple traits, we suggest a set of SNP-level F statistics by the concept of multiple-correlation to measure the genetic association between multiple trait values and SNP-specific genotypic scores and obtain their null distributions. Hereby, we perform 6 genome-wide association analyses using the novel one- and two-stage approaches which are based on single, double and triple traits. Incorporating all these 6 analyses, we successfully validate the SNPs which have been identified to be responsible for rheumatoid arthritis in the literature and detect more disease susceptibility SNPs for follow-up studies in the future. Except for chromosome 13 and 18, each of the others is found to harbour susceptible genetic regions for rheumatoid arthritis or related diseases, i.e., lupus erythematosus. This topic is discussed in part III.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Humans and dogs are both affected by the allergic skin disease atopic dermatitis (AD), caused by an interaction between genetic and environmental factors. The German shepherd dog (GSD) is a high-risk breed for canine AD (CAD). In this study, we used a Swedish cohort of GSDs as a model for human AD. Serum IgA levels are known to be lower in GSDs compared to other breeds. We detected significantly lower IgA levels in the CAD cases compared to controls (p = 1.1 × 10(-5)) in our study population. We also detected a separation within the GSD cohort, where dogs could be grouped into two different subpopulations. Disease prevalence differed significantly between the subpopulations contributing to population stratification (λ = 1.3), which was successfully corrected for using a mixed model approach. A genome-wide association analysis of CAD was performed (n cases = 91, n controls = 88). IgA levels were included in the model, due to the high correlation between CAD and low IgA levels. In addition, we detected a correlation between IgA levels and the age at the time of sampling (corr = 0.42, p = 3.0 × 10(-9)), thus age was included in the model. A genome-wide significant association was detected on chromosome 27 (praw = 3.1 × 10(-7), pgenome = 0.03). The total associated region was defined as a ~1.5-Mb-long haplotype including eight genes. Through targeted re-sequencing and additional genotyping of a subset of identified SNPs, we defined 11 smaller haplotype blocks within the associated region. Two blocks showed the strongest association to CAD. The ~209-kb region, defined by the two blocks, harbors only the PKP2 gene, encoding Plakophilin 2 expressed in the desmosomes and important for skin structure. Our results may yield further insight into the genetics behind both canine and human AD.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Intense selective pressures applied over short evolutionary time have resulted in homogeneity within, but substantial variation among, horse breeds. Utilizing this population structure, 744 individuals from 33 breeds, and a 54,000 SNP genotyping array, breed-specific targets of selection were identified using an F(ST)-based statistic calculated in 500-kb windows across the genome. A 5.5-Mb region of ECA18, in which the myostatin (MSTN) gene was centered, contained the highest signature of selection in both the Paint and Quarter Horse. Gene sequencing and histological analysis of gluteal muscle biopsies showed a promoter variant and intronic SNP of MSTN were each significantly associated with higher Type 2B and lower Type 1 muscle fiber proportions in the Quarter Horse, demonstrating a functional consequence of selection at this locus. Signatures of selection on ECA23 in all gaited breeds in the sample led to the identification of a shared, 186-kb haplotype including two doublesex related mab transcription factor genes (DMRT2 and 3). The recent identification of a DMRT3 mutation within this haplotype, which appears necessary for the ability to perform alternative gaits, provides further evidence for selection at this locus. Finally, putative loci for the determination of size were identified in the draft breeds and the Miniature horse on ECA11, as well as when signatures of selection surrounding candidate genes at other loci were examined. This work provides further evidence of the importance of MSTN in racing breeds, provides strong evidence for selection upon gait and size, and illustrates the potential for population-based techniques to find genomic regions driving important phenotypes in the modern horse.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

HIV-1 sequence diversity is affected by selection pressures arising from host genomic factors. Using paired human and viral data from 1071 individuals, we ran >3000 genome-wide scans, testing for associations between host DNA polymorphisms, HIV-1 sequence variation and plasma viral load (VL), while considering human and viral population structure. We observed significant human SNP associations to a total of 48 HIV-1 amino acid variants (p<2.4 × 10−12). All associated SNPs mapped to the HLA class I region. Clinical relevance of host and pathogen variation was assessed using VL results. We identified two critical advantages to the use of viral variation for identifying host factors: (1) association signals are much stronger for HIV-1 sequence variants than VL, reflecting the ‘intermediate phenotype’ nature of viral variation; (2) association testing can be run without any clinical data. The proposed genome-to-genome approach highlights sites of genomic conflict and is a strategy generally applicable to studies of host–pathogen interaction.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Attention has recently been drawn to Enterococcus faecium because of an increasing number of nosocomial infections caused by this species and its resistance to multiple antibacterial agents. However, relatively little is known about the pathogenic determinants of this organism. We have previously identified a cell-wall-anchored collagen adhesin, Acm, produced by some isolates of E. faecium, and a secreted antigen, SagA, exhibiting broad-spectrum binding to extracellular matrix proteins. Here, we analysed the draft genome of strain TX0016 for potential microbial surface components recognizing adhesive matrix molecules (MSCRAMMs). Genome-based bioinformatics identified 22 predicted cell-wall-anchored E. faecium surface proteins (Fms), of which 15 (including Acm) had characteristics typical of MSCRAMMs, including predicted folding into a modular architecture with multiple immunoglobulin-like domains. Functional characterization of one [Fms10; redesignated second collagen adhesin of E. faecium (Scm)] revealed that recombinant Scm(65) (A- and B-domains) and Scm(36) (A-domain) bound to collagen type V efficiently in a concentration-dependent manner, bound considerably less to collagen type I and fibrinogen, and differed from Acm in their binding specificities to collagen types IV and V. Results from far-UV circular dichroism measurements of recombinant Scm(36) and of Acm(37) indicated that these proteins were rich in beta-sheets, supporting our folding predictions. Whole-cell ELISA and FACS analyses unambiguously demonstrated surface expression of Scm in most E. faecium isolates. Strikingly, 11 of the 15 predicted MSCRAMMs clustered in four loci, each with a class C sortase gene; nine of these showed similarity to Enterococcus faecalis Ebp pilus subunits and also contained motifs essential for pilus assembly. Antibodies against one of the predicted major pilus proteins, Fms9 (redesignated EbpC(fm)), detected a 'ladder' pattern of high-molecular-mass protein bands in a Western blot analysis of cell surface extracts from E. faecium, suggesting that EbpC(fm) is polymerized into a pilus structure. Further analysis of the transcripts of the corresponding gene cluster indicated that fms1 (ebpA(fm)), fms5 (ebpB(fm)) and ebpC(fm) are co-transcribed, a result consistent with those for pilus-encoding gene clusters of other Gram-positive bacteria. All 15 genes occurred frequently in 30 clinically derived diverse E. faecium isolates tested. The common occurrence of MSCRAMM- and pilus-encoding genes and the presence of a second collagen-binding protein may have important implications for our understanding of this emerging pathogen.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Transcription enhancer factor 1 is essential for cardiac, skeletal, and smooth muscle development and uses its N-terminal TEA domain (TEAD) to bind M-CAT elements. Here, we present the first structure of TEAD and show that it is a three-helix bundle with a homeodomain fold. Structural data reveal how TEAD binds DNA. Using structure-function correlations, we find that the L1 loop is essential for cooperative loading of TEAD molecules on to tandemly duplicated M-CAT sites. Furthermore, using a microarray chip-based assay, we establish that known binding sites of the full-length protein are only a subset of DNA elements recognized by TEAD. Our results provide a model for understanding the regulation of genome-wide gene expression during development by TEA/ATTS family of transcription factors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cytoplasmic polyhedrosis virus (CPV) is unique within the Reoviridae family in having a turreted single-layer capsid contained within polyhedrin inclusion bodies, yet being fully capable of cell entry and endogenous RNA transcription. Biochemical data have shown that the amino-terminal 79 residues of the CPV turret protein (TP) is sufficient to bring CPV or engineered proteins into the polyhedrin matrix for micro-encapsulation. Here we report the three-dimensional structure of CPV at 3.88 A resolution using single-particle cryo-electron microscopy. Our map clearly shows the turns and deep grooves of alpha-helices, the strand separation in beta-sheets, and densities for loops and many bulky side chains; thus permitting atomic model-building effort from cryo-electron microscopy maps. We observed a helix-to-beta-hairpin conformational change between the two conformational states of the capsid shell protein in the region directly interacting with genomic RNA. We have also discovered a messenger RNA release hole coupled with the mRNA capping machinery unique to CPV. Furthermore, we have identified the polyhedrin-binding domain, a structure that has potential in nanobiotechnology applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tumor necrosis factor receptor p75/80 ((TNF-R p75/80) is a 75 kDa type 1 transmembrane protein expressed predominately on cells of hematopoietic lineage. TNF-R p75/80 belongs to the TNF receptor superfamily characterized by cysteine-rich extracellular regions composed of three to six disulfide-linked domains. In the present report, we have characterized, for the first time, the complete gene structure for human TNF-R p75/80 which spans approximately 43 kbp. The gene consists of 10 exons (ranging from 34 bp to 2.5 kbp) and 9 introns (343 bp to 19 kbp). Consensus elements for transcription factors involved in T cell development and activation were noted in the 5$\sp\prime$ flanking region including TCF-1, Ikaros, AP-1, CK-2, IL-6RE, ISRE, GAS, NF-$\kappa$B and SP1, as well as an unusually high GC content and CpG frequency that appears characteristic of some TNF-R family members. The unusual (GATA)$\sb{\rm n}$ and (GAA)(GGA) repeats found within intron 1 may prove useful for further genome analysis within the 1p36 chromosomal locus. The human TNF-R p75/80 gene structure will permit further assessment of its involvement in normal hematopoietic cell development and function, autoimmune disease, and non-random translocations in hematopoietic malignancies. The region 1.8 kb 5$\sp\prime$ of the ATG was able to drive luciferase expression when transfected into cell lines expressing TNF-R p75/80. Further characterization of the 5$\sp\prime$-regulatory region will aid in determining factors and signal transduction pathways involved in regulating TNF-R p75/80 expression. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mitochondria cannot form de novo but require mechanisms allowing their inheritance to daughter cells. In contrast to most other eukaryotes Trypanosoma brucei has a single mitochondrion whose single-unit genome is physically connected to the flagellum. Here we identify a β-barrel mitochondrial outer membrane protein, termed tripartite attachment complex 40 (TAC40), that localizes to this connection. TAC40 is essential for mitochondrial DNA inheritance and belongs to the mitochondrial porin protein family. However, it is not specifically related to any of the three subclasses of mitochondrial porins represented by the metabolite transporter voltage-dependent anion channel (VDAC), the protein translocator of the outer membrane 40 (TOM40), or the fungi-specific MDM10, a component of the endoplasmic reticulum–mitochondria encounter structure (ERMES). MDM10 and TAC40 mediate cellular architecture and participate in transmembrane complexes that are essential for mitochondrial DNA inheritance. In yeast MDM10, in the context of the ERMES, is postulated to connect the mitochondrial genomes to actin filaments, whereas in trypanosomes TAC40 mediates the linkage of the mitochondrial DNA to the basal body of the flagellum. However, TAC40 does not colocalize with trypanosomal orthologs of ERMES components and, unlike MDM10, it regulates neither mitochondrial morphology nor the assembly of the protein translocase. TAC40 therefore defines a novel subclass of mitochondrial porins that is distinct from VDAC, TOM40, and MDM10. However, whereas the architecture of the TAC40-containing complex in trypanosomes and the MDM10-containing ERMES in yeast is very different, both are organized around a β-barrel protein of the mitochondrial porin family that mediates a DNA–cytoskeleton linkage that is essential for mitochondrial DNA inheritance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

TbRRM1 of Trypanosoma brucei is a nucleoprotein that was previously identified in a search for splicing factors in T. brucei. We show that TbRRM1 associates with mRNAs and with the auxiliary splicing factor polypyrimidine tract-binding protein 2, but not with components of the core spliceosome. TbRRM1 also interacts with several retrotransposon hot spot (RHS) proteins and histones. RNA immunoprecipitation of a tagged form of TbRRM1 from procyclic (insect) form trypanosomes identified ca. 1,500 transcripts that were enriched and 3,000 transcripts that were underrepresented compared to cellular mRNA. Enriched transcripts encoded RNA-binding proteins, including TbRRM1 itself, several RHS transcripts, mRNAs with long coding regions, and a high proportion of stage-regulated mRNAs that are more highly expressed in bloodstream forms. Transcripts encoding ribosomal proteins, other factors involved in translation, and procyclic-specific transcripts were underrepresented. Knockdown of TbRRM1 by RNA interference caused widespread changes in mRNA abundance, but these changes did not correlate with the binding of the protein to transcripts, and most splice sites were unchanged, negating a general role for TbRRM1 in splice site selection. When changes in mRNA abundance were mapped across the genome, regions with many downregulated mRNAs were identified. Two regions were analyzed by chromatin immunoprecipitation, both of which exhibited increases in nucleosome occupancy upon TbRRM1 depletion. In addition, subjecting cells to heat shock resulted in translocation of TbRRM1 to the cytoplasm and compaction of chromatin, consistent with a second role for TbRRM1 in modulating chromatin structure. IMPORTANCE: Trypanosoma brucei, the parasite that causes human sleeping sickness, is transmitted by tsetse flies. The parasite progresses through different life cycle stages in its two hosts, altering its pattern of gene expression in the process. In trypanosomes, protein-coding genes are organized as polycistronic units that are processed into monocistronic mRNAs. Since genes in the same unit can be regulated independently of each other, it is believed that gene regulation is essentially posttranscriptional. In this study, we investigated the role of a nuclear RNA-binding protein, TbRRM1, in the insect stage of the parasite. We found that TbRRM1 binds nuclear mRNAs and also affects chromatin status. Reduction of nuclear TbRRM1 by RNA interference or heat shock resulted in chromatin compaction. We propose that TbRRM1 regulates RNA polymerase II-driven gene expression both cotranscriptionally, by facilitating transcription and efficient splicing, and posttranscriptionally, via its interaction with nuclear mRNAs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Classical swine fever virus (CSFV) causes a highly contagious disease in pigs that can range from a severe haemorrhagic fever to a nearly unapparent disease, depending on the virulence of the virus strain. Little is known about the viral molecular determinants of CSFV virulence. The nonstructural protein NS4B is essential for viral replication. However, the roles of CSFV NS4B in viral genome replication and pathogenesis have not yet been elucidated. NS4B of the GPE-  vaccine strain and of the highly virulent Eystrup strain differ by a total of seven amino acid residues, two of which are located in the predicted trans-membrane domains of NS4B and were described previously to relate to virulence, and five residues clustering in the N-terminal part. In the present study, we examined the potential role of these five amino acids in modulating genome replication and determining pathogenicity in pigs. A chimeric low virulent GPE- -derived virus carrying the complete Eystrup NS4B showed enhanced pathogenicity in pigs. The in vitro replication efficiency of the NS4B chimeric GPE-  replicon was significantly higher than that of the replicon carrying only the two Eystrup-specific amino acids in NS4B. In silico and in vitro data suggest that the N-terminal part of NS4B forms an amphipathic α-helix structure. The N-terminal NS4B with these five amino acid residues is associated with the intracellular membranes. Taken together, this is the first gain-of-function study showing that the N-terminal domain of NS4B can determine CSFV genome replication in cell culture and viral pathogenicity in pigs.