986 resultados para Coding Region
Resumo:
A new RTE-like, non-long terminal repeat retrotransposon, termed SjR2, from the human blood fluke, Schistosoma japonicum, is described. SjR2 is similar to3.9 kb in length and is constituted of a single open reading frame encoding a polyprotein with apurinic/apyrimidinic endonuclease and reverse transcriptase domains. The open reading frame is bounded by 5'- and 3'-terininal untranslated regions and, at its 3-terminus, SjR2 bears a short (TGAC)(3) repeat. Phylogenetic analyses based on conserved domains of reverse transcriptase or endonuclease revealed that SjR2 belonged to the RTE clade of non-long terminal repeat retrotransposons. Further, SjR2 was homologous, but probably not orthologous, to SR2 front the African blood fluke, Schistosoma mansoni; this RTE-like family of non-long terminal repeat retrotransposons appears to have arisen before the divergence of the extant schistosome species. Hybridisation analyses indicated that similar to 10,000 copies of SjR2 were dispersed throughout the S. japonicum chromosomes, accounting for up to 14% of the nuclear genome. Messenger RNAs encoding the reverse transcriptase and endonuclease domains of SjR2 were detected in several developmental stages of the schistosome, indicating that the retrotransposon was actively replicating within the genome of the parasite. Exploration of the coding and non-coding regions of SjR2 revealed two notable characteristics. First, the recombinant reverse transcriptase domain of SjR2 expressed in insect cells primed reverse transcription of SjR2 mRNA in vitro. By contrast, recombinant SjR2-endonuclease did not appear to cleave schistosome or plasmid DNA. Second, the 5'-untranslated region of SjR2 was >80% identical to the 3-untranslated region of a schistosome heat shock protein-70 gene (hsp-70) in the antisense orientation, indicating that SjR2-like elements were probably inserted into the non-coding regions of ancestral S. japonicum HSP-70, probably after the species diverged from S. mansoni. (C) 2002 Australian Society for Parasitology Inc. Published by Elsevier Science Ltd. All rights reserved.
Resumo:
Here, we report the molecular analysis of two independent 5S rRNA clusters found in the intergenic region of two ubiquitin genomic clones isolated from Tetrahymena pyriformis. Each cluster contains two 120-bp-long coding regions organized in tandem with 142/145-bp-long spacers.
Resumo:
BACKGROUND: The comparison of complete genomes has revealed surprisingly large numbers of conserved non-protein-coding (CNC) DNA regions. However, the biological function of CNC remains elusive. CNC differ in two aspects from conserved protein-coding regions. They are not conserved across phylum boundaries, and they do not contain readily detectable sub-domains. Here we characterize the persistence length and time of CNC and conserved protein-coding regions in the vertebrate and insect lineages. RESULTS: The persistence length is the length of a genome region over which a certain level of sequence identity is consistently maintained. The persistence time is the evolutionary period during which a conserved region evolves under the same selective constraints.Our main findings are: (i) Insect genomes contain 1.60 times less conserved information than vertebrates; (ii) Vertebrate CNC have a higher persistence length than conserved coding regions or insect CNC; (iii) CNC have shorter persistence times as compared to conserved coding regions in both lineages. CONCLUSION: Higher persistence length of vertebrate CNC indicates that the conserved information in vertebrates and insects is organized in functional elements of different lengths. These findings might be related to the higher morphological complexity of vertebrates and give clues about the structure of active CNC elements.Shorter persistence time might explain the previously puzzling observations of highly conserved CNC within each phylum, and of a lack of conservation between phyla. It suggests that CNC divergence might be a key factor in vertebrate evolution. Further evolutionary studies will help to relate individual CNC to specific developmental processes.
Resumo:
Canine distemper virus (CDV) produces a glycosylated type I fusion protein (F) with an internal hydrophobic signal sequence beginning around 115 residues downstream of the first AUG used for translation initiation. Cleavage of the signal sequence yields the F0 molecule, which is cleaved into the F1 and F2 subunits. Surprisingly, when all in-frame AUGs located in the first third of the F gene were mutated a protein of the same molecular size as the F0 molecule was still expressed from both the Onderstepoort (OP) and A75/17-CDV F genes. We designated this protein, which is initiated from a non-AUG codon protein Fx. Site-directed mutagenesis allowed to identify codon 85, a GCC codon coding for alanine, as the most likely position from which translation initiation of Fx occurs in OP-CDV. Deletion analysis demonstrated that at least 60 nucleotides upstream of the GCC codon are required for efficient Fx translation. This sequence is GC-rich, suggesting extensive folding. Secondary structure may therefore be important for translation initiation at codon 85.
Resumo:
Scaffold or matrix attachment region (S/MAR) genetic elements have previously been proposed to insulate transgenes from repressive effects linked to their site of integration within the host cell genome. We have evaluated their use in various stable transfection settings to increase the production of recombinant proteins such as monoclonal antibodies from Chinese hamster ovary (CHO) cell lines. Using the green fluorescent protein coding sequence, we show that S/MAR elements mediate a dual effect on the population of transfected cells. First, S/MAR elements almost fully abolish the occurrence of cell clones that express little transgene that may result from transgene integration in an unfavorable chromosomal environment. Second, they increase the overall expression of the transgene over the whole range of expression levels, allowing the detection of cells with significantly higher levels of transgene expression. An optimal setting was identified as the addition of a S/MAR element both in cis (on the transgene expression vector) and in trans (co-transfected on a separate plasmid). When used to express immunoglobulins, the S/MAR element enabled cell clones with high and stable levels of expression to be isolated following the analysis of a few cell lines generated without transgene amplification procedures.
Resumo:
Long non-coding RNAs (lncRNAs) are deregulated in several tumors, although their role in acute myeloid leukemia (AML) is mostly unknown.We have examined the expression of the lncRNA HOX antisense intergenic RNA myeloid 1 (HOTAIRM1) in 241 AML patients. We have correlated HOTAIRM1 expression with a miRNA expression profile. We have also analyzed the prognostic value of HOTAIRM1 expression in 215 intermediate-risk AML (IR-AML) patients.The lowest expression level was observed in acute promyelocytic leukemia (P < 0.001) and the highest in t(6;9) AML (P = 0.005). In 215 IR-AML patients, high HOTAIRM1 expression was independently associated with shorter overall survival (OR:2.04;P = 0.001), shorter leukemia-free survival (OR:2.56; P < 0.001) and a higher cumulative incidence of relapse (OR:1.67; P = 0.046). Moreover, HOTAIRM1 maintained its independent prognostic value within the favorable molecular subgroup (OR: 3.43; P = 0.009). Interestingly, HOTAIRM1 was overexpressed in NPM1-mutated AML (P < 0.001) and within this group retained its prognostic value (OR: 2.21; P = 0.01). Moreover, HOTAIRM1 expression was associated with a specific 33-microRNA signature that included miR-196b (P < 0.001). miR-196b is located in the HOX genomic region and has previously been reported to have an independent prognostic value in AML. miR-196b and HOTAIRM1 in combination as a prognostic factor can classify patients as high-, intermediate-, or low-risk (5-year OS: 24% vs 42% vs 70%; P = 0.004).Determination of HOTAIRM1 level at diagnosis provided relevant prognostic information in IR-AML and allowed refinement of risk stratification based on common molecular markers. The prognostic information provided by HOTAIRM1 was strengthened when combined with miR-196b expression. Furthermore, HOTAIRM1 correlated with a 33-miRNA signature.
Resumo:
Cell-free translation of total RNA isolated from vaccinia virus-infected cells late in infection results in a complex mixture of polypeptides. A monospecific antibody directed against one of the major structural proteins of the virus particle immunoprecipitated a single polypeptide with a molecular weight of 11,000 (11K) from this mixture. Immunoprecipitation was therefore used to identify the structural polypeptide among the in vitro translation products of RNA purified by hybridization selection to restriction fragments of the vaccinia virus genome. This allowed us to map the mRNA coding for the 11K polypeptide to the extreme left-hand end of the HindIII E fragment. Detailed transcriptional mapping of this region of the genome by nuclease S1 analysis revealed the presence of a late RNA transcribed from the rightward-reading strand. Its 5' end mapped at ca. 130 base pairs to the left of the HindIII site at the junction between the HindIII F and E fragments. The map position of this RNA coincided precisely with the map position of the late message coding for the 11K polypeptide.
Resumo:
Little is known about the relation between the genome organization and gene expression in Leishmania. Bioinformatic analysis can be used to predict genes and find homologies with known proteins. A model was proposed, in which genes are organized into large clusters and transcribed from only one strand, in the form of large polycistronic primary transcripts. To verify the validity of this model, we studied gene expression at the transcriptional, post-transcriptional and translational levels in a unique locus of 34kb located on chr27 and represented by cosmid L979. Sequence analysis revealed 115 ORFs on either DNA strand. Using computer programs developed for Leishmania genes, only nine of these ORFs, localized on the same strand, were predicted to code for proteins, some of which show homologies with known proteins. Additionally, one pseudogene, was identified. We verified the biological relevance of these predictions. mRNAs from nine predicted genes and proteins from seven were detected. Nuclear run-on analyses confirmed that the top strand is transcribed by RNA polymerase II and suggested that there is no polymerase entry site. Low levels of transcription were detected in regions of the bottom strand and stable transcripts were identified for four ORFs on this strand not predicted to be protein-coding. In conclusion, the transcriptional organization of the Leishmania genome is complex, raising the possibility that computer predictions may not be comprehensive.
Resumo:
We have mapped the genes coding for two major structural polypeptides of the vaccinia virus core by hybrid selection and transcriptional mapping. First, RNA was selected by hybridization to restriction fragments of the vaccinia virus genome, translated in vitro and the products were immunoprecipitated with antibodies against the two polypeptides. This approach allowed us to map the genes to the left hand end of the largest Hind III restriction fragment of 50 kilobase pairs. Second, transcriptional mapping of this region of the genome revealed the presence of the two expected RNAs. Both RNAs are transcribed from the leftward reading strand and the 5'-ends of the genes are separated by about 7.5 kilobase pairs of DNA. Thus, two genes encoding structural polypeptides with a similar location in the vaccinia virus particle are clustered at approximately 105 kilobase pairs from the left hand end of the 180 kilobase pair vaccinia virus genome.
Resumo:
Background and Aims: The NS5A protein of the HCV is known tobe involved in viral replication and assembly and probably in theresistance to Interferon based-therapy. Previous studies identifiedinsertions or deletions from 1 to 12 nucleotides in several genomicregions. In a multicenter study (17 French and 1 Swiss laboratoriesof virology), we identified for the first time a 31 amino acidsinsertion leading to a duplication of the V3 domain in the NS5Aregion with a high prevalence. Quasispecies of each strain withduplication were characterized and the inserted V3 domain wasidentified.Methods: Between 2006 and 2008, 1067 patients chronicallyinfected with a 1b HCV were consecutively included in the study.We first amplified the V3 region by RT-PCR to detect duplication(919 samples successfully amplified). The entire NS5A region wasthen amplified, cloned and sequenced in strains bearing theduplication. V3 sequences (called R1 and R2) from each clonewere analyzed with BioEdit and compared to a V3 consensussequence (C) built from the Database Los Alamos Hepatitis C.Entropy was determined at each position.Results: V3 duplications were identified in 25 patients representinga prevalence of 2.72%. We sequenced 2043 clones from which776 had a complete coding NS5A sequence (corresponding toa mean of 30 clones per patient). At the intra-individual level,6 to 17 variants were identified per V3 region, with a maximum of3 different amino acids. At the inter-individual level, a differenceof 7 and 2 amino acids was observed between C and R1 and R2sequences, respectively. Moreover few positions presented entropyhigher than 1 (4 for the R1, 2 for the R2 and 2 for the C). Among allthe sequenced clones, more than 60% were defective virus (partialfragment of NS5A or stop codon).Conclusions: We identified a duplication of the V3 domain ingenotype 1b HCV with a high prevalence. The R2 domain, which wasthe most similar to the C region, might probably be the "original"domain, whereas R1 should be the inserted domain. Phylogeneticanalyses are under process to confirm this hypothesis.
Resumo:
Association studies have revealed expression quantitative trait loci (eQTLs) for a large number of genes. However, the causative variants that regulate gene expression levels are generally unknown. We hypothesized that copy-number variation of sequence repeats contribute to the expression variation of some genes. Our laboratory has previously identified that the rare expansion of a repeat c.-174CGGGGCGGGGCG in the promoter region of the CSTB gene causes a silencing of the gene, resulting in progressive myoclonus epilepsy. Here, we genotyped the repeat length and quantified CSTB expression by quantitative real-time polymerase chain reaction in 173 lymphoblastoid cell lines (LCLs) and fibroblast samples from the GenCord collection. The majority of alleles contain either two or three copies of this repeat. Independent analysis revealed that the c.-174CGGGGCGGGGCG repeat length is strongly associated with CSTB expression (P = 3.14 × 10(-11)) in LCLs only. Examination of both genotyped and imputed single-nucleotide polymorphisms (SNPs) within 2 Mb of CSTB revealed that the dodecamer repeat represents the strongest cis-eQTL for CSTB in LCLs. We conclude that the common two or three copy variation is likely the causative cis-eQTL for CSTB expression variation. More broadly, we propose that polymorphic tandem repeats may represent the causative variation of a fraction of cis-eQTLs in the genome.
Resumo:
BACKGROUND: Silver-Russell syndrome (SRS) is a genetically and clinically heterogeneous disease. Although no protein coding gene defects have been reported in SRS patients, approximately 50% of SRS patients carry epimutations (hypomethylation) at the IGF2/H19 imprinting control region 1 (ICR1). Proper methylation at ICR1 is crucial for the imprinted expression of IGF2, a fetal growth factor. CTCFL, a testis-specific protein, has recently been proposed to play a role in the establishment of DNA methylation at the murine equivalent of ICR1. A screen was undertaken to assess whether CTCFL is mutated in SRS patients with hypomethylation, to explore a link between the observed epimutations and a genetic cause of the disease. METHODOLOGY/PRINCIPAL FINDINGS: DNA was obtained from 36 SRS patients with hypomethylation at ICR1. All CTCFL coding exons were sequenced and analyzed for duplications/deletions using both multiplex ligation-dependent probe amplification, with a custom CTCFL probe set, and genomic qPCR. Novel SNP alleles were analyzed for potential differential splicing in vitro utilizing a splicing assay. Neither mutations of CTCFL nor duplications/deletions were observed. Five novel SNPs were identified and have been submitted to dbSNP. In silico splice prediction suggested one novel SNP, IVS2-66A>C, activated a cryptic splice site, resulting in aberrant splicing and premature termination. In vitro splicing assays did not confirm predicted aberrant splicing. CONCLUSIONS/SIGNIFICANCE: As no mutations were detected at CTCFL in the patients examined, we conclude that genetic alterations of CTCFL are not responsible for the SRS hypomethylation. We suggest that analysis of other genes involved in the establishment of DNA methylation at imprinted genes, such as DNMT3A and DNMT3L, may provide insight into the genetic cause of hypomethylation in SRS patients.
Resumo:
Two Brazilian Potato virus Y (PVY) isolates were biologically characterized as necrotic (PVY-NBR) and common (PVY-OBR) based upon symptoms on test plants. Additional characterization was performed by sequencing a cDNA corresponding to the 3' terminal region of the viral genome. The sequence consisted of 195 nucleotides (nt) coding part of the nuclear inclusion body b (NIb) gene, 804 nt of the coat protein (CP) gene, and 328 nt (PVY-OBR) or 326 nt (PVY-NBR) of the 3'-untranslated region (UTR). Translation of the sequence resulted in one single open reading frame with part of the NIb and a CP of 267 amino acids. The two isolates shared 95.1% similarity in the CP amino acid sequence. The CP and the 3'-UTR sequence of the Brazilian isolates were compared to those of other PVY isolates previously reported and unrooted phylogenetic trees were constructed. The trees revealed a separation of two distinct clusters, one comprising most of the common strains and the other comprising the necrotic strains. PVY-OBR was clustered in the common group and PVY-NBR in the necrotic one.
Resumo:
The human immunoglobulin lambda variable locus (IGLV) is mapped at chromosome 22 band q11.1-q11.2. The 30 functional germline v-lambda genes sequenced untill now have been subgrouped into 10 families (Vl1 to Vl10). The number of Vl genes has been estimated at approximately 70. This locus is formed by three gene clusters (VA, VB and VC) that encompass the variable coding genes (V) responsible for the synthesis of lambda-type Ig light chains, and the Jl-Cl cluster with the joining segments and the constant genes. Recently the entire variable lambda gene locus was mapped by contig methodology and its one- megabase DNA totally sequenced. All the known functional V-lambda genes and pseudogenes were located. We screened a human genomic DNA cosmid library and isolated a clone with an insert of 37 kb (cosmid 8.3) encompassing four functional genes (IGLV7S1, IGLV1S1, IGLV1S2 and IGLV5a), a pseudogene (VlA) and a vestigial sequence (vg1) to study in detail the positions of the restriction sites surrounding the Vl genes. We generated a high resolution restriction map, locating 31 restriction sites in 37 kb of the VB cluster, a region rich in functional Vl genes. This mapping information opens the perspective for further RFLP studies and sequencing
Resumo:
In recent years, reversible logic has emerged as one of the most important approaches for power optimization with its application in low power CMOS, quantum computing and nanotechnology. Low power circuits implemented using reversible logic that provides single error correction – double error detection (SEC-DED) is proposed in this paper. The design is done using a new 4 x 4 reversible gate called ‘HCG’ for implementing hamming error coding and detection circuits. A parity preserving HCG (PPHCG) that preserves the input parity at the output bits is used for achieving fault tolerance for the hamming error coding and detection circuits.