985 resultados para Coding Sequences


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The 3′ UTRs of eukaryotic genes participate in a variety of post-transcriptional (and some transcriptional) regulatory interactions. Some of these interactions are well characterised, but an undetermined number remain to be discovered. While some regulatory sequences in 3′ UTRs may be conserved over long evolutionary time scales, others may have only ephemeral functional significance as regulatory profiles respond to changing selective pressures. Here we propose a sensitive segmentation methodology for investigating patterns of composition and conservation in 3′ UTRs based on comparison of closely related species. We describe encodings of pairwise and three-way alignments integrating information about conservation, GC content and transition/transversion ratios and apply the method to three closely related Drosophila species: D. melanogaster, D. simulans and D. yakuba. Incorporating multiple data types greatly increased the number of segment classes identified compared to similar methods based on conservation or GC content alone. We propose that the number of segments and number of types of segment identified by the method can be used as proxies for functional complexity. Our main finding is that the number of segments and segment classes identified in 3′ UTRs is greater than in the same length of protein-coding sequence, suggesting greater functional complexity in 3′ UTRs. There is thus a need for sustained and extensive efforts by bioinformaticians to delineate functional elements in this important genomic fraction. C code, data and results are available upon request.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fourier spectra of 120 short coding sequences (<1 200 bp) show that not all coding sequences are characterized by 3-base periodicity. Statistical analysis suggests that whether a coding sequence has 3-base periodicity may be related to the composition and distribution of bases, the usage and the order of the amino acids of the encoded protein as well as the synonymous codon usage. Generally, the content of A+U is higher than that of G+C in non-period-3 sequences, inversely in period-3 sequences. In the three codon positions, the base distribution in the non-periodic-3 sequences is more uniform than in the periodic-3 sequences. The usage biases of the amino acids and the codons in non-period-3 sequences are weaker than that in period-3 sequences. All of these phenomena should be considered sufficiently in predicting the genes and exons of DNA sequences by Fourier analysis method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

During infection of a new host, the first surfaces encountered by herpes simplex viruses are the apical membranes of epithelial cells of mucosal surfaces. These cells are highly polarized, and the protein composition of their apical and basolateral membranes are very different, so that different viral entry pathways have evolved for each surface. To determine whether the viral glycoprotein G (gG) is specifically required for efficient infection of a particular surface of polarized cells, apical and basal surfaces were infected with wild-type virus or a gG deletion mutant. After infection of polarized cells in culture, the gG− virus was deficient in infection of apical surfaces but was able to infect cells through basal membranes, replicate, and spread into surrounding cells. The gG-dependent step in apical infection was a stage beyond attachment. After in vivo infection of apical surfaces of epithelial cells of nonscarified mouse corneas, infection by glycoprotein C− or gG− virus was considerably reduced as compared with that observed after infection with wild-type virus. In contrast, when corneas were scarified, allowing virus access to other cell surfaces, the gG and glycoprotein C deletion mutants infected eyes as efficiently as wild-type viruses. A secondary mutation allowing infection of apical surfaces by gG− virus arose readily during passage of the virus in nonpolarized cells, indicating that either the gG-dependent step of apical infection can be bypassed or that another viral protein can acquire the same function.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Adenoviral vector-mediated gene transfer offers significant potential for gene therapy of many human diseases. However, progress has been slowed by several limitations. First, the insert capacity of currently available adenoviral vectors is limited to 8 kb of foreign DNA. Second, the expression of viral proteins in infected cells is believed to trigger a cellular immune response that results in inflammation and in only transient expression of the transferred gene. We report the development of a new adenoviral vector that has all viral coding sequences removed. Thus, large inserts are accommodated and expression of all viral proteins is eliminated. The first application of this vector system carries a dual expression cassette comprising 28.2 kb of nonviral DNA that includes the full-length murine dystrophin cDNA under control of a large muscle-specific promoter and a lacZ reporter construct. Using this vector, we demonstrate independent expression of both genes in primary mdx (dystrophin-deficient) muscle cells.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The leader protease (L-pro) and capsid-coding sequences (P1) constitute approximately 3 kb of the foot-and-mouth disease virus (FMDV). We studied the phylogenetic relationship of 46 FMDV serotype A isolates of Indian origin collected during the period 1968-2005 and also eight vaccine strains using the neighbour-joining tree and Bayesian tree methods. The viruses were categorized under three major groups - Asian, Euro-South American and European. The Indian isolates formed a distinct genetic group among the Asian isolates. The Indian isolates were further classified into different genetic subgroups (<5% divergence). Post-1995 isolates were divided into two subgroups while a few isolates which originated in the year 2005 from Andhra Pradesh formed a separate group. These isolates were closely related to the isolates of the 1970s. The FMDV isolates seem to undergo reverse mutation or onvergent evolution wherein sequences identical to the ancestors are present in the isolates in circulation. The eight vaccine strains included in the study were not related to each other and belonged to different genetic groups. Recombination was detected in the L-pro region in one isolate (A IND 20/82) and in the VP1 coding 1D region in another isolate (A RAJ 21/96). Positive selection was identified at aa positions 23 in the L-pro (P<0.05; 0.046*) and at aa 171 in the capsid protein VP1 (P<0.01; 0.003**).

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Transcription from morbillivirus genomes commences at a single promoter in the 3' non-coding terminus, with the six genes being transcribed sequentially. The 3' and 5' untranslated regions (UTRs) of the genes (mRNA sense), together with the intergenic trinucleotide spacer, comprise the non-coding sequences (NCS) of the virus and contain the conserved gene end and gene start signals, respectively. Bicistronic minigenomes containing transcription units (TUs) encoding autofluorescent reporter proteins separated by measles virus (MV) NCS were used to give a direct estimation of gene expression in single, living cells by assessing the relative amounts of each fluorescent protein in each cell. Initially, five minigenomes containing each of the MV NCS were generated. Assays were developed to determine the amount of each fluorescent protein in cells at both cell population and single-cell levels. This revealed significant variations in gene expression between cells expressing the same NCS-containing minigenome. The minigenome containing the M/F NCS produced significantly lower amounts of fluorescent protein from the second TU (TU2), compared with the other minigenomes. A minigenome with a truncated F 5' UTR had increased expression from TU2. This UTR is 524 nt longer than the other MV 5' UTRs. Insertions into the 5' UTR of the enhanced green fluorescent protein gene in the minigenome containing the N/P NCS showed that specific sequences, rather than just the additional length of F 5' UTR, govern this decreased expression from TU2.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Within about 30 years the Brazilian buffalo (Bubalus bubalis) herd will reach approximately 50 million head as a result of the great adaptive capacity of these animals to tropical climates, together with the good productive and reproductive potential which make these animals an important animal protein source for poor and developing countries. The myostatin gene (GDF8) is important in the physiology of stock animals because its product produces a direct effect on muscle development and consequently also on meat production. The myostatin sequence is known in several mammalian species and shows a high degree of amino acid sequence conservation, although the presence of non-silent and silent changes in the coding sequences and several alterations in the introns and untranslated regions have been identified. The objective of our work was to characterize the myostatin coding regions of B. bubalis (Murrah breed) and to compare them with the Bos taurus regions looking for variations in nucleotide and protein sequences. In this way, we were able to identify 12 variations at DNA level and five alterations on the presumed myostatin protein sequence as compared to non double-muscled bovine sequences.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Despite the wide distribution of transposable elements (TEs) in mammalian genomes, part of their evolutionary significance remains to be discovered. Today there is a substantial amount of evidence showing that TEs are involved in the generation of new exons in different species. In the present study, we searched 22,805 genes and reported the occurrence of TE-cassettes in coding sequences of 542 cow genes using the RepeatMasker program. Despite the significant number (542) of genes with TE insertions in exons only 14 (2.6%) of them were translated into protein, which we characterized as chimeric genes. From these chimeric genes, only the FAST kinase domains 3 (FASTKD3) gene, present on chromosome BTA 20, is a functional gene and showed evidence of the exaptation event. The genome sequence analysis showed that the last exon coding sequence of bovine FASTKD3 is ∼85% similar to the ART2A retrotransposon sequence. In addition, comparison among FASTKD3 proteins shows that the last exon is very divergent from those of Homo sapiens, Pan troglodytes and Canis familiares. We suggest that the gene structure of bovine FASTKD3 gene could have originated by several ectopic recombinations between TE copies. Additionally, the absence of TE sequences in all other species analyzed suggests that the TE insertion is clade-specific, mainly in the ruminant lineage. ©FUNPEC-RP.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Within about 30 years the Brazilian buffalo (Bubalus bubalis) herd will reach approximately 50 million head as a result of the great adaptive capacity of these animals to tropical climates, together with the good productive and reproductive potential which make these animals an important animal protein source for poor and developing countries. The myostatin gene (GDF8) is important in the physiology of stock animals because its product produces a direct effect on muscle development and consequently also on meat production. The myostatin sequence is known in several mammalian species and shows a high degree of amino acid sequence conservation, although the presence of non-silent and silent changes in the coding sequences and several alterations in the introns and untranslated regions have been identified. The objective of our work was to characterize the myostatin coding regions of B. bubalis (Murrah breed) and to compare them with the Bos taurus regions looking for variations in nucleotide and protein sequences. In this way, we were able to identify 12 variations at DNA level and five alterations on the presumed myostatin protein sequence as compared to non double-muscled bovine sequences.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In silico analyses of Leishmania spp. genome data are a powerful resource to improve the understanding of these pathogens' biology. Trypanosomatids such as Leishmania spp. have their protein-coding genes grouped in long polycistronic units of functionally unrelated genes. The control of gene expression happens by a variety of posttranscriptional mechanisms. The high degree of synteny among Leishmania species is accompanied by highly conserved coding sequences (CDS) and poorly conserved intercoding untranslated sequences. To identify the elements involved in the control of gene expression, we conducted an in silico investigation to find conserved intercoding sequences (CICS) in the genomes of L major, L infantum, and L braziliensis. We used a combination of computational tools, such as Linux-Shell, PERL and R languages, BLAST, MSPcrunch, SSAKE, and Pred-A-Term algorithms to construct a pipeline which was able to: (i) search for conservation in target-regions, (ii) eliminate CICS redundancy and mask repeat elements, (iii) predict the mRNA's extremities, (iv) analyze the distribution of orthologous genes within the generated LeishCICS-clusters, (v) assign GO terms to the LeishCICS-clusters. and (vi) provide statistical support for the gene-enrichment annotation. We associated the LeishCICS-cluster data, generated at the end of the pipeline, with the expression profile oft. donovani genes during promastigote-amastigote differentiation, as previously evaluated by others (GEO accession: GSE21936). A Pearson's correlation coefficient greater than 0.5 was observed for 730 LeishCICS-clusters containing from 2 to 17 genes. The designed computational pipeline is a useful tool and its application identified potential regulatory cis elements and putative regulons in Leishmania. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Five different clones encoding thioredoxin homologues were isolated from Arabidopsis thaliana cDNA libraries. On the basis of the sequences they encode divergent proteins, but all belong to the cytoplasmic thioredoxins h previously described in higher plants. The five proteins obtained by overexpressing the coding sequences in Escherichia coli present typical thioredoxin activities (NADP(+)-malate dehydrogenase activation and reduction by Arabidopsis thioredoxin reductase) despite the presence of a variant active site, Trp-Cys-Pro-Pro-Cys, in three proteins in place of the canonical Trp-Cys-Gly-Pro-Cys sequence described for thioredoxins in prokaryotes and eukaryotes. Southern blots show that each cDNA is encoded by a single gene but suggest the presence of additional related sequences in the Arabidopsis genome. This very complex diversity of thioredoxins h is probably common to all higher plants, since the Arabidopsis sequences appear to have diverged very early, at the beginning of plant speciation. This diversity allows the transduction of a redox signal into multiple pathways.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent advancements in the area of nanotechnology have brought us into a new age of pervasive computing devices. These computing devices grow ever smaller and are being used in ways which were unimaginable before. Recent interest in developing a precise indoor positioning system, as opposed to existing outdoor systems, has given way to much research heading into the area. The use of these small computing devices offers many conveniences for usage in indoor positioning systems. This thesis will deal with using small computing devices Raspberry Pi’s to enable and improve position estimation of mobile devices within closed spaces. The newly patented Orthogonal Perfect DFT Golay coding sequences will be used inside this scenario, and their positioning properties will be tested. After that, testing and comparisons with other coding sequences will be done.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Studies of molecular evolutionary rates have yielded a wide range of rate estimates for various genes and taxa. Recent studies based on population-level and pedigree data have produced remarkably high estimates of mutation rate, which strongly contrast with substitution rates inferred in phylogenetic (species-level) studies. Using Bayesian analysis with a relaxed-clock model, we estimated rates for three groups of mitochondrial data: avian protein-coding genes, primate protein-coding genes, and primate d-loop sequences. In all three cases, we found a measurable transition between the high, short-term (<1–2 Myr) mutation rate and the low, long-term substitution rate. The relationship between the age of the calibration and the rate of change can be described by a vertically translated exponential decay curve, which may be used for correcting molecular date estimates. The phylogenetic substitution rates in mitochondria are approximately 0.5% per million years for avian protein-coding sequences and 1.5% per million years for primate protein-coding and d-loop sequences. Further analyses showed that purifying selection offers the most convincing explanation for the observed relationship between the estimated rate and the depth of the calibration. We rule out the possibility that it is a spurious result arising from sequence errors, and find it unlikely that the apparent decline in rates over time is caused by mutational saturation. Using a rate curve estimated from the d-loop data, several dates for last common ancestors were calculated: modern humans and Neandertals (354 ka; 222–705 ka), Neandertals (108 ka; 70–156 ka), and modern humans (76 ka; 47–110 ka). If the rate curve for a particular taxonomic group can be accurately estimated, it can be a useful tool for correcting divergence date estimates by taking the rate decay into account. Our results show that it is invalid to extrapolate molecular rates of change across different evolutionary timescales, which has important consequences for studies of populations, domestication, conservation genetics, and human evolution.