65 resultados para coding
Resumo:
A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.
Resumo:
We investigate on-line prediction of individual sequences. Given a class of predictors, the goal is to predict as well as the best predictor in the class, where the loss is measured by the self information (logarithmic) loss function. The excess loss (regret) is closely related to the redundancy of the associated lossless universal code. Using Shtarkov's theorem and tools from empirical process theory, we prove a general upper bound on the best possible (minimax) regret. The bound depends on certain metric properties of the class of predictors. We apply the bound to both parametric and nonparametric classes ofpredictors. Finally, we point out a suboptimal behavior of the popular Bayesian weighted average algorithm.
Resumo:
The turbot (Scophthalmus maximus) is a commercially valuable flatfish and one of the most promising aquaculture species in Europe. Two transcriptome 454-pyrosequencing runs were used in order to detect Single Nucleotide Polymorphisms (SNPs) in genesrelated to immune response and gonad differentiation. A total of 866 true SNPs were detected in 140 different contigs representing 262,093 bp as a whole. Only one true SNP was analyzed in each contig. One hundred and thirteen SNPs out of the 140 analyzed were feasible (genotyped), while Ш were polymorphic in a wild population. Transition/transversion ratio (1.354) was similar to that observed in other fish studies. Unbiased gene diversity (He) estimates ranged from 0.060 to 0.510 (mean = 0.351), minimum allele frequency (MAF) from 0.030 to 0.500 (mean = 0.259) and all loci were in Hardy-Weinberg equilibrium after Bonferroni correction. A large number of SNPs (49) were located in the coding region, 33 representing synonymous and 16 non-synonymous changes. Most SNP-containing genes were related to immune response and gonad differentiation processes, and could be candidates for functional changes leading to phenotypic changes. These markers will be useful for population screening to look for adaptive variation in wild and domestic turbot
Resumo:
Hepatitis A virus (HAV), the prototype of genus Hepatovirus, has several unique biological characteristics that distinguish it from other members of the Picornaviridae family. Among these, the need for an intact eIF4G factor for the initiation of translation results in an inability to shut down host protein synthesis by a mechanism similar to that of other picornaviruses. Consequently, HAV must inefficiently compete for the cellular translational machinery and this may explain its poor growth in cell culture. In this context of virus/cell competition, HAV has strategically adopted a naturally highly deoptimized codon usage with respect to that of its cellular host. With the aim to optimize its codon usage the virus was adapted to propagate in cells with impaired protein synthesis, in order to make tRNA pools more available for the virus. A significant loss of fitness was the immediate response to the adaptation process that was, however, later on recovered and more associated to a re-deoptimization rather than to an optimization of the codon usage specifically in the capsid coding region. These results exclude translation selection and instead suggest fine-tuning translation kinetics selection as the underlying mechanism of the codon usage bias in this specific genome region. Additionally, the results provide clear evidence of the Red Queen dynamics of evolution since the virus has very much evolved to re-adapt its codon usage to the environmental cellular changing conditions in order to recover the original fitness.
Resumo:
Real-world images are complex objects, difficult to describe but at the same time possessing a high degree of redundancy. A very recent study [1] on the statistical properties of natural images reveals that natural images can be viewed through different partitions which are essentially fractal in nature. One particular fractal component, related to the most singular (sharpest) transitions in the image, seems to be highly informative about the whole scene. In this paper we will show how to decompose the image into their fractal components.We will see that the most singular component is related to (but not coincident with) the edges of the objects present in the scenes. We will propose a new, simple method to reconstruct the image with information contained in that most informative component.We will see that the quality of the reconstruction is strongly dependent on the capability to extract the relevant edges in the determination of the most singular set.We will discuss the results from the perspective of coding, proposing this method as a starting point for future developments.
Resumo:
Background: Non-long terminal repeat (non-LTR) retrotransposons have contributed to shaping the structure and function of genomes. In silico and experimental approaches have been used to identify the non-LTR elements of the urochordate Ciona intestinalis. Knowledge of the types and abundance of non-LTR elements in urochordates is a key step in understanding their contribution to the structure and function of vertebrate genomes. Results: Consensus elements phylogenetically related to the I, LINE1, LINE2, LOA and R2 elements of the 14 eukaryotic non-LTR clades are described from C. intestinalis. The ascidian elements showed conservation of both the reverse transcriptase coding sequence and the overall structural organization seen in each clade. The apurinic/apyrimidinic endonuclease and nucleic-acid-binding domains encoded upstream of the reverse transcriptase, and the RNase H and the restriction enzyme-like endonuclease motifs encoded downstream of the reverse transcriptase were identified in the corresponding Ciona families. Conclusions: The genome of C. intestinalis harbors representatives of at least five clades of non-LTR retrotransposons. The copy number per haploid genome of each element is low, less than 100, far below the values reported for vertebrate counterparts but within the range for protostomes. Genomic and sequence analysis shows that the ascidian non-LTR elements are unmethylated and flanked by genomic segments with a gene density lower than average for the genome. The analysis provides valuable data for understanding the evolution of early chordate genomes and enlarges the view on the distribution of the non-LTR retrotransposons in eukaryotes.
Resumo:
Background: Non-long terminal repeat (non-LTR) retrotransposons have contributed to shaping the structure and function of genomes. In silico and experimental approaches have been used to identify the non-LTR elements of the urochordate Ciona intestinalis. Knowledge of the types and abundance of non-LTR elements in urochordates is a key step in understanding their contribution to the structure and function of vertebrate genomes. Results: Consensus elements phylogenetically related to the I, LINE1, LINE2, LOA and R2 elements of the 14 eukaryotic non-LTR clades are described from C. intestinalis. The ascidian elements showed conservation of both the reverse transcriptase coding sequence and the overall structural organization seen in each clade. The apurinic/apyrimidinic endonuclease and nucleic-acid-binding domains encoded upstream of the reverse transcriptase, and the RNase H and the restriction enzyme-like endonuclease motifs encoded downstream of the reverse transcriptase were identified in the corresponding Ciona families. Conclusions: The genome of C. intestinalis harbors representatives of at least five clades of non-LTR retrotransposons. The copy number per haploid genome of each element is low, less than 100, far below the values reported for vertebrate counterparts but within the range for protostomes. Genomic and sequence analysis shows that the ascidian non-LTR elements are unmethylated and flanked by genomic segments with a gene density lower than average for the genome. The analysis provides valuable data for understanding the evolution of early chordate genomes and enlarges the view on the distribution of the non-LTR retrotransposons in eukaryotes.
Resumo:
Hepatitis A virus (HAV), the prototype of genus Hepatovirus, has several unique biological characteristics that distinguish it from other members of the Picornaviridae family. Among these, the need for an intact eIF4G factor for the initiation of translation results in an inability to shut down host protein synthesis by a mechanism similar to that of other picornaviruses. Consequently, HAV must inefficiently compete for the cellular translational machinery and this may explain its poor growth in cell culture. In this context of virus/cell competition, HAV has strategically adopted a naturally highly deoptimized codon usage with respect to that of its cellular host. With the aim to optimize its codon usage the virus was adapted to propagate in cells with impaired protein synthesis, in order to make tRNA pools more available for the virus. A significant loss of fitness was the immediate response to the adaptation process that was, however, later on recovered and more associated to a re-deoptimization rather than to an optimization of the codon usage specifically in the capsid coding region. These results exclude translation selection and instead suggest fine-tuning translation kinetics selection as the underlying mechanism of the codon usage bias in this specific genome region. Additionally, the results provide clear evidence of the Red Queen dynamics of evolution since the virus has very much evolved to re-adapt its codon usage to the environmental cellular changing conditions in order to recover the original fitness.
Resumo:
MicroRNAs (miRNAs) are short non-coding RNA molecules playing regulatory roles by repressing translation or cleaving RNA transcripts. Although the number of verified human miRNA is still expanding, only few have been functionally described. However, emerging evidences suggest the potential involvement of altered regulation of miRNA in pathogenesis of cancers and these genes are thought to function as both tumours suppressor and oncogenes. In our study, we examined by Real-Time PCR the expression of 156 mature miRNA in colorectal cancer. The analysis by several bioinformatics algorithms of colorectal tumours and adjacent non-neoplastic tissues from patients and colorectal cancer cell lines allowed identifying a group of 13 miRNA whose expression is significantly altered in this tumor. The most significantly deregulated miRNA being miR-31, miR-96, miR-133b, miR-135b, miR-145, and miR-183. In addition, the expression level of miR-31 was correlated with the stage of CRC tumor. Our results suggest that miRNA expression profile could have relevance to the biological and clinical behavior of colorectal neoplasia.
Resumo:
This research analyses the actual use and conception of the ICT mobility that a life long learning group of students have. The students have participated in a Mobile Learning experience along an online postgraduate course, which was designed under a traditional e-learning perspective. The students received a tablet PC (iPad) in order to work at the course and also to use it in their personal and professional life. A complete and original pre-test / post-test questionnaire was applied before and after the course. This instrument was scientifically validated. Thru the questionnaire, uses tendency and students perceptions were studied. Frequencies, purposes, habits of use and valuation, as well as the device"s integration into their personal, social and professional life were studied. The analysis intents to apply the 'Social Technographics Profile" by Bernoff (2010) to classify, by profile groups, the users of the actual Internet. Finally a reflexion of the reasons and limits of the theory, in this study, and also the relation to reality is presented. The Inter-coding reliability and validity shows the possibility of applying the instrument on wider samples in order to get a closer look to the uses and actual conceptions of the ubiquitous ICTs.
Resumo:
Data banks on the flora of the Catalan Countries. Two data banks on the flora of the Catalan Countries (NE and E of the Iberian Peninsule and Balearic Islands) have been set up by the Secció de Ciències of the Institut d'Estudis Catalans (I.E.C., Institute of Catalan Studies). One is devoted to bibliography and the other to floristics, in this first phase concerned only with vascular plants. The sources are all types of publications containing concrete information on the vascular flora of the Catalan Countries. The information is transcribed on precoded forms so as to standardize the data and thus permit homogeneous input for subsequent storage in the computer. The coding schemes and characteristics of the forms are described for each of the data banks. Only the phytocoenological inventories receive special treatment which is discussed withreference to the floristic data bank. A first issue is annexed: the bibliografy concerning the vascular flora of Catalan Countries (years 1983 and 1984).
Differences in the evolutionary history of disease genes affected by dominant or recessive mutations
Resumo:
Background: Global analyses of human disease genes by computational methods have yielded important advances in the understanding of human diseases. Generally these studies have treated the group of disease genes uniformly, thus ignoring the type of disease-causing mutations (dominant or recessive). In this report we present a comprehensive study of the evolutionary history of autosomal disease genes separated by mode of inheritance.Results: We examine differences in protein and coding sequence conservation between dominant and recessive human disease genes. Our analysis shows that disease genes affected by dominant mutations are more conserved than those affected by recessive mutations. This could be a consequence of the fact that recessive mutations remain hidden from selection while heterozygous. Furthermore, we employ functional annotation analysis and investigations into disease severity to support this hypothesis. Conclusion: This study elucidates important differences between dominantly- and recessively-acting disease genes in terms of protein and DNA sequence conservation, paralogy and essentiality. We propose that the division of disease genes by mode of inheritance will enhance both understanding of the disease process and prediction of candidate disease genes in the future.
Resumo:
Amino acid tandem repeats, also called homopolymeric tracts, are extremely abundant in eukaryotic proteins. To gain insight into the genome-wide evolution of these regions in mammals, we analyzed the repeat content in a large data set of rat-mouse-human orthologs. Our results show that human proteins contain more amino acid repeats than rodent proteins and that trinucleotide repeats are also more abundant in human coding sequences. Using the human species as an outgroup, we were able to address differences in repeat loss and repeat gain in the rat and mouse lineages. In this data set, mouse proteins contain substantially more repeats than rat proteins, which can be at least partly attributed to a higher repeat loss in the rat lineage. The data are consistent with a role for trinucleotide slippage in the generation of novel amino acid repeats. We confirm the previously observed functional bias of proteins with repeats, with overrepresentation of transcription factors and DNA-binding proteins. We show that genes encoding amino acid repeats tend to have an unusually high GC content, and that differences in coding GC content among orthologs are directly related to the presence/absence of repeats. We propose that the different GC content isochore structure in rodents and humans may result in an increased amino acid repeat prevalence in the human lineage.
Resumo:
One of the most striking results of the human (and mammalian) genomes is the low number of protein-coding genes. To-date, the main molecular mechanism to increase the number of different protein isoforms and functions is alternative splicing. However, a less-known way to increase the number of protein functions is the existence of multifunctional, multitask, or ‘‘moonlighting’’, proteins. By and large, moonlighting proteins are experimentally disclosed by serendipity. Proteomics is becoming one of the very active areas of biomedical research, which permits researchers to identify previously unseen connections among proteins and pathways. In principle, protein–protein interaction (PPI) databases should contain information on moonlighting proteins and could provide suggestions to further analysis in order to prove the multifunctionality. As far as we know, nobody has verified whether PPI databases actually disclose moonlighting proteins. In the present work we check whether well-established moonlighting proteins present in PPI databases connect with their known partners and, therefore, a careful inspection of these databases could help to suggest their different functions. The results of our research suggest that PPI databases could be a valuable tool to suggest multifunctionality.
Resumo:
Assessing the contribution of promoters and coding sequences to gene evolution is an important step toward discovering the major genetic determinants of human evolution. Many specific examples have revealed the evolutionary importance of cis-regulatory regions. However, the relative contribution of regulatory and coding regions to the evolutionary process and whether systemic factors differentially influence their evolution remains unclear. To address these questions, we carried out an analysis at the genome scale to identify signatures of positive selection in human proximal promoters. Next, we examined whether genes with positively selected promoters (Prom+ genes) show systemic differences with respect to a set of genes with positively selected protein-coding regions (Cod+ genes). We found that the number of genes in each set was not significantly different (8.1% and 8.5%, respectively). Furthermore, a functional analysis showed that, in both cases, positive selection affects almost all biological processes and only a few genes of each group are located in enriched categories, indicating that promoters and coding regions are not evolutionarily specialized with respect to gene function. On the other hand, we show that the topology of the human protein network has a different influence on the molecular evolution of proximal promoters and coding regions. Notably, Prom+ genes have an unexpectedly high centrality when compared with a reference distribution (P = 0.008, for Eigenvalue centrality). Moreover, the frequency of Prom+ genes increases from the periphery to the center of the protein network (P = 0.02, for the logistic regression coefficient). This means that gene centrality does not constrain the evolution of proximal promoters, unlike the case with coding regions, and further indicates that the evolution of proximal promoters is more efficient in the center of the protein network than in the periphery. These results show that proximal promoters have had a systemic contribution to human evolution by increasing the participation of central genes in the evolutionary process.