56 resultados para event sequences


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have carried out an initial analysis of the dynamics of the recent evolution of the splice-sites sequences on a large collection of human, rodent (mouse and rat), and chicken introns. Our results indicate that the sequences of splice sites are largely homogeneous within tetrapoda. We have also found that orthologous splice signals between human and rodents and within rodents are more conserved than unrelated splice sites, but the additional conservation can be explained mostly by background intron conservation. In contrast, additional conservation over background is detectable in orthologous mammalian and chicken splice sites. Our results also indicate that the U2 and U12 intron classes seem to have evolved independently since the split of mammals and birds; we have not been able to find a convincing case of interconversion between these two classes in our collections of orthologous introns. Similarly, we have not found a single case of switching between AT-AC and GT-AG subtypes within U12 introns, suggesting that this event has been a rare occurrence in recent evolutionary times. Switching between GT-AG and the noncanonical GC-AG U2 subtypes, on the contrary, does not appear to be unusual; in particular, T to C mutations appear to be relatively well tolerated in GT-AG introns with very strong donor sites.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The goals of the human genome project did not include sequencing of the heterochromatic regions. We describe here an initial sequence of 1.1 Mb of the short arm of human chromosome 21 (HSA21p), estimated to be 10% of 21p. This region contains extensive euchromatic-like sequence and includes on average one transcript every 100 kb. These transcripts show multiple inter- and intrachromosomal copies, and extensive copy number and sequence variability. The sequencing of the "heterochromatic" regions of the human genome is likely to reveal many additional functional elements and provide important evolutionary information.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The construction of metagenomic libraries has permitted the study of microorganisms resistant to isolation and the analysis of 16S rDNA sequences has been used for over two decades to examine bacterial biodiversity. Here, we show that the analysis of random sequence reads (RSRs) instead of 16S is a suitable shortcut to estimate the biodiversity of a bacterial community from metagenomic libraries. We generated 10,010 RSRs from a metagenomic library of microorganisms found in human faecal samples. Then searched them using the program BLASTN against a prokaryotic sequence database to assign a taxon to each RSR. The results were compared with those obtained by screening and analysing the clones containing 16S rDNA sequences in the whole library. We found that the biodiversity observed by RSR analysis is consistent with that obtained by 16S rDNA. We also show that RSRs are suitable to compare the biodiversity between different metagenomic libraries. RSRs can thus provide a good estimate of the biodiversity of a metagenomic library and, as an alternative to 16S, this approach is both faster and cheaper.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: The human FOXI1 gene codes for a transcription factor involved in the physiology of the inner ear, testis, and kidney. Using three interspecies comparisons, it has been suggested that this may be a gene underhuman-specific selection. We sought to confirm this finding by using an extended set of orthologous sequences.Additionally, we explored for signals of natural selection within humans by sequencing the gene in 20 Europeans,20 East Asians and 20 Yorubas and by analysing SNP variation in a 2 Mb region centered on FOXI1 in 39worldwide human populations from the HGDP-CEPH diversity panel.Results: The genome sequences recently available from other primate and non-primate species showed that FOXI1divergence patterns are compatible with neutral evolution. Sequence-based neutrality tests were not significant inEuropeans, East Asians or Yorubas. However, the Long Range Haplotype (LRH) test, as well as the iHS and XP-Rsbstatistics revealed significantly extended tracks of homozygosity around FOXI1 in Africa, suggesting a recentepisode of positive selection acting on this gene. A functionally relevant SNP, as well as several SNPs either on theputatively selected core haplotypes or with significant iHS or XP-Rsb values, displayed allele frequencies stronglycorrelated with the absolute geographical latitude of the populations sampled.Conclusions: We present evidence for recent positive selection in the FOXI1 gene region in Africa. Climate mightbe related to this recent adaptive event in humans. Of the multiple functions of FOXI1, its role in kidney-mediatedwater-electrolyte homeostasis is the most obvious candidate for explaining a climate-related adaptation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work we present the results of experimental work on the development of lexical class-based lexica by automatic means. Our purpose is to assess the use of linguistic lexical-class based information as a feature selection methodology for the use of classifiers in quick lexical development. The results show that the approach can help reduce the human effort required in the development of language resources significantly.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is aimed at exploring the determinants of female activity from a dynamic perspective. An event-history analysis of the transition form employment to housework has been made resorting to data from the European Household Panel Survey. Four countries representing different welfare regimes and, more specifically, different family policies, have been selected for the analysis: Britain, Denmark, Germany and Spain. The results confirm the importance of individual-level factors, which is consistent with an economic approach to female labour supply. Nonetheless, there are significant cross-national differences in how these factors act over the risk of abandoning the labour market. First, the number of trnasitions is much lower among Danish working women than among British, German or Spanish ones, revealing the relative importance of universal provision of childcare services, vis-à-vis other elements of the family policy, as time or money.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sequential randomized prediction of an arbitrary binary sequence isinvestigated. No assumption is made on the mechanism of generating the bit sequence. The goal of the predictor is to minimize its relative loss, i.e., to make (almost) as few mistakes as the best ``expert'' in a fixed, possibly infinite, set of experts. We point out a surprising connection between this prediction problem and empirical process theory. First, in the special case of static (memoryless) experts, we completely characterize the minimax relative loss in terms of the maximum of an associated Rademacher process. Then we show general upper and lower bounds on the minimaxrelative loss in terms of the geometry of the class of experts. As main examples, we determine the exact order of magnitude of the minimax relative loss for the class of autoregressive linear predictors and for the class of Markov experts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider adaptive sequential lossy coding of bounded individual sequences when the performance is measured by the sequentially accumulated mean squared distortion. Theencoder and the decoder are connected via a noiseless channel of capacity $R$ and both are assumed to have zero delay. No probabilistic assumptions are made on how the sequence to be encoded is generated. For any bounded sequence of length $n$, the distortion redundancy is defined as the normalized cumulative distortion of the sequential scheme minus the normalized cumulative distortion of the best scalarquantizer of rate $R$ which is matched to this particular sequence. We demonstrate the existence of a zero-delay sequential scheme which uses common randomization in the encoder and the decoder such that the normalized maximum distortion redundancy converges to zero at a rate $n^{-1/5}\log n$ as the length of the encoded sequence $n$ increases without bound.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We construct spectral sequences in the framework of Baues-Wirsching cohomology and homology for functors between small categories and analyze particular cases including Grothendieck fibrations. We also give applications to more classical cohomology and homology theories including Hochschild-Mitchell cohomology and those studied before by Watts, Roos, Quillen and others

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the scope of the European project Hydroptimet, INTERREG IIIB-MEDOCC programme, limited area model (LAM) intercomparison of intense events that produced many damages to people and territory is performed. As the comparison is limited to single case studies, the work is not meant to provide a measure of the different models' skill, but to identify the key model factors useful to give a good forecast on such a kind of meteorological phenomena. This work focuses on the Spanish flash-flood event, also known as "Montserrat-2000" event. The study is performed using forecast data from seven operational LAMs, placed at partners' disposal via the Hydroptimet ftp site, and observed data from Catalonia rain gauge network. To improve the event analysis, satellite rainfall estimates have been also considered. For statistical evaluation of quantitative precipitation forecasts (QPFs), several non-parametric skill scores based on contingency tables have been used. Furthermore, for each model run it has been possible to identify Catalonia regions affected by misses and false alarms using contingency table elements. Moreover, the standard "eyeball" analysis of forecast and observed precipitation fields has been supported by the use of a state-of-the-art diagnostic method, the contiguous rain area (CRA) analysis. This method allows to quantify the spatial shift forecast error and to identify the error sources that affected each model forecasts. High-resolution modelling and domain size seem to have a key role for providing a skillful forecast. Further work is needed to support this statement, including verification using a wider observational data set.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The 10 June 2000 event was the largest flash flood event that occurred in the Northeast of Spain in the late 20th century, both as regards its meteorological features and its considerable social impact. This paper focuses on analysis of the structures that produced the heavy rainfalls, especially from the point of view of meteorological radar. Due to the fact that this case is a good example of a Mediterranean flash flood event, a final objective of this paper is to undertake a description of the evolution of the rainfall structure that would be sufficiently clear to be understood at an interdisciplinary forum. Then, it could be useful not only to improve conceptual meteorological models, but also for application in downscaling models. The main precipitation structure was a Mesoscale Convective System (MCS) that crossed the region and that developed as a consequence of the merging of two previous squall lines. The paper analyses the main meteorological features that led to the development and triggering of the heavy rainfalls, with special emphasis on the features of this MCS, its life cycle and its dynamic features. To this end, 2-D and 3-D algorithms were applied to the imagery recorded over the complete life cycle of the structures, which lasted approximately 18 h. Mesoscale and synoptic information were also considered. Results show that it was an NS-MCS, quasi-stationary during its stage of maturity as a consequence of the formation of a convective train, the different displacement directions of the 2-D structures and the 3-D structures, including the propagation of new cells, and the slow movement of the convergence line associated with the Mediterranean mesoscale low.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Leakage detection is an important issue in many chemical sensing applications. Leakage detection hy thresholds suffers from important drawbacks when sensors have serious drifts or they are affected by cross-sensitivities. Here we present an adaptive method based in a Dynamic Principal Component Analysis that models the relationships between the sensors in the may. In normal conditions a certain variance distribution characterizes sensor signals. However, in the presence of a new source of variance the PCA decomposition changes drastically. In order to prevent the influence of sensor drifts the model is adaptive and it is calculated in a recursive manner with minimum computational effort. The behavior of this technique is studied with synthetic signals and with real signals arising by oil vapor leakages in an air compressor. Results clearly demonstrate the efficiency of the proposed method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Hox and ParaHox gene clusters are thought to have resulted from the duplication of a ProtoHox gene cluster early in metazoan evolution. However, the origin and evolution of the other genes belonging to the extended Hox group of homeobox-containing genes, that is, Mox and Evx, remains obscure. We constructed phylogenetic trees with mouse, amphioxus and Drosophila extended Hox and other related Antennapedia-type homeobox gene sequences and analyzed the linkage data available for such genes.Results: We claim that neither Mox nor Evx is a Hox or ParaHox gene. We propose a scenariothat reconciles phylogeny with linkage data, in which an Evx/Mox ancestor gene linked to aProtoHox cluster was involved in a segmental tandem duplication event that generated an arrayof all Hox-like genes, referred to as the `coupled¿ cluster. A chromosomal breakage within thiscluster explains the current composition of the extended Hox cluster (with Evx, Hox and Moxgenes) and the ParaHox cluster.Conclusions: Most studies dealing with the origin and evolution of Hox and ParaHox clustershave not included the Hox-related genes Mox and Evx. Our phylogenetic analyses and theavailable linkage data in mammalian genomes support an evolutionary scenario in which anancestor of Evx and Mox was linked to the ProtoHox cluster, and that a tandem duplication of alarge genomic region early in metazoan evolution generated the Hox and ParaHox clusters, plusthe cluster-neighbors Evx and Mox. The large `coupled¿ Hox-like cluster EvxHox/MoxParaHox wassubsequently broken, thus grouping the Mox and Evx genes to the Hox clusters, and isolating theParaHox cluster.