948 resultados para local sequence alignment problem


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main feature of partition of unity methods such as the generalized or extended finite element method is their ability of utilizing a priori knowledge about the solution of a problem in the form of enrichment functions. However, analytical derivation of enrichment functions with good approximation properties is mostly limited to two-dimensional linear problems. This paper presents a procedure to numerically generate proper enrichment functions for three-dimensional problems with confined plasticity where plastic evolution is gradual. This procedure involves the solution of boundary value problems around local regions exhibiting nonlinear behavior and the enrichment of the global solution space with the local solutions through the partition of unity method framework. This approach can produce accurate nonlinear solutions with a reduced computational cost compared to standard finite element methods since computationally intensive nonlinear iterations can be performed on coarse global meshes after the creation of enrichment functions properly describing localized nonlinear behavior. Several three-dimensional nonlinear problems based on the rate-independent J (2) plasticity theory with isotropic hardening are solved using the proposed procedure to demonstrate its robustness, accuracy and computational efficiency.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Using recent results on the compactness of the space of solutions of the Yamabe problem, we show that in conformal classes of metrics near the class of a nondegenerate solution which is unique (up to scaling) the Yamabe problem has a unique solution as well. This provides examples of a local extension, in the space of conformal classes, of a well-known uniqueness criterion due to Obata.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Context. Lithium abundances in open clusters are a very effective probe of mixing processes, and their study can help us to understand the large depletion of lithium that occurs in the Sun. Owing to its age and metallicity, the open cluster M 67 is especially interesting on this respect. Many studies of lithium abundances in M 67 have been performed, but a homogeneous global analysis of lithium in stars from subsolar masses and extending to the most massive members, has yet to be accomplished for a large sample based on high-quality spectra. Aims. We test our non-standard models, which were calibrated using the Sun with observational data. Methods. We collect literature data to analyze, for the first time in a homogeneous way, the non-local thermal equilibrium lithium abundances of all observed single stars in M 67 more massive than similar to 0.9 M-circle dot. Our grid of evolutionary models is computed assuming a non-standard mixing at metallicity [Fe/H] = 0.01, using the Toulouse-Geneva evolution code. Our analysis starts from the entrance into the zero-age main-sequence. Results. Lithium in M 67 is a tight function of mass for stars more massive than the Sun, apart from a few outliers. A plateau in lithium abundances is observed for turn-off stars. Both less massive (M >= 1.10 M-circle dot) and more massive (M >= 1.28 M-circle dot) stars are more depleted than those in the plateau. There is a significant scatter in lithium abundances for any given mass M <= 1.1 M-circle dot. Conclusions. Our models qualitatively reproduce most of the features described above, although the predicted depletion of lithium is 0.45 dex smaller than observed for masses in the plateau region, i.e. between 1.1 and 1.28 solar masses. More work is clearly needed to accurately reproduce the observations. Despite hints that chromospheric activity and rotation play a role in lithium depletion, no firm conclusion can be drawn with the presently available data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of solving the Optimal Power Flow problem is to determine the optimal state of an electric power transmission system, that is, the voltage magnitude and phase angles and the tap ratios of the transformers that optimize the performance of a given system, while satisfying its physical and operating constraints. The Optimal Power Flow problem is modeled as a large-scale mixed-discrete nonlinear programming problem. This paper proposes a method for handling the discrete variables of the Optimal Power Flow problem. A penalty function is presented. Due to the inclusion of the penalty function into the objective function, a sequence of nonlinear programming problems with only continuous variables is obtained and the solutions of these problems converge to a solution of the mixed problem. The obtained nonlinear programming problems are solved by a Primal-Dual Logarithmic-Barrier Method. Numerical tests using the IEEE 14, 30, 118 and 300-Bus test systems indicate that the method is efficient. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Solution of structural reliability problems by the First Order method require optimization algorithms to find the smallest distance between a limit state function and the origin of standard Gaussian space. The Hassofer-Lind-Rackwitz-Fiessler (HLRF) algorithm, developed specifically for this purpose, has been shown to be efficient but not robust, as it fails to converge for a significant number of problems. On the other hand, recent developments in general (augmented Lagrangian) optimization techniques have not been tested in aplication to structural reliability problems. In the present article, three new optimization algorithms for structural reliability analysis are presented. One algorithm is based on the HLRF, but uses a new differentiable merit function with Wolfe conditions to select step length in linear search. It is shown in the article that, under certain assumptions, the proposed algorithm generates a sequence that converges to the local minimizer of the problem. Two new augmented Lagrangian methods are also presented, which use quadratic penalties to solve nonlinear problems with equality constraints. Performance and robustness of the new algorithms is compared to the classic augmented Lagrangian method, to HLRF and to the improved HLRF (iHLRF) algorithms, in the solution of 25 benchmark problems from the literature. The new proposed HLRF algorithm is shown to be more robust than HLRF or iHLRF, and as efficient as the iHLRF algorithm. The two augmented Lagrangian methods proposed herein are shown to be more robust and more efficient than the classical augmented Lagrangian method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The asymptotic expansion of the distribution of the gradient test statistic is derived for a composite hypothesis under a sequence of Pitman alternative hypotheses converging to the null hypothesis at rate n(-1/2), n being the sample size. Comparisons of the local powers of the gradient, likelihood ratio, Wald and score tests reveal no uniform superiority property. The power performance of all four criteria in one-parameter exponential family is examined.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We derive asymptotic expansions for the nonnull distribution functions of the likelihood ratio, Wald, score and gradient test statistics in the class of dispersion models, under a sequence of Pitman alternatives. The asymptotic distributions of these statistics are obtained for testing a subset of regression parameters and for testing the precision parameter. Based on these nonnull asymptotic expansions, the power of all four tests, which are equivalent to first order, are compared. Furthermore, in order to compare the finite-sample performance of these tests in this class of models, Monte Carlo simulations are presented. An empirical application to a real data set is considered for illustrative purposes. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Background A large number of probabilistic models used in sequence analysis assign non-zero probability values to most input sequences. To decide when a given probability is sufficient the most common way is bayesian binary classification, where the probability of the model characterizing the sequence family of interest is compared to that of an alternative probability model. We can use as alternative model a null model. This is the scoring technique used by sequence analysis tools such as HMMER, SAM and INFERNAL. The most prevalent null models are position-independent residue distributions that include: the uniform distribution, genomic distribution, family-specific distribution and the target sequence distribution. This paper presents a study to evaluate the impact of the choice of a null model in the final result of classifications. In particular, we are interested in minimizing the number of false predictions in a classification. This is a crucial issue to reduce costs of biological validation. Results For all the tests, the target null model presented the lowest number of false positives, when using random sequences as a test. The study was performed in DNA sequences using GC content as the measure of content bias, but the results should be valid also for protein sequences. To broaden the application of the results, the study was performed using randomly generated sequences. Previous studies were performed on aminoacid sequences, using only one probabilistic model (HMM) and on a specific benchmark, and lack more general conclusions about the performance of null models. Finally, a benchmark test with P. falciparum confirmed these results. Conclusions Of the evaluated models the best suited for classification are the uniform model and the target model. However, the use of the uniform model presents a GC bias that can cause more false positives for candidate sequences with extreme compositional bias, a characteristic not described in previous studies. In these cases the target model is more dependable for biological validation due to its higher specificity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate the nature of extremely red galaxies (ERGs), objects whose colours are redder than those found in the red sequence present in colour–magnitude diagrams of galaxies. We selected from the Sloan Digital Sky Survey Data Release 7 a volume-limited sample of such galaxies in the redshift interval 0.010 < z < 0.030, brighter than Mr = −17.8 (magnitudes dereddened, corrected for the Milky Way extinction) and with (g − r) colours larger than those of galaxies in the red sequence. This sample contains 416 ERGs, which were classified visually. Our classification was cross-checked with other classifications available in the literature. We found from our visual classification that the majority of objects in our sample are edge-on spirals (73 per cent). Other spirals correspond to 13 per cent, whereas elliptical galaxies comprise only 11 per cent of the objects. After comparing the morphological mix and the distributions of Hα/Hβ and axial ratios of ERGs and objects in the red sequence, we suggest that dust, more than stellar population effects, is the driver of the red colours found in these extremely red galaxies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The cellular rheology has recently undergone a rapid development with particular attention to the cytoskeleton mechanical properties and its main components - actin filaments, intermediate filaments, microtubules and crosslinked proteins. However it is not clear what are the cellular structural changes that directly affect the cell mechanical properties. Thus, in this work, we aimed to quantify the structural rearrangement of these fibers that may emerge in changes in the cell mechanics. We created an image analysis platform to study smooth muscle cells from different arteries: aorta, mammary, renal, carotid and coronary and processed respectively 31, 29, 31, 30 and 35 cell image obtained by confocal microscopy. The platform was developed in Matlab (MathWorks) and it uses the Sobel operator to determine the actin fiber image orientation of the cell, labeled with phalloidin. The Sobel operator is used as a filter capable of calculating the pixel brightness gradient, point to point, in the image. The operator uses vertical and horizontal convolution kernels to calculate the magnitude and the angle of the pixel intensity gradient. The image analysis followed the sequence: (1) opens a given cells image set to be processed; (2) sets a fix threshold to eliminate noise, based on Otsu's method; (3) detect the fiber edges in the image using the Sobel operator; and (4) quantify the actin fiber orientation. Our first result is the probability distribution II(Δθ) to find a given fiber angle deviation (Δθ) from the main cell fiber orientation θ0. The II(Δθ) follows an exponential decay II(Δθ) = Aexp(-αΔθ) regarding to its θ0. We defined and determined a misalignment index α of the fibers of each artery kind: coronary αCo = (1.72 ‘+ or =’ 0.36)rad POT -1; renal αRe = (1.43 + or - 0.64)rad POT -1; aorta αAo = (1.42 + or - 0.43)rad POT -1; mammary αMa = (1.12 + or - 0.50)rad POT -1; and carotid αCa = (1.01 + or - 0.39)rad POT -1. The α of coronary and carotid are statistically different (p < 0.05) among all analyzed cells. We discussed our results correlating the misalignment index data with the experimental cell mechanical properties obtained by using Optical Magnetic Twisting Cytometry with the same group of cells.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hierarchical multi-label classification is a complex classification task where the classes involved in the problem are hierarchically structured and each example may simultaneously belong to more than one class in each hierarchical level. In this paper, we extend our previous works, where we investigated a new local-based classification method that incrementally trains a multi-layer perceptron for each level of the classification hierarchy. Predictions made by a neural network in a given level are used as inputs to the neural network responsible for the prediction in the next level. We compare the proposed method with one state-of-the-art decision-tree induction method and two decision-tree induction methods, using several hierarchical multi-label classification datasets. We perform a thorough experimental analysis, showing that our method obtains competitive results to a robust global method regarding both precision and recall evaluation measures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since the first underground nuclear explosion, carried out in 1958, the analysis of seismic signals generated by these sources has allowed seismologists to refine the travel times of seismic waves through the Earth and to verify the accuracy of the location algorithms (the ground truth for these sources was often known). Long international negotiates have been devoted to limit the proliferation and testing of nuclear weapons. In particular the Treaty for the comprehensive nuclear test ban (CTBT), was opened to signatures in 1996, though, even if it has been signed by 178 States, has not yet entered into force, The Treaty underlines the fundamental role of the seismological observations to verify its compliance, by detecting and locating seismic events, and identifying the nature of their sources. A precise definition of the hypocentral parameters represents the first step to discriminate whether a given seismic event is natural or not. In case that a specific event is retained suspicious by the majority of the State Parties, the Treaty contains provisions for conducting an on-site inspection (OSI) in the area surrounding the epicenter of the event, located through the International Monitoring System (IMS) of the CTBT Organization. An OSI is supposed to include the use of passive seismic techniques in the area of the suspected clandestine underground nuclear test. In fact, high quality seismological systems are thought to be capable to detect and locate very weak aftershocks triggered by underground nuclear explosions in the first days or weeks following the test. This PhD thesis deals with the development of two different seismic location techniques: the first one, known as the double difference joint hypocenter determination (DDJHD) technique, is aimed at locating closely spaced events at a global scale. The locations obtained by this method are characterized by a high relative accuracy, although the absolute location of the whole cluster remains uncertain. We eliminate this problem introducing a priori information: the known location of a selected event. The second technique concerns the reliable estimates of back azimuth and apparent velocity of seismic waves from local events of very low magnitude recorded by a trypartite array at a very local scale. For the two above-mentioned techniques, we have used the crosscorrelation technique among digital waveforms in order to minimize the errors linked with incorrect phase picking. The cross-correlation method relies on the similarity between waveforms of a pair of events at the same station, at the global scale, and on the similarity between waveforms of the same event at two different sensors of the try-partite array, at the local scale. After preliminary tests on the reliability of our location techniques based on simulations, we have applied both methodologies to real seismic events. The DDJHD technique has been applied to a seismic sequence occurred in the Turkey-Iran border region, using the data recorded by the IMS. At the beginning, the algorithm was applied to the differences among the original arrival times of the P phases, so the cross-correlation was not used. We have obtained that the relevant geometrical spreading, noticeable in the standard locations (namely the locations produced by the analysts of the International Data Center (IDC) of the CTBT Organization, assumed as our reference), has been considerably reduced by the application of our technique. This is what we expected, since the methodology has been applied to a sequence of events for which we can suppose a real closeness among the hypocenters, belonging to the same seismic structure. Our results point out the main advantage of this methodology: the systematic errors affecting the arrival times have been removed or at least reduced. The introduction of the cross-correlation has not brought evident improvements to our results: the two sets of locations (without and with the application of the cross-correlation technique) are very similar to each other. This can be commented saying that the use of the crosscorrelation has not substantially improved the precision of the manual pickings. Probably the pickings reported by the IDC are good enough to make the random picking error less important than the systematic error on travel times. As a further justification for the scarce quality of the results given by the cross-correlation, it should be remarked that the events included in our data set don’t have generally a good signal to noise ratio (SNR): the selected sequence is composed of weak events ( magnitude 4 or smaller) and the signals are strongly attenuated because of the large distance between the stations and the hypocentral area. In the local scale, in addition to the cross-correlation, we have performed a signal interpolation in order to improve the time resolution. The algorithm so developed has been applied to the data collected during an experiment carried out in Israel between 1998 and 1999. The results pointed out the following relevant conclusions: a) it is necessary to correlate waveform segments corresponding to the same seismic phases; b) it is not essential to select the exact first arrivals; and c) relevant information can be also obtained from the maximum amplitude wavelet of the waveforms (particularly in bad SNR conditions). Another remarkable point of our procedure is that its application doesn’t demand a long time to process the data, and therefore the user can immediately check the results. During a field survey, such feature will make possible a quasi real-time check allowing the immediate optimization of the array geometry, if so suggested by the results at an early stage.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

[EN]This work presents a novel approach to solve a two dimensional problem by using an adaptive finite element approach. The most common strategy to deal with nested adaptivity is to generate a mesh that represents the geometry and the input parameters correctly, and to refine this mesh locally to obtain the most accurate solution. As opposed to this approach, the authors propose a technique using independent meshes : geometry, input data and the unknowns. Each particular mesh is obtained by a local nested refinement of the same coarse mesh at the parametric space…

Relevância:

30.00% 30.00%

Publicador:

Resumo:

[EN]This paper shows a finite element method for pollutant transport with several pollutant sources. An Eulerian convection–diffusion–reaction model to simulate the pollutant dispersion is used. The discretization of the different sources allows to impose the emissions as boundary conditions. The Eulerian description can deal with the coupling of several plumes. An adaptive stabilized finite element formulation, specifically Least-Squares, with a Crank-Nicolson temporal integration is proposed to solve the problem. An splitting scheme has been used to treat separately the transport and the reaction. A mass-consistent model has been used to compute the wind field of the problem

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.