31 resultados para local sequence alignment problem
Resumo:
This article describes a real-world production planning and scheduling problem occurring at an integrated pulp and paper mill (P&P) which manufactures paper for cardboard out of produced pulp. During the cooking of wood chips in the digester, two by-products are produced: the pulp itself (virgin fibers) and the waste stream known as black liquor. The former is then mixed with recycled fibers and processed in a paper machine. Here, due to significant sequence-dependent setups in paper type changeovers, sizing and sequencing of lots have to be made simultaneously in order to efficiently use capacity. The latter is converted into electrical energy using a set of evaporators, recovery boilers and counter-pressure turbines. The planning challenge is then to synchronize the material flow as it moves through the pulp and paper mills, and energy plant, maximizing customer demand (as backlogging is allowed), and minimizing operation costs. Due to the intensive capital feature of P&P, the output of the digester must be maximized. As the production bottleneck is not fixed, to tackle this problem we propose a new model that integrates the critical production units associated to the pulp and paper mills, and energy plant for the first time. Simple stochastic mixed integer programming based local search heuristics are developed to obtain good feasible solutions for the problem. The benefits of integrating the three stages are discussed. The proposed approaches are tested on real-world data. Our work may help P&P companies to increase their competitiveness and reactiveness in dealing with demand pattern oscillations. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
Background: Decreasing costs of DNA sequencing have made prokaryotic draft genome sequences increasingly common. A contig scaffold is an ordering of contigs in the correct orientation. A scaffold can help genome comparisons and guide gap closure efforts. One popular technique for obtaining contig scaffolds is to map contigs onto a reference genome. However, rearrangements that may exist between the query and reference genomes may result in incorrect scaffolds, if these rearrangements are not taken into account. Large-scale inversions are common rearrangement events in prokaryotic genomes. Even in draft genomes it is possible to detect the presence of inversions given sufficient sequencing coverage and a sufficiently close reference genome. Results: We present a linear-time algorithm that can generate a set of contig scaffolds for a draft genome sequence represented in contigs given a reference genome. The algorithm is aimed at prokaryotic genomes and relies on the presence of matching sequence patterns between the query and reference genomes that can be interpreted as the result of large-scale inversions; we call these patterns inversion signatures. Our algorithm is capable of correctly generating a scaffold if at least one member of every inversion signature pair is present in contigs and no inversion signatures have been overwritten in evolution. The algorithm is also capable of generating scaffolds in the presence of any kind of inversion, even though in this general case there is no guarantee that all scaffolds in the scaffold set will be correct. We compare the performance of SIS, the program that implements the algorithm, to seven other scaffold-generating programs. The results of our tests show that SIS has overall better performance. Conclusions: SIS is a new easy-to-use tool to generate contig scaffolds, available both as stand-alone and as a web server. The good performance of SIS in our tests adds evidence that large-scale inversions are widespread in prokaryotic genomes.
Resumo:
In this paper we obtain asymptotic expansions, up to order n(-1/2) and under a sequence of Pitman alternatives, for the nonnull distribution functions of the likelihood ratio, Wald, score and gradient test statistics in the class of symmetric linear regression models. This is a wide class of models which encompasses the t model and several other symmetric distributions with longer-than normal tails. The asymptotic distributions of all four statistics are obtained for testing a subset of regression parameters. Furthermore, in order to compare the finite-sample performance of these tests in this class of models, Monte Carlo simulations are presented. An empirical application to a real data set is considered for illustrative purposes. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The integrated production scheduling and lot-sizing problem in a flow shop environment consists of establishing production lot sizes and allocating machines to process them within a planning horizon in a production line with machines arranged in series. The problem considers that demands must be met without backlogging, the capacity of the machines must be respected, and machine setups are sequence-dependent and preserved between periods of the planning horizon. The objective is to determine a production schedule to minimise the setup, production and inventory costs. A mathematical model from the literature is presented, as well as procedures for obtaining feasible solutions. However, some of the procedures have difficulty in obtaining feasible solutions for large-sized problem instances. In addition, we address the problem using different versions of the Asynchronous Team (A-Team) approach. The procedures were compared with literature heuristics based on Mixed Integer Programming. The proposed A-Team procedures outperformed the literature heuristics, especially for large instances. The developed methodologies and the results obtained are presented.
Resumo:
Aims. Our goal is to study the circumstellar environment associated with each component of the wide intermediate-mass pre-main sequence binary system PDS 144 using broadband polarimetry. Methods. We present near-infrared (NIR) linear polarimetric observations of PDS 144 gathered with the IAGPOL imaging polarimeter along with the CamIV infrared camera at the Observatorio do Pico dos Dias (OPD). In addition, we re-analyzed OPD archive optical polarization to separate the binary and estimate the interstellar polarization using foreground stars. Results. After discounting the interstellar component, we found that both stars of the binary system are intrinsically polarized. The polarization vectors at optical and NIR bands of both components are aligned with the local magnetic field and the jet axis. These findings indicate an interplay between the interstellar magnetic field and the formation of the binary system. We also found that the PDS 144N is less polarized than its southern companion in the optical. However, in the NIR PDS 144N is more polarized. Our polarization data can only be explained by high inclinations (i greater than or similar to 80 degrees) for the disks of both members. In particular, comparisons of our NIR data with young stellar objects disk models suggest predominantly small grains in the circumstellar environment of PDS 144N. In spite of the different grain types in each component, the infrared spectral indexes indicate a coeval system. We also found evidence of coplanarity between the disks.
Resumo:
The main feature of partition of unity methods such as the generalized or extended finite element method is their ability of utilizing a priori knowledge about the solution of a problem in the form of enrichment functions. However, analytical derivation of enrichment functions with good approximation properties is mostly limited to two-dimensional linear problems. This paper presents a procedure to numerically generate proper enrichment functions for three-dimensional problems with confined plasticity where plastic evolution is gradual. This procedure involves the solution of boundary value problems around local regions exhibiting nonlinear behavior and the enrichment of the global solution space with the local solutions through the partition of unity method framework. This approach can produce accurate nonlinear solutions with a reduced computational cost compared to standard finite element methods since computationally intensive nonlinear iterations can be performed on coarse global meshes after the creation of enrichment functions properly describing localized nonlinear behavior. Several three-dimensional nonlinear problems based on the rate-independent J (2) plasticity theory with isotropic hardening are solved using the proposed procedure to demonstrate its robustness, accuracy and computational efficiency.
Resumo:
Using recent results on the compactness of the space of solutions of the Yamabe problem, we show that in conformal classes of metrics near the class of a nondegenerate solution which is unique (up to scaling) the Yamabe problem has a unique solution as well. This provides examples of a local extension, in the space of conformal classes, of a well-known uniqueness criterion due to Obata.
Resumo:
Context. Lithium abundances in open clusters are a very effective probe of mixing processes, and their study can help us to understand the large depletion of lithium that occurs in the Sun. Owing to its age and metallicity, the open cluster M 67 is especially interesting on this respect. Many studies of lithium abundances in M 67 have been performed, but a homogeneous global analysis of lithium in stars from subsolar masses and extending to the most massive members, has yet to be accomplished for a large sample based on high-quality spectra. Aims. We test our non-standard models, which were calibrated using the Sun with observational data. Methods. We collect literature data to analyze, for the first time in a homogeneous way, the non-local thermal equilibrium lithium abundances of all observed single stars in M 67 more massive than similar to 0.9 M-circle dot. Our grid of evolutionary models is computed assuming a non-standard mixing at metallicity [Fe/H] = 0.01, using the Toulouse-Geneva evolution code. Our analysis starts from the entrance into the zero-age main-sequence. Results. Lithium in M 67 is a tight function of mass for stars more massive than the Sun, apart from a few outliers. A plateau in lithium abundances is observed for turn-off stars. Both less massive (M >= 1.10 M-circle dot) and more massive (M >= 1.28 M-circle dot) stars are more depleted than those in the plateau. There is a significant scatter in lithium abundances for any given mass M <= 1.1 M-circle dot. Conclusions. Our models qualitatively reproduce most of the features described above, although the predicted depletion of lithium is 0.45 dex smaller than observed for masses in the plateau region, i.e. between 1.1 and 1.28 solar masses. More work is clearly needed to accurately reproduce the observations. Despite hints that chromospheric activity and rotation play a role in lithium depletion, no firm conclusion can be drawn with the presently available data.
Resumo:
The aim of solving the Optimal Power Flow problem is to determine the optimal state of an electric power transmission system, that is, the voltage magnitude and phase angles and the tap ratios of the transformers that optimize the performance of a given system, while satisfying its physical and operating constraints. The Optimal Power Flow problem is modeled as a large-scale mixed-discrete nonlinear programming problem. This paper proposes a method for handling the discrete variables of the Optimal Power Flow problem. A penalty function is presented. Due to the inclusion of the penalty function into the objective function, a sequence of nonlinear programming problems with only continuous variables is obtained and the solutions of these problems converge to a solution of the mixed problem. The obtained nonlinear programming problems are solved by a Primal-Dual Logarithmic-Barrier Method. Numerical tests using the IEEE 14, 30, 118 and 300-Bus test systems indicate that the method is efficient. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Solution of structural reliability problems by the First Order method require optimization algorithms to find the smallest distance between a limit state function and the origin of standard Gaussian space. The Hassofer-Lind-Rackwitz-Fiessler (HLRF) algorithm, developed specifically for this purpose, has been shown to be efficient but not robust, as it fails to converge for a significant number of problems. On the other hand, recent developments in general (augmented Lagrangian) optimization techniques have not been tested in aplication to structural reliability problems. In the present article, three new optimization algorithms for structural reliability analysis are presented. One algorithm is based on the HLRF, but uses a new differentiable merit function with Wolfe conditions to select step length in linear search. It is shown in the article that, under certain assumptions, the proposed algorithm generates a sequence that converges to the local minimizer of the problem. Two new augmented Lagrangian methods are also presented, which use quadratic penalties to solve nonlinear problems with equality constraints. Performance and robustness of the new algorithms is compared to the classic augmented Lagrangian method, to HLRF and to the improved HLRF (iHLRF) algorithms, in the solution of 25 benchmark problems from the literature. The new proposed HLRF algorithm is shown to be more robust than HLRF or iHLRF, and as efficient as the iHLRF algorithm. The two augmented Lagrangian methods proposed herein are shown to be more robust and more efficient than the classical augmented Lagrangian method.
Resumo:
The asymptotic expansion of the distribution of the gradient test statistic is derived for a composite hypothesis under a sequence of Pitman alternative hypotheses converging to the null hypothesis at rate n(-1/2), n being the sample size. Comparisons of the local powers of the gradient, likelihood ratio, Wald and score tests reveal no uniform superiority property. The power performance of all four criteria in one-parameter exponential family is examined.
Resumo:
We derive asymptotic expansions for the nonnull distribution functions of the likelihood ratio, Wald, score and gradient test statistics in the class of dispersion models, under a sequence of Pitman alternatives. The asymptotic distributions of these statistics are obtained for testing a subset of regression parameters and for testing the precision parameter. Based on these nonnull asymptotic expansions, the power of all four tests, which are equivalent to first order, are compared. Furthermore, in order to compare the finite-sample performance of these tests in this class of models, Monte Carlo simulations are presented. An empirical application to a real data set is considered for illustrative purposes. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Abstract Background A large number of probabilistic models used in sequence analysis assign non-zero probability values to most input sequences. To decide when a given probability is sufficient the most common way is bayesian binary classification, where the probability of the model characterizing the sequence family of interest is compared to that of an alternative probability model. We can use as alternative model a null model. This is the scoring technique used by sequence analysis tools such as HMMER, SAM and INFERNAL. The most prevalent null models are position-independent residue distributions that include: the uniform distribution, genomic distribution, family-specific distribution and the target sequence distribution. This paper presents a study to evaluate the impact of the choice of a null model in the final result of classifications. In particular, we are interested in minimizing the number of false predictions in a classification. This is a crucial issue to reduce costs of biological validation. Results For all the tests, the target null model presented the lowest number of false positives, when using random sequences as a test. The study was performed in DNA sequences using GC content as the measure of content bias, but the results should be valid also for protein sequences. To broaden the application of the results, the study was performed using randomly generated sequences. Previous studies were performed on aminoacid sequences, using only one probabilistic model (HMM) and on a specific benchmark, and lack more general conclusions about the performance of null models. Finally, a benchmark test with P. falciparum confirmed these results. Conclusions Of the evaluated models the best suited for classification are the uniform model and the target model. However, the use of the uniform model presents a GC bias that can cause more false positives for candidate sequences with extreme compositional bias, a characteristic not described in previous studies. In these cases the target model is more dependable for biological validation due to its higher specificity.
Resumo:
We investigate the nature of extremely red galaxies (ERGs), objects whose colours are redder than those found in the red sequence present in colour–magnitude diagrams of galaxies. We selected from the Sloan Digital Sky Survey Data Release 7 a volume-limited sample of such galaxies in the redshift interval 0.010 < z < 0.030, brighter than Mr = −17.8 (magnitudes dereddened, corrected for the Milky Way extinction) and with (g − r) colours larger than those of galaxies in the red sequence. This sample contains 416 ERGs, which were classified visually. Our classification was cross-checked with other classifications available in the literature. We found from our visual classification that the majority of objects in our sample are edge-on spirals (73 per cent). Other spirals correspond to 13 per cent, whereas elliptical galaxies comprise only 11 per cent of the objects. After comparing the morphological mix and the distributions of Hα/Hβ and axial ratios of ERGs and objects in the red sequence, we suggest that dust, more than stellar population effects, is the driver of the red colours found in these extremely red galaxies.
Resumo:
The cellular rheology has recently undergone a rapid development with particular attention to the cytoskeleton mechanical properties and its main components - actin filaments, intermediate filaments, microtubules and crosslinked proteins. However it is not clear what are the cellular structural changes that directly affect the cell mechanical properties. Thus, in this work, we aimed to quantify the structural rearrangement of these fibers that may emerge in changes in the cell mechanics. We created an image analysis platform to study smooth muscle cells from different arteries: aorta, mammary, renal, carotid and coronary and processed respectively 31, 29, 31, 30 and 35 cell image obtained by confocal microscopy. The platform was developed in Matlab (MathWorks) and it uses the Sobel operator to determine the actin fiber image orientation of the cell, labeled with phalloidin. The Sobel operator is used as a filter capable of calculating the pixel brightness gradient, point to point, in the image. The operator uses vertical and horizontal convolution kernels to calculate the magnitude and the angle of the pixel intensity gradient. The image analysis followed the sequence: (1) opens a given cells image set to be processed; (2) sets a fix threshold to eliminate noise, based on Otsu's method; (3) detect the fiber edges in the image using the Sobel operator; and (4) quantify the actin fiber orientation. Our first result is the probability distribution II(Δθ) to find a given fiber angle deviation (Δθ) from the main cell fiber orientation θ0. The II(Δθ) follows an exponential decay II(Δθ) = Aexp(-αΔθ) regarding to its θ0. We defined and determined a misalignment index α of the fibers of each artery kind: coronary αCo = (1.72 ‘+ or =’ 0.36)rad POT -1; renal αRe = (1.43 + or - 0.64)rad POT -1; aorta αAo = (1.42 + or - 0.43)rad POT -1; mammary αMa = (1.12 + or - 0.50)rad POT -1; and carotid αCa = (1.01 + or - 0.39)rad POT -1. The α of coronary and carotid are statistically different (p < 0.05) among all analyzed cells. We discussed our results correlating the misalignment index data with the experimental cell mechanical properties obtained by using Optical Magnetic Twisting Cytometry with the same group of cells.