60 resultados para Computational algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new algorithm has been developed for smoothing the surfaces in finite element formulations of contact-impact. A key feature of this method is that the smoothing is done implicitly by constructing smooth signed distance functions for the bodies. These functions are then employed for the computation of the gap and other variables needed for implementation of contact-impact. The smoothed signed distance functions are constructed by a moving least-squares approximation with a polynomial basis. Results show that when nodes are placed on a surface, the surface can be reproduced with an error of about one per cent or less with either a quadratic or a linear basis. With a quadratic basis, the method exactly reproduces a circle or a sphere even for coarse meshes. Results are presented for contact problems involving the contact of circular bodies. Copyright (C) 2002 John Wiley Sons, Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In computer simulations of smooth dynamical systems, the original phase space is replaced by machine arithmetic, which is a finite set. The resulting spatially discretized dynamical systems do not inherit all functional properties of the original systems, such as surjectivity and existence of absolutely continuous invariant measures. This can lead to computational collapse to fixed points or short cycles. The paper studies loss of such properties in spatial discretizations of dynamical systems induced by unimodal mappings of the unit interval. The problem reduces to studying set-valued negative semitrajectories of the discretized system. As the grid is refined, the asymptotic behavior of the cardinality structure of the semitrajectories follows probabilistic laws corresponding to a branching process. The transition probabilities of this process are explicitly calculated. These results are illustrated by the example of the discretized logistic mapping.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Libraries of cyclic peptides are being synthesized using combinatorial chemistry for high throughput screening in the drug discovery process. This paper describes the min_syn_steps.cpp program (available at http://www.imb.uq.edu.au/groups/smythe/tran), which after inputting a list of cyclic peptides to be synthesized, removes cyclic redundant sequences and calculates synthetic strategies which minimize the synthetic steps as well as the reagent requirements. The synthetic steps and reagent requirements could be minimized by finding common subsets within the sequences for block synthesis. Since a brute-force approach to search for optimum synthetic strategies is impractically large, a subset-orientated approach is utilized here to limit the size of the search. (C) 2002 Elsevier Science Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Lanczos algorithm is appreciated in many situations due to its speed. and economy of storage. However, the advantage that the Lanczos basis vectors need not be kept is lost when the algorithm is used to compute the action of a matrix function on a vector. Either the basis vectors need to be kept, or the Lanczos process needs to be applied twice. In this study we describe an augmented Lanczos algorithm to compute a dot product relative to a function of a large sparse symmetric matrix, without keeping the basis vectors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Subcycling, or the use of different timesteps at different nodes, can be an effective way of improving the computational efficiency of explicit transient dynamic structural solutions. The method that has been most widely adopted uses a nodal partition. extending the central difference method, in which small timestep updates are performed interpolating on the displacement at neighbouring large timestep nodes. This approach leads to narrow bands of unstable timesteps or statistical stability. It also can be in error due to lack of momentum conservation on the timestep interface. The author has previously proposed energy conserving algorithms that avoid the first problem of statistical stability. However, these sacrifice accuracy to achieve stability. An approach to conserve momentum on an element interface by adding partial velocities is considered here. Applied to extend the central difference method. this approach is simple. and has accuracy advantages. The method can be programmed by summing impulses of internal forces, evaluated using local element timesteps, in order to predict a velocity change at a node. However, it is still only statistically stable, so an adaptive timestep size is needed to monitor accuracy and to be adjusted if necessary. By replacing the central difference method with the explicit generalized alpha method. it is possible to gain stability by dissipating the high frequency response that leads to stability problems. However. coding the algorithm is less elegant, as the response depends on previous partial accelerations. Extension to implicit integration, is shown to be impractical due to the neglect of remote effects of internal forces acting across a timestep interface. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article presents Monte Carlo techniques for estimating network reliability. For highly reliable networks, techniques based on graph evolution models provide very good performance. However, they are known to have significant simulation cost. An existing hybrid scheme (based on partitioning the time space) is available to speed up the simulations; however, there are difficulties with optimizing the important parameter associated with this scheme. To overcome these difficulties, a new hybrid scheme (based on partitioning the edge set) is proposed in this article. The proposed scheme shows orders of magnitude improvement of performance over the existing techniques in certain classes of network. It also provides reliability bounds with little overhead.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Signal peptides and transmembrane helices both contain a stretch of hydrophobic amino acids. This common feature makes it difficult for signal peptide and transmembrane helix predictors to correctly assign identity to stretches of hydrophobic residues near the N-terminal methionine of a protein sequence. The inability to reliably distinguish between N-terminal transmembrane helix and signal peptide is an error with serious consequences for the prediction of protein secretory status or transmembrane topology. In this study, we report a new method for differentiating protein N-terminal signal peptides and transmembrane helices. Based on the sequence features extracted from hydrophobic regions (amino acid frequency, hydrophobicity, and the start position), we set up discriminant functions and examined them on non-redundant datasets with jackknife tests. This method can incorporate other signal peptide prediction methods and achieve higher prediction accuracy. For Gram-negative bacterial proteins, 95.7% of N-terminal signal peptides and transmembrane helices can be correctly predicted (coefficient 0.90). Given a sensitivity of 90%, transmembrane helices can be identified from signal peptides with a precision of 99% (coefficient 0.92). For eukaryotic proteins, 94.2% of N-terminal signal peptides and transmembrane helices can be correctly predicted with coefficient 0.83. Given a sensitivity of 90%, transmembrane helices can be identified from signal peptides with a precision of 87% (coefficient 0.85). The method can be used to complement current transmembrane protein prediction and signal peptide prediction methods to improve their prediction accuracies. (C) 2003 Elsevier Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A Combined Genetic Algorithm and Method of Moments design methods is presented for the design of unusual near-field antennas for use in Magnetic Resonance Imaging systems. The method is successfully applied to the design of an asymmetric coil structure for use at 190MHz and demonstrates excellent radiofrequency field homogeneity.