845 resultados para Sign Data LMS algorithm.
Resumo:
Current commercially available Doppler lidars provide an economical and robust solution for measuring vertical and horizontal wind velocities, together with the ability to provide co- and cross-polarised backscatter profiles. The high temporal resolution of these instruments allows turbulent properties to be obtained from studying the variation in radial velocities. However, the instrument specifications mean that certain characteristics, especially the background noise behaviour, become a limiting factor for the instrument sensitivity in regions where the aerosol load is low. Turbulent calculations require an accurate estimate of the contribution from velocity uncertainty estimates, which are directly related to the signal-to-noise ratio. Any bias in the signal-to-noise ratio will propagate through as a bias in turbulent properties. In this paper we present a method to correct for artefacts in the background noise behaviour of commercially available Doppler lidars and reduce the signal-to-noise ratio threshold used to discriminate between noise, and cloud or aerosol signals. We show that, for Doppler lidars operating continuously at a number of locations in Finland, the data availability can be increased by as much as 50 % after performing this background correction and subsequent reduction in the threshold. The reduction in bias also greatly improves subsequent calculations of turbulent properties in weak signal regimes.
Resumo:
A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Conventional procedures employed in the modeling of viscoelastic properties of polymer rely on the determination of the polymer`s discrete relaxation spectrum from experimentally obtained data. In the past decades, several analytical regression techniques have been proposed to determine an explicit equation which describes the measured spectra. With a diverse approach, the procedure herein introduced constitutes a simulation-based computational optimization technique based on non-deterministic search method arisen from the field of evolutionary computation. Instead of comparing numerical results, this purpose of this paper is to highlight some Subtle differences between both strategies and focus on what properties of the exploited technique emerge as new possibilities for the field, In oder to illustrate this, essayed cases show how the employed technique can outperform conventional approaches in terms of fitting quality. Moreover, in some instances, it produces equivalent results With much fewer fitting parameters, which is convenient for computational simulation applications. I-lie problem formulation and the rationale of the highlighted method are herein discussed and constitute the main intended contribution. (C) 2009 Wiley Periodicals, Inc. J Appl Polym Sci 113: 122-135, 2009
Resumo:
Motivation: Array CGH technologies enable the simultaneous measurement of DNA copy number for thousands of sites on a genome. We developed the circular binary segmentation (CBS) algorithm to divide the genome into regions of equal copy number (Olshen {\it et~al}, 2004). The algorithm tests for change-points using a maximal $t$-statistic with a permutation reference distribution to obtain the corresponding $p$-value. The number of computations required for the maximal test statistic is $O(N^2),$ where $N$ is the number of markers. This makes the full permutation approach computationally prohibitive for the newer arrays that contain tens of thousands markers and highlights the need for a faster. algorithm. Results: We present a hybrid approach to obtain the $p$-value of the test statistic in linear time. We also introduce a rule for stopping early when there is strong evidence for the presence of a change. We show through simulations that the hybrid approach provides a substantial gain in speed with only a negligible loss in accuracy and that the stopping rule further increases speed. We also present the analysis of array CGH data from a breast cancer cell line to show the impact of the new approaches on the analysis of real data. Availability: An R (R Development Core Team, 2006) version of the CBS algorithm has been implemented in the ``DNAcopy'' package of the Bioconductor project (Gentleman {\it et~al}, 2004). The proposed hybrid method for the $p$-value is available in version 1.2.1 or higher and the stopping rule for declaring a change early is available in version 1.5.1 or higher.
Resumo:
A tandem mass spectral database system consists of a library of reference spectra and a search program. State-of-the-art search programs show a high tolerance for variability in compound-specific fragmentation patterns produced by collision-induced decomposition and enable sensitive and specific 'identity search'. In this communication, performance characteristics of two search algorithms combined with the 'Wiley Registry of Tandem Mass Spectral Data, MSforID' (Wiley Registry MSMS, John Wiley and Sons, Hoboken, NJ, USA) were evaluated. The search algorithms tested were the MSMS search algorithm implemented in the NIST MS Search program 2.0g (NIST, Gaithersburg, MD, USA) and the MSforID algorithm (John Wiley and Sons, Hoboken, NJ, USA). Sample spectra were acquired on different instruments and, thus, covered a broad range of possible experimental conditions or were generated in silico. For each algorithm, more than 30,000 matches were performed. Statistical evaluation of the library search results revealed that principally both search algorithms can be combined with the Wiley Registry MSMS to create a reliable identification tool. It appears, however, that a higher degree of spectral similarity is necessary to obtain a correct match with the NIST MS Search program. This characteristic of the NIST MS Search program has a positive effect on specificity as it helps to avoid false positive matches (type I errors), but reduces sensitivity. Thus, particularly with sample spectra acquired on instruments differing in their Setup from tandem-in-space type fragmentation, a comparably higher number of false negative matches (type II errors) were observed by searching the Wiley Registry MSMS.
Resumo:
We propose a general procedure for solving incomplete data estimation problems. The procedure can be used to find the maximum likelihood estimate or to solve estimating equations in difficult cases such as estimation with the censored or truncated regression model, the nonlinear structural measurement error model, and the random effects model. The procedure is based on the general principle of stochastic approximation and the Markov chain Monte-Carlo method. Applying the theory on adaptive algorithms, we derive conditions under which the proposed procedure converges. Simulation studies also indicate that the proposed procedure consistently converges to the maximum likelihood estimate for the structural measurement error logistic regression model.
Resumo:
Frequencies of meiotic configurations in cytogenetic stocks are dependent on chiasma frequencies in segments defined by centromeres, breakpoints, and telomeres. The expectation maximization algorithm is proposed as a general method to perform maximum likelihood estimations of the chiasma frequencies in the intervals between such locations. The estimates can be translated via mapping functions into genetic maps of cytogenetic landmarks. One set of observational data was analyzed to exemplify application of these methods, results of which were largely concordant with other comparable data. The method was also tested by Monte Carlo simulation of frequencies of meiotic configurations from a monotelodisomic translocation heterozygote, assuming six different sample sizes. The estimate averages were always close to the values given initially to the parameters. The maximum likelihood estimation procedures can be extended readily to other kinds of cytogenetic stocks and allow the pooling of diverse cytogenetic data to collectively estimate lengths of segments, arms, and chromosomes.
Resumo:
Numerical modelling methodologies are important by their application to engineering and scientific problems, because there are processes where analytical mathematical expressions cannot be obtained to model them. When the only available information is a set of experimental values for the variables that determine the state of the system, the modelling problem is equivalent to determining the hyper-surface that best fits the data. This paper presents a methodology based on the Galerkin formulation of the finite elements method to obtain representations of relationships that are defined a priori, between a set of variables: y = z(x1, x2,...., xd). These representations are generated from the values of the variables in the experimental data. The approximation, piecewise, is an element of a Sobolev space and has derivatives defined in a general sense into this space. The using of this approach results in the need of inverting a linear system with a structure that allows a fast solver algorithm. The algorithm can be used in a variety of fields, being a multidisciplinary tool. The validity of the methodology is studied considering two real applications: a problem in hydrodynamics and a problem of engineering related to fluids, heat and transport in an energy generation plant. Also a test of the predictive capacity of the methodology is performed using a cross-validation method.
Resumo:
Includes bibliographies (p. 12).
Resumo:
"May 1980."
Resumo:
We investigate two numerical procedures for the Cauchy problem in linear elasticity, involving the relaxation of either the given boundary displacements (Dirichlet data) or the prescribed boundary tractions (Neumann data) on the over-specified boundary, in the alternating iterative algorithm of Kozlov et al. (1991). The two mixed direct (well-posed) problems associated with each iteration are solved using the method of fundamental solutions (MFS), in conjunction with the Tikhonov regularization method, while the optimal value of the regularization parameter is chosen via the generalized cross-validation (GCV) criterion. An efficient regularizing stopping criterion which ceases the iterative procedure at the point where the accumulation of noise becomes dominant and the errors in predicting the exact solutions increase, is also presented. The MFS-based iterative algorithms with relaxation are tested for Cauchy problems for isotropic linear elastic materials in various geometries to confirm the numerical convergence, stability, accuracy and computational efficiency of the proposed method.
Resumo:
Conventional DEA models assume deterministic, precise and non-negative data for input and output observations. However, real applications may be characterized by observations that are given in form of intervals and include negative numbers. For instance, the consumption of electricity in decentralized energy resources may be either negative or positive, depending on the heat consumption. Likewise, the heat losses in distribution networks may be within a certain range, depending on e.g. external temperature and real-time outtake. Complementing earlier work separately addressing the two problems; interval data and negative data; we propose a comprehensive evaluation process for measuring the relative efficiencies of a set of DMUs in DEA. In our general formulation, the intervals may contain upper or lower bounds with different signs. The proposed method determines upper and lower bounds for the technical efficiency through the limits of the intervals after decomposition. Based on the interval scores, DMUs are then classified into three classes, namely, the strictly efficient, weakly efficient and inefficient. An intuitive ranking approach is presented for the respective classes. The approach is demonstrated through an application to the evaluation of bank branches. © 2013.