845 resultados para Sign Data LMS algorithm.


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Estimation of Taylor`s power law for species abundance data may be performed by linear regression of the log empirical variances on the log means, but this method suffers from a problem of bias for sparse data. We show that the bias may be reduced by using a bias-corrected Pearson estimating function. Furthermore, we investigate a more general regression model allowing for site-specific covariates. This method may be efficiently implemented using a Newton scoring algorithm, with standard errors calculated from the inverse Godambe information matrix. The method is applied to a set of biomass data for benthic macrofauna from two Danish estuaries. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Interval-censored survival data, in which the event of interest is not observed exactly but is only known to occur within some time interval, occur very frequently. In some situations, event times might be censored into different, possibly overlapping intervals of variable widths; however, in other situations, information is available for all units at the same observed visit time. In the latter cases, interval-censored data are termed grouped survival data. Here we present alternative approaches for analyzing interval-censored data. We illustrate these techniques using a survival data set involving mango tree lifetimes. This study is an example of grouped survival data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: Prediction methods for identifying binding peptides could minimize the number of peptides required to be synthesized and assayed, and thereby facilitate the identification of potential T-cell epitopes. We developed a bioinformatic method for the prediction of peptide binding to MHC class II molecules. Results: Experimental binding data and expert knowledge of anchor positions and binding motifs were combined with an evolutionary algorithm (EA) and an artificial neural network (ANN): binding data extraction --> peptide alignment --> ANN training and classification. This method, termed PERUN, was implemented for the prediction of peptides that bind to HLA-DR4(B1*0401). The respective positive predictive values of PERUN predictions of high-, moderate-, low- and zero-affinity binder-a were assessed as 0.8, 0.7, 0.5 and 0.8 by cross-validation, and 1.0, 0.8, 0.3 and 0.7 by experimental binding. This illustrates the synergy between experimentation and computer modeling, and its application to the identification of potential immunotheraaeutic peptides.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of computational fluid dynamics simulations for calibrating a flush air data system is described, In particular, the flush air data system of the HYFLEX hypersonic vehicle is used as a case study. The HYFLEX air data system consists of nine pressure ports located flush with the vehicle nose surface, connected to onboard pressure transducers, After appropriate processing, surface pressure measurements can he converted into useful air data parameters. The processing algorithm requires an accurate pressure model, which relates air data parameters to the measured pressures. In the past, such pressure models have been calibrated using combinations of flight data, ground-based experimental results, and numerical simulation. We perform a calibration of the HYFLEX flush air data system using computational fluid dynamics simulations exclusively, The simulations are used to build an empirical pressure model that accurately describes the HYFLEX nose pressure distribution ol cr a range of flight conditions. We believe that computational fluid dynamics provides a quick and inexpensive way to calibrate the air data system and is applicable to a broad range of flight conditions, When tested with HYFLEX flight data, the calibrated system is found to work well. It predicts vehicle angle of attack and angle of sideslip to accuracy levels that generally satisfy flight control requirements. Dynamic pressure is predicted to within the resolution of the onboard inertial measurement unit. We find that wind-tunnel experiments and flight data are not necessary to accurately calibrate the HYFLEX flush air data system for hypersonic flight.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We tested the effects of four data characteristics on the results of reserve selection algorithms. The data characteristics were nestedness of features (land types in this case), rarity of features, size variation of sites (potential reserves) and size of data sets (numbers of sites and features). We manipulated data sets to produce three levels, with replication, of each of these data characteristics while holding the other three characteristics constant. We then used an optimizing algorithm and three heuristic algorithms to select sites to solve several reservation problems. We measured efficiency as the number or total area of selected sites, indicating the relative cost of a reserve system. Higher nestedness increased the efficiency of all algorithms (reduced the total cost of new reserves). Higher rarity reduced the efficiency of all algorithms (increased the total cost of new reserves). More variation in site size increased the efficiency of all algorithms expressed in terms of total area of selected sites. We measured the suboptimality of heuristic algorithms as the percentage increase of their results over optimal (minimum possible) results. Suboptimality is a measure of the reliability of heuristics as indicative costing analyses. Higher rarity reduced the suboptimality of heuristics (increased their reliability) and there is some evidence that more size variation did the same for the total area of selected sites. We discuss the implications of these results for the use of reserve selection algorithms as indicative and real-world planning tools.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To translate and transfer solution data between two totally different meshes (i.e. mesh 1 and mesh 2), a consistent point-searching algorithm for solution interpolation in unstructured meshes consisting of 4-node bilinear quadrilateral elements is presented in this paper. The proposed algorithm has the following significant advantages: (1) The use of a point-searching strategy allows a point in one mesh to be accurately related to an element (containing this point) in another mesh. Thus, to translate/transfer the solution of any particular point from mesh 2 td mesh 1, only one element in mesh 2 needs to be inversely mapped. This certainly minimizes the number of elements, to which the inverse mapping is applied. In this regard, the present algorithm is very effective and efficient. (2) Analytical solutions to the local co ordinates of any point in a four-node quadrilateral element, which are derived in a rigorous mathematical manner in the context of this paper, make it possible to carry out an inverse mapping process very effectively and efficiently. (3) The use of consistent interpolation enables the interpolated solution to be compatible with an original solution and, therefore guarantees the interpolated solution of extremely high accuracy. After the mathematical formulations of the algorithm are presented, the algorithm is tested and validated through a challenging problem. The related results from the test problem have demonstrated the generality, accuracy, effectiveness, efficiency and robustness of the proposed consistent point-searching algorithm. Copyright (C) 1999 John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The new technologies for Knowledge Discovery from Databases (KDD) and data mining promise to bring new insights into a voluminous growing amount of biological data. KDD technology is complementary to laboratory experimentation and helps speed up biological research. This article contains an introduction to KDD, a review of data mining tools, and their biological applications. We discuss the domain concepts related to biological data and databases, as well as current KDD and data mining developments in biology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The identification, modeling, and analysis of interactions between nodes of neural systems in the human brain have become the aim of interest of many studies in neuroscience. The complex neural network structure and its correlations with brain functions have played a role in all areas of neuroscience, including the comprehension of cognitive and emotional processing. Indeed, understanding how information is stored, retrieved, processed, and transmitted is one of the ultimate challenges in brain research. In this context, in functional neuroimaging, connectivity analysis is a major tool for the exploration and characterization of the information flow between specialized brain regions. In most functional magnetic resonance imaging (fMRI) studies, connectivity analysis is carried out by first selecting regions of interest (ROI) and then calculating an average BOLD time series (across the voxels in each cluster). Some studies have shown that the average may not be a good choice and have suggested, as an alternative, the use of principal component analysis (PCA) to extract the principal eigen-time series from the ROI(s). In this paper, we introduce a novel approach called cluster Granger analysis (CGA) to study connectivity between ROIs. The main aim of this method was to employ multiple eigen-time series in each ROI to avoid temporal information loss during identification of Granger causality. Such information loss is inherent in averaging (e.g., to yield a single ""representative"" time series per ROI). This, in turn, may lead to a lack of power in detecting connections. The proposed approach is based on multivariate statistical analysis and integrates PCA and partial canonical correlation in a framework of Granger causality for clusters (sets) of time series. We also describe an algorithm for statistical significance testing based on bootstrapping. By using Monte Carlo simulations, we show that the proposed approach outperforms conventional Granger causality analysis (i.e., using representative time series extracted by signal averaging or first principal components estimation from ROIs). The usefulness of the CGA approach in real fMRI data is illustrated in an experiment using human faces expressing emotions. With this data set, the proposed approach suggested the presence of significantly more connections between the ROIs than were detected using a single representative time series in each ROI. (c) 2010 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a simulated-annealing-based genetic algorithm for solving model parameter estimation problems. The algorithm incorporates advantages of both genetic algorithms and simulated annealing. Tests on computer-generated synthetic data that closely resemble optical constants of a metal were performed to compare the efficiency of plain genetic algorithms against the simulated-annealing-based genetic algorithms. These tests assess the ability of the algorithms to and the global minimum and the accuracy of values obtained for model parameters. Finally, the algorithm with the best performance is used to fit the model dielectric function to data for platinum and aluminum. (C) 1997 Optical Society of America.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The cost and risk associated with mineral exploration in Australia increases significantly as companies move into deeper regolith-covered terrain. The ability to map the bedrock and the depth of weathering within an area has the potential to decrease this risk and increase the effectiveness of exploration programs. This paper is the second in a trilogy concerning the Grant's Patch area of the Eastern Goldfields. The recent development of the VPmg potential field inversion program in conjunction with the acquisition of high-resolution gravity data over an area with extensive drilling provided an opportunity to evaluate three-dimensional gravity inversion as a bedrock and regolith mapping tool. An apparent density model of the study area was constructed, with the ground represented as adjoining 200 m by 200 m vertical rectangular prisms. During inversion VPmg incrementally adjusted the density of each prism until the free-air gravity response of the model replicated the observed data. For the Grant's Patch study area, this image of the apparent density values proved easier to interpret than the Bouguer gravity image. A regolith layer was introduced into the model and realistic fresh-rock densities assigned to each basement prism according to its interpreted lithology. With the basement and regolith densities fixed, the VPmg inversion algorithm adjusted the depth to fresh basement until the misfit between the calculated and observed gravity response was minimised. The resulting geometry of the bedrock/regolith contact largely replicated the base of weathering indicated by drilling with predicted depth of weathering values from gravity inversion typically within 15% of those logged during RAB and RC drilling.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new algorithm, PfAGSS, for predicting 3' splice sites in Plasmodium falciparum genomic sequences is described. Application of this program to the published P. falciparum chromosome 2 and 3 data suggests that existing programs result in a high error rate in assigning 3' intron boundaries. (C) 2001 Elsevier Science B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Much progress has been made on inferring population history from molecular data. However, complex demographic scenarios have been considered rarely or have proved intractable. The serial introduction of the South-Central American cane Load Bufo marinas in various Caribbean and Pacific islands involves four major phases: a possible genetic admixture during the first introduction, a bottleneck associated with founding, a transitory, population boom, and finally, a demographic stabilization. A large amount of historical and demographic information is available for those introductions and can be combined profitably with molecular data. We used a Bayesian approach to combine this information With microsatellite (10 loci) and enzyme (22 loci) data and used a rejection algorithm to simultaneously estimate the demographic parameters describing the four major phases of the introduction history,. The general historical trends supported by microsatellites and enzymes were similar. However, there was a stronger support for a larger bottleneck at introductions for microsatellites than enzymes and for a more balanced genetic admixture for enzymes than for microsatellites. Verb, little information was obtained from either marker about the transitory population boom observed after each introduction. Possible explanations for differences in resolution of demographic events and discrepancies between results obtained with microsatellites and enzymes were explored. Limits Of Our model and method for the analysis of nonequilibrium populations were discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We focus on mixtures of factor analyzers from the perspective of a method for model-based density estimation from high-dimensional data, and hence for the clustering of such data. This approach enables a normal mixture model to be fitted to a sample of n data points of dimension p, where p is large relative to n. The number of free parameters is controlled through the dimension of the latent factor space. By working in this reduced space, it allows a model for each component-covariance matrix with complexity lying between that of the isotropic and full covariance structure models. We shall illustrate the use of mixtures of factor analyzers in a practical example that considers the clustering of cell lines on the basis of gene expressions from microarray experiments. (C) 2002 Elsevier Science B.V. All rights reserved.