39 resultados para Sign Data LMS algorithm.

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The modelling of inpatient length of stay (LOS) has important implications in health care studies. Finite mixture distributions are usually used to model the heterogeneous LOS distribution, due to a certain proportion of patients sustaining-a longer stay. However, the morbidity data are collected from hospitals, observations clustered within the same hospital are often correlated. The generalized linear mixed model approach is adopted to accommodate the inherent correlation via unobservable random effects. An EM algorithm is developed to obtain residual maximum quasi-likelihood estimation. The proposed hierarchical mixture regression approach enables the identification and assessment of factors influencing the long-stay proportion and the LOS for the long-stay patient subgroup. A neonatal LOS data set is used for illustration, (C) 2003 Elsevier Science Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hannenhalli and Pevzner developed the first polynomial-time algorithm for the combinatorial problem of sorting of signed genomic data. Their algorithm solves the minimum number of reversals required for rearranging a genome to another when gene duplication is nonexisting. In this paper, we show how to extend the Hannenhalli-Pevzner approach to genomes with multigene families. We propose a new heuristic algorithm to compute the reversal distance between two genomes with multigene families via the concept of binary integer programming without removing gene duplicates. The experimental results on simulated and real biological data demonstrate that the proposed algorithm is able to find the reversal distance accurately. ©2005 IEEE

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data refinements are refinement steps in which a program’s local data structures are changed. Data refinement proof obligations require the software designer to find an abstraction relation that relates the states of the original and new program. In this paper we describe an algorithm that helps a designer find an abstraction relation for a proposed refinement. Given sufficient time and space, the algorithm can find a minimal abstraction relation, and thus show that the refinement holds. As it executes, the algorithm displays mappings that cannot be in any abstraction relation. When the algorithm is not given sufficient resources to terminate, these mappings can help the designer find a suitable abstraction relation. The same algorithm can be used to test an abstraction relation supplied by the designer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: Inpatient length of stay (LOS) is an important measure of hospital activity, health care resource consumption, and patient acuity. This research work aims at developing an incremental expectation maximization (EM) based learning approach on mixture of experts (ME) system for on-line prediction of LOS. The use of a batchmode learning process in most existing artificial neural networks to predict LOS is unrealistic, as the data become available over time and their pattern change dynamically. In contrast, an on-line process is capable of providing an output whenever a new datum becomes available. This on-the-spot information is therefore more useful and practical for making decisions, especially when one deals with a tremendous amount of data. Methods and material: The proposed approach is illustrated using a real example of gastroenteritis LOS data. The data set was extracted from a retrospective cohort study on all infants born in 1995-1997 and their subsequent admissions for gastroenteritis. The total number of admissions in this data set was n = 692. Linked hospitalization records of the cohort were retrieved retrospectively to derive the outcome measure, patient demographics, and associated co-morbidities information. A comparative study of the incremental learning and the batch-mode learning algorithms is considered. The performances of the learning algorithms are compared based on the mean absolute difference (MAD) between the predictions and the actual LOS, and the proportion of predictions with MAD < 1 day (Prop(MAD < 1)). The significance of the comparison is assessed through a regression analysis. Results: The incremental learning algorithm provides better on-line prediction of LOS when the system has gained sufficient training from more examples (MAD = 1.77 days and Prop(MAD < 1) = 54.3%), compared to that using the batch-mode learning. The regression analysis indicates a significant decrease of MAD (p-value = 0.063) and a significant (p-value = 0.044) increase of Prop(MAD

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: Prediction methods for identifying binding peptides could minimize the number of peptides required to be synthesized and assayed, and thereby facilitate the identification of potential T-cell epitopes. We developed a bioinformatic method for the prediction of peptide binding to MHC class II molecules. Results: Experimental binding data and expert knowledge of anchor positions and binding motifs were combined with an evolutionary algorithm (EA) and an artificial neural network (ANN): binding data extraction --> peptide alignment --> ANN training and classification. This method, termed PERUN, was implemented for the prediction of peptides that bind to HLA-DR4(B1*0401). The respective positive predictive values of PERUN predictions of high-, moderate-, low- and zero-affinity binder-a were assessed as 0.8, 0.7, 0.5 and 0.8 by cross-validation, and 1.0, 0.8, 0.3 and 0.7 by experimental binding. This illustrates the synergy between experimentation and computer modeling, and its application to the identification of potential immunotheraaeutic peptides.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of computational fluid dynamics simulations for calibrating a flush air data system is described, In particular, the flush air data system of the HYFLEX hypersonic vehicle is used as a case study. The HYFLEX air data system consists of nine pressure ports located flush with the vehicle nose surface, connected to onboard pressure transducers, After appropriate processing, surface pressure measurements can he converted into useful air data parameters. The processing algorithm requires an accurate pressure model, which relates air data parameters to the measured pressures. In the past, such pressure models have been calibrated using combinations of flight data, ground-based experimental results, and numerical simulation. We perform a calibration of the HYFLEX flush air data system using computational fluid dynamics simulations exclusively, The simulations are used to build an empirical pressure model that accurately describes the HYFLEX nose pressure distribution ol cr a range of flight conditions. We believe that computational fluid dynamics provides a quick and inexpensive way to calibrate the air data system and is applicable to a broad range of flight conditions, When tested with HYFLEX flight data, the calibrated system is found to work well. It predicts vehicle angle of attack and angle of sideslip to accuracy levels that generally satisfy flight control requirements. Dynamic pressure is predicted to within the resolution of the onboard inertial measurement unit. We find that wind-tunnel experiments and flight data are not necessary to accurately calibrate the HYFLEX flush air data system for hypersonic flight.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We tested the effects of four data characteristics on the results of reserve selection algorithms. The data characteristics were nestedness of features (land types in this case), rarity of features, size variation of sites (potential reserves) and size of data sets (numbers of sites and features). We manipulated data sets to produce three levels, with replication, of each of these data characteristics while holding the other three characteristics constant. We then used an optimizing algorithm and three heuristic algorithms to select sites to solve several reservation problems. We measured efficiency as the number or total area of selected sites, indicating the relative cost of a reserve system. Higher nestedness increased the efficiency of all algorithms (reduced the total cost of new reserves). Higher rarity reduced the efficiency of all algorithms (increased the total cost of new reserves). More variation in site size increased the efficiency of all algorithms expressed in terms of total area of selected sites. We measured the suboptimality of heuristic algorithms as the percentage increase of their results over optimal (minimum possible) results. Suboptimality is a measure of the reliability of heuristics as indicative costing analyses. Higher rarity reduced the suboptimality of heuristics (increased their reliability) and there is some evidence that more size variation did the same for the total area of selected sites. We discuss the implications of these results for the use of reserve selection algorithms as indicative and real-world planning tools.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To translate and transfer solution data between two totally different meshes (i.e. mesh 1 and mesh 2), a consistent point-searching algorithm for solution interpolation in unstructured meshes consisting of 4-node bilinear quadrilateral elements is presented in this paper. The proposed algorithm has the following significant advantages: (1) The use of a point-searching strategy allows a point in one mesh to be accurately related to an element (containing this point) in another mesh. Thus, to translate/transfer the solution of any particular point from mesh 2 td mesh 1, only one element in mesh 2 needs to be inversely mapped. This certainly minimizes the number of elements, to which the inverse mapping is applied. In this regard, the present algorithm is very effective and efficient. (2) Analytical solutions to the local co ordinates of any point in a four-node quadrilateral element, which are derived in a rigorous mathematical manner in the context of this paper, make it possible to carry out an inverse mapping process very effectively and efficiently. (3) The use of consistent interpolation enables the interpolated solution to be compatible with an original solution and, therefore guarantees the interpolated solution of extremely high accuracy. After the mathematical formulations of the algorithm are presented, the algorithm is tested and validated through a challenging problem. The related results from the test problem have demonstrated the generality, accuracy, effectiveness, efficiency and robustness of the proposed consistent point-searching algorithm. Copyright (C) 1999 John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The new technologies for Knowledge Discovery from Databases (KDD) and data mining promise to bring new insights into a voluminous growing amount of biological data. KDD technology is complementary to laboratory experimentation and helps speed up biological research. This article contains an introduction to KDD, a review of data mining tools, and their biological applications. We discuss the domain concepts related to biological data and databases, as well as current KDD and data mining developments in biology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a simulated-annealing-based genetic algorithm for solving model parameter estimation problems. The algorithm incorporates advantages of both genetic algorithms and simulated annealing. Tests on computer-generated synthetic data that closely resemble optical constants of a metal were performed to compare the efficiency of plain genetic algorithms against the simulated-annealing-based genetic algorithms. These tests assess the ability of the algorithms to and the global minimum and the accuracy of values obtained for model parameters. Finally, the algorithm with the best performance is used to fit the model dielectric function to data for platinum and aluminum. (C) 1997 Optical Society of America.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The cost and risk associated with mineral exploration in Australia increases significantly as companies move into deeper regolith-covered terrain. The ability to map the bedrock and the depth of weathering within an area has the potential to decrease this risk and increase the effectiveness of exploration programs. This paper is the second in a trilogy concerning the Grant's Patch area of the Eastern Goldfields. The recent development of the VPmg potential field inversion program in conjunction with the acquisition of high-resolution gravity data over an area with extensive drilling provided an opportunity to evaluate three-dimensional gravity inversion as a bedrock and regolith mapping tool. An apparent density model of the study area was constructed, with the ground represented as adjoining 200 m by 200 m vertical rectangular prisms. During inversion VPmg incrementally adjusted the density of each prism until the free-air gravity response of the model replicated the observed data. For the Grant's Patch study area, this image of the apparent density values proved easier to interpret than the Bouguer gravity image. A regolith layer was introduced into the model and realistic fresh-rock densities assigned to each basement prism according to its interpreted lithology. With the basement and regolith densities fixed, the VPmg inversion algorithm adjusted the depth to fresh basement until the misfit between the calculated and observed gravity response was minimised. The resulting geometry of the bedrock/regolith contact largely replicated the base of weathering indicated by drilling with predicted depth of weathering values from gravity inversion typically within 15% of those logged during RAB and RC drilling.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new algorithm, PfAGSS, for predicting 3' splice sites in Plasmodium falciparum genomic sequences is described. Application of this program to the published P. falciparum chromosome 2 and 3 data suggests that existing programs result in a high error rate in assigning 3' intron boundaries. (C) 2001 Elsevier Science B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Much progress has been made on inferring population history from molecular data. However, complex demographic scenarios have been considered rarely or have proved intractable. The serial introduction of the South-Central American cane Load Bufo marinas in various Caribbean and Pacific islands involves four major phases: a possible genetic admixture during the first introduction, a bottleneck associated with founding, a transitory, population boom, and finally, a demographic stabilization. A large amount of historical and demographic information is available for those introductions and can be combined profitably with molecular data. We used a Bayesian approach to combine this information With microsatellite (10 loci) and enzyme (22 loci) data and used a rejection algorithm to simultaneously estimate the demographic parameters describing the four major phases of the introduction history,. The general historical trends supported by microsatellites and enzymes were similar. However, there was a stronger support for a larger bottleneck at introductions for microsatellites than enzymes and for a more balanced genetic admixture for enzymes than for microsatellites. Verb, little information was obtained from either marker about the transitory population boom observed after each introduction. Possible explanations for differences in resolution of demographic events and discrepancies between results obtained with microsatellites and enzymes were explored. Limits Of Our model and method for the analysis of nonequilibrium populations were discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.