Biblioteca Digital

40 resultados para regression algorithm

em University of Queensland eSpace - Australia

Better prediction of protein contact number using a support vector regression analysis of amino acid sequence

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.

Genetic analysis of complex demographic scenarios: Spatially expanding populations of the cane toad, Bufo marinus

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Inferring the spatial expansion dynamics of invading species from molecular data is notoriously difficult due to the complexity of the processes involved. For these demographic scenarios, genetic data obtained from highly variable markers may be profitably combined with specific sampling schemes and information from other sources using a Bayesian approach. The geographic range of the introduced toad Bufo marinus is still expanding in eastern and northern Australia, in each case from isolates established around 1960. A large amount of demographic and historical information is available on both expansion areas. In each area, samples were collected along a transect representing populations of different ages and genotyped at 10 microsatellite loci. Five demographic models of expansion, differing in the dispersal pattern for migrants and founders and in the number of founders, were considered. Because the demographic history is complex, we used an approximate Bayesian method, based on a rejection-regression algorithm. to formally test the relative likelihoods of the five models of expansion and to infer demographic parameters. A stepwise migration-foundation model with founder events was statistically better supported than other four models in both expansion areas. Posterior distributions supported different dynamics of expansion in the studied areas. Populations in the eastern expansion area have a lower stable effective population size and have been founded by a smaller number of individuals than those in the northern expansion area. Once demographically stabilized, populations exchange a substantial number of effective migrants per generation in both expansion areas, and such exchanges are larger in northern than in eastern Australia. The effective number of migrants appears to be considerably lower than that of founders in both expansion areas. We found our inferences to be relatively robust to various assumptions on marker. demographic, and historical features. The method presented here is the only robust, model-based method available so far, which allows inferring complex population dynamics over a short time scale. It also provides the basis for investigating the interplay between population dynamics, drift, and selection in invasive species.

Modelling inpatient length of stay by a hierarchical mixture regression via the EM algorithm

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The modelling of inpatient length of stay (LOS) has important implications in health care studies. Finite mixture distributions are usually used to model the heterogeneous LOS distribution, due to a certain proportion of patients sustaining-a longer stay. However, the morbidity data are collected from hospitals, observations clustered within the same hospital are often correlated. The generalized linear mixed model approach is adopted to accommodate the inherent correlation via unobservable random effects. An EM algorithm is developed to obtain residual maximum quasi-likelihood estimation. The proposed hierarchical mixture regression approach enables the identification and assessment of factors influencing the long-stay proportion and the LOS for the long-stay patient subgroup. A neonatal LOS data set is used for illustration, (C) 2003 Elsevier Science Ltd. All rights reserved.

An EM-based Semi-Parametric Mixture Model Approach to the Regression Analysis of Competing-Risks Data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.

Finite mixture regression model with random effects: application to neonatal hospital length of stay

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A two-component mixture regression model that allows simultaneously for heterogeneity and dependency among observations is proposed. By specifying random effects explicitly in the linear predictor of the mixture probability and the mixture components, parameter estimation is achieved by maximising the corresponding best linear unbiased prediction type log-likelihood. Approximate residual maximum likelihood estimates are obtained via an EM algorithm in the manner of generalised linear mixed model (GLMM). The method can be extended to a g-component mixture regression model with the component density from the exponential family, leading to the development of the class of finite mixture GLMM. For illustration, the method is applied to analyse neonatal length of stay (LOS). It is shown that identification of pertinent factors that influence hospital LOS can provide important information for health care planning and resource allocation. (C) 2002 Elsevier Science B.V. All rights reserved.

Multi-level zero-inflated Poisson regression modelling of correlated count data with excess zeros

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Count data with excess zeros relative to a Poisson distribution are common in many biomedical applications. A popular approach to the analysis of such data is to use a zero-inflated Poisson (ZIP) regression model. Often, because of the hierarchical Study design or the data collection procedure, zero-inflation and lack of independence may occur simultaneously, which tender the standard ZIP model inadequate. To account for the preponderance of zero counts and the inherent correlation of observations, a class of multi-level ZIP regression model with random effects is presented. Model fitting is facilitated using an expectation-maximization algorithm, whereas variance components are estimated via residual maximum likelihood estimating equations. A score test for zero-inflation is also presented. The multi-level ZIP model is then generalized to cope with a more complex correlation structure. Application to the analysis of correlated count data from a longitudinal infant feeding study illustrates the usefulness of the approach.

Tuning pattern classifier parameters using a genetic algorithm with an application in mobile robotics

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Support vector machines (SVMs) have recently emerged as a powerful technique for solving problems in pattern classification and regression. Best performance is obtained from the SVM its parameters have their values optimally set. In practice, good parameter settings are usually obtained by a lengthy process of trial and error. This paper describes the use of genetic algorithm to evolve these parameter settings for an application in mobile robotics.

An improved seeded region growing algorithm

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently Adams and Bischof (1994) proposed a novel region growing algorithm for segmenting intensity images. The inputs to the algorithm are the intensity image and a set of seeds - individual points or connected components - that identify the individual regions to be segmented. The algorithm grows these seed regions until all of the image pixels have been assimilated. Unfortunately the algorithm is inherently dependent on the order of pixel processing. This means, for example, that raster order processing and anti-raster order processing do not, in general, lead to the same tessellation. In this paper we propose an improved seeded region growing algorithm that retains the advantages of the Adams and Bischof algorithm fast execution, robust segmentation, and no tuning parameters - but is pixel order independent. (C) 1997 Elsevier Science B.V.

Predicting sprint running times from isokinetic and squat lift tests: A regression analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study examined the relationship between isokinetic hip extensor/hip flexor strength, 1-RM squat strength, and sprint running performance for both a sprint-trained and non-sprint-trained group. Eleven male sprinters and 8 male controls volunteered for the study. On the same day subjects ran 20-m sprints from both a stationary start and with a 50-m acceleration distance, completed isokinetic hip extension/flexion exercises at 1.05, 4.74, and 8.42 rad.s(-1), and had their squat strength estimated. Stepwise multiple regression analysis showed that equations for predicting both 20-m maximum velocity nm time and 20-m acceleration time may be calculated with an error of less than 0.05 sec using only isokinetic and squat strength data. However, a single regression equation for predicting both 20-m acceleration and maximum velocity run times from isokinetic or squat tests was not found. The regression analysis indicated that hip flexor strength at all test velocities was a better predictor of sprint running performance than hip extensor strength.

Prediction of MHC class II-binding peptides using an evolutionary algorithm and artificial neural network

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation: Prediction methods for identifying binding peptides could minimize the number of peptides required to be synthesized and assayed, and thereby facilitate the identification of potential T-cell epitopes. We developed a bioinformatic method for the prediction of peptide binding to MHC class II molecules. Results: Experimental binding data and expert knowledge of anchor positions and binding motifs were combined with an evolutionary algorithm (EA) and an artificial neural network (ANN): binding data extraction --> peptide alignment --> ANN training and classification. This method, termed PERUN, was implemented for the prediction of peptides that bind to HLA-DR4(B1*0401). The respective positive predictive values of PERUN predictions of high-, moderate-, low- and zero-affinity binder-a were assessed as 0.8, 0.7, 0.5 and 0.8 by cross-validation, and 1.0, 0.8, 0.3 and 0.7 by experimental binding. This illustrates the synergy between experimentation and computer modeling, and its application to the identification of potential immunotheraaeutic peptides.

Likelihood-based estimation of the regression model with scrambled responses

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A significant problem in the collection of responses to potentially sensitive questions, such as relating to illegal, immoral or embarrassing activities, is non-sampling error due to refusal to respond or false responses. Eichhorn & Hayre (1983) suggested the use of scrambled responses to reduce this form of bias. This paper considers a linear regression model in which the dependent variable is unobserved but for which the sum or product with a scrambling random variable of known distribution, is known. The performance of two likelihood-based estimators is investigated, namely of a Bayesian estimator achieved through a Markov chain Monte Carlo (MCMC) sampling scheme, and a classical maximum-likelihood estimator. These two estimators and an estimator suggested by Singh, Joarder & King (1996) are compared. Monte Carlo results show that the Bayesian estimator outperforms the classical estimators in almost all cases, and the relative performance of the Bayesian estimator improves as the responses become more scrambled.

A consistent point-searching algorithm for solution interpolation in unstructured meshes consisting of 4-node bilinear quadrilateral elements

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To translate and transfer solution data between two totally different meshes (i.e. mesh 1 and mesh 2), a consistent point-searching algorithm for solution interpolation in unstructured meshes consisting of 4-node bilinear quadrilateral elements is presented in this paper. The proposed algorithm has the following significant advantages: (1) The use of a point-searching strategy allows a point in one mesh to be accurately related to an element (containing this point) in another mesh. Thus, to translate/transfer the solution of any particular point from mesh 2 td mesh 1, only one element in mesh 2 needs to be inversely mapped. This certainly minimizes the number of elements, to which the inverse mapping is applied. In this regard, the present algorithm is very effective and efficient. (2) Analytical solutions to the local co ordinates of any point in a four-node quadrilateral element, which are derived in a rigorous mathematical manner in the context of this paper, make it possible to carry out an inverse mapping process very effectively and efficiently. (3) The use of consistent interpolation enables the interpolated solution to be compatible with an original solution and, therefore guarantees the interpolated solution of extremely high accuracy. After the mathematical formulations of the algorithm are presented, the algorithm is tested and validated through a challenging problem. The related results from the test problem have demonstrated the generality, accuracy, effectiveness, efficiency and robustness of the proposed consistent point-searching algorithm. Copyright (C) 1999 John Wiley & Sons, Ltd.

Trial-of-antibiotic algorithm for the diagnosis of tuberculosis in a district hospital in a developing country with high HIV prevalence

Relevância:

20.00% 20.00%

Publicador:

Resumo:

OBJECTIVE: To evaluate a diagnostic algorithm for pulmonary tuberculosis based on smear microscopy and objective response to trial of antibiotics. SETTING: Adult medical wards, Hlabisa Hospital, South Africa, 1996-1997. METHODS: Adults with chronic chest symptoms and abnormal chest X-ray had sputum examined for Ziehl-Neelsen stained acid-fast bacilli by light microscopy. Those with negative smears were treated with amoxycillin for 5 days and assessed. Those who had not improved were treated with erythromycin for 5 days and reassessed. Response was compared with mycobacterial culture. RESULTS: Of 280 suspects who completed the diagnostic pathway, 160 (57%) had a positive smear, 46 (17%) responded to amoxycillin, 34 (12%) responded to erythromycin and 40 (14%) were treated as smear-negative tuberculosis. The sensitivity (89%) and specificity (84%) of the full algorithm for culture-positive tuberculosis were high. However, 11 patients (positive predictive value [PPV] 95%) were incorrectly diagnosed with tuberculosis, and 24 cases of tuberculosis (negative predictive value [NPV] 70%) were not identified. NPV improved to 75% when anaemia was included as a predictor. Algorithm performance was independent of human immunodeficiency virus status. CONCLUSION: Sputum smear microscopy plus trial of antibiotic algorithm among a selected group of tuberculosis suspects may increase diagnostic accuracy in district hospitals in developing countries.

Minimum-order stable recursive filter design via the genetic algorithm approach

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, the minimum-order stable recursive filter design problem is proposed and investigated. This problem is playing an important role in pipeline implementation sin signal processing. Here, the existence of a high-order stable recursive filter is proved theoretically, in which the upper bound for the highest order of stable filters is given. Then the minimum-order stable linear predictor is obtained via solving an optimization problem. In this paper, the popular genetic algorithm approach is adopted since it is a heuristic probabilistic optimization technique and has been widely used in engineering designs. Finally, an illustrative example is sued to show the effectiveness of the proposed algorithm.

An equivalent algorithm for simulating thermal effects of magma intrusion problems in porous rocks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An equivalent algorithm is proposed to simulate thermal effects of the magma intrusion in geological systems, which are composed of porous rocks. Based on the physical and mathematical equivalence, the original magma solidification problem with a moving boundary between the rock and intruded magma is transformed into a new problem without the moving boundary but with a physically equivalent heat source. From the analysis of an ideal solidification model, the physically equivalent heat source has been determined in this paper. The major advantage in using the proposed equivalent algorithm is that the fixed finite element mesh with a variable integration time step can be employed to simulate the thermal effect of the intruded magma solidification using the conventional finite element method. The related numerical results have demonstrated the correctness and usefulness of the proposed equivalent algorithm for simulating the thermal effect of the intruded magma solidification in geological systems. (C) 2003 Elsevier B.V. All rights reserved.

«
1
2
3
»