142 resultados para Regression methods
em University of Queensland eSpace - Australia
Resumo:
Little consensus exists in the literature regarding methods for determination of the onset of electromyographic (EMG) activity. The aim of this study was to compare the relative accuracy of a range of computer-based techniques with respect to EMG onset determined visually by an experienced examiner. Twenty-seven methods were compared which varied in terms of EMG processing (low pass filtering at 10, 50 and 500 Hz), threshold value (1, 2 and 3 SD beyond mean of baseline activity) and the number of samples for which the mean must exceed the defined threshold (20, 50 and 100 ms). Three hundred randomly selected trials of a postural task were evaluated using each technique. The visual determination of EMG onset was found to be highly repeatable between days. Linear regression equations were calculated for the values selected by each computer method which indicated that the onset values selected by the majority of the parameter combinations deviated significantly from the visually derived onset values. Several methods accurately selected the time of onset of EMG activity and are recommended for future use. Copyright (C) 1996 Elsevier Science Ireland Ltd.
Resumo:
Purpose: To determine whether constriction of proximal arterial vessels precedes involution of the distal hyaloid vasculature in the mouse, under normal conditions, and whether this vasoconstriction is less pronounced when the distal hyaloid network persists, as it does in oxygen-induced retinopathy (OIR). Methods: Photomicrographs of the vasa hyaloidea propria were analysed from pre-term pups (1-2 days prior to birth), and on Days 1-11 post-birth. The OIR model involved exposing pups to similar to 90% O-2 from D1-5, followed by return to ambient air. At sampling times pups were anaesthetised and perfused with india ink. Retinal flatmounts were also incubated with FITC-lectin (BS-1, G. simplicifolia,); this labels all vessels, allowing identification of vessels not patent to the perfusate. Results: Mean diameter of proximal hyaloid vessels in preterm pups was 25.44 +/- 1.98 mum; +/-1 SEM). Within 3-12 hrs of birth, significant vasoconstriction was evident (diameter:12.45 +/- 0.88 mum), and normal hyaloid regression subsequently occurred. Similar vasoconstriction occurred in the O-2-treated group, but this was reversed upon return to room air, with significant dilation of proximal vessels by D7 (diameter: 31.75 +/- 11.99 mum) and distal hyaloid vessels subsequently became enlarged and tortuous. Conclusions: Under normal conditions, vasoconstriction of proximal hyaloid vessels occurs at birth, preceding attenuation of distal hyaloid vessels. Vasoconstriction also occurs in O-2-treated pups during treatment, but upon return to room air, the remaining hyaloid vessels dilate proximally, and the distal vessels become dilated and tortuous. These observations support the contention that regression of the hyaloid network is dependent, in the first instance, on proximal arterial vasoconstriction.
Resumo:
Improvements to peroxide oxidation methods for analysing acid sulfate soils (ASS) are introduced. The soil solution ratio has been increased to 1 : 40, titrations are performed in suspension, and the duration of the peroxide digest stage is substantially shortened. For 9 acid sulfate soils, the peroxide oxidisable sulfur value obtained using the improved method was compared with the reduced inorganic sulfur result obtained using the chromium reducible sulfur method. Their regression was highly significant, the slope of the regression line was not significantly different (P = 0.05) from unity, and the intercept not significantly different from zero. A complete sulfur budget for the improved method showed there was no loss of sulfur as has been reported for earlier peroxide oxidation techniques. When soils were very finely ground, efficient oxidation of sulfides was achieved, despite the milder digestion conditions. Highly sulfidic and organic soils were shown to be the most difficult to analyse using either the improved method or the chromium method. No single analytical method can be universally applied to all ASS, rather a suite of methods is necessary for a thorough understanding of many ASS. The improved peroxide method, in combination with the chromium method and the 4 M HCl extraction, form a sound platform for informed decision making on the management of acid sulfate soils.
Resumo:
We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.
Resumo:
Background and Objective: To examine if commonly recommended assumptions for multivariable logistic regression are addressed in two major epidemiological journals. Methods: Ninety-nine articles from the Journal of Clinical Epidemiology and the American Journal of Epidemiology were surveyed for 10 criteria: six dealing with computation and four with reporting multivariable logistic regression results. Results: Three of the 10 criteria were addressed in 50% or more of the articles. Statistical significance testing or confidence intervals were reported in all articles. Methods for selecting independent variables were described in 82%, and specific procedures used to generate the models were discussed in 65%. Fewer than 50% of the articles indicated if interactions were tested or met the recommended events per independent variable ratio of 10: 1. Fewer than 20% of the articles described conformity to a linear gradient, examined collinearity, reported information on validation procedures, goodness-of-fit, discrimination statistics, or provided complete information on variable coding. There was no significant difference (P >.05) in the proportion of articles meeting the criteria across the two journals. Conclusion: Articles reviewed frequently did not report commonly recommended assumptions for using multivariable logistic regression. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.
Resumo:
Background Regression to the mean (RTM) is a statistical phenomenon that can make natural variation in repeated data look like real change. It happens when unusually large or small measurements tend to be followed by measurements that are closer to the mean. Methods We give some examples of the phenomenon, and discuss methods to overcome it at the design and analysis stages of a study. Results The effect of RTM in a sample becomes more noticeable with increasing measurement error and when follow-up measurements are only examined on a sub-sample selected using a baseline value. Conclusions RTM is a ubiquitous phenomenon in repeated data and should always be considered as a possible cause of an observed change. Its effect can be alleviated through better study design and use of suitable statistical methods.
Resumo:
Background: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Resumo:
Promoted-ignition testing on carbon steel rods of varying cross-sectional area and shape was performed in high pressure oxygen to assess the effect of sample geometry on the regression rate of the melting interface. Cylindrical and rectangular geometries and three different cross sections were tested and the regression rates of the cylinders were compared to the regression rates of the rectangular samples at test pressures around 6.9 MPa. Tests were recorded and video analysis used to determine the regression rate of the melting interface by a new method based on a drop cycle which was found to provide a good basis for statistical analysis and provide excellent agreement to the standard averaging methods used. Both geometries tested showed the typical trend of decreasing regression rate of the melting interface with increasing cross-sectional area; however, it was shown that the effect of geometry is more significant as the sample's cross sections become larger. Discussion is provided regarding the use of 3.2-mm square rods rather than 3.2-mm cylindrical rods within the standard ASTM test and any effect this may have on the observed regression rate of the melting interface.
Resumo:
This paper critically assesses several loss allocation methods based on the type of competition each method promotes. This understanding assists in determining which method will promote more efficient network operations when implemented in deregulated electricity industries. The methods addressed in this paper include the pro rata [1], proportional sharing [2], loss formula [3], incremental [4], and a new method proposed by the authors of this paper, which is loop-based [5]. These methods are tested on a modified Nordic 32-bus network, where different case studies of different operating points are investigated. The varying results obtained for each allocation method at different operating points make it possible to distinguish methods that promote unhealthy competition from those that encourage better system operation.
Resumo:
We propose quadrature rules for the approximation of line integrals possessing logarithmic singularities and show their convergence. In some instances a superconvergence rate is demonstrated.
Resumo:
Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).
Resumo:
There are many techniques for electricity market price forecasting. However, most of them are designed for expected price analysis rather than price spike forecasting. An effective method of predicting the occurrence of spikes has not yet been observed in the literature so far. In this paper, a data mining based approach is presented to give a reliable forecast of the occurrence of price spikes. Combined with the spike value prediction techniques developed by the same authors, the proposed approach aims at providing a comprehensive tool for price spike forecasting. In this paper, feature selection techniques are firstly described to identify the attributes relevant to the occurrence of spikes. A simple introduction to the classification techniques is given for completeness. Two algorithms: support vector machine and probability classifier are chosen to be the spike occurrence predictors and are discussed in details. Realistic market data are used to test the proposed model with promising results.