945 resultados para multivariate regression tree
Resumo:
The multivariate skew-t distribution (J Multivar Anal 79:93-113, 2001; J R Stat Soc, Ser B 65:367-389, 2003; Statistics 37:359-363, 2003) includes the Student t, skew-Cauchy and Cauchy distributions as special cases and the normal and skew-normal ones as limiting cases. In this paper, we explore the use of Markov Chain Monte Carlo (MCMC) methods to develop a Bayesian analysis of repeated measures, pretest/post-test data, under multivariate null intercept measurement error model (J Biopharm Stat 13(4):763-771, 2003) where the random errors and the unobserved value of the covariate (latent variable) follows a Student t and skew-t distribution, respectively. The results and methods are numerically illustrated with an example in the field of dentistry.
Resumo:
Considering the Wald, score, and likelihood ratio asymptotic test statistics, we analyze a multivariate null intercept errors-in-variables regression model, where the explanatory and the response variables are subject to measurement errors, and a possible structure of dependency between the measurements taken within the same individual are incorporated, representing a longitudinal structure. This model was proposed by Aoki et al. (2003b) and analyzed under the bayesian approach. In this article, considering the classical approach, we analyze asymptotic test statistics and present a simulation study to compare the behavior of the three test statistics for different sample sizes, parameter values and nominal levels of the test. Also, closed form expressions for the score function and the Fisher information matrix are presented. We consider two real numerical illustrations, the odontological data set from Hadgu and Koch (1999), and a quality control data set.
Resumo:
Skew-normal distribution is a class of distributions that includes the normal distributions as a special case. In this paper, we explore the use of Markov Chain Monte Carlo (MCMC) methods to develop a Bayesian analysis in a multivariate, null intercept, measurement error model [R. Aoki, H. Bolfarine, J.A. Achcar, and D. Leao Pinto Jr, Bayesian analysis of a multivariate null intercept error-in -variables regression model, J. Biopharm. Stat. 13(4) (2003b), pp. 763-771] where the unobserved value of the covariate (latent variable) follows a skew-normal distribution. The results and methods are applied to a real dental clinical trial presented in [A. Hadgu and G. Koch, Application of generalized estimating equations to a dental randomized clinical trial, J. Biopharm. Stat. 9 (1999), pp. 161-178].
Resumo:
The main purpose of this work is to study the behaviour of Skovgaard`s [Skovgaard, I.M., 2001. Likelihood asymptotics. Scandinavian journal of Statistics 28, 3-32] adjusted likelihood ratio statistic in testing simple hypothesis in a new class of regression models proposed here. The proposed class of regression models considers Dirichlet distributed observations, and the parameters that index the Dirichlet distributions are related to covariates and unknown regression coefficients. This class is useful for modelling data consisting of multivariate positive observations summing to one and generalizes the beta regression model described in Vasconcellos and Cribari-Neto [Vasconcellos, K.L.P., Cribari-Neto, F., 2005. Improved maximum likelihood estimation in a new class of beta regression models. Brazilian journal of Probability and Statistics 19,13-31]. We show that, for our model, Skovgaard`s adjusted likelihood ratio statistics have a simple compact form that can be easily implemented in standard statistical software. The adjusted statistic is approximately chi-squared distributed with a high degree of accuracy. Some numerical simulations show that the modified test is more reliable in finite samples than the usual likelihood ratio procedure. An empirical application is also presented and discussed. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Parkinson's disease (PD) is a degenerative illness whose cardinal symptoms include rigidity, tremor, and slowness of movement. In addition to its widely recognized effects PD can have a profound effect on speech and voice.The speech symptoms most commonly demonstrated by patients with PD are reduced vocal loudness, monopitch, disruptions of voice quality, and abnormally fast rate of speech. This cluster of speech symptoms is often termed Hypokinetic Dysarthria.The disease can be difficult to diagnose accurately, especially in its early stages, due to this reason, automatic techniques based on Artificial Intelligence should increase the diagnosing accuracy and to help the doctors make better decisions. The aim of the thesis work is to predict the PD based on the audio files collected from various patients.Audio files are preprocessed in order to attain the features.The preprocessed data contains 23 attributes and 195 instances. On an average there are six voice recordings per person, By using data compression technique such as Discrete Cosine Transform (DCT) number of instances can be minimized, after data compression, attribute selection is done using several WEKA build in methods such as ChiSquared, GainRatio, Infogain after identifying the important attributes, we evaluate attributes one by one by using stepwise regression.Based on the selected attributes we process in WEKA by using cost sensitive classifier with various algorithms like MultiPass LVQ, Logistic Model Tree(LMT), K-Star.The classified results shows on an average 80%.By using this features 95% approximate classification of PD is acheived.This shows that using the audio dataset, PD could be predicted with a higher level of accuracy.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
A estrutura horizontal e vertical do componente arbóreo foi investigada em um trecho de Floresta Atlântica baixo-montana através de um levantamento fitossociológico em dois blocos amostrais de 0,99 ha cada no Parque Estadual Intervales. Todos os indivíduos com DAP > 5 cm foram registrados. Foram amostrados 3.078 indivíduos distribuídos em 172 espécies. O índice de diversidade de Shannon foi de H' = 3,85 nat.ind.-1. A família Myrtaceae se destacou tanto em número de espécies (38) quanto em número de indivíduos (745) no levantamento. Euterpe edulis Mart. teve o maior valor de importância (33,98%), abrangendo 21,8% do total de indivíduos registrados. O índice de similaridade quantitativo foi maior do que o qualitativo, mostrando pouca variação estrutural entre os blocos amostrais, mas a grande quantidade de espécies pouco abundantes, resultou em pronunciadas diferenças florísticas entre eles. Uma análise de correspondência retificada (DCA) gerou três estratos verticais arbitrários. O estrato A (> 26 m) teve a menor densidade e foi bem representado pelas espécies Sloanea guianensis (Aubl.) Benth. e Virola bicuhyba (Schott. ex A.DC.) Warb. O estrato B (8 m < h < 26 m) mostrou a maior riqueza e diversidade florística, e o estrato C (< 8 m) a maior densidade. Euterpe edulis, Guapira opposita (Vell.) Reitz, Garcinia gardneriana (Planch. & Triana) Zappi e Eugenia mosenii (Kausel) Sobral foram bem representadas nos estratos B e C da floresta. A existência de estratos verticais em florestas tropicais é discutida, recomendando-se o uso da DCA para estudos da estratificação vertical em outras florestas tropicais.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Several Brazilian commercial gasoline physicochemical parameters, such as relative density, distillation curve (temperatures related to 10%, 50% and 90% of distilled volume, final boiling point and residue), octane numbers (motor and research octane number and anti-knock index), hydrocarbon compositions (olefins, aromatics and saturates) and anhydrous ethanol and benzene content was predicted from chromatographic profiles obtained by flame ionization detection (GC-FID) and using partial least square regression (PLS). GC-FID is a technique intensively used for fuel quality control due to its convenience, speed, accuracy and simplicity and its profiles are much easier to interpret and understand than results produced by other techniques. Another advantage is that it permits association with multivariate methods of analysis, such as PLS. The chromatogram profiles were recorded and used to deploy PLS models for each property. The standard error of prediction (SEP) has been the main parameter considered to select the "best model". Most of GC-FID-PLS results, when compared to those obtained by the Brazilian Government Petroleum, Natural Gas and Biofuels Agency - ANP Regulation 309 specification methods, were very good. In general, all PLS models developed in these work provide unbiased predictions with lows standard error of prediction and percentage average relative error (below 11.5 and 5.0, respectively). (C) 2007 Elsevier B.V. All rights reserved.
Multivariate quality control studies applied to Ca(II) and Mg(II) determination by a portable method
Resumo:
A portable or field test method for simultaneous spectrophotometric determination of calcium and magnesium in water using multivariate partial least squares (PLS) calibration methods is proposed. The method is based on the reaction between the analytes and methylthymol blue at pH 11. The spectral information was used as the X-block, and the Ca(II) and Mg(II) concentrations obtained by a reference technique (ICP-AES) were used as the Y-block. Two series of analyses were performed, with a month's difference between them. The first series was used as the calibration set and the second one as the validation set. Multivariate statistical process control (MSPC) techniques, based on statistics from principal component models, were used to study the features and evolution with time of the spectral signals. Signal standardization was used to correct the deviations between series. Method validation was performed by comparing the predictions of the PLS model with the reference Ca(II) and Mg(II) concentrations determined by ICP-AES using the joint interval test for the slope and intercept of the regression line with errors in both axes. (C) 1998 John Wiley & Sons, Ltd.
Resumo:
In this paper is reported the use of the chromatographic profiles of volatiles to determine disease markers in plants - in this case, leaves of Eucalyptus globulus contaminated by the necrotroph fungus Teratosphaeria nubilosa. The volatile fraction was isolated by headspace solid phase microextraction (HS-SPME) and analyzed by comprehensive two-dimensional gas chromatography-fast quadrupole mass spectrometry (GC. ×. GC-qMS). For the correlation between the metabolic profile described by the chromatograms and the presence of the infection, unfolded-partial least squares discriminant analysis (U-PLS-DA) with orthogonal signal correction (OSC) were employed. The proposed method was checked to be independent of factors such as the age of the harvested plants. The manipulation of the mathematical model obtained also resulted in graphic representations similar to real chromatograms, which allowed the tentative identification of more than 40 compounds potentially useful as disease biomarkers for this plant/pathogen pair. The proposed methodology can be considered as highly reliable, since the diagnosis is based on the whole chromatographic profile rather than in the detection of a single analyte. © 2013 Elsevier B.V..
Resumo:
Pós-graduação em Agronomia (Energia na Agricultura) - FCA
Resumo:
Introduction: This systematic review and meta-regression analysis aimed to calculate a combined prevalence estimate and evaluate the prevalence of different Treponema species in primary and secondary endodontic infections, including symptomatic and asymptomatic eases. Methods: The MEDLINE/PubMed, Embase, Scielo, Web of Knowledge, and Scopus data-bases were searched without starting date restriction up to and including March 2014. Only reports in English were included. The selected literature was reviewed by 2 authors and classified as suitable or not to be included in this review. Lists were compared, and, in case of disagreements, decisions were made after a discussion based on inclusion and exclusion criteria. A pooled prevalence of Treponema species in endodontic infections was estimated. Additionally, a meta-regression analysis was performed. Results: Among the 265 articles identified in the initial search, only 51 were included in the final analysis. The studies were classified into 2 different groups according to the type of endodontic infection and whether it was an exclusively primary/secondary study (n = 36) or a primary/secondary comparison (n = 15). The pooled prevalence of Treponema species was 41.5% (95% confidence interval, 35.9-47.0). In the multivariate model of meta-regression analysis, primary endodontic infections (P < .001), acute apical abscess, symptomatic apical periodontitis (P < .001), and concomitant presence of 2 or more species (P = .028) explained the heterogeneity regarding the prevalence rates of Treponema species. Conclusions: Our findings suggest that Treponema species are important pathogens involved in endodontic infections, particularly in cases of primary and acute infections.
Resumo:
Petroleum contamination impact on macrobenthic communities in the northeast portion of Todos os Santos Bay was assessed combining in multivariate analyses, chemical parameters such as aliphatic and polycyclic aromatic hydrocarbon indices and concentration ratios with benthic ecological parameters. Sediment samples were taken in August 2000 with a 0.05 m(2) van Veen grab at 28 sampling locations. The predominance of n-alkanes with more than 24 carbons, together with CPI values close to one, and the fact that most of the stations showed UCM/resolved aliphatic hydrocarbons ratios (UCM:R) higher than two, indicated a high degree of anthropogenic contribution, the presence of terrestrial plant detritus, petroleum products and evidence of chronic oil pollution. The indices used to determine the origin of PAH indicated the occurrence of a petrogenic contribution. A pyrolytic contribution constituted mainly by fossil fuel combustion derived PAH was also observed. The results of the stepwise multiple regression analysis performed with chemical data and benthic ecological descriptors demonstrated that not only total PAH concentrations but also specific concentration ratios or indices such as >= C24:< C24, An/178 and Fl/Fl + Py, are determining the structure of benthic communities within the study area. According to the BIO-ENV results petroleum related variables seemed to have a main influence on macrofauna community structure. The PCA ordination performed with the chemical data resulted in the formation of three groups of stations. The decrease in macrofauna density, number of species and diversity from groups III to I seemed to be related to the occurrence of high aliphatic hydrocarbon and PAH concentrations associated with fine sediments. Our results showed that macrobenthic communities in the northeast portion of Todos os Santos Bay are subjected to the impact of chronic oil pollution as was reflected by the reduction in the number of species and diversity. These results emphasise the importance to combine in multivariate approaches not only total hydrocarbon concentrations but also indices, isomer pair ratios and specific compound concentrations with biological data to improve the assessment of anthropogenic impact on marine ecosystems. (c) 2008 Elsevier Ltd. All rights reserved.
Resumo:
This paper presents a survey of evolutionary algorithms that are designed for decision-tree induction. In this context, most of the paper focuses on approaches that evolve decision trees as an alternate heuristics to the traditional top-down divide-and-conquer approach. Additionally, we present some alternative methods that make use of evolutionary algorithms to improve particular components of decision-tree classifiers. The paper's original contributions are the following. First, it provides an up-to-date overview that is fully focused on evolutionary algorithms and decision trees and does not concentrate on any specific evolutionary approach. Second, it provides a taxonomy, which addresses works that evolve decision trees and works that design decision-tree components by the use of evolutionary algorithms. Finally, a number of references are provided that describe applications of evolutionary algorithms for decision-tree induction in different domains. At the end of this paper, we address some important issues and open questions that can be the subject of future research.