921 resultados para Bayesian classifier
Resumo:
A new genus and species of microteiid lizard is described based on a series of specimens obtained at Parque Nacional do Caparao (20 degrees 28'S, 41 degrees 49'W), southeastern Brazil, along the division line between the States of Minas Gerais and Espirito Santo. The new lizard occurs in isolated high-altitude, open, rocky habitats above the altitudinal lit-nits of the Atlantic forest. It is characterized by the presence of prefrontals, frontoparietals, parietals, interparietal, and occipital scales; ear opening and eyelid distinct; three pairs of genials; absence of collar; lanceolate and mucronate dorsal scales; six regular transverse and longitudinal series of smooth ventrals that are longer than wide, with the lateral ones narrower. Maximum parsimony (MP) and partitioned Bayesian (PBA) phylogenetic analyses based on morphological and molecular characters with all known genera of Gymnophthalminae (except for Scriptosaura) Plus Rhachisaurus recovered this new lizard in a clade having Colobodactylus and Heterodactylus as its closest relatives. Both analyses recovered the monophyly of Gymnophthalminae and Gymnophthalmini. The monophyly of the Heterodactylini received moderate support in MP analyses but was not recovered in PBA. To eliminate classification controversy between these results, the present concept of Heterodactylini is restricted to accommodate the new genus, Colobodactylus and Heterodactylus, and a new tribe Iphisiini is proposed to allocate Alexandresaurus, Iphisa, Colobosaura, Acratosaura, and Stenolepis. Current phylogenetic knowledge of Gymnophthalminae suggests that fossoriality and increase of body elongation arose as adaptive responses to avoid extreme surface temperatures, either cold or hot, depending on circumstances.
Resumo:
Background: Plasmodium vivax malaria is a major public health challenge in Latin America, Asia and Oceania, with 130-435 million clinical cases per year worldwide. Invasion of host blood cells by P. vivax mainly depends on a type I membrane protein called Duffy binding protein (PvDBP). The erythrocyte-binding motif of PvDBP is a 170 amino-acid stretch located in its cysteine-rich region II (PvDBP(II)), which is the most variable segment of the protein. Methods: To test whether diversifying natural selection has shaped the nucleotide diversity of PvDBP(II) in Brazilian populations, this region was sequenced in 122 isolates from six different geographic areas. A Bayesian method was applied to test for the action of natural selection under a population genetic model that incorporates recombination. The analysis was integrated with a structural model of PvDBP(II), and T-and B-cell epitopes were localized on the 3-D structure. Results: The results suggest that: (i) recombination plays an important role in determining the haplotype structure of PvDBP(II), and (ii) PvDBP(II) appears to contain neutrally evolving codons as well as codons evolving under natural selection. Diversifying selection preferentially acts on sites identified as epitopes, particularly on amino acid residues 417, 419, and 424, which show strong linkage disequilibrium. Conclusions: This study shows that some polymorphisms of PvDBP(II) are present near the erythrocyte-binding domain and might serve to elude antibodies that inhibit cell invasion. Therefore, these polymorphisms should be taken into account when designing vaccines aimed at eliciting antibodies to inhibit erythrocyte invasion.
Sensitivity to noise and ergodicity of an assembly line of cellular automata that classifies density
Resumo:
We investigate the sensitivity of the composite cellular automaton of H. Fuks [Phys. Rev. E 55, R2081 (1997)] to noise and assess the density classification performance of the resulting probabilistic cellular automaton (PCA) numerically. We conclude that the composite PCA performs the density classification task reliably only up to very small levels of noise. In particular, it cannot outperform the noisy Gacs-Kurdyumov-Levin automaton, an imperfect classifier, for any level of noise. While the original composite CA is nonergodic, analyses of relaxation times indicate that its noisy version is an ergodic automaton, with the relaxation times decaying algebraically over an extended range of parameters with an exponent very close (possibly equal) to the mean-field value.
Resumo:
Context tree models have been introduced by Rissanen in [25] as a parsimonious generalization of Markov models. Since then, they have been widely used in applied probability and statistics. The present paper investigates non-asymptotic properties of two popular procedures of context tree estimation: Rissanen's algorithm Context and penalized maximum likelihood. First showing how they are related, we prove finite horizon bounds for the probability of over- and under-estimation. Concerning overestimation, no boundedness or loss-of-memory conditions are required: the proof relies on new deviation inequalities for empirical probabilities of independent interest. The under-estimation properties rely on classical hypotheses for processes of infinite memory. These results improve on and generalize the bounds obtained in Duarte et al. (2006) [12], Galves et al. (2008) [18], Galves and Leonardi (2008) [17], Leonardi (2010) [22], refining asymptotic results of Buhlmann and Wyner (1999) [4] and Csiszar and Talata (2006) [9]. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Background: The aim of this study was to estimate the prevalence of fibromyalgia, as well as to assess the major symptoms of this syndrome in an adult, low socioeconomic status population assisted by the primary health care system in a city in Brazil. Methods: We cross-sectionally sampled individuals assisted by the public primary health care system (n = 768, 35-60 years old). Participants were interviewed by phone and screened about pain. They were then invited to be clinically assessed (304 accepted). Pain was estimated using a Visual Analogue Scale (VAS). Fibromyalgia was assessed using the Fibromyalgia Impact Questionnaire (FIQ), as well as screening for tender points using dolorimetry. Statistical analyses included Bayesian Statistics and the Kruskal-Wallis Anova test (significance level = 5%). Results: From the phone-interview screening, we divided participants (n = 768) in three groups: No Pain (NP) (n = 185); Regional Pain (RP) (n = 388) and Widespread Pain (WP) (n = 106). Among those participating in the clinical assessments, (304 subjects), the prevalence of fibromyalgia was 4.4% (95% confidence interval [2.6%; 6.3%]). Symptoms of pain (VAS and FIQ), feeling well, job ability, fatigue, morning tiredness, stiffness, anxiety and depression were statically different among the groups. In multivariate analyses we found that individuals with FM and WP had significantly higher impairment than those with RP and NP. FM and WP were similarly disabling. Similarly, RP was no significantly different than NP. Conclusion: Fibromyalgia is prevalent in the low socioeconomic status population assisted by the public primary health care system. Prevalence was similar to other studies (4.4%) in a more diverse socioeconomic population. Individuals with FM and WP have significant impact in their well being.
Resumo:
Background: There are several studies in the literature depicting measurement error in gene expression data and also, several others about regulatory network models. However, only a little fraction describes a combination of measurement error in mathematical regulatory networks and shows how to identify these networks under different rates of noise. Results: This article investigates the effects of measurement error on the estimation of the parameters in regulatory networks. Simulation studies indicate that, in both time series (dependent) and non-time series (independent) data, the measurement error strongly affects the estimated parameters of the regulatory network models, biasing them as predicted by the theory. Moreover, when testing the parameters of the regulatory network models, p-values computed by ignoring the measurement error are not reliable, since the rate of false positives are not controlled under the null hypothesis. In order to overcome these problems, we present an improved version of the Ordinary Least Square estimator in independent (regression models) and dependent (autoregressive models) data when the variables are subject to noises. Moreover, measurement error estimation procedures for microarrays are also described. Simulation results also show that both corrected methods perform better than the standard ones (i.e., ignoring measurement error). The proposed methodologies are illustrated using microarray data from lung cancer patients and mouse liver time series data. Conclusions: Measurement error dangerously affects the identification of regulatory network models, thus, they must be reduced or taken into account in order to avoid erroneous conclusions. This could be one of the reasons for high biological false positive rates identified in actual regulatory network models.
Resumo:
Morphological and molecular analyses have proven to be complementary tools of taxonomic information for the redescription of the ctenostome bryozoans Amathia brasiliensis Busk, 1886 and Amathia distans Busk, 1886. The two species, originally described from material collected by the `Challenger` expedition but synonymized by later authors, now have their status fixed by means of the selection of lectotypes, morphological observations and analyses of DNA sequences described here. The morphological characters allowing the identification of living and/or preserved specimens are (1) A. brasiliensis: whitish-pale pigment spots in the frontal surface of stolons and zooids, and a wide stolon with biserial zooid clusters growing in clockwise and anti-clockwise spirals along it, the spirality direction being maintained from maternal to daughter stolons; and (2) A. distans: bright yellow pigment spots in stolonal and zooidal surfaces including lophophores, and a slender stolon, thickly cuticularized, with biserial zooid clusters growing in clockwise and anti-clockwise spirals along it and the spirality direction not maintained from maternal to daughter stolons. Pairwise comparisons of DNA sequences of the mitochondrial genes cytochrome c oxidase subunit I and large ribosomal RNA subunit revealed deep genetic divergence between A. brasiliensis and A. distans. Finally, analyses of those sequences within a Bayesian phylogenetic context recovered their genealogical species status.
Resumo:
Stream discharge-concentration relationships are indicators of terrestrial ecosystem function. Throughout the Amazon and Cerrado regions of Brazil rapid changes in land use and land cover may be altering these hydrochemical relationships. The current analysis focuses on factors controlling the discharge-calcium (Ca) concentration relationship since previous research in these regions has demonstrated both positive and negative slopes in linear log(10)discharge-log(10)Ca concentration regressions. The objective of the current study was to evaluate factors controlling stream discharge-Ca concentration relationships including year, season, stream order, vegetation cover, land use, and soil classification. It was hypothesized that land use and soil class are the most critical attributes controlling discharge-Ca concentration relationships. A multilevel, linear regression approach was utilized with data from 28 streams throughout Brazil. These streams come from three distinct regions and varied broadly in watershed size (< 1 to > 10(6) ha) and discharge (10(-5.7)-10(3.2) m(3) s(-1)). Linear regressions of log(10)Ca versus log(10)discharge in 13 streams have a preponderance of negative slopes with only two streams having significant positive slopes. An ANOVA decomposition suggests the effect of discharge on Ca concentration is large but variable. Vegetation cover, which incorporates aspects of land use, explains the largest proportion of the variance in the effect of discharge on Ca followed by season and year. In contrast, stream order, land use, and soil class explain most of the variation in stream Ca concentration. In the current data set, soil class, which is related to lithology, has an important effect on Ca concentration but land use, likely through its effect on runoff concentration and hydrology, has a greater effect on discharge-concentration relationships.
Resumo:
A simultaneous optimization strategy based on a neuro-genetic approach is proposed for selection of laser induced breakdown spectroscopy operational conditions for the simultaneous determination of macronutrients (Ca, Mg and P), micro-nutrients (B, Cu, Fe, Mn and Zn), Al and Si in plant samples. A laser induced breakdown spectroscopy system equipped with a 10 Hz Q-switched Nd:YAG laser (12 ns, 532 nm, 140 mJ) and an Echelle spectrometer with intensified coupled-charge device was used. Integration time gate, delay time, amplification gain and number of pulses were optimized. Pellets of spinach leaves (NIST 1570a) were employed as laboratory samples. In order to find a model that could correlate laser induced breakdown spectroscopy operational conditions with compromised high peak areas of all elements simultaneously, a Bayesian Regularized Artificial Neural Network approach was employed. Subsequently, a genetic algorithm was applied to find optimal conditions for the neural network model, in an approach called neuro-genetic, A single laser induced breakdown spectroscopy working condition that maximizes peak areas of all elements simultaneously, was obtained with the following optimized parameters: 9.0 mu s integration time gate, 1.1 mu s delay time, 225 (a.u.) amplification gain and 30 accumulated laser pulses. The proposed approach is a useful and a suitable tool for the optimization process of such a complex analytical problem. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Introduction: Internet users are increasingly using the worldwide web to search for information relating to their health. This situation makes it necessary to create specialized tools capable of supporting users in their searches. Objective: To apply and compare strategies that were developed to investigate the use of the Portuguese version of Medical Subject Headings (MeSH) for constructing an automated classifier for Brazilian Portuguese-language web-based content within or outside of the field of healthcare, focusing on the lay public. Methods: 3658 Brazilian web pages were used to train the classifier and 606 Brazilian web pages were used to validate it. The strategies proposed were constructed using content-based vector methods for text classification, such that Naive Bayes was used for the task of classifying vector patterns with characteristics obtained through the proposed strategies. Results: A strategy named InDeCS was developed specifically to adapt MeSH for the problem that was put forward. This approach achieved better accuracy for this pattern classification task (0.94 sensitivity, specificity and area under the ROC curve). Conclusions: Because of the significant results achieved by InDeCS, this tool has been successfully applied to the Brazilian healthcare search portal known as Busca Saude. Furthermore, it could be shown that MeSH presents important results when used for the task of classifying web-based content focusing on the lay public. It was also possible to show from this study that MeSH was able to map out mutable non-deterministic characteristics of the web. (c) 2010 Elsevier Inc. All rights reserved.
Resumo:
Age-related changes in running kinematics have been reported in the literature using classical inferential statistics. However, this approach has been hampered by the increased number of biomechanical gait variables reported and subsequently the lack of differences presented in these studies. Data mining techniques have been applied in recent biomedical studies to solve this problem using a more general approach. In the present work, we re-analyzed lower extremity running kinematic data of 17 young and 17 elderly male runners using the Support Vector Machine (SVM) classification approach. In total, 31 kinematic variables were extracted to train the classification algorithm and test the generalized performance. The results revealed different accuracy rates across three different kernel methods adopted in the classifier, with the linear kernel performing the best. A subsequent forward feature selection algorithm demonstrated that with only six features, the linear kernel SVM achieved 100% classification performance rate, showing that these features provided powerful combined information to distinguish age groups. The results of the present work demonstrate potential in applying this approach to improve knowledge about the age-related differences in running gait biomechanics and encourages the use of the SVM in other clinical contexts. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
A hybrid system to automatically detect, locate and classify disturbances affecting power quality in an electrical power system is presented in this paper. The disturbances characterized are events from an actual power distribution system simulated by the ATP (Alternative Transients Program) software. The hybrid approach introduced consists of two stages. In the first stage, the wavelet transform (WT) is used to detect disturbances in the system and to locate the time of their occurrence. When such an event is flagged, the second stage is triggered and various artificial neural networks (ANNs) are applied to classify the data measured during the disturbance(s). A computational logic using WTs and ANNs together with a graphical user interface (GU) between the algorithm and its end user is then implemented. The results obtained so far are promising and suggest that this approach could lead to a useful application in an actual distribution system. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
This paper analyses the presence of financial constraint in the investment decisions of 367 Brazilian firms from 1997 to 2004, using a Bayesian econometric model with group-varying parameters. The motivation for this paper is the use of clustering techniques to group firms in a totally endogenous form. In order to classify the firms we used a hybrid clustering method, that is, hierarchical and non-hierarchical clustering techniques jointly. To estimate the parameters a Bayesian approach was considered. Prior distributions were assumed for the parameters, classifying the model in random or fixed effects. Ordinate predictive density criterion was used to select the model providing a better prediction. We tested thirty models and the better prediction considers the presence of 2 groups in the sample, assuming the fixed effect model with a Student t distribution with 20 degrees of freedom for the error. The results indicate robustness in the identification of financial constraint when the firms are classified by the clustering techniques. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
One of the most important recent improvements in cardiology is the use of ventricular assist devices (VADs) to help patients with severe heart diseases, especially when they are indicated to heart transplantation. The Institute Dante Pazzanese of Cardiology has been developing an implantable centrifugal blood pump that will be able to help a sick human heart to keep blood flow and pressure at physiological levels. This device will be used as a totally or partially implantable VAD. Therefore, an improvement on device performance is important for the betterment of the level of interaction with patient`s behavior or conditions. But some failures may occur if the device`s pumping control does not follow the changes in patient`s behavior or conditions. The VAD control system must consider tolerance to faults and have a dynamic adaptation according to patient`s cardiovascular system changes, and also must attend to changes in patient conditions, behavior, or comportments. This work proposes an application of the mechatronic approach to this class of devices based on advanced techniques for control, instrumentation, and automation to define a method for developing a hierarchical supervisory control system that is able to perform VAD control dynamically, automatically, and securely. For this methodology, we used concepts based on Bayesian network for patients` diagnoses, Petri nets to generate a VAD control algorithm, and Safety Instrumented Systems to ensure VAD system security. Applying these concepts, a VAD control system is being built for method effectiveness confirmation.
Resumo:
Safety Instrumented Systems (SIS) are designed to prevent and / or mitigate accidents, avoiding undesirable high potential risk scenarios, assuring protection of people`s health, protecting the environment and saving costs of industrial equipment. The design of these systems require formal methods for ensuring the safety requirements, but according material published in this area, has not identified a consolidated procedure to match the task. This sense, this article introduces a formal method for diagnosis and treatment of critical faults based on Bayesian network (BN) and Petri net (PN). This approach considers diagnosis and treatment for each safety instrumented function (SIF) including hazard and operability (HAZOP) study in the equipment or system under control. It also uses BN and Behavioral Petri net (BPN) for diagnoses and decision-making and the PN for the synthesis, modeling and control to be implemented by Safety Programmable Logic Controller (PLC). An application example considering the diagnosis and treatment of critical faults is presented and illustrates the methodology proposed.