921 resultados para Bayesian classifier
                                
Resumo:
Genetic recombination can produce heterogeneous phylogenetic histories within a set of homologous genes. Delineating recombination events is important in the study of molecular evolution, as inference of such events provides a clearer picture of the phylogenetic relationships among different gene sequences or genomes. Nevertheless, detecting recombination events can be a daunting task, as the performance of different recombination-detecting approaches can vary, depending on evolutionary events that take place after recombination. We previously evaluated the effects of post-recombination events on the prediction accuracy of recombination-detecting approaches using simulated nucleotide sequence data. The main conclusion, supported by other studies, is that one should not depend on a single method when searching for recombination events. In this paper, we introduce a two-phase strategy, applying three statistical measures to detect the occurrence of recombination events, and a Bayesian phylogenetic approach to delineate breakpoints of such events in nucleotide sequences. We evaluate the performance of these approaches using simulated data, and demonstrate the applicability of this strategy to empirical data. The two-phase strategy proves to be time-efficient when applied to large datasets, and yields high-confidence results.
                                
Resumo:
This paper examines the article system in interlanguage grammar focusing on Japanese learners of English, whose native language lacks articles. It will be demonstrated that for the acquisition of the English article system, count/mass distinctions and definiteness are the crucial factors. Although Japanese does not employ the article system to encode these aspects, it will be argued that they are nevertheless syntactically encoded through its classifier system. Hence, the problem for these learners must be to map these features onto the appropriate surface forms as the Missing Surface Inflection Hypothesis predicts (Prévost & White 2000). This suggestion will further be supported empirically by a fill-in-the article task. It will be concluded that these Japanese learners understand the English article system fairly well, possibly due to their native language, yet have problems with realizing the relevant features (i.e. count/mass distinctions and definiteness) in the target language.
                                
Resumo:
There are many techniques for electricity market price forecasting. However, most of them are designed for expected price analysis rather than price spike forecasting. An effective method of predicting the occurrence of spikes has not yet been observed in the literature so far. In this paper, a data mining based approach is presented to give a reliable forecast of the occurrence of price spikes. Combined with the spike value prediction techniques developed by the same authors, the proposed approach aims at providing a comprehensive tool for price spike forecasting. In this paper, feature selection techniques are firstly described to identify the attributes relevant to the occurrence of spikes. A simple introduction to the classification techniques is given for completeness. Two algorithms: support vector machine and probability classifier are chosen to be the spike occurrence predictors and are discussed in details. Realistic market data are used to test the proposed model with promising results.
                                
Resumo:
A significant problem in the collection of responses to potentially sensitive questions, such as relating to illegal, immoral or embarrassing activities, is non-sampling error due to refusal to respond or false responses. Eichhorn & Hayre (1983) suggested the use of scrambled responses to reduce this form of bias. This paper considers a linear regression model in which the dependent variable is unobserved but for which the sum or product with a scrambling random variable of known distribution, is known. The performance of two likelihood-based estimators is investigated, namely of a Bayesian estimator achieved through a Markov chain Monte Carlo (MCMC) sampling scheme, and a classical maximum-likelihood estimator. These two estimators and an estimator suggested by Singh, Joarder & King (1996) are compared. Monte Carlo results show that the Bayesian estimator outperforms the classical estimators in almost all cases, and the relative performance of the Bayesian estimator improves as the responses become more scrambled.
                                
Resumo:
Item noise models of recognition assert that interference at retrieval is generated by the words from the study list. Context noise models of recognition assert that interference at retrieval is generated by the contexts in which the test word has appeared. The authors introduce the bind cue decide model of episodic memory, a Bayesian context noise model, and demonstrate how it can account for data from the item noise and dual-processing approaches to recognition memory. From the item noise perspective, list strength and list length effects, the mirror effect for word frequency and concreteness, and the effects of the similarity of other words in a list are considered. From the dual-processing perspective, process dissociation data on the effects of length, temporal separation of lists, strength, and diagnosticity of context are examined. The authors conclude that the context noise approach to recognition is a viable alternative to existing approaches.
                                
Resumo:
Intelligent design theorist William Dembski has proposed an explanatory filter for distinguishing between events due to chance, lawful regularity or design. We show that if Dembski's filter were adopted as a scientific heuristic, some classical developments in science would not be rational, and that Dembski's assertion that the filter reliably identifies rarefied design requires ignoring the state of background knowledge. If background information changes even slightly, the filter's conclusion will vary wildly. Dembski fails to overcome Hume's objections to arguments from design.
                                
Resumo:
The phylogeny of the Australian legume genus Daviesia was estimated using sequences of the internal transcribed spacers of nuclear ribosomal DNA. Partial congruence was found with previous analyses using morphology, including strong support for monophyly of the genus and for a sister group relationship between the clade D. pachyloma and the rest of the genus. A previously unplaced bird-pollinated species, anceps + D. D. epiphyllum, was well supported as sister to the only other bird-pollinated species in the genus, D. speciosa, indicating a single origin of bird pollination in their common ancestor. Other morphological groups within Daviesia were not supported and require reassessment. A strong and previously unreported sister clade of Daviesia consists of the two monotypic genera Erichsenia and Viminaria. These share phyllode-like leaves and indehiscent fruits. The evolutionary history of cord roots, which have anomalous secondary thickening, was explored using parsimony. Cord roots are limited to three separate clades but have a complex history involving a small number of gains (most likely 0-3) and losses (0-5). The anomalous structure of cord roots ( adventitious vascular strands embedded in a parenchymatous matrix) may facilitate nutrient storage, and the roots may be contractile. Both functions may be related to a postfire resprouting adaptation. Alternatively, cord roots may be an adaptation to the low-nutrient lateritic soils of Western Australia. However, tests for association between root type, soil type, and growth habit were equivocal, depending on whether the variables were treated as phylogenetically dependent (insignificant) or independent ( significant).
                                
Resumo:
Butterflyfish are colourful, pan-tropical coastal fish that are important and distinctive members of coral reef communities. A successful systematic scheme and a robust phylogeny is considered essential in understanding further their biogeography and ecology, although recent cladistic treatments of butterflyfish phylogeny, based on soft tissue and bone morphology and coded at the generic and subgeneric levels, differ in character coding and subsequently tree topology. This study provides an independent test of the morphologically based hypotheses, using molecular systematic data from two partial mitochondrial gene fragments, cytochrome b (cytb) and small subunit rRNA (rrnS), for 52 ingroup chaetodontids and seven pomacanthids used to root the molecular trees. Individual gene trees were largely compatible and a combined molecular phylogeny, inferred from Bayesian analysis, was used to test alternative hypotheses suggested by morphological analyses. The tree was also used to map the latest morphological matrix in order to evaluate potential synapomorphies for various nodes defining butterflyfish interrelationships. A clade comprised of Chelmon and Coradion was sister group to other chaetodontids. Heniochus and Hemitaurichthys were each resolved as monophyletic groups, and as sister taxa Of the taxa sampled, Prognothodes was resolved as the sister genus to Chaeotodon. Of the ten Chaetodon subgenera sampled, all were monophyletic but their interrelationships differed significantly from that inferred from morphological characters. Lepidochaetodon was the most basal subgenus followed by Exornator and the remaining subgenera. Molecular data support the sister group relationship between Corallochaetodon and Citharoedus suggested by morphology, but major differences occur among the remaining more derived taxa. Chaetodon trifascialis and C. oligacanthus were resolved as sister taxa adding weight to the inclusion of the latter in C. Megaprotodon. Of those pairs of taxa known to hybridize and sampled with molecular data, all were closely related phylogenetically, except those hybrids known to occur in the Rabdophorus subgenus. Two base changes separated C. pelewensis from C. paucifasciatus which have been regarded previously as a single species. Cytb provided greater resolution than rrnS and will likely provide additional resolution with greater taxon sampling.
                                
Resumo:
An analysis of the relationships of the major arthropod groups Was undertaken using mitochondrial genome data to examine the hypotheses that Hexapoda is polyphyletic and that Collembola is more closely related to branchiopod crustaceans than insects. We sought to examine the sensitivity of this relationship to outgroup choice, data treatment. gene choice and optimality criteria used in the phylogenetic analysis of mitochondrial genome data. Additionally we sequenced the mitochondrial genome of ail archaeognathan, Nesomachilis australica. to improve taxon selection in the apterygote insects, a group poorly represented in previous mitochondrial phylogenies. The sister group of the Collembola was rarely resolved in our analyses with a significant level of support. The use of different outgroups (myriapods, nematodes, or annelids + mollusks) resulted in many different placements of Collembola. The way in which the dataset was coded for analysis (DNA, DNA with the exclusion of third codon position and as amino acids) also had marked affects on tree topology. We found that nodal Support was spread evenly throughout the 13 mitochondrial genes and the exclusion of genes resulted in significantly less resolution in the inferred trees. Optimality criteria had a much lesser effect on topology than the preceding factors; parsimony and Bayesian trees for a given data set and treatment were quite similar. We therefore conclude that the relationships of the extant arthropod groups as inferred by mitochondrial genomes are highly vulnerable to outgroup choice, data treatment and gene choice, and no consistent alternative hypothesis of Collembola's relationships is supported. Pending the resolution of these identified problems with the application of mitogenomic data to basal arthropod relationships, it is difficult to justify the rejection of hexapod monophyly, which is well supported on morphological grounds. (c) The Willi Hennig Society 2004.
                                
Resumo:
We discuss the expectation propagation (EP) algorithm for approximate Bayesian inference using a factorizing posterior approximation. For neural network models, we use a central limit theorem argument to make EP tractable when the number of parameters is large. For two types of models, we show that EP can achieve optimal generalization performance when data are drawn from a simple distribution.
                                
Resumo:
Background/Aims: Approximately four million Africans were taken as slaves to Brazil, where they interbred extensively with Amerindians and Europeans. We have previously shown that while most White Brazilians carry Y chromosomes of European origin, they display high proportions of African and Amerindian mtDNA lineages, because of sex-biased genetic admixture. Methods: We studied the Y chromosome and mtDNA haplogroup structure of 120 Black males from Sao Paulo, Brazil. Results: Only 48% of the Y chromosomes, but 85% of the mtDNA haplogroups were characteristic of sub-Saharan Africa, confirming our previous observation of sexually biased mating. We mined literature data for mtDNA and Y chromosome haplogroup frequencies for African native populations from regions involved in Atlantic Slave Trade. Principal Components Analysis and Bayesian analysis of population structure revealed no genetic differentiation of Y chromosome marker frequencies between the African regions. However, mtDNA examination unraveled considerable genetic structure, with three clusters at Central-West Africa, West Africa and Southeast Africa. A hypothesis is proposed to explain this structure. Conclusion: Using these mtDNA data we could obtain for the first time an estimate of the relative ancestral contribution of Central-West (0.445), West (0.431) and Southeast Africa (0.123) to African Brazilians from Sao Paulo. These estimates are consistent with historical information. Copyright (c) 2008 S. Karger AG, Basel.
                                
Resumo:
For the purpose of developing a longitudinal model to predict hand-and-foot syndrome (HFS) dynamics in patients receiving capecitabine, data from two large phase III studies were used. Of 595 patients in the capecitabine arms, 400 patients were randomly selected to build the model, and the other 195 were assigned for model validation. A score for risk of developing HFS was modeled using the proportional odds model, a sigmoidal maximum effect model driven by capecitabine accumulation as estimated through a kinetic-pharmacodynamic model and a Markov process. The lower the calculated creatinine clearance value at inclusion, the higher was the risk of HFS. Model validation was performed by visual and statistical predictive checks. The predictive dynamic model of HFS in patients receiving capecitabine allows the prediction of toxicity risk based on cumulative capecitabine dose and previous HFS grade. This dose-toxicity model will be useful in developing Bayesian individual treatment adaptations and may be of use in the clinic.
                                
Resumo:
Fogo selvagem (FS) is mediated by pathogenic, predominantly IgG4, anti-desmoglein 1 (Dsg1) autoantibodies and is endemic in Limao Verde, Brazil. IgG and IgG subclass autoantibodies were tested in a sample of 214 FS patients and 261 healthy controls by Dsg1 ELISA. For model selection, the sample was randomly divided into training (50%), validation (25%), and test (25%) sets. Using the training and validation sets, IgG4 was chosen as the best predictor of FS, with index values above 6.43 classified as FS. Using the test set, IgG4 has sensitivity of 92% (95% confidence interval (95% CI): 82-95%), specificity of 97% (95% CI: 89-100%), and area under the curve of 0.97 ( 95% CI: 0.94-1.00). The IgG4 positive predictive value (PPV) in Limao Verde (3% FS prevalence) was 49%. The sensitivity, specificity, and PPV of IgG anti-Dsg1 were 87, 91, and 23%, respectively. The IgG4-based classifier was validated by testing 11 FS patients before and after clinical disease and 60 Japanese pemphigus foliaceus patients. It classified 21 of 96 normal individuals from a Limao Verde cohort as having FS serology. On the basis of its PPV, half of the 21 individuals may currently have preclinical FS and could develop clinical disease in the future. Identifying individuals during preclinical FS will enhance our ability to identify the etiological agent(s) triggering FS.
                                
Resumo:
Hepatitis C virus (HCV) is a frequent cause of acute and chronic hepatitis and a leading cause for cirrhosis of the liver and hepatocellular carcinoma. HCV is classified in six major genotypes and more than 70 subtypes. In Colombian blood banks, serum samples were tested for anti-HCV antibodies using a third-generation ELISA. The aim of this study was to characterize the viral sequences in plasma of 184 volunteer blood donors who attended the ""Banco Nacional de Sangre de la Cruz Roja Colombiana,`` Bogota, Colombia. Three different HCV genomic regions were amplified by nested PCR. The first of these was a segment of 180 bp of the 5`UTR region to confirm the previous diagnosis by ELISA. From those that were positive to the 5`UTR region, two further segments were amplified for genotyping and subtyping by phylogenetic analysis: a segment of 380 bp from the NS5B region; and a segment of 391 bp from the E1 region. The distribution of HCV subtypes was: 1b (82.8%), 1a (5.7%), 2a (5.7%), 2b (2.8%), and 3a (2.8%). By applying Bayesian Markov chain Monte Carlo simulation, it was estimated that HCV-1b was introduced into Bogota around 1950. Also, this subtype spread at an exponential rate between about 1970 to about 1990, after which transmission of HCV was reduced by anti-HCV testing of this population. Among Colombian blood donors, HCV genotype 1b is the most frequent genotype, especially in large urban conglomerates such as Bogota, as is the case in other South American countries. J. Med. Virol. 82: 1889-1898, 2010. (C) 2010 Wiley-Liss, Inc.
                                
Resumo:
Molecular epidemiological data concerning the hepatitis B virus (HBV) in Chile are not known completely. Since the HBV genotype F is the most prevalent in the country, the goal of this study was to obtain full HBV genome sequences from patients infected chronically in order to determine their subgenotypes and the occurrence of resistance-associated mutations. Twenty-one serum samples from antiviral drug-naive patients with chronic hepatitis B were subjected to full-length PCR amplification, and both strands of the whole genomes were fully sequenced. Phylogenetic analyses were performed along with reference sequences available from GenBank (n = 290). The sequences were aligned using Clustal X and edited in the SE-AL software. Bayesian phylogenetic analyses were conducted by Markov Chain Monte Carlo simulations (MCMC) for 10 million generations in order to obtain the substitution tree using BEAST. The sequences were also analyzed for the presence of primary drug resistance mutations using CodonCode Aligner Software. The phylogenetic analyses indicated that all sequences were found to be the HBV subgenotype F1b, clustered into four different groups, suggesting that diverse lineages of this subgenotype may be circulating within this population of Chilean patients. J. Med. Virol. 83: 1530-1536, 2011. (C) 2011 Wiley-Liss, Inc.
 
                    