64 resultados para Bayesian statistic
Resumo:
Proceedings of the 11th Australasian Remote Sensing and Photogrammetry Conference
Resumo:
Genetic recombination can produce heterogeneous phylogenetic histories within a set of homologous genes. Delineating recombination events is important in the study of molecular evolution, as inference of such events provides a clearer picture of the phylogenetic relationships among different gene sequences or genomes. Nevertheless, detecting recombination events can be a daunting task, as the performance of different recombination-detecting approaches can vary, depending on evolutionary events that take place after recombination. We recently evaluated the effects of post-recombination events on the prediction accuracy of recombination-detecting approaches using simulated nucleotide sequence data. The main conclusion, supported by other studies, is that one should not depend on a single method when searching for recombination events. In this paper, we introduce a two-phase strategy, applying three statistical measures to detect the occurrence of recombination events, and a Bayesian phylogenetic approach in delineating breakpoints of such events in nucleotide sequences. We evaluate the performance of these approaches using simulated data, and demonstrate the applicability of this strategy to empirical data. The two-phase strategy proves to be time-efficient when applied to large datasets, and yields high-confidence results.
Resumo:
The received view of an ad hoc hypothesis is that it accounts for only the observation(s) it was designed to account for, and so non-adhocness is generally held to be necessary or important for an introduced hypothesis or modification to a theory. Attempts by Popper and several others to convincingly explicate this view, however, prove to be unsuccessful or of doubtful value, and familiar and firmer criteria for evaluating the hypotheses or modified theories so classified are characteristically available. These points are obscured largely because the received view fails to adequately separate psychology from methodology or to recognise ambiguities in the use of 'ad hoc'.
Resumo:
The generalized Gibbs sampler (GGS) is a recently developed Markov chain Monte Carlo (MCMC) technique that enables Gibbs-like sampling of state spaces that lack a convenient representation in terms of a fixed coordinate system. This paper describes a new sampler, called the tree sampler, which uses the GGS to sample from a state space consisting of phylogenetic trees. The tree sampler is useful for a wide range of phylogenetic applications, including Bayesian, maximum likelihood, and maximum parsimony methods. A fast new algorithm to search for a maximum parsimony phylogeny is presented, using the tree sampler in the context of simulated annealing. The mathematics underlying the algorithm is explained and its time complexity is analyzed. The method is tested on two large data sets consisting of 123 sequences and 500 sequences, respectively. The new algorithm is shown to compare very favorably in terms of speed and accuracy to the program DNAPARS from the PHYLIP package.
Resumo:
Genetic recombination can produce heterogeneous phylogenetic histories within a set of homologous genes. Delineating recombination events is important in the study of molecular evolution, as inference of such events provides a clearer picture of the phylogenetic relationships among different gene sequences or genomes. Nevertheless, detecting recombination events can be a daunting task, as the performance of different recombination-detecting approaches can vary, depending on evolutionary events that take place after recombination. We previously evaluated the effects of post-recombination events on the prediction accuracy of recombination-detecting approaches using simulated nucleotide sequence data. The main conclusion, supported by other studies, is that one should not depend on a single method when searching for recombination events. In this paper, we introduce a two-phase strategy, applying three statistical measures to detect the occurrence of recombination events, and a Bayesian phylogenetic approach to delineate breakpoints of such events in nucleotide sequences. We evaluate the performance of these approaches using simulated data, and demonstrate the applicability of this strategy to empirical data. The two-phase strategy proves to be time-efficient when applied to large datasets, and yields high-confidence results.
Resumo:
Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).
Resumo:
A significant problem in the collection of responses to potentially sensitive questions, such as relating to illegal, immoral or embarrassing activities, is non-sampling error due to refusal to respond or false responses. Eichhorn & Hayre (1983) suggested the use of scrambled responses to reduce this form of bias. This paper considers a linear regression model in which the dependent variable is unobserved but for which the sum or product with a scrambling random variable of known distribution, is known. The performance of two likelihood-based estimators is investigated, namely of a Bayesian estimator achieved through a Markov chain Monte Carlo (MCMC) sampling scheme, and a classical maximum-likelihood estimator. These two estimators and an estimator suggested by Singh, Joarder & King (1996) are compared. Monte Carlo results show that the Bayesian estimator outperforms the classical estimators in almost all cases, and the relative performance of the Bayesian estimator improves as the responses become more scrambled.
Resumo:
Background: The Perceived Need for Care Questionnaire (PNCQ) was designed for the Australian National Survey of Mental Health and Wellbeing. The PNCQ complemented collection of data on diagnosis and disability with the survey participants' perceptions of their needs for mental health care and the meeting of those needs. The four-stage design of the PNCQ mimics a conversational exploration of the topic of perceived needs. Five categories of perceived need are each assigned to one of four levels of perceived need (no need, unmet need, partially met need and met need). For unmet need and partially met need, information on barriers to care is collected, Methods: Inter-rater reliabilities of perceived needs assessed by the PNCQ were examined in a study of 145 anxiety clinic attenders. Construct validity of these items was tested, using a multi-trait multi-method approach and hypotheses regarding extreme groups, in a study with a sample of 51 general practice and community psychiatric service patients. Results: The instrument is brief to administer and has proved feasible for use in various settings. Inter-rater reliabilities for major categories, measured by the kappa statistic, exceeded 0.60 in most cases; for the summary category of all perceived needs, inter-rater reliability was 0.62. The multi-trait multi-method approach lent support to the construct validity of the instrument, as did findings in extreme groups. Conclusions: The PNCQ shows acceptable feasibility, reliability and validity, adding to the range of assessment tools available for epidemiological and health services research.
Resumo:
Item noise models of recognition assert that interference at retrieval is generated by the words from the study list. Context noise models of recognition assert that interference at retrieval is generated by the contexts in which the test word has appeared. The authors introduce the bind cue decide model of episodic memory, a Bayesian context noise model, and demonstrate how it can account for data from the item noise and dual-processing approaches to recognition memory. From the item noise perspective, list strength and list length effects, the mirror effect for word frequency and concreteness, and the effects of the similarity of other words in a list are considered. From the dual-processing perspective, process dissociation data on the effects of length, temporal separation of lists, strength, and diagnosticity of context are examined. The authors conclude that the context noise approach to recognition is a viable alternative to existing approaches.
Resumo:
Intelligent design theorist William Dembski has proposed an explanatory filter for distinguishing between events due to chance, lawful regularity or design. We show that if Dembski's filter were adopted as a scientific heuristic, some classical developments in science would not be rational, and that Dembski's assertion that the filter reliably identifies rarefied design requires ignoring the state of background knowledge. If background information changes even slightly, the filter's conclusion will vary wildly. Dembski fails to overcome Hume's objections to arguments from design.
Resumo:
We have previously found an association between variations in schizophrenia birth rates and varyinglevels of perinatal sunshine duration. This study examines whether such an association can also be found for Ža. affective psychosis, and Žb. broadly defined nonaffective psychoses. Data for individuals born between 1931 and 1970 in Australia with ICD9 Other PsychosisŽ295–299.were obtained from the Queensland Mental Health Statistical System. ‘Affective psychosis’ included affective psychosis, schizo-affective psychosis, and depressive and excitative non-organic psychoses. ‘Non-affective psychosis’ included chizophrenia, paranoid disorders and other non-organic psychoses. Those receiving both affective and non-affective psychotic diagnoses were excluded. Rates per 10,000 live monthly general population births were calculated. For each month, we assessed the agreementŽusing the kappa statistic. between trends in Ža. birth rates and Žb. long-term trends in seasonally adjusted perinatal sunshine duration. The analyses were performed separately for males and females. There were 6265 with non-affective psychosis ŽMs3964 rate 66r10,000; Fs2299 44r10,000. and 2858 with affective psychosisŽMs1392 24r10,000; Fs1466 28r10,000.. There were no significant associations between Ža. affective psychosis birth rates for either males or females and Žb. sunshine duration. There was a significant association between nonaffective psychosis birth rates for males only and Žb. sunshine duration Žkappas0.15 p-0.001.. This suggests that, as a risk factor, the effect of reduced perinatal sunshine is specifically associated with males who develop non-affective psychosis. The Stanley Foundation supported this project.
Resumo:
The phylogeny of the Australian legume genus Daviesia was estimated using sequences of the internal transcribed spacers of nuclear ribosomal DNA. Partial congruence was found with previous analyses using morphology, including strong support for monophyly of the genus and for a sister group relationship between the clade D. pachyloma and the rest of the genus. A previously unplaced bird-pollinated species, anceps + D. D. epiphyllum, was well supported as sister to the only other bird-pollinated species in the genus, D. speciosa, indicating a single origin of bird pollination in their common ancestor. Other morphological groups within Daviesia were not supported and require reassessment. A strong and previously unreported sister clade of Daviesia consists of the two monotypic genera Erichsenia and Viminaria. These share phyllode-like leaves and indehiscent fruits. The evolutionary history of cord roots, which have anomalous secondary thickening, was explored using parsimony. Cord roots are limited to three separate clades but have a complex history involving a small number of gains (most likely 0-3) and losses (0-5). The anomalous structure of cord roots ( adventitious vascular strands embedded in a parenchymatous matrix) may facilitate nutrient storage, and the roots may be contractile. Both functions may be related to a postfire resprouting adaptation. Alternatively, cord roots may be an adaptation to the low-nutrient lateritic soils of Western Australia. However, tests for association between root type, soil type, and growth habit were equivocal, depending on whether the variables were treated as phylogenetically dependent (insignificant) or independent ( significant).