909 resultados para Automatic Inference
Resumo:
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.
Resumo:
Stephens and Donnelly have introduced a simple yet powerful importance sampling scheme for computing the likelihood in population genetic models. Fundamental to the method is an approximation to the conditional probability of the allelic type of an additional gene, given those currently in the sample. As noted by Li and Stephens, the product of these conditional probabilities for a sequence of draws that gives the frequency of allelic types in a sample is an approximation to the likelihood, and can be used directly in inference. The aim of this note is to demonstrate the high level of accuracy of "product of approximate conditionals" (PAC) likelihood when used with microsatellite data. Results obtained on simulated microsatellite data show that this strategy leads to a negligible bias over a wide range of the scaled mutation parameter theta. Furthermore, the sampling variance of likelihood estimates as well as the computation time are lower than that obtained with importance sampling on the whole range of theta. It follows that this approach represents an efficient substitute to IS algorithms in computer intensive (e.g. MCMC) inference methods in population genetics. (c) 2006 Elsevier Inc. All rights reserved.
Resumo:
Varroa destructor is a parasitic mite of the Eastern honeybee Apis cerana. Fifty years ago, two distinct evolutionary lineages (Korean and Japanese) invaded the Western honeybee Apis mellifera. This haplo-diploid parasite species reproduces mainly through brother sister matings, a system which largely favors the fixation of new mutations. In a worldwide sample of 225 individuals from 21 locations collected on Western honeybees and analyzed at 19 microsatellite loci, a series of de novo mutations was observed. Using historical data concerning the invasion, this original biological system has been exploited to compare three mutation models with allele size constraints for microsatellite markers: stepwise (SMM) and generalized (GSM) mutation models, and a model with mutation rate increasing exponentially with microsatellite length (ESM). Posterior probabilities of the three models have been estimated for each locus individually using reversible jump Markov Chain Monte Carlo. The relative support of each model varies widely among loci, but the GSM is the only model that always receives at least 9% support, whatever the locus. The analysis also provides robust estimates of mutation parameters for each locus and of the divergence time of the two invasive lineages (67,000 generations with a 90% credibility interval of 35,000-174,000). With an average of 10 generations per year, this divergence time fits with the last post-glacial Korea Japan land separation. (c) 2005 Elsevier Inc. All rights reserved.
Resumo:
Accurately measured peptide masses can be used for large-scale protein identification from bacterial whole-cell digests as an alternative to tandem mass spectrometry (MS/MS) provided mass measurement errors of a few parts-per-million (ppm) are obtained. Fourier transform ion cyclotron resonance (FTICR) mass spectrometry (MS) routinely achieves such mass accuracy either with internal calibration or by regulating the charge in the analyzer cell. We have developed a novel and automated method for internal calibration of liquid chromatography (LC)/FTICR data from whole-cell digests using peptides in the sample identified by concurrent MS/MS together with ambient polydimethyl-cyclosiloxanes as internal calibrants in the mass spectra. The method reduced mass measurement error from 4.3 +/- 3.7 ppm to 0.3 +/- 2.3 ppm in an E. coli LC/FTICR dataset of 1000 MS and MS/MS spectra and is applicable to all analyses of complex protein digests by FTICRMS. Copyright (c) 2006 John Wiley & Sons, Ltd.
Resumo:
Event-related functional magnetic resonance imaging (efMRI) has emerged as a powerful technique for detecting brains' responses to presented stimuli. A primary goal in efMRI data analysis is to estimate the Hemodynamic Response Function (HRF) and to locate activated regions in human brains when specific tasks are performed. This paper develops new methodologies that are important improvements not only to parametric but also to nonparametric estimation and hypothesis testing of the HRF. First, an effective and computationally fast scheme for estimating the error covariance matrix for efMRI is proposed. Second, methodologies for estimation and hypothesis testing of the HRF are developed. Simulations support the effectiveness of our proposed methods. When applied to an efMRI dataset from an emotional control study, our method reveals more meaningful findings than the popular methods offered by AFNI and FSL. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Ranald Roderick Macdonald (1945-2007) was an important contributor to mathematical psychology in the UK, as a referee and action editor for British Journal of Mathematical and Statistical Psychology and as a participant and organizer at the British Psychological Society's Mathematics, statistics and computing section meetings. This appreciation argues that his most important contribution was to the foundations of significance testing, where his concern about what information was relevant in interpreting the results of significance tests led him to be a persuasive advocate for the 'Weak Fisherian' form of hypothesis testing.
Resumo:
The utility of an "ecologically rational" recognition-based decision rule in multichoice decision problems is analyzed, varying the type of judgment required (greater or lesser). The maximum size and range of a counterintuitive advantage associated with recognition-based judgment (the "less-is-more effect") is identified for a range of cue validity values. Greater ranges of the less-is-more effect occur when participants are asked which is the greatest of to choices (m > 2) than which is the least. Less-is-more effects also have greater range for larger values of in. This implies that the classic two-altemative forced choice task, as studied by Goldstein and Gigerenzer (2002), may not be the most appropriate test case for less-is-more effects.
Resumo:
Studies of ignorance-driven decision making have been employed to analyse when ignorance should prove advantageous on theoretical grounds or else they have been employed to examine whether human behaviour is consistent with an ignorance-driven inference strategy (e. g., the recognition heuristic). In the current study we examine whether-under conditions where such inferences might be expected-the advantages that theoretical analyses predict are evident in human performance data. A single experiment shows that, when asked to make relative wealth judgements, participants reliably use recognition as a basis for their judgements. Their wealth judgements under these conditions are reliably more accurate when some of the target names are unknown than when participants recognize all of the names (a "less-is-more effect"). These results are consistent across a number of variations: the number of options given to participants and the nature of the wealth judgement. A basic model of recognition-based inference predicts these effects.
Resumo:
“Fast & frugal” heuristics represent an appealing way of implementing bounded rationality and decision-making under pressure. The recognition heuristic is the simplest and most fundamental of these heuristics. Simulation and experimental studies have shown that this ignorance-driven heuristic inference can prove superior to knowledge based inference (Borges, Goldstein, Ortman & Gigerenzer, 1999; Goldstein & Gigerenzer, 2002) and have shown how the heuristic could develop from ACT-R’s forgetting function (Schooler & Hertwig, 2005). Mathematical analyses also demonstrate that, under certain conditions, a “less-is-more effect” will always occur (Goldstein & Gigerenzer, 2002). The further analyses presented in this paper show, however, that these conditions may constitute a special case and that the less-is-more effect in decision-making is subject to the moderating influence of the number of options to be considered and the framing of the question.
Resumo:
The identification of criminal networks is not a routine exploratory process within the current practice of the law enforcement authorities; rather it is triggered by specific evidence of criminal activity being investigated. A network is identified when a criminal comes to notice and any associates who could also be potentially implicated would need to be identified if only to be eliminated from the enquiries as suspects or witnesses as well as to prevent and/or detect crime. However, an identified network may not be the one causing most harm in a given area.. This paper identifies a methodology to identify all of the criminal networks that are present within a Law Enforcement Area, and, prioritises those that are causing most harm to the community. Each crime is allocated a score based on its crime type and how recently the crime was committed; the network score, which can be used as decision support to help prioritise it for law enforcement purposes, is the sum of the individual crime scores.
Resumo:
There are still major challenges in the area of automatic indexing and retrieval of multimedia content data for very large multimedia content corpora. Current indexing and retrieval applications still use keywords to index multimedia content and those keywords usually do not provide any knowledge about the semantic content of the data. With the increasing amount of multimedia content, it is inefficient to continue with this approach. In this paper, we describe the project DREAM, which addresses such challenges by proposing a new framework for semi-automatic annotation and retrieval of multimedia based on the semantic content. The framework uses the Topic Map Technology, as a tool to model the knowledge automatically extracted from the multimedia content using an Automatic Labelling Engine. We describe how we acquire knowledge from the content and represent this knowledge using the support of NLP to automatically generate Topic Maps. The framework is described in the context of film post-production.
Resumo:
The main activity carried out by the geophysicist when interpreting seismic data, in terms of both importance and time spent is tracking (or picking) seismic events. in practice, this activity turns out to be rather challenging, particularly when the targeted event is interrupted by discontinuities such as geological faults or exhibits lateral changes in seismic character. In recent years, several automated schemes, known as auto-trackers, have been developed to assist the interpreter in this tedious and time-consuming task. The automatic tracking tool available in modem interpretation software packages often employs artificial neural networks (ANN's) to identify seismic picks belonging to target events through a pattern recognition process. The ability of ANNs to track horizons across discontinuities largely depends on how reliably data patterns characterise these horizons. While seismic attributes are commonly used to characterise amplitude peaks forming a seismic horizon, some researchers in the field claim that inherent seismic information is lost in the attribute extraction process and advocate instead the use of raw data (amplitude samples). This paper investigates the performance of ANNs using either characterisation methods, and demonstrates how the complementarity of both seismic attributes and raw data can be exploited in conjunction with other geological information in a fuzzy inference system (FIS) to achieve an enhanced auto-tracking performance.
Resumo:
The 3D reconstruction of a Golgi-stained dendritic tree from a serial stack of images captured with a transmitted light bright-field microscope is investigated. Modifications to the bootstrap filter are discussed such that the tree structure may be estimated recursively as a series of connected segments. The tracking performance of the bootstrap particle filter is compared against Differential Evolution, an evolutionary global optimisation method, both in terms of robustness and accuracy. It is found that the particle filtering approach is significantly more robust and accurate for the data considered.
Resumo:
The externally recorded electroencephalogram (EEG) is contaminated with signals that do not originate from the brain, collectively known as artefacts. Thus, EEG signals must be cleaned prior to any further analysis. In particular, if the EEG is to be used in online applications such as Brain-Computer Interfaces (BCIs) the removal of artefacts must be performed in an automatic manner. This paper investigates the robustness of Mutual Information based features to inter-subject variability for use in an automatic artefact removal system. The system is based on the separation of EEG recordings into independent components using a temporal ICA method, RADICAL, and the utilisation of a Support Vector Machine for classification of the components into EEG and artefact signals. High accuracy and robustness to inter-subject variability is achieved.