939 resultados para STATISTICAL DATA
Resumo:
An appropriate model of recent human evolution is not only important to understand our own history, but it is necessary to disentangle the effects of demography and selection on genome diversity. Although most genetic data support the view that our species originated recently in Africa, it is still unclear if it completely replaced former members of the Homo genus, or if some interbreeding occurred during its range expansion. Several scenarios of modern human evolution have been proposed on the basis of molecular and paleontological data, but their likelihood has never been statistically assessed. Using DNA data from 50 nuclear loci sequenced in African, Asian and Native American samples, we show here by extensive simulations that a simple African replacement model with exponential growth has a higher probability (78%) as compared with alternative multiregional evolution or assimilation scenarios. A Bayesian analysis of the data under this best supported model points to an origin of our species approximate to 141 thousand years ago (Kya), an exit out-of-Africa approximate to 51 Kya, and a recent colonization of the Americas approximate to 10.5 Kya. We also find that the African replacement model explains not only the shallow ancestry of mtDNA or Y-chromosomes but also the occurrence of deep lineages at some autosomal loci, which has been formerly interpreted as a sign of interbreeding with Homo erectus.
Recent developments in genetic data analysis: what can they tell us about human demographic history?
Resumo:
Over the last decade, a number of new methods of population genetic analysis based on likelihood have been introduced. This review describes and explains the general statistical techniques that have recently been used, and discusses the underlying population genetic models. Experimental papers that use these methods to infer human demographic and phylogeographic history are reviewed. It appears that the use of likelihood has hitherto had little impact in the field of human population genetics, which is still primarily driven by more traditional approaches. However, with the current uncertainty about the effects of natural selection, population structure and ascertainment of single-nucleotide polymorphism markers, it is suggested that likelihood-based methods may have a greater impact in the future.
Resumo:
Heterogeneity in lifetime data may be modelled by multiplying an individual's hazard by an unobserved frailty. We test for the presence of frailty of this kind in univariate and bivariate data with Weibull distributed lifetimes, using statistics based on the ordered Cox-Snell residuals from the null model of no frailty. The form of the statistics is suggested by outlier testing in the gamma distribution. We find through simulation that the sum of the k largest or k smallest order statistics, for suitably chosen k , provides a powerful test when the frailty distribution is assumed to be gamma or positive stable, respectively. We provide recommended values of k for sample sizes up to 100 and simple formulae for estimated critical values for tests at the 5% level.
Resumo:
Resolving the relationships between Metazoa and other eukaryotic groups as well as between metazoan phyla is central to the understanding of the origin and evolution of animals. The current view is based on limited data sets, either a single gene with many species (e.g., ribosomal RNA) or many genes but with only a few species. Because a reliable phylogenetic inference simultaneously requires numerous genes and numerous species, we assembled a very large data set containing 129 orthologous proteins (similar to30,000 aligned amino acid positions) for 36 eukaryotic species. Included in the alignments are data from the choanoflagellate Monosiga ovata, obtained through the sequencing of about 1,000 cDNAs. We provide conclusive support for choanoflagellates as the closest relative of animals and for fungi as the second closest. The monophyly of Plantae and chromalveolates was recovered but without strong statistical support. Within animals, in contrast to the monophyly of Coelomata observed in several recent large-scale analyses, we recovered a paraphyletic Coelamata, with nematodes and platyhelminths nested within. To include a diverse sample of organisms, data from EST projects were used for several species, resulting in a large amount of missing data in our alignment (about 25%). By using different approaches, we verify that the inferred phylogeny is not sensitive to these missing data. Therefore, this large data set provides a reliable phylogenetic framework for studying eukaryotic and animal evolution and will be easily extendable when large amounts of sequence information become available from a broader taxonomic range.
Resumo:
Background: MHC Class I molecules present antigenic peptides to cytotoxic T cells, which forms an integral part of the adaptive immune response. Peptides are bound within a groove formed by the MHC heavy chain. Previous approaches to MHC Class I-peptide binding prediction have largely concentrated on the peptide anchor residues located at the P2 and C-terminus positions. Results: A large dataset comprising MHC-peptide structural complexes was created by remodelling pre-determined x-ray crystallographic structures. Static energetic analysis, following energy minimisation, was performed on the dataset in order to characterise interactions between bound peptides and the MHC Class I molecule, partitioning the interactions within the groove into van der Waals, electrostatic and total non-bonded energy contributions. Conclusion: The QSAR techniques of Genetic Function Approximation (GFA) and Genetic Partial Least Squares (G/PLS) algorithms were used to identify key interactions between the two molecules by comparing the calculated energy values with experimentally-determined BL50 data. Although the peptide termini binding interactions help ensure the stability of the MHC Class I-peptide complex, the central region of the peptide is also important in defining the specificity of the interaction. As thermodynamic studies indicate that peptide association and dissociation may be driven entropically, it may be necessary to incorporate entropic contributions into future calculations.
Resumo:
The application of prediction theories has been widely practised for many years in many industries such as manufacturing, defence and aerospace. Although these theories are not new, their application has not been widely used within the building services industry. Collectively, the building services industry should take a deeper look at these approaches in comparison with the traditional deterministic approaches currently being practised. By extending the application into this industry, this paper seeks to provide the industry with an overview of how simplified stochastic modelling coupled with availability and reliability predictions using historical data compiled from various sources could enhance the quality of building services systems.
Resumo:
A means of assessing, monitoring and controlling aggregate emissions from multi-instrument Emissions Trading Schemes is proposed. The approach allows contributions from different instruments with different forms of emissions targets to be integrated. Where Emissions Trading Schemes are helping meet specific national targets, the approach allows the entry requirements of new participants to be calculated and set at a level that will achieve these targets. The approach is multi-levelled, and may be extended downwards to support pooling of participants within instruments, or upwards to embed Emissions Trading Schemes within a wider suite of policies and measures with hard and soft targets. Aggregate emissions from each instrument are treated stochastically. Emissions from the scheme as a whole are then the joint probability distribution formed by integrating the emissions from its instruments. Because a Bayesian approach is adopted, qualitative and semi-qualitative data from expert opinion can be used where quantitative data is not currently available, or is incomplete. This approach helps government retain sufficient control over emissions trading scheme targets to allow them to meet their emissions reduction obligations, while minimising the need for retrospectively adjusting existing participants’ conditions of entry. This maintains participant confidence, while providing the necessary policy levers for good governance.
Resumo:
Baking and 2-g mixograph analyses were performed for 55 cultivars (19 spring and 36 winter wheat) from various quality classes from the 2002 harvest in Poland. An instrumented 2-g direct-drive mixograph was used to study the mixing characteristics of the wheat cultivars. A number of parameters were extracted automatically from each mixograph trace and correlated with baking volume and flour quality parameters (protein content and high molecular weight glutenin subunit [HMW-GS] composition by SDS-PAGE) using multiple linear regression statistical analysis. Principal component analysis of the mixograph data discriminated between four flour quality classes, and predictions of baking volume were obtained using several selected mixograph parameters, chosen using a best subsets regression routine, giving R-2 values of 0.862-0.866. In particular, three new spring wheat strains (CHD 502a-c) recently registered in Poland were highly discriminated and predicted to give high baking volume on the basis of two mixograph parameters: peak bandwidth and 10-min bandwidth.
Resumo:
Wireless Personal Area Networks (WPANs) are offering high data rates suitable for interconnecting high bandwidth personal consumer devices (Wireless HD streaming, Wireless-USB and Bluetooth EDR). ECMA-368 is the Physical (PHY) and Media Access Control (MAC) backbone of many of these wireless devices. WPAN devices tend to operate in an ad-hoc based network and therefore it is important to successfully latch onto the network and become part of one of the available piconets. This paper presents a new algorithm for detecting the Packet/Fame Sync (PFS) signal in ECMA-368 to identify piconets and aid symbol timing. The algorithm is based on correlating the received PFS symbols with the expected locally stored symbols over the 24 or 12 PFS symbols, but selecting the likely TFC based on the highest statistical mode from the 24 or 12 best correlation results. The results are very favorable showing an improvement margin in the order of 11.5dB in reference sensitivity tests between the required performance using this algorithm and the performance of comparable systems.
Resumo:
Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.
Resumo:
It is generally assumed that the variability of neuronal morphology has an important effect on both the connectivity and the activity of the nervous system, but this effect has not been thoroughly investigated. Neuroanatomical archives represent a crucial tool to explore structure–function relationships in the brain. We are developing computational tools to describe, generate, store and render large sets of three–dimensional neuronal structures in a format that is compact, quantitative, accurate and readily accessible to the neuroscientist. Single–cell neuroanatomy can be characterized quantitatively at several levels. In computer–aided neuronal tracing files, a dendritic tree is described as a series of cylinders, each represented by diameter, spatial coordinates and the connectivity to other cylinders in the tree. This ‘Cartesian’ description constitutes a completely accurate mapping of dendritic morphology but it bears little intuitive information for the neuroscientist. In contrast, a classical neuroanatomical analysis characterizes neuronal dendrites on the basis of the statistical distributions of morphological parameters, e.g. maximum branching order or bifurcation asymmetry. This description is intuitively more accessible, but it only yields information on the collective anatomy of a group of dendrites, i.e. it is not complete enough to provide a precise ‘blueprint’ of the original data. We are adopting a third, intermediate level of description, which consists of the algorithmic generation of neuronal structures within a certain morphological class based on a set of ‘fundamental’, measured parameters. This description is as intuitive as a classical neuroanatomical analysis (parameters have an intuitive interpretation), and as complete as a Cartesian file (the algorithms generate and display complete neurons). The advantages of the algorithmic description of neuronal structure are immense. If an algorithm can measure the values of a handful of parameters from an experimental database and generate virtual neurons whose anatomy is statistically indistinguishable from that of their real counterparts, a great deal of data compression and amplification can be achieved. Data compression results from the quantitative and complete description of thousands of neurons with a handful of statistical distributions of parameters. Data amplification is possible because, from a set of experimental neurons, many more virtual analogues can be generated. This approach could allow one, in principle, to create and store a neuroanatomical database containing data for an entire human brain in a personal computer. We are using two programs, L–NEURON and ARBORVITAE, to investigate systematically the potential of several different algorithms for the generation of virtual neurons. Using these programs, we have generated anatomically plausible virtual neurons for several morphological classes, including guinea pig cerebellar Purkinje cells and cat spinal cord motor neurons. These virtual neurons are stored in an online electronic archive of dendritic morphology. This process highlights the potential and the limitations of the ‘computational neuroanatomy’ strategy for neuroscience databases.
Resumo:
We compared output from 3 dynamic process-based models (DMs: ECOSSE, MILLENNIA and the Durham Carbon Model) and 9 bioclimatic envelope models (BCEMs; including BBOG ensemble and PEATSTASH) ranging from simple threshold to semi-process-based models. Model simulations were run at 4 British peatland sites using historical climate data and climate projections under a medium (A1B) emissions scenario from the 11-RCM (regional climate model) ensemble underpinning UKCP09. The models showed that blanket peatlands are vulnerable to projected climate change; however, predictions varied between models as well as between sites. All BCEMs predicted a shift from presence to absence of a climate associated with blanket peat, where the sites with the lowest total annual precipitation were closest to the presence/absence threshold. DMs showed a more variable response. ECOSSE predicted a decline in net C sink and shift to net C source by the end of this century. The Durham Carbon Model predicted a smaller decline in the net C sink strength, but no shift to net C source. MILLENNIA predicted a slight overall increase in the net C sink. In contrast to the BCEM projections, the DMs predicted that the sites with coolest temperatures and greatest total annual precipitation showed the largest change in carbon sinks. In this model inter-comparison, the greatest variation in model output in response to climate change projections was not between the BCEMs and DMs but between the DMs themselves, because of different approaches to modelling soil organic matter pools and decomposition amongst other processes. The difference in the sign of the response has major implications for future climate feedbacks, climate policy and peatland management. Enhanced data collection, in particular monitoring peatland response to current change, would significantly improve model development and projections of future change.
Resumo:
We assessed the vulnerability of blanket peat to climate change in Great Britain using an ensemble of 8 bioclimatic envelope models. We used 4 published models that ranged from simple threshold models, based on total annual precipitation, to Generalised Linear Models (GLMs, based on mean annual temperature). In addition, 4 new models were developed which included measures of water deficit as threshold, classification tree, GLM and generalised additive models (GAM). Models that included measures of both hydrological conditions and maximum temperature provided a better fit to the mapped peat area than models based on hydrological variables alone. Under UKCIP02 projections for high (A1F1) and low (B1) greenhouse gas emission scenarios, 7 out of the 8 models showed a decline in the bioclimatic space associated with blanket peat. Eastern regions (Northumbria, North York Moors, Orkney) were shown to be more vulnerable than higher-altitude, western areas (Highlands, Western Isles and Argyle, Bute and The Trossachs). These results suggest a long-term decline in the distribution of actively growing blanket peat, especially under the high emissions scenario, although it is emphasised that existing peatlands may well persist for decades under a changing climate. Observational data from long-term monitoring and manipulation experiments in combination with process-based models are required to explore the nature and magnitude of climate change impacts on these vulnerable areas more fully.
Resumo:
We explore the potential for making statistical decadal predictions of sea surface temperatures (SSTs) in a perfect model analysis, with a focus on the Atlantic basin. Various statistical methods (Lagged correlations, Linear Inverse Modelling and Constructed Analogue) are found to have significant skill in predicting the internal variability of Atlantic SSTs for up to a decade ahead in control integrations of two different global climate models (GCMs), namely HadCM3 and HadGEM1. Statistical methods which consider non-local information tend to perform best, but which is the most successful statistical method depends on the region considered, GCM data used and prediction lead time. However, the Constructed Analogue method tends to have the highest skill at longer lead times. Importantly, the regions of greatest prediction skill can be very different to regions identified as potentially predictable from variance explained arguments. This finding suggests that significant local decadal variability is not necessarily a prerequisite for skillful decadal predictions, and that the statistical methods are capturing some of the dynamics of low-frequency SST evolution. In particular, using data from HadGEM1, significant skill at lead times of 6–10 years is found in the tropical North Atlantic, a region with relatively little decadal variability compared to interannual variability. This skill appears to come from reconstructing the SSTs in the far north Atlantic, suggesting that the more northern latitudes are optimal for SST observations to improve predictions. We additionally explore whether adding sub-surface temperature data improves these decadal statistical predictions, and find that, again, it depends on the region, prediction lead time and GCM data used. Overall, we argue that the estimated prediction skill motivates the further development of statistical decadal predictions of SSTs as a benchmark for current and future GCM-based decadal climate predictions.