824 resultados para decentralised data fusion framework
Resumo:
Traffic safety engineers are among the early adopters of Bayesian statistical tools for analyzing crash data. As in many other areas of application, empirical Bayes methods were their first choice, perhaps because they represent an intuitively appealing, yet relatively easy to implement alternative to purely classical approaches. With the enormous progress in numerical methods made in recent years and with the availability of free, easy to use software that permits implementing a fully Bayesian approach, however, there is now ample justification to progress towards fully Bayesian analyses of crash data. The fully Bayesian approach, in particular as implemented via multi-level hierarchical models, has many advantages over the empirical Bayes approach. In a full Bayesian analysis, prior information and all available data are seamlessly integrated into posterior distributions on which practitioners can base their inferences. All uncertainties are thus accounted for in the analyses and there is no need to pre-process data to obtain Safety Performance Functions and other such prior estimates of the effect of covariates on the outcome of interest. In this light, fully Bayesian methods may well be less costly to implement and may result in safety estimates with more realistic standard errors. In this manuscript, we present the full Bayesian approach to analyzing traffic safety data and focus on highlighting the differences between the empirical Bayes and the full Bayes approaches. We use an illustrative example to discuss a step-by-step Bayesian analysis of the data and to show some of the types of inferences that are possible within the full Bayesian framework.
Resumo:
In June 2006, the Swiss Parliament made two important decisions with regards to public registers' governance and individuals' identification. It adopted a new law on the harmonisation of population registers in order to simplify statistical data collection and data exchange from around 4'000 decentralized registers, and it also approved the introduction of a Unique Person Identifier (UPI). The law is rather vague about the implementation of this harmonisation and even though many projects are currently being undertaken in this domain, most of them are quite technical. We believe there is a need for analysis tools and therefore we propose a conceptual framework based on three pillars (Privacy, Identity and Governance) to analyse the requirements in terms of data management for population registers.
Resumo:
This research provides a description of the process followed in order to assemble a "Social Accounting Matrix" for Spain corresponding to the year 2000 (SAMSP00). As argued in the paper, this process attempts to reconcile ESA95 conventions with requirements of applied general equilibrium modelling. Particularly, problems related to the level of aggregation of net taxation data, and to the valuation system used for expressing the monetary value of input-output transactions have deserved special attention. Since the adoption of ESA95 conventions, input-output transactions have been preferably valued at basic prices, which impose additional difficulties on modellers interested in computing applied general equilibrium models. This paper addresses these difficulties by developing a procedure that allows SAM-builders to change the valuation system of input-output transactions conveniently. In addition, this procedure produces new data related to net taxation information.
Resumo:
Many species contain genetic lineages that are phylogenetically intermixed with those of other species. In the Sorex araneus group, previous results based on mtDNA and Y chromosome sequence data showed an incongruent position of Sorex granarius within this group. In this study, we explored the relationship between species within the S. araneus group, aiming to resolve the particular position of S. granarius. In this context, we sequenced a total of 2447 base pairs (bp) of X-linked and nuclear genes from 47 individuals of the S. araneus group. The same taxa were also analyzed within a Bayesian framework with nine autosomal microsatellites. These analyses revealed that all markers apart from mtDNA showed similar patterns, suggesting that the problematic position of S. granarius is best explained by an incongruent behavior by mtDNA. Given their close phylogenetic relationship and their close geographic distribution, the most likely explanation for this pattern is past mtDNA introgression from S. araneus race Carlit to S. granarius.
Resumo:
This study aims to improve the accuracy and usability of Iowa Falling Weight Deflectometer (FWD) data by incorporating significant enhancements into the fully-automated software system for rapid processing of the FWD data. These enhancements include: (1) refined prediction of backcalculated pavement layer modulus through deflection basin matching/optimization, (2) temperature correction of backcalculated Hot-Mix Asphalt (HMA) layer modulus, (3) computation of 1993 AASHTO design guide related effective SN (SNeff) and effective k-value (keff ), (4) computation of Iowa DOT asphalt concrete (AC) overlay design related Structural Rating (SR) and kvalue (k), and (5) enhancement of user-friendliness of input and output from the software tool. A high-quality, easy-to-use backcalculation software package, referred to as, I-BACK: the Iowa Pavement Backcalculation Software, was developed to achieve the project goals and requirements. This report presents theoretical background behind the incorporated enhancements as well as guidance on the use of I-BACK developed in this study. The developed tool, I-BACK, provides more fine-tuned ANN pavement backcalculation results by implementation of deflection basin matching optimizer for conventional flexible, full-depth, rigid, and composite pavements. Implementation of this tool within Iowa DOT will facilitate accurate pavement structural evaluation and rehabilitation designs for pavement/asset management purposes. This research has also set the framework for the development of a simplified FWD deflection based HMA overlay design procedure which is one of the recommended areas for future research.
Resumo:
This study aims to improve the accuracy and usability of Iowa Falling Weight Deflectometer (FWD) data by incorporating significant enhancements into the fully-automated software system for rapid processing of the FWD data. These enhancements include: (1) refined prediction of backcalculated pavement layer modulus through deflection basin matching/optimization, (2) temperature correction of backcalculated Hot-Mix Asphalt (HMA) layer modulus, (3) computation of 1993 AASHTO design guide related effective SN (SNeff) and effective k-value (keff ), (4) computation of Iowa DOT asphalt concrete (AC) overlay design related Structural Rating (SR) and kvalue (k), and (5) enhancement of user-friendliness of input and output from the software tool. A high-quality, easy-to-use backcalculation software package, referred to as, I-BACK: the Iowa Pavement Backcalculation Software, was developed to achieve the project goals and requirements. This report presents theoretical background behind the incorporated enhancements as well as guidance on the use of I-BACK developed in this study. The developed tool, I-BACK, provides more fine-tuned ANN pavement backcalculation results by implementation of deflection basin matching optimizer for conventional flexible, full-depth, rigid, and composite pavements. Implementation of this tool within Iowa DOT will facilitate accurate pavement structural evaluation and rehabilitation designs for pavement/asset management purposes. This research has also set the framework for the development of a simplified FWD deflection based HMA overlay design procedure which is one of the recommended areas for future research.
Resumo:
The temporal dynamics of species diversity are shaped by variations in the rates of speciation and extinction, and there is a long history of inferring these rates using first and last appearances of taxa in the fossil record. Understanding diversity dynamics critically depends on unbiased estimates of the unobserved times of speciation and extinction for all lineages, but the inference of these parameters is challenging due to the complex nature of the available data. Here, we present a new probabilistic framework to jointly estimate species-specific times of speciation and extinction and the rates of the underlying birth-death process based on the fossil record. The rates are allowed to vary through time independently of each other, and the probability of preservation and sampling is explicitly incorporated in the model to estimate the true lifespan of each lineage. We implement a Bayesian algorithm to assess the presence of rate shifts by exploring alternative diversification models. Tests on a range of simulated data sets reveal the accuracy and robustness of our approach against violations of the underlying assumptions and various degrees of data incompleteness. Finally, we demonstrate the application of our method with the diversification of the mammal family Rhinocerotidae and reveal a complex history of repeated and independent temporal shifts of both speciation and extinction rates, leading to the expansion and subsequent decline of the group. The estimated parameters of the birth-death process implemented here are directly comparable with those obtained from dated molecular phylogenies. Thus, our model represents a step towards integrating phylogenetic and fossil information to infer macroevolutionary processes.
Resumo:
The bacterial insertion sequence IS21 contains two genes, istA and istB, which are organized as an operon. IS21 spontaneously forms tandem repeats designated (IS21)2. Plasmids carrying (IS21)2 react efficiently with other replicons, producing cointegrates via a cut-and-paste mechanism. Here we show that transposition of a single IS21 element (simple insertion) and cointegrate formation involving (IS21)2 result from two distinct non-replicative pathways, which are essentially due to two differentiated IstA proteins, transposase and cointegrase. In Escherichia coli, transposase was characterized as the full-length, 46 kDa product of the istA gene, whereas the 45 kDa cointegrase was expressed, in-frame, from a natural internal translation start of istA. The istB gene, which could be experimentally disconnected from istA, provided a helper protein that strongly stimulated the transposase and cointegrase-driven reactions. Site-directed mutagenesis was used to express either cointegrase or transposase from the istA gene. Cointegrase promoted replicon fusion at high frequencies by acting on IS21 ends which were linked by 2, 3, or 4 bp junction sequences in (IS21)2. By contrast, cointegrase poorly catalyzed simple insertion of IS21 elements. Transposase had intermediate, uniform activity in both pathways. The ability of transposase to synapse two widely spaced IS21 ends may reside in the eight N-terminal amino acid residues which are absent from cointegrase. Given the 2 or 3 bp spacing in naturally occurring IS21 tandems and the specialization of cointegrase, the fulminant spread of IS21 via cointegration can now be understood.
Resumo:
As a thorough aggregation of probability and graph theory, Bayesian networks currently enjoy widespread interest as a means for studying factors that affect the coherent evaluation of scientific evidence in forensic science. Paper I of this series of papers intends to contribute to the discussion of Bayesian networks as a framework that is helpful for both illustrating and implementing statistical procedures that are commonly employed for the study of uncertainties (e.g. the estimation of unknown quantities). While the respective statistical procedures are widely described in literature, the primary aim of this paper is to offer an essentially non-technical introduction on how interested readers may use these analytical approaches - with the help of Bayesian networks - for processing their own forensic science data. Attention is mainly drawn to the structure and underlying rationale of a series of basic and context-independent network fragments that users may incorporate as building blocs while constructing larger inference models. As an example of how this may be done, the proposed concepts will be used in a second paper (Part II) for specifying graphical probability networks whose purpose is to assist forensic scientists in the evaluation of scientific evidence encountered in the context of forensic document examination (i.e. results of the analysis of black toners present on printed or copied documents).
Resumo:
Advanced neuroinformatics tools are required for methods of connectome mapping, analysis, and visualization. The inherent multi-modality of connectome datasets poses new challenges for data organization, integration, and sharing. We have designed and implemented the Connectome Viewer Toolkit - a set of free and extensible open source neuroimaging tools written in Python. The key components of the toolkit are as follows: (1) The Connectome File Format is an XML-based container format to standardize multi-modal data integration and structured metadata annotation. (2) The Connectome File Format Library enables management and sharing of connectome files. (3) The Connectome Viewer is an integrated research and development environment for visualization and analysis of multi-modal connectome data. The Connectome Viewer's plugin architecture supports extensions with network analysis packages and an interactive scripting shell, to enable easy development and community contributions. Integration with tools from the scientific Python community allows the leveraging of numerous existing libraries for powerful connectome data mining, exploration, and comparison. We demonstrate the applicability of the Connectome Viewer Toolkit using Diffusion MRI datasets processed by the Connectome Mapper. The Connectome Viewer Toolkit is available from http://www.cmtk.org/
Resumo:
The phloem performs essential systemic functions in tracheophytes, yet little is known about its molecular genetic specification. Here we show that application of the peptide ligand CLAVATA3/EMBRYO SURROUNDING REGION 45 (CLE45) specifically inhibits specification of protophloem in Arabidopsis roots by locking the sieve element precursor cell in its preceding developmental state. CLE45 treatment, as well as viable transgenic expression of a weak CLE45(G6T) variant, interferes not only with commitment to sieve element fate but also with the formative sieve element precursor cell division that creates protophloem and metaphloem cell files. However, the absence of this division appears to be a secondary effect of discontinuous sieve element files and subsequent systemically reduced auxin signaling in the root meristem. In the absence of the formative sieve element precursor cell division, metaphloem identity is seemingly adopted by the normally procambial cell file instead, pointing to possibly independent positional cues for metaphloem formation. The protophloem formation and differentiation defects in brevis radix (brx) and octopus (ops) mutants are similar to those observed in transgenic seedlings with increased CLE45 activity and can be rescued by loss of function of a putative CLE45 receptor, BARELY ANY MERISTEM 3 (BAM3). Conversely, a dominant gain-of-function ops allele or mild OPS dosage increase suppresses brx defects and confers CLE45 resistance. Thus, our data suggest that delicate quantitative interplay between the opposing activities of BAM3-mediated CLE45 signals and OPS-dependent signals determines cellular commitment to protophloem sieve element fate, with OPS acting as a positive, quantitative master regulator of phloem fate.
Resumo:
Gene set enrichment (GSE) analysis is a popular framework for condensing information from gene expression profiles into a pathway or signature summary. The strengths of this approach over single gene analysis include noise and dimension reduction, as well as greater biological interpretability. As molecular profiling experiments move beyond simple case-control studies, robust and flexible GSE methodologies are needed that can model pathway activity within highly heterogeneous data sets. To address this challenge, we introduce Gene Set Variation Analysis (GSVA), a GSE method that estimates variation of pathway activity over a sample population in an unsupervised manner. We demonstrate the robustness of GSVA in a comparison with current state of the art sample-wise enrichment methods. Further, we provide examples of its utility in differential pathway activity and survival analysis. Lastly, we show how GSVA works analogously with data from both microarray and RNA-seq experiments. GSVA provides increased power to detect subtle pathway activity changes over a sample population in comparison to corresponding methods. While GSE methods are generally regarded as end points of a bioinformatic analysis, GSVA constitutes a starting point to build pathway-centric models of biology. Moreover, GSVA contributes to the current need of GSE methods for RNA-seq data. GSVA is an open source software package for R which forms part of the Bioconductor project and can be downloaded at http://www.bioconductor.org.
Resumo:
BACKGROUND: Findings from randomised trials have shown a higher early risk of stroke after carotid artery stenting than after carotid endarterectomy. We assessed whether white-matter lesions affect the perioperative risk of stroke in patients treated with carotid artery stenting versus carotid endarterectomy. METHODS: Patients with symptomatic carotid artery stenosis included in the International Carotid Stenting Study (ICSS) were randomly allocated to receive carotid artery stenting or carotid endarterectomy. Copies of baseline brain imaging were analysed by two investigators, who were masked to treatment, for the severity of white-matter lesions using the age-related white-matter changes (ARWMC) score. Randomisation was done with a computer-generated sequence (1:1). Patients were divided into two groups using the median ARWMC. We analysed the risk of stroke within 30 days of revascularisation using a per-protocol analysis. ICSS is registered with controlled-trials.com, number ISRCTN 25337470. FINDINGS: 1036 patients (536 randomly allocated to carotid artery stenting, 500 to carotid endarterectomy) had baseline imaging available. Median ARWMC score was 7, and patients were dichotomised into those with a score of 7 or more and those with a score of less than 7. In patients treated with carotid artery stenting, those with an ARWMC score of 7 or more had an increased risk of stroke compared with those with a score of less than 7 (HR for any stroke 2·76, 95% CI 1·17-6·51; p=0·021; HR for non-disabling stroke 3·00, 1·10-8·36; p=0·031), but we did not see a similar association in patients treated with carotid endarterectomy (HR for any stroke 1·18, 0·40-3·55; p=0·76; HR for disabling or fatal stroke 1·41, 0·38-5·26; p=0·607). Carotid artery stenting was associated with a higher risk of stroke compared with carotid endarterectomy in patients with an ARWMC score of 7 or more (HR for any stroke 2·98, 1·29-6·93; p=0·011; HR for non-disabling stroke 6·34, 1·45-27·71; p=0·014), but there was no risk difference in patients with an ARWMC score of less than 7. INTERPRETATION: The presence of white-matter lesions on brain imaging should be taken into account when selecting patients for carotid revascularisation. Carotid artery stenting should be avoided in patients with more extensive white-matter lesions, but might be an acceptable alternative to carotid endarterectomy in patients with less extensive lesions. FUNDING: Medical Research Council, the Stroke Association, Sanofi-Synthélabo, the European Union Research Framework Programme 5.
Resumo:
BACKGROUND: There is an ever-increasing volume of data on host genes that are modulated during HIV infection, influence disease susceptibility or carry genetic variants that impact HIV infection. We created GuavaH (Genomic Utility for Association and Viral Analyses in HIV, http://www.GuavaH.org), a public resource that supports multipurpose analysis of genome-wide genetic variation and gene expression profile across multiple phenotypes relevant to HIV biology. FINDINGS: We included original data from 8 genome and transcriptome studies addressing viral and host responses in and ex vivo. These studies cover phenotypes such as HIV acquisition, plasma viral load, disease progression, viral replication cycle, latency and viral-host genome interaction. This represents genome-wide association data from more than 4,000 individuals, exome sequencing data from 392 individuals, in vivo transcriptome microarray data from 127 patients/conditions, and 60 sets of RNA-seq data. Additionally, GuavaH allows visualization of protein variation in ~8,000 individuals from the general population. The publicly available GuavaH framework supports queries on (i) unique single nucleotide polymorphism across different HIV related phenotypes, (ii) gene structure and variation, (iii) in vivo gene expression in the setting of human infection (CD4+ T cells), and (iv) in vitro gene expression data in models of permissive infection, latency and reactivation. CONCLUSIONS: The complexity of the analysis of host genetic influences on HIV biology and pathogenesis calls for comprehensive motors of research on curated data. The tool developed here allows queries and supports validation of the rapidly growing body of host genomic information pertinent to HIV research.
Resumo:
Knowledge about spatial biodiversity patterns is a basic criterion for reserve network design. Although herbarium collections hold large quantities of information, the data are often scattered and cannot supply complete spatial coverage. Alternatively, herbarium data can be used to fit species distribution models and their predictions can be used to provide complete spatial coverage and derive species richness maps. Here, we build on previous effort to propose an improved compositionalist framework for using species distribution models to better inform conservation management. We illustrate the approach with models fitted with six different methods and combined using an ensemble approach for 408 plant species in a tropical and megadiverse country (Ecuador). As a complementary view to the traditional richness hotspots methodology, consisting of a simple stacking of species distribution maps, the compositionalist modelling approach used here combines separate predictions for different pools of species to identify areas of alternative suitability for conservation. Our results show that the compositionalist approach better captures the established protected areas than the traditional richness hotspots strategies and allows the identification of areas in Ecuador that would optimally complement the current protection network. Further studies should aim at refining the approach with more groups and additional species information.