Biblioteca Digital

824 resultados para decentralised data fusion framework

Biases of the ordinary least squares and instrumental variables estimators of the intergenerational earnings elasticity : revisited in the light of panel data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The OLS estimator of the intergenerational earnings correlation is biased towards zero, while the instrumental variables estimator is biased upwards. The first of these results arises because of measurement error, while the latter rests on the presumption that the education of the parent family is an invalid instrument. We propose a panel data framework for quantifying the asymptotic biases of these estimators, as well as a mis-specification test for the IV estimator. [Author]

Numerical analysis of wave-induced fluid flow effects on seismic data: Application to monitoring of CO2 storage at the Sleipner Field

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work we analyze how patchy distributions of CO2 and brine within sand reservoirs may lead to significant attenuation and velocity dispersion effects, which in turn may have a profound impact on surface seismic data. The ultimate goal of this paper is to contribute to the understanding of these processes within the framework of the seismic monitoring of CO2 sequestration, a key strategy to mitigate global warming. We first carry out a Monte Carlo analysis to study the statistical behavior of attenuation and velocity dispersion of compressional waves traveling through rocks with properties similar to those at the Utsira Sand, Sleipner field, containing quasi-fractal patchy distributions of CO2 and brine. These results show that the mean patch size and CO2 saturation play key roles in the observed wave-induced fluid flow effects. The latter can be remarkably important when CO2 concentrations are low and mean patch sizes are relatively large. To analyze these effects on the corresponding surface seismic data, we perform numerical simulations of wave propagation considering reservoir models and CO2 accumulation patterns similar to the CO2 injection site in the Sleipner field. These numerical experiments suggest that wave-induced fluid flow effects may produce changes in the reservoir's seismic response, modifying significantly the main seismic attributes usually employed in the characterization of these environments. Consequently, the determination of the nature of the fluid distributions as well as the proper modeling of the seismic data constitute important aspects that should not be ignored in the seismic monitoring of CO2 sequestration problems.

Towards a global criteria based framework for the sustainability assessment of bioethanol supply chains: application to the Swiss dilemma : is local produced bioethanol more sustainable than bioethanol imported from Brazil?

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Biofuels are considered as a promising substitute for fossil fuels when considering the possible reduction of greenhouse gases emissions. However limiting their impacts on potential benefits for reducing climate change is shortsighted. Global sustainability assessments are necessary to determine the sustainability of supply chains. We propose a new global criterion based framework enabling a comprehensive international comparison of bioethanol supply chains. The interest of this framework is that the selection of the sustainability indicators is qualified on three criterions: relevance, reliability and adaptability to the local context. Sustainability issues have been handled along environmental, social and economical issues. This new framework has been applied for a specific issue: from a Swiss perspective, is locally produced bioethanol in Switzerland more sustainable than imported from Brazil? Thanks to this framework integrating local context in its indicator definition, Brazilian production of bioethanol is shown as energy efficient and economically interesting for Brazil. From a strictly economic point of view, bioethanol production within Switzerland is not justified for Swiss consumption and questionable for the environmental issue. The social dimension is delicate to assess due to the lack of reliable data and is strongly linked to the agricultural policy in both countries. There is a need of establishing minimum sustainability criteria for imported bioethanol to avoid unwanted negative or leakage effects.

Testing for multicointegration in panel data with common factors

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper addresses the concept of multicointegration in panel data frame- work. The proposal builds upon the panel data cointegration procedures developed in Pedroni (2004), for which we compute the moments of the parametric statistics. When individuals are either cross-section independent or cross-section dependence can be re- moved by cross-section demeaning, our approach can be applied to the wider framework of mixed I(2) and I(1) stochastic processes analysis. The paper also deals with the issue of cross-section dependence using approximate common factor models. Finite sample performance is investigated through Monte Carlo simulations. Finally, we illustrate the use of the procedure investigating inventories, sales and production relationship for a panel of US industries.

A framework to understand the variations of PSD-95 expression in brain aging and in Alzheimer's disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The postsynaptic density protein PSD-95 is a major element of synapses. PSD-95 is involved in aging, Alzheimer's disease (AD) and numerous psychiatric disorders. However, contradictory data about PSD-95 expression in aging and AD have been reported. Indeed in AD versus control brains PSD-95 varies according to regions, increasing in the frontal cortex, at least in a primary stage, and decreasing in the temporal cortex. In contrast, in transgenic mouse models of aging and AD PSD-95 expression is decreased, in behaviorally aged impaired versus unimpaired rodents it can decrease or increase and finally, it is increased in rodents grown in enriched environments. Different factors explain these contradictory results in both animals and humans, among others concomitant psychiatric endophenotypes, such as depression. The possible involvement of PSD-95 in reactive and/or compensatory mechanisms during AD progression is underscored, at least before the occurrence of important synaptic elimination. Thus, in AD but not in AD transgenic mice, enhanced expression might precede the diminution commonly observed in advanced aging. A two-compartments cell model, separating events taking place in cell bodies and synapses, is presented. Overall these data suggest that AD research will progress by untangling pathological from protective events, a prerequisite for effective therapeutic strategies.

Building a Social Accounting Matrix within the ESA95 Framework: Obtaining a Dataset for Applied General Equilibrium Modelling

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research provides a description of the process followed in order to assemble a "Social Accounting Matrix" for Spain corresponding to the year 2000 (SAMSP00). As argued in the paper, this process attempts to reconcile ESA95 conventions with requirements of applied general equilibrium modelling. Particularly, problems related to the level of aggregation of net taxation data, and to the valuation system used for expressing the monetary value of input-output transactions have deserved special attention. Since the adoption of ESA95 conventions, input-output transactions have been preferably valued at basic prices, which impose additional difficulties on modellers interested in computing applied general equilibrium models. This paper addresses these difficulties by developing a procedure that allows SAM-builders to change the valuation system of input-output transactions conveniently. In addition, this procedure produces new data related to net taxation information.

Testing for multicointegration in panel data with common factors

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper addresses the concept of multicointegration in panel data frame- work. The proposal builds upon the panel data cointegration procedures developed in Pedroni (2004), for which we compute the moments of the parametric statistics. When individuals are either cross-section independent or cross-section dependence can be re- moved by cross-section demeaning, our approach can be applied to the wider framework of mixed I(2) and I(1) stochastic processes analysis. The paper also deals with the issue of cross-section dependence using approximate common factor models. Finite sample performance is investigated through Monte Carlo simulations. Finally, we illustrate the use of the procedure investigating inventories, sales and production relationship for a panel of US industries.

BIO-INSPIRED COMPUTATIONAL TECHNIQUES APPLIED TO THE CLUSTERING AND VISUALIZATION OF SPATIO-TEMPORAL GEOSPATIAL DATA

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The coverage and volume of geo-referenced datasets are extensive and incessantly¦growing. The systematic capture of geo-referenced information generates large volumes¦of spatio-temporal data to be analyzed. Clustering and visualization play a key¦role in the exploratory data analysis and the extraction of knowledge embedded in¦these data. However, new challenges in visualization and clustering are posed when¦dealing with the special characteristics of this data. For instance, its complex structures,¦large quantity of samples, variables involved in a temporal context, high dimensionality¦and large variability in cluster shapes.¦The central aim of my thesis is to propose new algorithms and methodologies for¦clustering and visualization, in order to assist the knowledge extraction from spatiotemporal¦geo-referenced data, thus improving making decision processes.¦I present two original algorithms, one for clustering: the Fuzzy Growing Hierarchical¦Self-Organizing Networks (FGHSON), and the second for exploratory visual data analysis:¦the Tree-structured Self-organizing Maps Component Planes. In addition, I present¦methodologies that combined with FGHSON and the Tree-structured SOM Component¦Planes allow the integration of space and time seamlessly and simultaneously in¦order to extract knowledge embedded in a temporal context.¦The originality of the FGHSON lies in its capability to reflect the underlying structure¦of a dataset in a hierarchical fuzzy way. A hierarchical fuzzy representation of¦clusters is crucial when data include complex structures with large variability of cluster¦shapes, variances, densities and number of clusters. The most important characteristics¦of the FGHSON include: (1) It does not require an a-priori setup of the number¦of clusters. (2) The algorithm executes several self-organizing processes in parallel.¦Hence, when dealing with large datasets the processes can be distributed reducing the¦computational cost. (3) Only three parameters are necessary to set up the algorithm.¦In the case of the Tree-structured SOM Component Planes, the novelty of this algorithm¦lies in its ability to create a structure that allows the visual exploratory data analysis¦of large high-dimensional datasets. This algorithm creates a hierarchical structure¦of Self-Organizing Map Component Planes, arranging similar variables' projections in¦the same branches of the tree. Hence, similarities on variables' behavior can be easily¦detected (e.g. local correlations, maximal and minimal values and outliers).¦Both FGHSON and the Tree-structured SOM Component Planes were applied in¦several agroecological problems proving to be very efficient in the exploratory analysis¦and clustering of spatio-temporal datasets.¦In this thesis I also tested three soft competitive learning algorithms. Two of them¦well-known non supervised soft competitive algorithms, namely the Self-Organizing¦Maps (SOMs) and the Growing Hierarchical Self-Organizing Maps (GHSOMs); and the¦third was our original contribution, the FGHSON. Although the algorithms presented¦here have been used in several areas, to my knowledge there is not any work applying¦and comparing the performance of those techniques when dealing with spatiotemporal¦geospatial data, as it is presented in this thesis.¦I propose original methodologies to explore spatio-temporal geo-referenced datasets¦through time. Our approach uses time windows to capture temporal similarities and¦variations by using the FGHSON clustering algorithm. The developed methodologies¦are used in two case studies. In the first, the objective was to find similar agroecozones¦through time and in the second one it was to find similar environmental patterns¦shifted in time.¦Several results presented in this thesis have led to new contributions to agroecological¦knowledge, for instance, in sugar cane, and blackberry production.¦Finally, in the framework of this thesis we developed several software tools: (1)¦a Matlab toolbox that implements the FGHSON algorithm, and (2) a program called¦BIS (Bio-inspired Identification of Similar agroecozones) an interactive graphical user¦interface tool which integrates the FGHSON algorithm with Google Earth in order to¦show zones with similar agroecological characteristics.

Sequence analysis and expression of the attachment and fusion proteins of canine distemper virus wild-type strain A75/17.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The biological properties of wild-type A75/17 and cell culture-adapted Onderstepoort canine distemper virus differ markedly. To learn more about the molecular basis for these differences, we have isolated and sequenced the protein-coding regions of the attachment and fusion proteins of wild-type canine distemper virus strain A75/17. In the attachment protein, a total of 57 amino acid differences were observed between the Onderstepoort strain and strain A75/17, and these were distributed evenly over the entire protein. Interestingly, the attachment protein of strain A75/17 contained an extension of three amino acids at the C terminus. Expression studies showed that the attachment protein of strain A75/17 had a higher apparent molecular mass than the attachment protein of the Onderstepoort strain, in both the presence and absence of tunicamycin. In the fusion protein, 60 amino acid differences were observed between the two strains, of which 44 were clustered in the much smaller F2 portion of the molecule. Significantly, the AUG that has been proposed as a translation initiation codon in the Onderstepoort strain is an AUA codon in strain A75/17. Detailed mutation analyses showed that both the first and second AUGs of strain A75/17 are the major translation initiation sites of the fusion protein. Similar analyses demonstrated that, also in the Onderstepoort strain, the first two AUGs are the translation initiation codons which contribute most to the generation of precursor molecules yielding the mature form of the fusion protein.

KCM, a general framework for codon-based model of evolution

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the advancement of high-throughput sequencing and dramatic increase of available genetic data, statistical modeling has become an essential part in the field of molecular evolution. Statistical modeling results in many interesting discoveries in the field, from detection of highly conserved or diverse regions in a genome to phylogenetic inference of species evolutionary history Among different types of genome sequences, protein coding regions are particularly interesting due to their impact on proteins. The building blocks of proteins, i.e. amino acids, are coded by triples of nucleotides, known as codons. Accordingly, studying the evolution of codons leads to fundamental understanding of how proteins function and evolve. The current codon models can be classified into three principal groups: mechanistic codon models, empirical codon models and hybrid ones. The mechanistic models grasp particular attention due to clarity of their underlying biological assumptions and parameters. However, they suffer from simplified assumptions that are required to overcome the burden of computational complexity. The main assumptions applied to the current mechanistic codon models are (a) double and triple substitutions of nucleotides within codons are negligible, (b) there is no mutation variation among nucleotides of a single codon and (c) assuming HKY nucleotide model is sufficient to capture essence of transition- transversion rates at nucleotide level. In this thesis, I develop a framework of mechanistic codon models, named KCM-based model family framework, based on holding or relaxing the mentioned assumptions. Accordingly, eight different models are proposed from eight combinations of holding or relaxing the assumptions from the simplest one that holds all the assumptions to the most general one that relaxes all of them. The models derived from the proposed framework allow me to investigate the biological plausibility of the three simplified assumptions on real data sets as well as finding the best model that is aligned with the underlying characteristics of the data sets. -- Avec l'avancement de séquençage à haut débit et l'augmentation dramatique des données géné¬tiques disponibles, la modélisation statistique est devenue un élément essentiel dans le domaine dé l'évolution moléculaire. Les résultats de la modélisation statistique dans de nombreuses découvertes intéressantes dans le domaine de la détection, de régions hautement conservées ou diverses dans un génome de l'inférence phylogénétique des espèces histoire évolutive. Parmi les différents types de séquences du génome, les régions codantes de protéines sont particulièrement intéressants en raison de leur impact sur les protéines. Les blocs de construction des protéines, à savoir les acides aminés, sont codés par des triplets de nucléotides, appelés codons. Par conséquent, l'étude de l'évolution des codons mène à la compréhension fondamentale de la façon dont les protéines fonctionnent et évoluent. Les modèles de codons actuels peuvent être classés en trois groupes principaux : les modèles de codons mécanistes, les modèles de codons empiriques et les hybrides. Les modèles mécanistes saisir une attention particulière en raison de la clarté de leurs hypothèses et les paramètres biologiques sous-jacents. Cependant, ils souffrent d'hypothèses simplificatrices qui permettent de surmonter le fardeau de la complexité des calculs. Les principales hypothèses retenues pour les modèles actuels de codons mécanistes sont : a) substitutions doubles et triples de nucleotides dans les codons sont négligeables, b) il n'y a pas de variation de la mutation chez les nucléotides d'un codon unique, et c) en supposant modèle nucléotidique HKY est suffisant pour capturer l'essence de taux de transition transversion au niveau nucléotidique. Dans cette thèse, je poursuis deux objectifs principaux. Le premier objectif est de développer un cadre de modèles de codons mécanistes, nommé cadre KCM-based model family, sur la base de la détention ou de l'assouplissement des hypothèses mentionnées. En conséquence, huit modèles différents sont proposés à partir de huit combinaisons de la détention ou l'assouplissement des hypothèses de la plus simple qui détient toutes les hypothèses à la plus générale qui détend tous. Les modèles dérivés du cadre proposé nous permettent d'enquêter sur la plausibilité biologique des trois hypothèses simplificatrices sur des données réelles ainsi que de trouver le meilleur modèle qui est aligné avec les caractéristiques sous-jacentes des jeux de données. Nos expériences montrent que, dans aucun des jeux de données réelles, tenant les trois hypothèses mentionnées est réaliste. Cela signifie en utilisant des modèles simples qui détiennent ces hypothèses peuvent être trompeuses et les résultats de l'estimation inexacte des paramètres. Le deuxième objectif est de développer un modèle mécaniste de codon généralisée qui détend les trois hypothèses simplificatrices, tandis que d'informatique efficace, en utilisant une opération de matrice appelée produit de Kronecker. Nos expériences montrent que sur un jeux de données choisis au hasard, le modèle proposé de codon mécaniste généralisée surpasse autre modèle de codon par rapport à AICc métrique dans environ la moitié des ensembles de données. En outre, je montre à travers plusieurs expériences que le modèle général proposé est biologiquement plausible.

A Confluence of Risks: Control and Compliance in the World of Unstructured Data, Big Data and the Cloud

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The emergence of powerful new technologies, the existence of large quantities of data, and increasing demands for the extraction of added value from these technologies and data have created a number of significant challenges for those charged with both corporate and information technology management. The possibilities are great, the expectations high, and the risks significant. Organisations seeking to employ cloud technologies and exploit the value of the data to which they have access, be this in the form of "Big Data" available from different external sources or data held within the organisation, in structured or unstructured formats, need to understand the risks involved in such activities. Data owners have responsibilities towards the subjects of the data and must also, frequently, demonstrate that they are in compliance with current standards, laws and regulations. This thesis sets out to explore the nature of the technologies that organisations might utilise, identify the most pertinent constraints and risks, and propose a framework for the management of data from discovery to external hosting that will allow the most significant risks to be managed through the definition, implementation, and performance of appropriate internal control activities.

Accelerated Microstructure Imaging via Convex Optimization (AMICO) from diffusion MRI data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Microstructure imaging from diffusion magnetic resonance (MR) data represents an invaluable tool to study non-invasively the morphology of tissues and to provide a biological insight into their microstructural organization. In recent years, a variety of biophysical models have been proposed to associate particular patterns observed in the measured signal with specific microstructural properties of the neuronal tissue, such as axon diameter and fiber density. Despite very appealing results showing that the estimated microstructure indices agree very well with histological examinations, existing techniques require computationally very expensive non-linear procedures to fit the models to the data which, in practice, demand the use of powerful computer clusters for large-scale applications. In this work, we present a general framework for Accelerated Microstructure Imaging via Convex Optimization (AMICO) and show how to re-formulate this class of techniques as convenient linear systems which, then, can be efficiently solved using very fast algorithms. We demonstrate this linearization of the fitting problem for two specific models, i.e. ActiveAx and NODDI, providing a very attractive alternative for parameter estimation in those techniques; however, the AMICO framework is general and flexible enough to work also for the wider space of microstructure imaging methods. Results demonstrate that AMICO represents an effective means to accelerate the fit of existing techniques drastically (up to four orders of magnitude faster) while preserving accuracy and precision in the estimated model parameters (correlation above 0.9). We believe that the availability of such ultrafast algorithms will help to accelerate the spread of microstructure imaging to larger cohorts of patients and to study a wider spectrum of neurological disorders.

Multitask learning of environmental spatial data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present research deals with an application of artificial neural networks for multitask learning from spatial environmental data. The real case study (sediments contamination of Geneva Lake) consists of 8 pollutants. There are different relationships between these variables, from linear correlations to strong nonlinear dependencies. The main idea is to construct a subsets of pollutants which can be efficiently modeled together within the multitask framework. The proposed two-step approach is based on: 1) the criterion of nonlinear predictability of each variable ?k? by analyzing all possible models composed from the rest of the variables by using a General Regression Neural Network (GRNN) as a model; 2) a multitask learning of the best model using multilayer perceptron and spatial predictions. The results of the study are analyzed using both machine learning and geostatistical tools.

Scaling and data collapse for the mean exit time of asset prices

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study theoretical and empirical aspects of the mean exit time (MET) of financial time series. The theoretical modeling is done within the framework of continuous time random walk. We empirically verify that the mean exit time follows a quadratic scaling law and it has associated a prefactor which is specific to the analyzed stock. We perform a series of statistical tests to determine which kind of correlation are responsible for this specificity. The main contribution is associated with the autocorrelation property of stock returns. We introduce and solve analytically both two-state and three-state Markov chain models. The analytical results obtained with the two-state Markov chain model allows us to obtain a data collapse of the 20 measured MET profiles in a single master curve.

From Empirical Bayes to Full Bayes: Methods for Analysing Traffice Safety Data, October 25, 2004

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Traffic safety engineers are among the early adopters of Bayesian statistical tools for analyzing crash data. As in many other areas of application, empirical Bayes methods were their first choice, perhaps because they represent an intuitively appealing, yet relatively easy to implement alternative to purely classical approaches. With the enormous progress in numerical methods made in recent years and with the availability of free, easy to use software that permits implementing a fully Bayesian approach, however, there is now ample justification to progress towards fully Bayesian analyses of crash data. The fully Bayesian approach, in particular as implemented via multi-level hierarchical models, has many advantages over the empirical Bayes approach. In a full Bayesian analysis, prior information and all available data are seamlessly integrated into posterior distributions on which practitioners can base their inferences. All uncertainties are thus accounted for in the analyses and there is no need to pre-process data to obtain Safety Performance Functions and other such prior estimates of the effect of covariates on the outcome of interest. In this slight, fully Bayesian methods may well be less costly to implement and may result in safety estimates with more realistic standard errors. In this manuscript, we present the full Bayesian approach to analyzing traffic safety data and focus on highlighting the differences between the empirical Bayes and the full Bayes approaches. We use an illustrative example to discuss a step-by-step Bayesian analysis of the data and to show some of the types of inferences that are possible within the full Bayesian framework.

«
1
2
...
21
22
23
24
25
26
27
...
54
55
»