982 resultados para PREDICTIONS
Resumo:
A 0.125 degree raster or grid-based Geographic Information System with data on tsetse, trypanosomosis, animal production, agriculture and land use has recently been developed in Togo. This paper addresses the problem of generating tsetse distribution and abundance maps from remotely sensed data, using a restricted amount of field data. A discriminant analysis model is tested using contemporary tsetse data and remotely sensed, low resolution data acquired from the National Oceanographic and Atmospheric Administration and Meteosat platforms. A split sample technique is adopted where a randomly selected part of the field measured data (training set) serves to predict the other part (predicted set). The obtained results are then compared with field measured data per corresponding grid-square. Depending on the size of the training set the percentage of concording predictions varies from 80 to 95 for distribution figures and from 63 to 74 for abundance. These results confirm the potential of satellite data application and multivariate analysis for the prediction, not only of the tsetse distribution, but more importantly of their abundance. This opens up new avenues because satellite predictions and field data may be combined to strengthen or substitute one another and thus reduce costs of field surveys.
Resumo:
BACKGROUND: The ambition of most molecular biologists is the understanding of the intricate network of molecular interactions that control biological systems. As scientists uncover the components and the connectivity of these networks, it becomes possible to study their dynamical behavior as a whole and discover what is the specific role of each of their components. Since the behavior of a network is by no means intuitive, it becomes necessary to use computational models to understand its behavior and to be able to make predictions about it. Unfortunately, most current computational models describe small networks due to the scarcity of kinetic data available. To overcome this problem, we previously published a methodology to convert a signaling network into a dynamical system, even in the total absence of kinetic information. In this paper we present a software implementation of such methodology. RESULTS: We developed SQUAD, a software for the dynamic simulation of signaling networks using the standardized qualitative dynamical systems approach. SQUAD converts the network into a discrete dynamical system, and it uses a binary decision diagram algorithm to identify all the steady states of the system. Then, the software creates a continuous dynamical system and localizes its steady states which are located near the steady states of the discrete system. The software permits to make simulations on the continuous system, allowing for the modification of several parameters. Importantly, SQUAD includes a framework for perturbing networks in a manner similar to what is performed in experimental laboratory protocols, for example by activating receptors or knocking out molecular components. Using this software we have been able to successfully reproduce the behavior of the regulatory network implicated in T-helper cell differentiation. CONCLUSION: The simulation of regulatory networks aims at predicting the behavior of a whole system when subject to stimuli, such as drugs, or determine the role of specific components within the network. The predictions can then be used to interpret and/or drive laboratory experiments. SQUAD provides a user-friendly graphical interface, accessible to both computational and experimental biologists for the fast qualitative simulation of large regulatory networks for which kinetic data is not necessarily available.
Resumo:
We investigate the determinants of teamwork and workers cooperation within the firm. Up to now the literature has almost exclusively focused on workers incentives as the main determinants for workers cooperation. We take a broader look at the firm's organizational design and analyze the impact that different aspects of it might have on cooperation. In particular, we consider the way in which the degree of decentralization of decisions and the use of complementary HRM practices (what we call the .rm.s vertical organizational design) can affect workers'collaboration with each other. We test the model's predictions on a unique dataset on Spanish small and medium size firms containing a rich set of variables that allows us to use sensible proxies for workers cooperation. We find that the decentralization of labor decisions (and to a less extent that of task planning) has a positive impact on workers cooperation. Likewise, cooperation is positively correlated to many of the HRM practices that seem to favor workers'interaction the most. We also confirm the previous finding that collaborative efforts respond positively to pay incentives, and particularly, to group or company incentives.
Resumo:
Predictions that deforestation would reduce American cutaneous leishmaniasis incidence have proved incorrect. Presentations at a recent international workshop, instead, demonstrated frequent domestication of transmission throughout Latin America. While posing new threats, this process also increases the effectiveness of vector control in and around houses. New approaches for sand fly control and effective targeting of resources are reviewed.
Resumo:
The paper presents an approach for mapping of precipitation data. The main goal is to perform spatial predictions and simulations of precipitation fields using geostatistical methods (ordinary kriging, kriging with external drift) as well as machine learning algorithms (neural networks). More practically, the objective is to reproduce simultaneously both the spatial patterns and the extreme values. This objective is best reached by models integrating geostatistics and machine learning algorithms. To demonstrate how such models work, two case studies have been considered: first, a 2-day accumulation of heavy precipitation and second, a 6-day accumulation of extreme orographic precipitation. The first example is used to compare the performance of two optimization algorithms (conjugate gradients and Levenberg-Marquardt) of a neural network for the reproduction of extreme values. Hybrid models, which combine geostatistical and machine learning algorithms, are also treated in this context. The second dataset is used to analyze the contribution of radar Doppler imagery when used as external drift or as input in the models (kriging with external drift and neural networks). Model assessment is carried out by comparing independent validation errors as well as analyzing data patterns.
Resumo:
This article studies how product introduction decisions relate to profitability and uncertainty in the context of multi-product firms and product differentiation. These two features, common to many modern industries, have not received much attention in the literature as compared to the classical problem of firm entry, even if the determinants of firm and product entry are quite different. The theoretical predictions about the sign of the impact of uncertainty on product entry are not conclusive. Therefore, an econometric model relating firms’ product introduction decisions with profitability and profit uncertainty is proposed. Firm’s estimated profits are obtained from a structural model of product demand and supply, and uncertainty is proxied by profits’ variance. The empirical analysis is carried out using data on the Spanish car industry for the period 1990-2000. The results show a positive relationship between product introduction and profitability, and a negative one with respect to profit variability. Interestingly, the degree of uncertainty appears to be a driving force of entry stronger than profitability, suggesting that the product proliferation process in the Spanish car market may have been mainly a consequence of lower uncertainty rather than the result of having a more profitable market. Keywords: Product introduction, entry, uncertainty, multiproduct firms, automobile JEL codes: L11, L13
Resumo:
Models predicting species spatial distribution are increasingly applied to wildlife management issues, emphasising the need for reliable methods to evaluate the accuracy of their predictions. As many available datasets (e.g. museums, herbariums, atlas) do not provide reliable information about species absences, several presence-only based analyses have been developed. However, methods to evaluate the accuracy of their predictions are few and have never been validated. The aim of this paper is to compare existing and new presenceonly evaluators to usual presence/absence measures. We use a reliable, diverse, presence/absence dataset of 114 plant species to test how common presence/absence indices (Kappa, MaxKappa, AUC, adjusted D-2) compare to presenceonly measures (AVI, CVI, Boyce index) for evaluating generalised linear models (GLM). Moreover we propose a new, threshold-independent evaluator, which we call "continuous Boyce index". All indices were implemented in the B10MAPPER software. We show that the presence-only evaluators are fairly correlated (p > 0.7) to the presence/absence ones. The Boyce indices are closer to AUC than to MaxKappa and are fairly insensitive to species prevalence. In addition, the Boyce indices provide predicted-toexpected ratio curves that offer further insights into the model quality: robustness, habitat suitability resolution and deviation from randomness. This information helps reclassifying predicted maps into meaningful habitat suitability classes. The continuous Boyce index is thus both a complement to usual evaluation of presence/absence models and a reliable measure of presence-only based predictions.
Resumo:
Des progrès significatifs ont été réalisés dans le domaine de l'intégration quantitative des données géophysique et hydrologique l'échelle locale. Cependant, l'extension à de plus grandes échelles des approches correspondantes constitue encore un défi majeur. Il est néanmoins extrêmement important de relever ce défi pour développer des modèles fiables de flux des eaux souterraines et de transport de contaminant. Pour résoudre ce problème, j'ai développé une technique d'intégration des données hydrogéophysiques basée sur une procédure bayésienne de simulation séquentielle en deux étapes. Cette procédure vise des problèmes à plus grande échelle. L'objectif est de simuler la distribution d'un paramètre hydraulique cible à partir, d'une part, de mesures d'un paramètre géophysique pertinent qui couvrent l'espace de manière exhaustive, mais avec une faible résolution (spatiale) et, d'autre part, de mesures locales de très haute résolution des mêmes paramètres géophysique et hydraulique. Pour cela, mon algorithme lie dans un premier temps les données géophysiques de faible et de haute résolution à travers une procédure de réduction déchelle. Les données géophysiques régionales réduites sont ensuite reliées au champ du paramètre hydraulique à haute résolution. J'illustre d'abord l'application de cette nouvelle approche dintégration des données à une base de données synthétiques réaliste. Celle-ci est constituée de mesures de conductivité hydraulique et électrique de haute résolution réalisées dans les mêmes forages ainsi que destimations des conductivités électriques obtenues à partir de mesures de tomographic de résistivité électrique (ERT) sur l'ensemble de l'espace. Ces dernières mesures ont une faible résolution spatiale. La viabilité globale de cette méthode est testée en effectuant les simulations de flux et de transport au travers du modèle original du champ de conductivité hydraulique ainsi que du modèle simulé. Les simulations sont alors comparées. Les résultats obtenus indiquent que la procédure dintégration des données proposée permet d'obtenir des estimations de la conductivité en adéquation avec la structure à grande échelle ainsi que des predictions fiables des caractéristiques de transports sur des distances de moyenne à grande échelle. Les résultats correspondant au scénario de terrain indiquent que l'approche d'intégration des données nouvellement mise au point est capable d'appréhender correctement les hétérogénéitées à petite échelle aussi bien que les tendances à gande échelle du champ hydraulique prévalent. Les résultats montrent également une flexibilté remarquable et une robustesse de cette nouvelle approche dintégration des données. De ce fait, elle est susceptible d'être appliquée à un large éventail de données géophysiques et hydrologiques, à toutes les gammes déchelles. Dans la deuxième partie de ma thèse, j'évalue en détail la viabilité du réechantillonnage geostatique séquentiel comme mécanisme de proposition pour les méthodes Markov Chain Monte Carlo (MCMC) appliquées à des probmes inverses géophysiques et hydrologiques de grande dimension . L'objectif est de permettre une quantification plus précise et plus réaliste des incertitudes associées aux modèles obtenus. En considérant une série dexemples de tomographic radar puits à puits, j'étudie deux classes de stratégies de rééchantillonnage spatial en considérant leur habilité à générer efficacement et précisément des réalisations de la distribution postérieure bayésienne. Les résultats obtenus montrent que, malgré sa popularité, le réechantillonnage séquentiel est plutôt inefficace à générer des échantillons postérieurs indépendants pour des études de cas synthétiques réalistes, notamment pour le cas assez communs et importants où il existe de fortes corrélations spatiales entre le modèle et les paramètres. Pour résoudre ce problème, j'ai développé un nouvelle approche de perturbation basée sur une déformation progressive. Cette approche est flexible en ce qui concerne le nombre de paramètres du modèle et lintensité de la perturbation. Par rapport au rééchantillonage séquentiel, cette nouvelle approche s'avère être très efficace pour diminuer le nombre requis d'itérations pour générer des échantillons indépendants à partir de la distribution postérieure bayésienne. - Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale. However, extending corresponding approaches beyond the local scale still represents a major challenge, yet is critically important for the development of reliable groundwater flow and contaminant transport models. To address this issue, I have developed a hydrogeophysical data integration technique based on a two-step Bayesian sequential simulation procedure that is specifically targeted towards larger-scale problems. The objective is to simulate the distribution of a target hydraulic parameter based on spatially exhaustive, but poorly resolved, measurements of a pertinent geophysical parameter and locally highly resolved, but spatially sparse, measurements of the considered geophysical and hydraulic parameters. To this end, my algorithm links the low- and high-resolution geophysical data via a downscaling procedure before relating the downscaled regional-scale geophysical data to the high-resolution hydraulic parameter field. I first illustrate the application of this novel data integration approach to a realistic synthetic database consisting of collocated high-resolution borehole measurements of the hydraulic and electrical conductivities and spatially exhaustive, low-resolution electrical conductivity estimates obtained from electrical resistivity tomography (ERT). The overall viability of this method is tested and verified by performing and comparing flow and transport simulations through the original and simulated hydraulic conductivity fields. The corresponding results indicate that the proposed data integration procedure does indeed allow for obtaining faithful estimates of the larger-scale hydraulic conductivity structure and reliable predictions of the transport characteristics over medium- to regional-scale distances. The approach is then applied to a corresponding field scenario consisting of collocated high- resolution measurements of the electrical conductivity, as measured using a cone penetrometer testing (CPT) system, and the hydraulic conductivity, as estimated from electromagnetic flowmeter and slug test measurements, in combination with spatially exhaustive low-resolution electrical conductivity estimates obtained from surface-based electrical resistivity tomography (ERT). The corresponding results indicate that the newly developed data integration approach is indeed capable of adequately capturing both the small-scale heterogeneity as well as the larger-scale trend of the prevailing hydraulic conductivity field. The results also indicate that this novel data integration approach is remarkably flexible and robust and hence can be expected to be applicable to a wide range of geophysical and hydrological data at all scale ranges. In the second part of my thesis, I evaluate in detail the viability of sequential geostatistical resampling as a proposal mechanism for Markov Chain Monte Carlo (MCMC) methods applied to high-dimensional geophysical and hydrological inverse problems in order to allow for a more accurate and realistic quantification of the uncertainty associated with the thus inferred models. Focusing on a series of pertinent crosshole georadar tomographic examples, I investigated two classes of geostatistical resampling strategies with regard to their ability to efficiently and accurately generate independent realizations from the Bayesian posterior distribution. The corresponding results indicate that, despite its popularity, sequential resampling is rather inefficient at drawing independent posterior samples for realistic synthetic case studies, notably for the practically common and important scenario of pronounced spatial correlation between model parameters. To address this issue, I have developed a new gradual-deformation-based perturbation approach, which is flexible with regard to the number of model parameters as well as the perturbation strength. Compared to sequential resampling, this newly proposed approach was proven to be highly effective in decreasing the number of iterations required for drawing independent samples from the Bayesian posterior distribution.
Resumo:
Report for the scientific sojourn carried out at the University of California at Berkeley, from September to December 2007. Environmental niche modelling (ENM) techniques are powerful tools to predict species potential distributions. In the last ten years, a plethora of novel methodological approaches and modelling techniques have been developed. During three months, I stayed at the University of California, Berkeley, working under the supervision of Dr. David R. Vieites. The aim of our work was to quantify the error committed by these techniques, but also to test how an increase in the sample size affects the resultant predictions. Using MaxEnt software we generated distribution predictive maps, from different sample sizes, of the Eurasian quail (Coturnix coturnix) in the Iberian Peninsula. The quail is a generalist species from a climatic point of view, but an habitat specialist. The resultant distribution maps were compared with the real distribution of the species. This distribution was obtained from recent bird atlases from Spain and Portugal. Results show that ENM techniques can have important errors when predicting the species distribution of generalist species. Moreover, an increase of sample size is not necessary related with a better performance of the models. We conclude that a deep knowledge of the species’ biology and the variables affecting their distribution is crucial for an optimal modelling. The lack of this knowledge can induce to wrong conclusions.
Resumo:
Predictive species distribution modelling (SDM) has become an essential tool in biodiversity conservation and management. The choice of grain size (resolution) of environmental layers used in modelling is one important factor that may affect predictions. We applied 10 distinct modelling techniques to presence-only data for 50 species in five different regions, to test whether: (1) a 10-fold coarsening of resolution affects predictive performance of SDMs, and (2) any observed effects are dependent on the type of region, modelling technique, or species considered. Results show that a 10 times change in grain size does not severely affect predictions from species distribution models. The overall trend is towards degradation of model performance, but improvement can also be observed. Changing grain size does not equally affect models across regions, techniques, and species types. The strongest effect is on regions and species types, with tree species in the data sets (regions) with highest locational accuracy being most affected. Changing grain size had little influence on the ranking of techniques: boosted regression trees remain best at both resolutions. The number of occurrences used for model training had an important effect, with larger sample sizes resulting in better models, which tended to be more sensitive to grain. Effect of grain change was only noticeable for models reaching sufficient performance and/or with initial data that have an intrinsic error smaller than the coarser grain size.
Resumo:
Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale. However, extending the corresponding approaches to the scale of a field site represents a major, and as-of-yet largely unresolved, challenge. To address this problem, we have developed downscaling procedure based on a non-linear Bayesian sequential simulation approach. The main objective of this algorithm is to estimate the value of the sparsely sampled hydraulic conductivity at non-sampled locations based on its relation to the electrical conductivity logged at collocated wells and surface resistivity measurements, which are available throughout the studied site. The in situ relationship between the hydraulic and electrical conductivities is described through a non-parametric multivariatekernel density function. Then a stochastic integration of low-resolution, large-scale electrical resistivity tomography (ERT) data in combination with high-resolution, local-scale downhole measurements of the hydraulic and electrical conductivities is applied. The overall viability of this downscaling approach is tested and validated by comparing flow and transport simulation through the original and the upscaled hydraulic conductivity fields. Our results indicate that the proposed procedure allows obtaining remarkably faithful estimates of the regional-scale hydraulic conductivity structure and correspondingly reliable predictions of the transport characteristics over relatively long distances.
Resumo:
Executive Summary The first essay of this dissertation investigates whether greater exchange rate uncertainty (i.e., variation over time in the exchange rate) fosters or depresses the foreign investment of multinational firms. In addition to the direct capital financing it supplies, foreign investment can be a source of valuable technology and know-how, which can have substantial positive effects on a host country's economic growth. Thus, it is critically important for policy makers and central bankers, among others, to understand how multinationals base their investment decisions on the characteristics of foreign exchange markets. In this essay, I first develop a theoretical framework to improve our knowledge regarding how the aggregate level of foreign investment responds to exchange rate uncertainty when an economy consists of many firms, each of which is making decisions. The analysis predicts a U-shaped effect of exchange rate uncertainty on the total level of foreign investment of the economy. That is, the effect is negative for low levels of uncertainty and positive for higher levels of uncertainty. This pattern emerges because the relationship between exchange rate volatility and 'the probability of investment is negative for firms with low productivity at home (i.e., firms that find it profitable to invest abroad) and the relationship is positive for firms with high productivity at home (i.e., firms that prefer exporting their product). This finding stands in sharp contrast to predictions in the existing literature that consider a single firm's decision to invest in a unique project. The main contribution of this research is to show that the aggregation over many firms produces a U-shaped pattern between exchange rate uncertainty and the probability of investment. Using data from industrialized countries for the period of 1982-2002, this essay offers a comprehensive empirical analysis that provides evidence in support of the theoretical prediction. In the second essay, I aim to explain the time variation in sovereign credit risk, which captures the risk that a government may be unable to repay its debt. The importance of correctly evaluating such a risk is illustrated by the central role of sovereign debt in previous international lending crises. In addition, sovereign debt is the largest asset class in emerging markets. In this essay, I provide a pricing formula for the evaluation of sovereign credit risk in which the decision to default on sovereign debt is made by the government. The pricing formula explains the variation across time in daily credit spreads - a widely used measure of credit risk - to a degree not offered by existing theoretical and empirical models. I use information on a country's stock market to compute the prevailing sovereign credit spread in that country. The pricing formula explains a substantial fraction of the time variation in daily credit spread changes for Brazil, Mexico, Peru, and Russia for the 1998-2008 period, particularly during the recent subprime crisis. I also show that when a government incentive to default is allowed to depend on current economic conditions, one can best explain the level of credit spreads, especially during the recent period of financial distress. In the third essay, I show that the risk of sovereign default abroad can produce adverse consequences for the U.S. equity market through a decrease in returns and an increase in volatility. The risk of sovereign default, which is no longer limited to emerging economies, has recently become a major concern for financial markets. While sovereign debt plays an increasing role in today's financial environment, the effects of sovereign credit risk on the U.S. financial markets have been largely ignored in the literature. In this essay, I develop a theoretical framework that explores how the risk of sovereign default abroad helps explain the level and the volatility of U.S. equity returns. The intuition for this effect is that negative economic shocks deteriorate the fiscal situation of foreign governments, thereby increasing the risk of a sovereign default that would trigger a local contraction in economic growth. The increased risk of an economic slowdown abroad amplifies the direct effect of these shocks on the level and the volatility of equity returns in the U.S. through two channels. The first channel involves a decrease in the future earnings of U.S. exporters resulting from unfavorable adjustments to the exchange rate. The second channel involves investors' incentives to rebalance their portfolios toward safer assets, which depresses U.S. equity prices. An empirical estimation of the model with monthly data for the 1994-2008 period provides evidence that the risk of sovereign default abroad generates a strong leverage effect during economic downturns, which helps to substantially explain the level and the volatility of U.S. equity returns.
Resumo:
The identification of all human chromosome 21 (HC21) genes is a necessary step in understanding the molecular pathogenesis of trisomy 21 (Down syndrome). The first analysis of the sequence of 21q included 127 previously characterized genes and predicted an additional 98 novel anonymous genes. Recently we evaluated the quality of this annotation by characterizing a set of HC21 open reading frames (C21orfs) identified by mapping spliced expressed sequence tags (ESTs) and predicted genes (PREDs), identified only in silico. This study underscored the limitations of in silico-only gene prediction, as many PREDs were incorrectly predicted. To refine the HC21 annotation, we have developed a reliable algorithm to extract and stringently map sequences that contain bona fide 3' transcript ends to the genome. We then created a specific 21q graphical display allowing an integrated view of the data that incorporates new ESTs as well as features such as CpG islands, repeats, and gene predictions. Using these tools we identified 27 new putative genes. To validate these, we sequenced previously cloned cDNAs and carried out RT-PCR, 5'- and 3'-RACE procedures, and comparative mapping. These approaches substantiated 19 new transcripts, thus increasing the HC21 gene count by 9.5%. These transcripts were likely not previously identified because they are small and encode small proteins. We also identified four transcriptional units that are spliced but contain no obvious open reading frame. The HC21 data presented here further emphasize that current gene prediction algorithms miss a substantial number of transcripts that nevertheless can be identified using a combination of experimental approaches and multiple refined algorithms.
Resumo:
DREAM is an initiative that allows researchers to assess how well their methods or approaches can describe and predict networks of interacting molecules [1]. Each year, recently acquired datasets are released to predictors ahead of publication. Researchers typically have about three months to predict the masked data or network of interactions, using any predictive method. Predictions are assessed prior to an annual conference where the best predictions are unveiled and discussed. Here we present the strategy we used to make a winning prediction for the DREAM3 phosphoproteomics challenge. We used Amelia II, a multiple imputation software method developed by Gary King, James Honaker and Matthew Blackwell[2] in the context of social sciences to predict the 476 out of 4624 measurements that had been masked for the challenge. To chose the best possible multiple imputation parameters to apply for the challenge, we evaluated how transforming the data and varying the imputation parameters affected the ability to predict additionally masked data. We discuss the accuracy of our findings and show that multiple imputations applied to this dataset is a powerful method to accurately estimate the missing data. We postulate that multiple imputations methods might become an integral part of experimental design as a mean to achieve cost savings in experimental design or to increase the quantity of samples that could be handled for a given cost.
Resumo:
In androdioecious metapopulations, where males co-occur with hermaphrodites, the absence of males from certain populations or regions may be explained by locally high selfing rates, high hermaphrodite outcross siring success (e.g. due to high pollen production by hermaphrodites), or to stochastic processes (e.g. the failure of males to invade populations or regions following colonization or range expansion by hermaphrodites). In the Iberian Peninsula and Morocco, the presence of males with hermaphrodites in the wind-pollinated androdioecious plant Mercurialis annua (Euphorbiaceae) varies both among populations within relatively small regions and among regions, with some regions lacking males from all populations. The species is known to have expanded its range into the Iberian Peninsula from a southern refugium. To account for variation in male presence in M. annua, we test the following hypotheses: (1) that males are absent in areas where plant densities are lower, because selfing rates should be correspondingly higher; (2) that males are absent in areas where hermaphrodites produce more pollen; and (3) that males are absent in areas where there is an elevated proportion of populations in which plant density and hermaphrodite pollen production disfavour their invasion. We found support for predictions two and three in Morocco (the putative Pleistocene refugium for M. annua) but no support for any hypothesis in Iberia (the expanded range). Our results are partially consistent with a hypothesis of sex-allocation equilibrium for populations in Morocco; in Iberia, the absence of males from large geographical regions is more consistent with a model of sex-ratio evolution in a metapopulation with recurrent population turnover. Our study points to the role of both frequency-dependent selection and contingencies imposed by colonization during range expansions and in metapopulations.