964 resultados para Data Migration Processes Modeling
Resumo:
Currently, the identification of two cryptic Iberian amphibians, Discoglossus galganoi Capula, Nascetti, Lanza, Bullini and Crespo, 1985 and Discoglossus jeanneae Busack, 1986, relies on molecular characterization. To provide a means to discern the distributions of these species, we used 385-base-pair sequences of the cytochrome b gene to identify 54 Spanish populations of Discoglossus. These data and a series of environmental variables were used to build up a logistic regression model capable of probabilistically designating a specimen of Discoglossus found in any Universal Transverse Mercator (UTM) grid cell of 10 km × 10 km to one of the two species. Western longitudes, wide river basins, and semipermeable (mainly siliceous) and sandstone substrates favored the presence of D. galganoi, while eastern longitudes, mountainous areas, severe floodings, and impermeable (mainly clay) or basic (limestone and gypsum) substrates favored D. jeanneae. Fifteen percent of the UTM cells were predicted to be shared by both species, whereas 51% were clearly in favor of D. galganoi and 34% were in favor of D. jeanneae, considering odds of 4:1. These results suggest that these two species have parapatric distributions and allow for preliminary identification of potential secondary contact areas. The method applied here can be generalized and used for other geographic problems posed by cryptic species.
Resumo:
The fast development of Information Communication Technologies (ICT) offers new opportunities to realize future smart cities. To understand, manage and forecast the city's behavior, it is necessary the analysis of different kinds of data from the most varied dataset acquisition systems. The aim of this research activity in the framework of Data Science and Complex Systems Physics is to provide stakeholders with new knowledge tools to improve the sustainability of mobility demand in future cities. Under this perspective, the governance of mobility demand generated by large tourist flows is becoming a vital issue for the quality of life in Italian cities' historical centers, which will worsen in the next future due to the continuous globalization process. Another critical theme is sustainable mobility, which aims to reduce private transportation means in the cities and improve multimodal mobility. We analyze the statistical properties of urban mobility of Venice, Rimini, and Bologna by using different datasets provided by companies and local authorities. We develop algorithms and tools for cartography extraction, trips reconstruction, multimodality classification, and mobility simulation. We show the existence of characteristic mobility paths and statistical properties depending on transport means and user's kinds. Finally, we use our results to model and simulate the overall behavior of the cars moving in the Emilia Romagna Region and the pedestrians moving in Venice with software able to replicate in silico the demand for mobility and its dynamic.
Resumo:
This dissertation contains four essays that all share a common purpose: developing new methodologies to exploit the potential of high-frequency data for the measurement, modeling and forecasting of financial assets volatility and correlations. The first two chapters provide useful tools for univariate applications while the last two chapters develop multivariate methodologies. In chapter 1, we introduce a new class of univariate volatility models named FloGARCH models. FloGARCH models provide a parsimonious joint model for low frequency returns and realized measures, and are sufficiently flexible to capture long memory as well as asymmetries related to leverage effects. We analyze the performances of the models in a realistic numerical study and on the basis of a data set composed of 65 equities. Using more than 10 years of high-frequency transactions, we document significant statistical gains related to the FloGARCH models in terms of in-sample fit, out-of-sample fit and forecasting accuracy compared to classical and Realized GARCH models. In chapter 2, using 12 years of high-frequency transactions for 55 U.S. stocks, we argue that combining low-frequency exogenous economic indicators with high-frequency financial data improves the ability of conditionally heteroskedastic models to forecast the volatility of returns, their full multi-step ahead conditional distribution and the multi-period Value-at-Risk. Using a refined version of the Realized LGARCH model allowing for time-varying intercept and implemented with realized kernels, we document that nominal corporate profits and term spreads have strong long-run predictive ability and generate accurate risk measures forecasts over long-horizon. The results are based on several loss functions and tests, including the Model Confidence Set. Chapter 3 is a joint work with David Veredas. We study the class of disentangled realized estimators for the integrated covariance matrix of Brownian semimartingales with finite activity jumps. These estimators separate correlations and volatilities. We analyze different combinations of quantile- and median-based realized volatilities, and four estimators of realized correlations with three synchronization schemes. Their finite sample properties are studied under four data generating processes, in presence, or not, of microstructure noise, and under synchronous and asynchronous trading. The main finding is that the pre-averaged version of disentangled estimators based on Gaussian ranks (for the correlations) and median deviations (for the volatilities) provide a precise, computationally efficient, and easy alternative to measure integrated covariances on the basis of noisy and asynchronous prices. Along these lines, a minimum variance portfolio application shows the superiority of this disentangled realized estimator in terms of numerous performance metrics. Chapter 4 is co-authored with Niels S. Hansen, Asger Lunde and Kasper V. Olesen, all affiliated with CREATES at Aarhus University. We propose to use the Realized Beta GARCH model to exploit the potential of high-frequency data in commodity markets. The model produces high quality forecasts of pairwise correlations between commodities which can be used to construct a composite covariance matrix. We evaluate the quality of this matrix in a portfolio context and compare it to models used in the industry. We demonstrate significant economic gains in a realistic setting including short selling constraints and transaction costs.
Resumo:
A research program on atmospheric boundary layer processes and local wind regimes in complex terrain was conducted in the vicinity of Lake Tekapo in the southern Alps of New Zealand, during two 1-month field campaigns in 1997 and 1999. The effects of the interaction of thermal and dynamic forcing were of specific interest, with a particular focus on the interaction of thermal forcing of differing scales. The rationale and objectives of the field and modeling program are described, along with the methodology used to achieve them. Specific research aims include improved knowledge of the role of surface forcing associated with varying energy balances across heterogeneous terrain, thermal influences on boundary layer and local wind development, and dynamic influences of the terrain through channeling effects. Data were collected using a network of surface meteorological and energy balance stations, radiosonde and pilot balloon soundings, tethered balloon and kite-based systems, sodar, and an instrumented light aircraft. These data are being used to investigate the energetics of surface heat fluxes, the effects of localized heating/cooling and advective processes on atmospheric boundary layer development, and dynamic channeling. A complementary program of numerical modeling includes application of the Regional Atmospheric Modeling System (RAMS) to case studies characterizing typical boundary layer structures and airflow patterns observed around Lake Tekapo. Some initial results derived from the special observation periods are used to illustrate progress made to date. In spite of the difficulties involved in obtaining good data and undertaking modeling experiments in such complex terrain, initial results show that surface thermal heterogeneity has a significant influence on local atmospheric structure and wind fields in the vicinity of the lake. This influence occurs particularly in the morning. However, dynamic channeling effects and the larger-scale thermal effect of the mountain region frequently override these more local features later in the day.
Resumo:
Given the rate of projected environmental change for the 21st century, urgent adaptation and mitigation measures are required to slow down the on-going erosion of biodiversity. Even though increasing evidence shows that recent human-induced environmental changes have already triggered species' range shifts, changes in phenology and species' extinctions, accurate projections of species' responses to future environmental changes are more difficult to ascertain. This is problematic, since there is a growing awareness of the need to adopt proactive conservation planning measures using forecasts of species' responses to future environmental changes. There is a substantial body of literature describing and assessing the impacts of various scenarios of climate and land-use change on species' distributions. Model predictions include a wide range of assumptions and limitations that are widely acknowledged but compromise their use for developing reliable adaptation and mitigation strategies for biodiversity. Indeed, amongst the most used models, few, if any, explicitly deal with migration processes, the dynamics of population at the "trailing edge" of shifting populations, species' interactions and the interaction between the effects of climate and land-use. In this review, we propose two main avenues to progress the understanding and prediction of the different processes A occurring on the leading and trailing edge of the species' distribution in response to any global change phenomena. Deliberately focusing on plant species, we first explore the different ways to incorporate species' migration in the existing modelling approaches, given data and knowledge limitations and the dual effects of climate and land-use factors. Secondly, we explore the mechanisms and processes happening at the trailing edge of a shifting species' distribution and how to implement them into a modelling approach. We finally conclude this review with clear guidelines on how such modelling improvements will benefit conservation strategies in a changing world. (c) 2007 Rubel Foundation, ETH Zurich. Published by Elsevier GrnbH. All rights reserved.
Resumo:
Managing the great complexity of enterprise system, due to entities numbers, decision and process varieties involved to be controlled results in a very hard task because deals with the integration of its operations and its information systems. Moreover, the enterprises find themselves in a constant changing process, reacting in a dynamic and competitive environment where their business processes are constantly altered. The transformation of business processes into models allows to analyze and redefine them. Through computing tools usage it is possible to minimize the cost and risks of an enterprise integration design. This article claims for the necessity of modeling the processes in order to define more precisely the enterprise business requirements and the adequate usage of the modeling methodologies. Following these patterns, the paper concerns the process modeling relative to the domain of demand forecasting as a practical example. The domain of demand forecasting was built based on a theoretical review. The resulting models considered as reference model are transformed into information systems and have the aim to introduce a generic solution and be start point of better practical forecasting. The proposal is to promote the adequacy of the information system to the real needs of an enterprise in order to enable it to obtain and accompany better results, minimizing design errors, time, money and effort. The enterprise processes modeling are obtained with the usage of CIMOSA language and to the support information system it was used the UML language.
Resumo:
Multiple outcomes data are commonly used to characterize treatment effects in medical research, for instance, multiple symptoms to characterize potential remission of a psychiatric disorder. Often either a global, i.e. symptom-invariant, treatment effect is evaluated. Such a treatment effect may over generalize the effect across the outcomes. On the other hand individual treatment effects, varying across all outcomes, are complicated to interpret, and their estimation may lose precision relative to a global summary. An effective compromise to summarize the treatment effect may be through patterns of the treatment effects, i.e. "differentiated effects." In this paper we propose a two-category model to differentiate treatment effects into two groups. A model fitting algorithm and simulation study are presented, and several methods are developed to analyze heterogeneity presenting in the treatment effects. The method is illustrated using an analysis of schizophrenia symptom data.
Resumo:
We describe the time evolution of gene expression levels by using a time translational matrix to predict future expression levels of genes based on their expression levels at some initial time. We deduce the time translational matrix for previously published DNA microarray gene expression data sets by modeling them within a linear framework by using the characteristic modes obtained by singular value decomposition. The resulting time translation matrix provides a measure of the relationships among the modes and governs their time evolution. We show that a truncated matrix linking just a few modes is a good approximation of the full time translation matrix. This finding suggests that the number of essential connections among the genes is small.
Resumo:
Key Performance Indicators (KPIs) and their predictions are widely used by the enterprises for informed decision making. Nevertheless , a very important factor, which is generally overlooked, is that the top level strategic KPIs are actually driven by the operational level business processes. These two domains are, however, mostly segregated and analysed in silos with different Business Intelligence solutions. In this paper, we are proposing an approach for advanced Business Simulations, which converges the two domains by utilising process execution & business data, and concepts from Business Dynamics (BD) and Business Ontologies, to promote better system understanding and detailed KPI predictions. Our approach incorporates the automated creation of Causal Loop Diagrams, thus empowering the analyst to critically examine the complex dependencies hidden in the massive amounts of available enterprise data. We have further evaluated our proposed approach in the context of a retail use-case that involved verification of the automatically generated causal models by a domain expert.
Resumo:
The goal of this study is to provide a framework for future researchers to understand and use the FARSITE wildfire-forecasting model with data assimilation. Current wildfire models lack the ability to provide accurate prediction of fire front position faster than real-time. When FARSITE is coupled with a recursive ensemble filter, the data assimilation forecast method improves. The scope includes an explanation of the standalone FARSITE application, technical details on FARSITE integration with a parallel program coupler called OpenPALM, and a model demonstration of the FARSITE-Ensemble Kalman Filter software using the FireFlux I experiment by Craig Clements. The results show that the fire front forecast is improved with the proposed data-driven methodology than with the standalone FARSITE model.
Resumo:
This article deals with the scavenging processes modeling of the particulate sulfate and the gas sulfur dioxide, emphasizing the synoptic conditions at different sampling sites in order to verify the domination of the in-cloud or below-cloud scavenging processes in the Metropolitan Area of São Paulo (RMSP). Three sampling sites were chosen: GV (Granja Viana) at RMSP surroundings, IAG-USP and Mackenzie (RMSP center). Basing on synoptic conditions, it was chosen a group of events where the numerical modeling, a simple scavenging model, was used. These synoptic conditions were usually convective cloud storms, which are usual at RMSP. The results show that the in-cloud processes were dominant (80%) for sulfate/sulfur dioxide scavenging processes, with below-cloud process indicating around 20% of the total. Clearly convective events, with total rainfall higher than 20 mm, are better modeled than the stratiform events, with correlation coefficient of 0.92. There is also a clear association with events presenting higher rainfall amount and the ratio between modeled and observed data set with correlation coefficient of 0.63. Additionally, the suburb sampling site, GV, as expected due to the pollution source distance, presents in general smaller amount of rainwater sulfate (modeled and observed) than the center sampling site, Mackenzie, where the characterization event explains partially the rainfall concentration differences.
Resumo:
A natural evolução dos sistemas de informação nas organizações envolve por um lado a instalação de equipamentos actualizados, e por outro a adopção de novas aplicações de suporte ao negócio, acompanhando o desenvolvimento dos mercados, as mudanças no modelo de negócio e a maturação da organização num novo contexto. Muitas vezes esta evolução implica a preservação dos dados existentes e de funcionalidades não cobertas pelas novas aplicações. Este facto leva ao desenvolvimento e execução de processos de migração de dados, de aplicações, e de integração de sistemas legados. Estes processos estão condicionados ao meio tecnológico disponível e ao conhecimento existente sobre os sistemas legados, sendo sensíveis ao contexto em que se desenrolam. Esta dissertação apresenta um estado da arte das abordagens à migração e integração, descreve as diversas alternativas, e ilustra de uma forma sistematizada e comparativa os exercícios realizados usando diferentes abordagens, num ambiente real de migração e integração em mudança.
Resumo:
The last three decades have seen quite dramatic changes the way we modeled time dependent data. Linear processes have been in the center stage in modeling time series. As far as the second order properties are concerned, the theory and the methodology are very adequate.However, there are more and more evidences that linear models are not sufficiently flexible and rich enough for modeling purposes and that failure to account for non-linearities can be very misleading and have undesired consequences.
Resumo:
É possível assistir nos dias de hoje, a um processo tecnológico evolutivo acentuado por toda a parte do globo. No caso das empresas, quer as pequenas, médias ou de grandes dimensões, estão cada vez mais dependentes dos sistemas informatizados para realizar os seus processos de negócio, e consequentemente à geração de informação referente aos negócios e onde, muitas das vezes, os dados não têm qualquer relacionamento entre si. A maioria dos sistemas convencionais informáticos não são projetados para gerir e armazenar informações estratégicas, impossibilitando assim que esta sirva de apoio como recurso estratégico. Portanto, as decisões são tomadas com base na experiência dos administradores, quando poderiam serem baseadas em factos históricos armazenados pelos diversos sistemas. Genericamente, as organizações possuem muitos dados, mas na maioria dos casos extraem pouca informação, o que é um problema em termos de mercados competitivos. Como as organizações procuram evoluir e superar a concorrência nas tomadas de decisão, surge neste contexto o termo Business Intelligence(BI). A GisGeo Information Systems é uma empresa que desenvolve software baseado em SIG (sistemas de informação geográfica) recorrendo a uma filosofia de ferramentas open-source. O seu principal produto baseia-se na localização geográfica dos vários tipos de viaturas, na recolha de dados, e consequentemente a sua análise (quilómetros percorridos, duração de uma viagem entre dois pontos definidos, consumo de combustível, etc.). Neste âmbito surge o tema deste projeto que tem objetivo de dar uma perspetiva diferente aos dados existentes, cruzando os conceitos BI com o sistema implementado na empresa de acordo com a sua filosofia. Neste projeto são abordados alguns dos conceitos mais importantes adjacentes a BI como, por exemplo, modelo dimensional, data Warehouse, o processo ETL e OLAP, seguindo a metodologia de Ralph Kimball. São também estudadas algumas das principais ferramentas open-source existentes no mercado, assim como quais as suas vantagens/desvantagens relativamente entre elas. Em conclusão, é então apresentada a solução desenvolvida de acordo com os critérios enumerados pela empresa como prova de conceito da aplicabilidade da área Business Intelligence ao ramo de Sistemas de informação Geográfica (SIG), recorrendo a uma ferramenta open-source que suporte visualização dos dados através de dashboards.