Biblioteca Digital

960 resultados para Discrete Data Models

Degree of multicollinearity and variables involved in linear dependence in additive-dominant models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this work was to assess the degree of multicollinearity and to identify the variables involved in linear dependence relations in additive-dominant models. Data of birth weight (n=141,567), yearling weight (n=58,124), and scrotal circumference (n=20,371) of Montana Tropical composite cattle were used. Diagnosis of multicollinearity was based on the variance inflation factor (VIF) and on the evaluation of the condition indexes and eigenvalues from the correlation matrix among explanatory variables. The first model studied (RM) included the fixed effect of dam age class at calving and the covariates associated to the direct and maternal additive and non-additive effects. The second model (R) included all the effects of the RM model except the maternal additive effects. Multicollinearity was detected in both models for all traits considered, with VIF values of 1.03 - 70.20 for RM and 1.03 - 60.70 for R. Collinearity increased with the increase of variables in the model and the decrease in the number of observations, and it was classified as weak, with condition index values between 10.00 and 26.77. In general, the variables associated with additive and non-additive effects were involved in multicollinearity, partially due to the natural connection between these covariables as fractions of the biological types in breed composition.

Environmental data mining and modeling based on machine learning algorithms and geostatistics

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper presents some contemporary approaches to spatial environmental data analysis. The main topics are concentrated on the decision-oriented problems of environmental spatial data mining and modeling: valorization and representativity of data with the help of exploratory data analysis, spatial predictions, probabilistic and risk mapping, development and application of conditional stochastic simulation models. The innovative part of the paper presents integrated/hybrid model-machine learning (ML) residuals sequential simulations-MLRSS. The models are based on multilayer perceptron and support vector regression ML algorithms used for modeling long-range spatial trends and sequential simulations of the residuals. NIL algorithms deliver non-linear solution for the spatial non-stationary problems, which are difficult for geostatistical approach. Geostatistical tools (variography) are used to characterize performance of ML algorithms, by analyzing quality and quantity of the spatially structured information extracted from data with ML algorithms. Sequential simulations provide efficient assessment of uncertainty and spatial variability. Case study from the Chernobyl fallouts illustrates the performance of the proposed model. It is shown that probability mapping, provided by the combination of ML data driven and geostatistical model based approaches, can be efficiently used in decision-making process. (C) 2003 Elsevier Ltd. All rights reserved.

Semivariogram models for estimating fig fly population density throughout the year

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this work was to select semivariogram models to estimate the population density of fig fly (Zaprionus indianus; Diptera: Drosophilidae) throughout the year, using ordinary kriging. Nineteen monitoring sites were demarcated in an area of 8,200 m2, cropped with six fruit tree species: persimmon, citrus, fig, guava, apple, and peach. During a 24 month period, 106 weekly evaluations were done in these sites. The average number of adult fig flies captured weekly per trap, during each month, was subjected to the circular, spherical, pentaspherical, exponential, Gaussian, rational quadratic, hole effect, K-Bessel, J-Bessel, and stable semivariogram models, using ordinary kriging interpolation. The models with the best fit were selected by cross-validation. Each data set (months) has a particular spatial dependence structure, which makes it necessary to define specific models of semivariograms in order to enhance the adjustment to the experimental semivariogram. Therefore, it was not possible to determine a standard semivariogram model; instead, six theoretical models were selected: circular, Gaussian, hole effect, K-Bessel, J-Bessel, and stable.

Univariate versus multivariate modeling of panel data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Panel data can be arranged into a matrix in two ways, called 'long' and 'wide' formats (LFand WF). The two formats suggest two alternative model approaches for analyzing paneldata: (i) univariate regression with varying intercept; and (ii) multivariate regression withlatent variables (a particular case of structural equation model, SEM). The present papercompares the two approaches showing in which circumstances they yield equivalent?insome cases, even numerically equal?results. We show that the univariate approach givesresults equivalent to the multivariate approach when restrictions of time invariance (inthe paper, the TI assumption) are imposed on the parameters of the multivariate model.It is shown that the restrictions implicit in the univariate approach can be assessed bychi-square difference testing of two nested multivariate models. In addition, commontests encountered in the econometric analysis of panel data, such as the Hausman test, areshown to have an equivalent representation as chi-square difference tests. Commonalitiesand differences between the univariate and multivariate approaches are illustrated usingan empirical panel data set of firms' profitability as well as a simulated panel data.

Methods for the analysis of two dimensional measurement data of paper web

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work is devoted to the problem of reconstructing the basis weight structure at paper web with black{box techniques. The data that is analyzed comes from a real paper machine and is collected by an o®-line scanner. The principal mathematical tool used in this work is Autoregressive Moving Average (ARMA) modelling. When coupled with the Discrete Fourier Transform (DFT), it gives a very flexible and interesting tool for analyzing properties of the paper web. Both ARMA and DFT are independently used to represent the given signal in a simplified version of our algorithm, but the final goal is to combine the two together. Ljung-Box Q-statistic lack-of-fit test combined with the Root Mean Squared Error coefficient gives a tool to separate significant signals from noise.

The Besançon Galaxy Model renewed. I. Constraints on the Galactic thin disc evolution from Tycho data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Context. The understanding of Galaxy evolution can be facilitated by the use of population synthesis models, which allow to test hypotheses on the star formation history, star evolution, as well as chemical and dynamical evolution of the Galaxy. Aims. The new version of the Besanc¸on Galaxy Model (hereafter BGM) aims to provide a more flexible and powerful tool to investigate the Initial Mass Function (IMF) and Star Formation Rate (SFR) of the Galactic disc. Methods. We present a new strategy for the generation of thin disc stars which assumes the IMF, SFR and evolutionary tracks as free parameters. We have updated most of the ingredients for the star count production and, for the first time, binary stars are generated in a consistent way. We keep in this new scheme the local dynamical self-consistency as in Bienayme et al (1987). We then compare simulations from the new model with Tycho-2 data and the local luminosity function, as a first test to verify and constrain the new ingredients. The effects of changing thirteen different ingredients of the model are systematically studied. Results. For the first time, a full sky comparison is performed between BGM and data. This strategy allows to constrain the IMF slope at high masses which is found to be close to 3.0, excluding a shallower slope such as Salpeter"s one. The SFR is found decreasing whatever IMF is assumed. The model is compatible with a local dark matter density of 0.011 M pc−3 implying that there is no compelling evidence for significant amount of dark matter in the disc. While the model is fitted to Tycho2 data, a magnitude limited sample with V<11, we check that it is still consistent with fainter stars. Conclusions. The new model constitutes a new basis for further comparisons with large scale surveys and is being prepared to become a powerful tool for the analysis of the Gaia mission data.

Optimisation of data analysis method for a fluidized bed combustion test rig

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nowadays the used fuel variety in power boilers is widening and new boiler constructions and running models have to be developed. This research and development is done in small pilot plants where more faster analyse about the boiler mass and heat balance is needed to be able to find and do the right decisions already during the test run. The barrier on determining boiler balance during test runs is the long process of chemical analyses of collected input and outputmatter samples. The present work is concentrating on finding a way to determinethe boiler balance without chemical analyses and optimise the test rig to get the best possible accuracy for heat and mass balance of the boiler. The purpose of this work was to create an automatic boiler balance calculation method for 4 MW CFB/BFB pilot boiler of Kvaerner Pulping Oy located in Messukylä in Tampere. The calculation was created in the data management computer of pilot plants automation system. The calculation is made in Microsoft Excel environment, which gives a good base and functions for handling large databases and calculations without any delicate programming. The automation system in pilot plant was reconstructed und updated by Metso Automation Oy during year 2001 and the new system MetsoDNA has good data management properties, which is necessary for big calculations as boiler balance calculation. Two possible methods for calculating boiler balance during test run were found. Either the fuel flow is determined, which is usedto calculate the boiler's mass balance, or the unburned carbon loss is estimated and the mass balance of the boiler is calculated on the basis of boiler's heat balance. Both of the methods have their own weaknesses, so they were constructed parallel in the calculation and the decision of the used method was left to user. User also needs to define the used fuels and some solid mass flowsthat aren't measured automatically by the automation system. With sensitivity analysis was found that the most essential values for accurate boiler balance determination are flue gas oxygen content, the boiler's measured heat output and lower heating value of the fuel. The theoretical part of this work concentrates in the error management of these measurements and analyses and on measurement accuracy and boiler balance calculation in theory. The empirical part of this work concentrates on the creation of the balance calculation for the boiler in issue and on describing the work environment.

Population specific and up to date cardiovascular risk charts can be efficiently obtained with record linkage of routine and observational data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Only few countries have cohorts enabling specific and up-to-date cardiovascular disease (CVD) risk estimation. Individual risk assessment based on study samples that differ too much from the target population could jeopardize the benefit of risk charts in general practice. Our aim was to provide up-to-date and valid CVD risk estimation for a Swiss population using a novel record linkage approach. METHODS: Anonymous record linkage was used to follow-up (for mortality, until 2008) 9,853 men and women aged 25-74 years who participated in the Swiss MONICA (MONItoring of trends and determinants in CVD) study of 1983-92. The linkage success was 97.8%, loss to follow-up 1990-2000 was 4.7%. Based on the ESC SCORE methodology (Weibull regression), we used age, sex, blood pressure, smoking, and cholesterol to generate three models. We compared the 1) original SCORE model with a 2) recalibrated and a 3) new model using the Brier score (BS) and cross-validation. RESULTS: Based on the cross-validated BS, the new model (BS = 14107×10(-6)) was somewhat more appropriate for risk estimation than the original (BS = 14190×10(-6)) and the recalibrated (BS = 14172×10(-6)) model. Particularly at younger age, derived absolute risks were consistently lower than those from the original and the recalibrated model which was mainly due to a smaller impact of total cholesterol. CONCLUSION: Using record linkage of observational and routine data is an efficient procedure to obtain valid and up-to-date CVD risk estimates for a specific population.

Forecasting tourism demand to Catalonia: neural networks vs. time series models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The increasing interest aroused by more advanced forecasting techniques, together with the requirement for more accurate forecasts of tourismdemand at the destination level due to the constant growth of world tourism, has lead us to evaluate the forecasting performance of neural modelling relative to that of time seriesmethods at a regional level. Seasonality and volatility are important features of tourism data, which makes it a particularly favourable context in which to compare the forecasting performance of linear models to that of nonlinear alternative approaches. Pre-processed official statistical data of overnight stays and tourist arrivals fromall the different countries of origin to Catalonia from 2001 to 2009 is used in the study. When comparing the forecasting accuracy of the different techniques for different time horizons, autoregressive integrated moving average models outperform self-exciting threshold autoregressions and artificial neural network models, especially for shorter horizons. These results suggest that the there is a trade-off between the degree of pre-processing and the accuracy of the forecasts obtained with neural networks, which are more suitable in the presence of nonlinearity in the data. In spite of the significant differences between countries, which can be explained by different patterns of consumer behaviour,we also find that forecasts of tourist arrivals aremore accurate than forecasts of overnight stays.

A panel data analysis of FDI and informal labor markets

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this paper is to examine whether informal labor markets affect the flows of Foreign Direct Investment (FDI), and also whether this effect is similar in developed and developing countries. With this aim, different public data sources, such as the World Bank (WB), and the United Nations Conference on Trade and Development (UNCTAD) are used, and panel econometric models are estimated for a sample of 65 countries over a 14 year period (1996-2009). In addition, this paper uses a dynamic model as an extension of the analysis to establish whether such an effect exists and what its indicators and significance may be.

Forecasting coal resources and reserves in heterogeneous coal zones using 3D facies models (As Pontes Basin, NW Spain)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Forecasting coal resources and reserves is critical for coal mine development. Thickness maps are commonly used for assessing coal resources and reserves; however they are limited for capturing coal splitting effects in thick and heterogeneous coal zones. As an alternative, three-dimensional geostatistical methods are used to populate facies distributionwithin a densely drilled heterogeneous coal zone in the As Pontes Basin (NWSpain). Coal distribution in this zone is mainly characterized by coal-dominated areas in the central parts of the basin interfingering with terrigenous-dominated alluvial fan zones at the margins. The three-dimensional models obtained are applied to forecast coal resources and reserves. Predictions using subsets of the entire dataset are also generated to understand the performance of methods under limited data constraints. Three-dimensional facies interpolation methods tend to overestimate coal resources and reserves due to interpolation smoothing. Facies simulation methods yield similar resource predictions than conventional thickness map approximations. Reserves predicted by facies simulation methods are mainly influenced by: a) the specific coal proportion threshold used to determine if a block can be recovered or not, and b) the capability of the modelling strategy to reproduce areal trends in coal proportions and splitting between coal-dominated and terrigenousdominated areas of the basin. Reserves predictions differ between the simulation methods, even with dense conditioning datasets. Simulation methods can be ranked according to the correlation of their outputs with predictions from the directly interpolated coal proportion maps: a) with low-density datasets sequential indicator simulation with trends yields the best correlation, b) with high-density datasets sequential indicator simulation with post-processing yields the best correlation, because the areal trends are provided implicitly by the dense conditioning data.

On the molecular basis of ion permeation in the epithelial Na+ channel.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The epithelial Na+ channel (ENaC) is highly selective for Na+ and Li+ over K+ and is blocked by the diuretic amiloride. ENaC is a heterotetramer made of two alpha, one beta, and one gamma homologous subunits, each subunit comprising two transmembrane segments. Amino acid residues involved in binding of the pore blocker amiloride are located in the pre-M2 segment of beta and gamma subunits, which precedes the second putative transmembrane alpha helix (M2). A residue in the alpha subunit (alphaS589) at the NH2 terminus of M2 is critical for the molecular sieving properties of ENaC. ENaC is more permeable to Li+ than Na+ ions. The concentration of half-maximal unitary conductance is 38 mM for Na+ and 118 mM for Li+, a kinetic property that can account for the differences in Li+ and Na+ permeability. We show here that mutation of amino acid residues at homologous positions in the pre-M2 segment of alpha, beta, and gamma subunits (alphaG587, betaG529, gammaS541) decreases the Li+/Na+ selectivity by changing the apparent channel affinity for Li+ and Na+. Fitting single-channel data of the Li+ permeation to a discrete-state model including three barriers and two binding sites revealed that these mutations increased the energy needed for the translocation of Li+ from an outer ion binding site through the selectivity filter. Mutation of betaG529 to Ser, Cys, or Asp made ENaC partially permeable to K+ and larger ions, similar to the previously reported alphaS589 mutations. We conclude that the residues alphaG587 to alphaS589 and homologous residues in the beta and gamma subunits form the selectivity filter, which tightly accommodates Na+ and Li+ ions and excludes larger ions like K+.

Ruin problems under IBNR Dynamics

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we consider a discrete-time risk process allowing for delay in claim settlement, which introduces a certain type of dependence in the process. From martingale theory, an expression for the ultimate ruin probability is obtained, and Lundberg-type inequalities are derived. The impact of delay in claim settlement is then investigated. To this end, a convex order comparison of the aggregate claim amounts is performed with the corresponding non-delayed risk model, and numerical simulations are carried out with Belgian market data.

Modeling of non-isothermalvapor membrane separation with thermodynamic models and generalized mass transfer equations

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A rigorous unit operation model is developed for vapor membrane separation. The new model is able to describe temperature, pressure, and concentration dependent permeation as wellreal fluid effects in vapor and gas separation with hydrocarbon selective rubbery polymeric membranes. The permeation through the membrane is described by a separate treatment of sorption and diffusion within the membrane. The chemical engineering thermodynamics is used to describe the equilibrium sorption of vapors and gases in rubbery membranes with equation of state models for polymeric systems. Also a new modification of the UNIFAC model is proposed for this purpose. Various thermodynamic models are extensively compared in order to verify the models' ability to predict and correlate experimental vapor-liquid equilibrium data. The penetrant transport through the selective layer of the membrane is described with the generalized Maxwell-Stefan equations, which are able to account for thebulk flux contribution as well as the diffusive coupling effect. A method is described to compute and correlate binary penetrant¿membrane diffusion coefficients from the experimental permeability coefficients at different temperatures and pressures. A fluid flow model for spiral-wound modules is derived from the conservation equation of mass, momentum, and energy. The conservation equations are presented in a discretized form by using the control volume approach. A combination of the permeation model and the fluid flow model yields the desired rigorous model for vapor membrane separation. The model is implemented into an inhouse process simulator and so vapor membrane separation may be evaluated as an integralpart of a process flowsheet.

Applicability of Power-Line Communications to Data Transfer of On-line Condition Monitoring of Electrical Drives

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research of power-line communications has been concentrated on home automation, broadband indoor communications and broadband data transfer in a low voltage distribution network between home andtransformer station. There has not been carried out much research work that is focused on the high frequency characteristics of industrial low voltage distribution networks. The industrial low voltage distribution network may be utilised as a communication channel to data transfer required by the on-line condition monitoring of electric motors. The advantage of using power-line data transfer is that it does not require the installing of new cables. In the first part of this work, the characteristics of industrial low voltage distribution network components and the pilot distribution network are measured and modelled with respect topower-line communications frequencies up to 30 MHz. The distributed inductances, capacitances and attenuation of MCMK type low voltage power cables are measured in the frequency band 100 kHz - 30 MHz and an attenuation formula for the cables is formed based on the measurements. The input impedances of electric motors (15-250 kW) are measured using several signal couplings and measurement based input impedance model for electric motor with a slotted stator is formed. The model is designed for the frequency band 10 kHz - 30 MHz. Next, the effect of DC (direct current) voltage link inverter on power line data transfer is briefly analysed. Finally, a pilot distribution network is formed and signal attenuation in communication channels in the pilot environment is measured. The results are compared with the simulations that are carried out utilising the developed models and measured parameters for cables and motors. In the second part of this work, a narrowband power-line data transfer system is developed for the data transfer ofon-line condition monitoring of electric motors. It is developed using standardintegrated circuits. The system is tested in the pilot environment and the applicability of the system for the data transfer required by the on-line condition monitoring of electric motors is analysed.

«
1
2
...
56
57
58
59
60
61
62
63
64
»