945 resultados para Data modeling


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, an alternative skew Student-t family of distributions is studied. It is obtained as an extension of the generalized Student-t (GS-t) family introduced by McDonald and Newey [10]. The extension that is obtained can be seen as a reparametrization of the skewed GS-t distribution considered by Theodossiou [14]. A key element in the construction of such an extension is that it can be stochastically represented as a mixture of an epsilon-skew-power-exponential distribution [1] and a generalized-gamma distribution. From this representation, we can readily derive theoretical properties and easy-to-implement simulation schemes. Furthermore, we study some of its main properties including stochastic representation, moments and asymmetry and kurtosis coefficients. We also derive the Fisher information matrix, which is shown to be nonsingular for some special cases such as when the asymmetry parameter is null, that is, at the vicinity of symmetry, and discuss maximum-likelihood estimation. Simulation studies for some particular cases and real data analysis are also reported, illustrating the usefulness of the extension considered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Infant mortality is an important measure of human development, related to the level of welfare of a society. In order to inform public policy, various studies have tried to identify the factors that influence, at an aggregated level, infant mortality. The objective of this paper is to analyze the regional pattern of infant mortality in Brazil, evaluating the effect of infrastructure, socio-economic, and demographic variables to understand its distribution across the country. Methods: Regressions including socio-economic and living conditions variables are conducted in a structure of panel data. More specifically, a spatial panel data model with fixed effects and a spatial error autocorrelation structure is used to help to solve spatial dependence problems. The use of a spatial modeling approach takes into account the potential presence of spillovers between neighboring spatial units. The spatial units considered are Minimum Comparable Areas, defined to provide a consistent definition across Census years. Data are drawn from the 1980, 1991 and 2000 Census of Brazil, and from data collected by the Ministry of Health (DATASUS). In order to identify the influence of health care infrastructure, variables related to the number of public and private hospitals are included. Results: The results indicate that the panel model with spatial effects provides the best fit to the data. The analysis confirms that the provision of health care infrastructure and social policy measures (e. g. improving education attainment) are linked to reduced rates of infant mortality. An original finding concerns the role of spatial effects in the analysis of IMR. Spillover effects associated with health infrastructure and water and sanitation facilities imply that there are regional benefits beyond the unit of analysis. Conclusions: A spatial modeling approach is important to produce reliable estimates in the analysis of panel IMR data. Substantively, this paper contributes to our understanding of the physical and social factors that influence IMR in the case of a developing country.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Consistent in silico models for ADME properties are useful tools in early drug discovery. Here, we report the hologram QSAR modeling of human intestinal absorption using a dataset of 638 compounds with experimental data associated. The final validated models are consistent and robust for the consensus prediction of this important pharmacokinetic property and are suitable for virtual screening applications. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Period adding cascades have been observed experimentally/numerically in the dynamics of neurons and pancreatic cells, lasers, electric circuits, chemical reactions, oceanic internal waves, and also in air bubbling. We show that the period adding cascades appearing in bubbling from a nozzle submerged in a viscous liquid can be reproduced by a simple model, based on some hydrodynamical principles, dealing with the time evolution of two variables, bubble position and pressure of the air chamber, through a system of differential equations with a rule of detachment based on force balance. The model further reduces to an iterating one-dimensional map giving the pressures at the detachments, where time between bubbles come out as an observable of the dynamics. The model has not only good agreement with experimental data, but is also able to predict the influence of the main parameters involved, like the length of the hose connecting the air supplier with the needle, the needle radius and the needle length. (C) 2012 American Institute of Physics. [http://dx.doi.org/10.1063/1.3695345]

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present an analytic description of numerical results for the Landau-gauge SU(2) gluon propagator D(p(2)), obtained from lattice simulations (in the scaling region) for the largest lattice sizes to date, in d = 2, 3 and 4 space-time dimensions. Fits to the gluon data in 3d and in 4d show very good agreement with the tree-level prediction of the refined Gribov-Zwanziger (RGZ) framework, supporting a massive behavior for D(p(2)) in the infrared limit. In particular, we investigate the propagator's pole structure and provide estimates of the dynamical mass scales that can be associated with dimension-two condensates in the theory. In the 2d case, fitting the data requires a noninteger power of the momentum p in the numerator of the expression for D(p(2)). In this case, an infinite-volume-limit extrapolation gives D(0) = 0. Our analysis suggests that this result is related to a particular symmetry in the complex-pole structure of the propagator and not to purely imaginary poles, as would be expected in the original Gribov-Zwanziger scenario.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Several models have been designed to predict survival of patients with heart failure. These, while available and widely used for both stratifying and deciding upon different treatment options on the individual level, have several limitations. Specifically, some clinical variables that may influence prognosis may have an influence that change over time. Statistical models that include such characteristic may help in evaluating prognosis. The aim of the present study was to analyze and quantify the impact of modeling heart failure survival allowing for covariates with time-varying effects known to be independent predictors of overall mortality in this clinical setting. Methodology: Survival data from an inception cohort of five hundred patients diagnosed with heart failure functional class III and IV between 2002 and 2004 and followed-up to 2006 were analyzed by using the proportional hazards Cox model and variations of the Cox's model and also of the Aalen's additive model. Principal Findings: One-hundred and eighty eight (188) patients died during follow-up. For patients under study, age, serum sodium, hemoglobin, serum creatinine, and left ventricular ejection fraction were significantly associated with mortality. Evidence of time-varying effect was suggested for the last three. Both high hemoglobin and high LV ejection fraction were associated with a reduced risk of dying with a stronger initial effect. High creatinine, associated with an increased risk of dying, also presented an initial stronger effect. The impact of age and sodium were constant over time. Conclusions: The current study points to the importance of evaluating covariates with time-varying effects in heart failure models. The analysis performed suggests that variations of Cox and Aalen models constitute a valuable tool for identifying these variables. The implementation of covariates with time-varying effects into heart failure prognostication models may reduce bias and increase the specificity of such models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Species distribution models (SDMs) can be useful for different conservation purposes. We discuss the importance of fitting spatial scale and using current records and relevant predictors aiming conservation. We choose jaguar (Panthera onca) as a target species and Brazil and Atlantic Forest biome as study areas. We tested two different extents (continent and biome) and resolutions (similar to 4 Km and similar to 1 Km) in Maxent with 186 records and 11 predictors (bioclimatic, elevation, land-use and landscape structure). All models presented satisfactory AUC values (>0.70) and low omission errors (<23%). SDMs were scale-sensitive as the use of reduced extent implied in significant gains to model performance generating more constrained and real predictive distribution maps. Continental-scale models performed poorly in predicting potential current jaguar distribution, but they reached the historic distribution. Specificity increased significantly from coarse to finer-scale models due to the reduction of overprediction. The variability of environmental space (E-space) differed for most of climatic variables between continental and biome-scale and the representation of the E-space by predictors differed significantly (t = 2.42, g.I. = 9, P < 0.05). Refining spatial scale, incorporating landscape variables and improving the quality of biological data are essential for improving model prediction for conservation purposes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A nonlinear analysis is performed for the purpose of identification of the pitch freeplay nonlinearity and its effect on the type of bifurcation of a two degree-of-freedom aeroelastic system. The databases for the identification are generated from experimental investigations of a pitch-plunge rigid airfoil supported by a nonlinear torsional spring. Experimental data and linear analysis are performed to validate the parameters of the linearized equations. Based on the periodic responses of the experimental data which included the flutter frequency and its third harmonics, the freeplay nonlinearity is approximated by a polynomial expansion up to the third order. This representation allows us to use the normal form of the Hopf bifurcation to characterize the type of instability. Based on numerical integrations, the coefficients of the polynomial expansion representing the freeplay nonlinearity are identified. Published by Elsevier Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we carry out robust modeling and influence diagnostics in Birnbaum-Saunders (BS) regression models. Specifically, we present some aspects related to BS and log-BS distributions and their generalizations from the Student-t distribution, and develop BS-t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright (c) 2011 John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids. In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism. However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties. By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community. Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An experimental study on drag-reduction phenomenon in dispersed oil-water flow has been performed in a 26-mm-i.d. Twelve meter long horizontal glass pipe. The flow was characterized using a novel wire-mesh sensor based on capacitance measurements and high-speed video recording. New two-phase pressure gradient, volume fraction, and phase distribution data have been used in the analysis. Drag reduction and slip ratio were detected at oil volume fractions between 10 and 45% and high mixture Reynolds numbers, and with water as the dominant phase. Phase-fraction distribution diagrams and cross-sectional imaging of the flow suggested the presence of a higher amount of water near to the pipe wall. Based on that, a phenomenology for explaining drag reduction in dispersed flow in a flow situation where slip ratio is significant is proposed. A simple phenomenological model is developed and the agreement between model predictions and data, including data from the literature, is encouraging. (c) 2011 American Institute of Chemical Engineers AIChE J, 2012

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statistical methods have been widely employed to assess the capabilities of credit scoring classification models in order to reduce the risk of wrong decisions when granting credit facilities to clients. The predictive quality of a classification model can be evaluated based on measures such as sensitivity, specificity, predictive values, accuracy, correlation coefficients and information theoretical measures, such as relative entropy and mutual information. In this paper we analyze the performance of a naive logistic regression model (Hosmer & Lemeshow, 1989) and a logistic regression with state-dependent sample selection model (Cramer, 2004) applied to simulated data. Also, as a case study, the methodology is illustrated on a data set extracted from a Brazilian bank portfolio. Our simulation results so far revealed that there is no statistically significant difference in terms of predictive capacity between the naive logistic regression models and the logistic regression with state-dependent sample selection models. However, there is strong difference between the distributions of the estimated default probabilities from these two statistical modeling techniques, with the naive logistic regression models always underestimating such probabilities, particularly in the presence of balanced samples. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Background To understand the molecular mechanisms underlying important biological processes, a detailed description of the gene products networks involved is required. In order to define and understand such molecular networks, some statistical methods are proposed in the literature to estimate gene regulatory networks from time-series microarray data. However, several problems still need to be overcome. Firstly, information flow need to be inferred, in addition to the correlation between genes. Secondly, we usually try to identify large networks from a large number of genes (parameters) originating from a smaller number of microarray experiments (samples). Due to this situation, which is rather frequent in Bioinformatics, it is difficult to perform statistical tests using methods that model large gene-gene networks. In addition, most of the models are based on dimension reduction using clustering techniques, therefore, the resulting network is not a gene-gene network but a module-module network. Here, we present the Sparse Vector Autoregressive model as a solution to these problems. Results We have applied the Sparse Vector Autoregressive model to estimate gene regulatory networks based on gene expression profiles obtained from time-series microarray experiments. Through extensive simulations, by applying the SVAR method to artificial regulatory networks, we show that SVAR can infer true positive edges even under conditions in which the number of samples is smaller than the number of genes. Moreover, it is possible to control for false positives, a significant advantage when compared to other methods described in the literature, which are based on ranks or score functions. By applying SVAR to actual HeLa cell cycle gene expression data, we were able to identify well known transcription factor targets. Conclusion The proposed SVAR method is able to model gene regulatory networks in frequent situations in which the number of samples is lower than the number of genes, making it possible to naturally infer partial Granger causalities without any a priori information. In addition, we present a statistical test to control the false discovery rate, which was not previously possible using other gene regulatory network models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Polynomial Chaos Expansion (PCE) is widely recognized as a flexible tool to represent different types of random variables/processes. However, applications to real, experimental data are still limited. In this article, PCE is used to represent the random time-evolution of metal corrosion growth in marine environments. The PCE coefficients are determined in order to represent data of 45 corrosion coupons tested by Jeffrey and Melchers (2001) at Taylors Beach, Australia. Accuracy of the representation and possibilities for model extrapolation are considered in the study. Results show that reasonably accurate smooth representations of the corrosion process can be obtained. The representation is not better because a smooth model is used to represent non-smooth corrosion data. Random corrosion leads to time-variant reliability problems, due to resistance degradation over time. Time variant reliability problems are not trivial to solve, especially under random process loading. Two example problems are solved herein, showing how the developed PCE representations can be employed in reliability analysis of structures subject to marine corrosion. Monte Carlo Simulation is used to solve the resulting time-variant reliability problems. However, an accurate and more computationally efficient solution is also presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work provides a numerical and experimental investigation of fatigue crack growth behavior in steel weldments including crack closure effects and their coupled interaction with weld strength mismatch. A central objective of this study is to extend previously developed frameworks for evaluation of crack clo- sure effects on FCGR to steel weldments while, at the same time, gaining additional understanding of commonly adopted criteria for crack closure loads and their influence on fatigue life of structural welds. Very detailed non-linear finite element analyses using 3-D models of compact tension C ( T ) fracture spec- imens with center cracked, square groove welds provide the evolution of crack growth with cyclic stress intensity factor which is required for the estimation of the closure loads. Fatigue crack growth tests con- ducted on plane-sided, shallow-cracked C ( T ) specimens provide the necessary data against which crack closure effects on fatigue crack growth behavior can be assessed. Overall, the present investigation pro- vides additional support for estimation procedures of plasticity-induced crack closure loads in fatigue analyses of structural steels and their weldments