962 resultados para data complexity
Resumo:
This essay is a trial on giving some mathematical ideas about the concept of biological complexity, trying to explore four different attributes considered to be essential to characterize a complex system in a biological context: decomposition, heterogeneous assembly, self-organization, and adequacy. It is a theoretical and speculative approach, opening some possibilities to further numerical and experimental work, illustrated by references to several researches that applied the concepts presented here. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
This letter addresses the optimization and complexity reduction of switch-reconfigured antennas. A new optimization technique based on graph models is investigated. This technique is used to minimize the redundancy in a reconfigurable antenna structure and reduce its complexity. A graph modeling rule for switch-reconfigured antennas is proposed, and examples are presented.
Resumo:
In this work, an axisymmetric two-dimensional finite element model was developed to simulate instrumented indentation testing of thin ceramic films deposited onto hard steel substrates. The level of film residual stress (sigma(r)), the film elastic modulus (E) and the film work hardening exponent (n) were varied to analyze their effects on indentation data. These numerical results were used to analyze experimental data that were obtained with titanium nitride coated specimens, in which the substrate bias applied during deposition was modified to obtain films with different levels of sigma(r). Good qualitative correlation was obtained when numerical and experimental results were compared, as long as all film properties are considered in the analyses, and not only sigma(r). The numerical analyses were also used to further understand the effect of sigma(r) on the mechanical properties calculated based on instrumented indentation data. In this case, the hardness values obtained based on real or calculated contact areas are similar only when sink-in occurs, i.e. with high n or high ratio VIE, where Y is the yield strength of the film. In an additional analysis, four ratios (R/h(max)) between indenter tip radius and maximum penetration depth were simulated to analyze the combined effects of R and sigma(r) on the indentation load-displacement curves. In this case, or did not significantly affect the load curve exponent, which was affected only by the indenter tip radius. On the other hand, the proportional curvature coefficient was significantly affected by sigma(r) and n. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
For the first time, we introduce and study some mathematical properties of the Kumaraswamy Weibull distribution that is a quite flexible model in analyzing positive data. It contains as special sub-models the exponentiated Weibull, exponentiated Rayleigh, exponentiated exponential, Weibull and also the new Kumaraswamy exponential distribution. We provide explicit expressions for the moments and moment generating function. We examine the asymptotic distributions of the extreme values. Explicit expressions are derived for the mean deviations, Bonferroni and Lorenz curves, reliability and Renyi entropy. The moments of the order statistics are calculated. We also discuss the estimation of the parameters by maximum likelihood. We obtain the expected information matrix. We provide applications involving two real data sets on failure times. Finally, some multivariate generalizations of the Kumaraswamy Weibull distribution are discussed. (C) 2010 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.
Resumo:
Estimation of Taylor`s power law for species abundance data may be performed by linear regression of the log empirical variances on the log means, but this method suffers from a problem of bias for sparse data. We show that the bias may be reduced by using a bias-corrected Pearson estimating function. Furthermore, we investigate a more general regression model allowing for site-specific covariates. This method may be efficiently implemented using a Newton scoring algorithm, with standard errors calculated from the inverse Godambe information matrix. The method is applied to a set of biomass data for benthic macrofauna from two Danish estuaries. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Interval-censored survival data, in which the event of interest is not observed exactly but is only known to occur within some time interval, occur very frequently. In some situations, event times might be censored into different, possibly overlapping intervals of variable widths; however, in other situations, information is available for all units at the same observed visit time. In the latter cases, interval-censored data are termed grouped survival data. Here we present alternative approaches for analyzing interval-censored data. We illustrate these techniques using a survival data set involving mango tree lifetimes. This study is an example of grouped survival data.
Resumo:
This paper proposes a regression model considering the modified Weibull distribution. This distribution can be used to model bathtub-shaped failure rate functions. Assuming censored data, we consider maximum likelihood and Jackknife estimators for the parameters of the model. We derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and we also present some ways to perform global influence. Besides, for different parameter settings, sample sizes and censoring percentages, various simulations are performed and the empirical distribution of the modified deviance residual is displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended for a martingale-type residual in log-modified Weibull regression models with censored data. Finally, we analyze a real data set under log-modified Weibull regression models. A diagnostic analysis and a model checking based on the modified deviance residual are performed to select appropriate models. (c) 2008 Elsevier B.V. All rights reserved.
Resumo:
In this study, regression models are evaluated for grouped survival data when the effect of censoring time is considered in the model and the regression structure is modeled through four link functions. The methodology for grouped survival data is based on life tables, and the times are grouped in k intervals so that ties are eliminated. Thus, the data modeling is performed by considering the discrete models of lifetime regression. The model parameters are estimated by using the maximum likelihood and jackknife methods. To detect influential observations in the proposed models, diagnostic measures based on case deletion, which are denominated global influence, and influence measures based on small perturbations in the data or in the model, referred to as local influence, are used. In addition to those measures, the local influence and the total influential estimate are also employed. Various simulation studies are performed and compared to the performance of the four link functions of the regression models for grouped survival data for different parameter settings, sample sizes and numbers of intervals. Finally, a data set is analyzed by using the proposed regression models. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
A four-parameter extension of the generalized gamma distribution capable of modelling a bathtub-shaped hazard rate function is defined and studied. The beauty and importance of this distribution lies in its ability to model monotone and non-monotone failure rate functions, which are quite common in lifetime data analysis and reliability. The new distribution has a number of well-known lifetime special sub-models, such as the exponentiated Weibull, exponentiated generalized half-normal, exponentiated gamma and generalized Rayleigh, among others. We derive two infinite sum representations for its moments. We calculate the density of the order statistics and two expansions for their moments. The method of maximum likelihood is used for estimating the model parameters and the observed information matrix is obtained. Finally, a real data set from the medical area is analysed.
Resumo:
Joint generalized linear models and double generalized linear models (DGLMs) were designed to model outcomes for which the variability can be explained using factors and/or covariates. When such factors operate, the usual normal regression models, which inherently exhibit constant variance, will under-represent variation in the data and hence may lead to erroneous inferences. For count and proportion data, such noise factors can generate a so-called overdispersion effect, and the use of binomial and Poisson models underestimates the variability and, consequently, incorrectly indicate significant effects. In this manuscript, we propose a DGLM from a Bayesian perspective, focusing on the case of proportion data, where the overdispersion can be modeled using a random effect that depends on some noise factors. The posterior joint density function was sampled using Monte Carlo Markov Chain algorithms, allowing inferences over the model parameters. An application to a data set on apple tissue culture is presented, for which it is shown that the Bayesian approach is quite feasible, even when limited prior information is available, thereby generating valuable insight for the researcher about its experimental results.
Resumo:
The application of airborne laser scanning (ALS) technologies in forest inventories has shown great potential to improve the efficiency of forest planning activities. Precise estimates, fast assessment and relatively low complexity can explain the good results in terms of efficiency. The evolution of GPS and inertial measurement technologies, as well as the observed lower assessment costs when these technologies are applied to large scale studies, can explain the increasing dissemination of ALS technologies. The observed good quality of results can be expressed by estimates of volumes and basal area with estimated error below the level of 8.4%, depending on the size of sampled area, the quantity of laser pulses per square meter and the number of control plots. This paper analyzes the potential of an ALS assessment to produce certain forest inventory statistics in plantations of cloned Eucalyptus spp with precision equal of superior to conventional methods. The statistics of interest in this case were: volume, basal area, mean height and dominant trees mean height. The ALS flight for data assessment covered two strips of approximately 2 by 20 Km, in which clouds of points were sampled in circular plots with a radius of 13 m. Plots were sampled in different parts of the strips to cover different stand ages. The clouds of points generated by the ALS assessment: overall height mean, standard error, five percentiles (height under which we can find 10%, 30%, 50%,70% and 90% of the ALS points above ground level in the cloud), and density of points above ground level in each percentile were calculated. The ALS statistics were used in regression models to estimate mean diameter, mean height, mean height of dominant trees, basal area and volume. Conventional forest inventory sample plots provided real data. For volume, an exploratory assessment involving different combinations of ALS statistics allowed for the definition of the most promising relationships and fitting tests based on well known forest biometric models. The models based on ALS statistics that produced the best results involved: the 30% percentile to estimate mean diameter (R(2)=0,88 and MQE%=0,0004); the 10% and 90% percentiles to estimate mean height (R(2)=0,94 and MQE%=0,0003); the 90% percentile to estimate dominant height (R(2)=0,96 and MQE%=0,0003); the 10% percentile and mean height of ALS points to estimate basal area (R(2)=0,92 and MQE%=0,0016); and, to estimate volume, age and the 30% and 90% percentiles (R(2)=0,95 MQE%=0,002). Among the tested forest biometric models, the best fits were provided by the modified Schumacher using age and the 90% percentile, modified Clutter using age, mean height of ALS points and the 70% percentile, and modified Buckman using age, mean height of ALS points and the 10% percentile.
Resumo:
Grass reference evapotranspiration (ETo) is an important agrometeorological parameter for climatological and hydrological studies, as well as for irrigation planning and management. There are several methods to estimate ETo, but their performance in different environments is diverse, since all of them have some empirical background. The FAO Penman-Monteith (FAD PM) method has been considered as a universal standard to estimate ETo for more than a decade. This method considers many parameters related to the evapotranspiration process: net radiation (Rn), air temperature (7), vapor pressure deficit (Delta e), and wind speed (U); and has presented very good results when compared to data from lysimeters Populated with short grass or alfalfa. In some conditions, the use of the FAO PM method is restricted by the lack of input variables. In these cases, when data are missing, the option is to calculate ETo by the FAD PM method using estimated input variables, as recommended by FAD Irrigation and Drainage Paper 56. Based on that, the objective of this study was to evaluate the performance of the FAO PM method to estimate ETo when Rn, Delta e, and U data are missing, in Southern Ontario, Canada. Other alternative methods were also tested for the region: Priestley-Taylor, Hargreaves, and Thornthwaite. Data from 12 locations across Southern Ontario, Canada, were used to compare ETo estimated by the FAD PM method with a complete data set and with missing data. The alternative ETo equations were also tested and calibrated for each location. When relative humidity (RH) and U data were missing, the FAD PM method was still a very good option for estimating ETo for Southern Ontario, with RMSE smaller than 0.53 mm day(-1). For these cases, U data were replaced by the normal values for the region and Delta e was estimated from temperature data. The Priestley-Taylor method was also a good option for estimating ETo when U and Delta e data were missing, mainly when calibrated locally (RMSE = 0.40 mm day(-1)). When Rn was missing, the FAD PM method was not good enough for estimating ETo, with RMSE increasing to 0.79 mm day(-1). When only T data were available, adjusted Hargreaves and modified Thornthwaite methods were better options to estimate ETo than the FAO) PM method, since RMSEs from these methods, respectively 0.79 and 0.83 mm day(-1), were significantly smaller than that obtained by FAO PM (RMSE = 1.12 mm day(-1). (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Leaf wetness duration (LWD) is a key parameter in agricultural meteorology since it is related to epidemiology of many important crops, controlling pathogen infection and development rates. Because LWD is not widely measured, several methods have been developed to estimate it from weather data. Among the models used to estimate LWD, those that use physical principles of dew formation and dew and/or rain evaporation have shown good portability and sufficiently accurate results, but their complexity is a disadvantage for operational use. Alternatively, empirical models have been used despite their limitations. The simplest empirical models use only relative humidity data. The objective of this study was to evaluate the performance of three RH-based empirical models to estimate LWD in four regions around the world that have different climate conditions. Hourly LWD, air temperature, and relative humidity data were obtained from Ames, IA (USA), Elora, Ontario (Canada), Florence, Toscany (Italy), and Piracicaba, Sao Paulo State (Brazil). These data were used to evaluate the performance of the following empirical LWD estimation models: constant RH threshold (RH >= 90%); dew point depression (DPD); and extended RH threshold (EXT_RH). Different performance of the models was observed in the four locations. In Ames, Elora and Piracicaba, the RH >= 90% and DPD models underestimated LWD, whereas in Florence these methods overestimated LWD, especially for shorter wet periods. When the EXT_RH model was used, LWD was overestimated for all locations, with a significant increase in the errors. In general, the RH >= 90% model performed best, presenting the highest general fraction of correct estimates (F(C)), between 0.87 and 0.92, and the lowest false alarm ratio (F(AR)), between 0.02 and 0.31. The use of specific thresholds for each location improved accuracy of the RH model substantially, even when independent data were used; MAE ranged from 1.23 to 1.89 h, which is very similar to errors obtained with published physical models for LWD estimation. Based on these results, we concluded that, if calibrated locally, LWD can be estimated with acceptable accuracy by RH above a specific threshold, and that the EXT_RH method was unsuitable for estimating LWD at the locations used in this study. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
The van Genuchten expressions for the unsaturated soil hydraulic properties, first published in 1980, are used frequently in various vadose zone flow and transport applications assuming a specific relationship between the m and n soil hydraulic parameters. By comparison, probably because of the complexity of the hydraulic conductivity equations, the more general solutions with independent m and n values are rarely used. We expressed the general van Genuchten-Mualem and van Genuchten-Burdine hydraulic conductivity equations in terms of hypergeometric functions, which can be approximated by infinite series that converge rapidly for relatively large values of the van Genuchten-Mualem parameter n but only very slowly when n is close to one. Alternative equations were derived that provide very close approximations of the analytical results. The newly proposed equations allow the use of independent values of the parameters m and n in the soil water retention model of van Genuchten for subsequent prediction of the van Genuchten-Mualem and van Genuchten-Burdine hydraulic conductivity models, thus providing more flexibility in fitting experimental pressure-head-dependent water content, theta(h), and hydraulic conductivity, K(h), or K(theta) data.
Resumo:
This article presents a statistical model of agricultural yield data based on a set of hierarchical Bayesian models that allows joint modeling of temporal and spatial autocorrelation. This method captures a comprehensive range of the various uncertainties involved in predicting crop insurance premium rates as opposed to the more traditional ad hoc, two-stage methods that are typically based on independent estimation and prediction. A panel data set of county-average yield data was analyzed for 290 counties in the State of Parana (Brazil) for the period of 1990 through 2002. Posterior predictive criteria are used to evaluate different model specifications. This article provides substantial improvements in the statistical and actuarial methods often applied to the calculation of insurance premium rates. These improvements are especially relevant to situations where data are limited.