980 resultados para Data errors


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, an axisymmetric two-dimensional finite element model was developed to simulate instrumented indentation testing of thin ceramic films deposited onto hard steel substrates. The level of film residual stress (sigma(r)), the film elastic modulus (E) and the film work hardening exponent (n) were varied to analyze their effects on indentation data. These numerical results were used to analyze experimental data that were obtained with titanium nitride coated specimens, in which the substrate bias applied during deposition was modified to obtain films with different levels of sigma(r). Good qualitative correlation was obtained when numerical and experimental results were compared, as long as all film properties are considered in the analyses, and not only sigma(r). The numerical analyses were also used to further understand the effect of sigma(r) on the mechanical properties calculated based on instrumented indentation data. In this case, the hardness values obtained based on real or calculated contact areas are similar only when sink-in occurs, i.e. with high n or high ratio VIE, where Y is the yield strength of the film. In an additional analysis, four ratios (R/h(max)) between indenter tip radius and maximum penetration depth were simulated to analyze the combined effects of R and sigma(r) on the indentation load-displacement curves. In this case, or did not significantly affect the load curve exponent, which was affected only by the indenter tip radius. On the other hand, the proportional curvature coefficient was significantly affected by sigma(r) and n. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An important topic in genomic sequence analysis is the identification of protein coding regions. In this context, several coding DNA model-independent methods based on the occurrence of specific patterns of nucleotides at coding regions have been proposed. Nonetheless, these methods have not been completely suitable due to their dependence on an empirically predefined window length required for a local analysis of a DNA region. We introduce a method based on a modified Gabor-wavelet transform (MGWT) for the identification of protein coding regions. This novel transform is tuned to analyze periodic signal components and presents the advantage of being independent of the window length. We compared the performance of the MGWT with other methods by using eukaryote data sets. The results show that MGWT outperforms all assessed model-independent methods with respect to identification accuracy. These results indicate that the source of at least part of the identification errors produced by the previous methods is the fixed working scale. The new method not only avoids this source of errors but also makes a tool available for detailed exploration of the nucleotide occurrence.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For the first time, we introduce and study some mathematical properties of the Kumaraswamy Weibull distribution that is a quite flexible model in analyzing positive data. It contains as special sub-models the exponentiated Weibull, exponentiated Rayleigh, exponentiated exponential, Weibull and also the new Kumaraswamy exponential distribution. We provide explicit expressions for the moments and moment generating function. We examine the asymptotic distributions of the extreme values. Explicit expressions are derived for the mean deviations, Bonferroni and Lorenz curves, reliability and Renyi entropy. The moments of the order statistics are calculated. We also discuss the estimation of the parameters by maximum likelihood. We obtain the expected information matrix. We provide applications involving two real data sets on failure times. Finally, some multivariate generalizations of the Kumaraswamy Weibull distribution are discussed. (C) 2010 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The knowledge of soil water storage (SWS) of soil profiles is crucial for the adoption of vegetation restoration practices. With the aim of identifying representative sites to obtain the mean SWS of a watershed, a time stability analysis of neutron probe evaluations of SWS was performed by the means of relative differences and Spearman rank correlation coefficients. At the same time, the effects of different neutron probe calibration procedures were explored on time stability analysis. mean SWS estimation. and preservation of the spatial variability of SWS. The selected watershed, with deep gullies and undulating slopes which cover an area of 20 ha, is characterized by an Ust-Sandiic Entisol and an Aeolian sandy soil. The dominant vegetation species are bunge needlegrass (Stipa bungeana Trim) and korshinsk peashrub (Carugano Korshinskii kom.). From June 11, 2007 to July 23,2008, SWS of the top1 m soil layer was evaluated for 20 dates, based on neutron probe data of 12 sampling sites. Three calibration procedures were employed: type 1, most complete, with each site having its own linear calibration equation (TrE); type II. with TrE equations extended over the whole field: and type III, with one single linear calibration curve for the whole field (UnE) and also correcting its intercept based on site specific relative difference analysis (RdE) and on linear fitting of data (RcE), both maintaining the same slope. A strong time stability of SWS estimated by TrE equations was identified. Soil particle size and soil organic matter content were recognized as the influencing factors for spatial variability of SWS. Land use influenced neither the spatial variability nor the time stability of SWS. Time stability analysis identified one site to represent the mean SWS of the whole watershed with mean absolute percentage errors of less than 10%, therefore. this site can be used as a predictor for the mean SWS of the watershed. Some equations of type II were found to be unsatisfactory to yield reliable mean SWS values or in preserving the associated soil spatial variability. Hence, it is recommended to be cautious in extending calibration equations to other sites since they might not consider the field variability. For the equations with corrected intercept (type III), which consider the spatial variability of calibration in a different way in relation to TrE, it was found that they can yield satisfactory means and standard deviation of SWS, except for the RdE equations, which largely leveled off the SWS values in the watershed. Correlation analysis showed that the neutron probe calibration was linked to soil bulk density and to organic matter content. Therefore, spatial variability of soil properties should be taken into account during the process of neutron probe calibration. This study provides useful information on the mean SWS observation with a time stable site and on distinct neutron probe calibration procedures, and it should be extended to soil water management studies with neutron probes, e.g., the process of vegetation restoration in wider area and soil types of the Loess Plateau in China. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Interval-censored survival data, in which the event of interest is not observed exactly but is only known to occur within some time interval, occur very frequently. In some situations, event times might be censored into different, possibly overlapping intervals of variable widths; however, in other situations, information is available for all units at the same observed visit time. In the latter cases, interval-censored data are termed grouped survival data. Here we present alternative approaches for analyzing interval-censored data. We illustrate these techniques using a survival data set involving mango tree lifetimes. This study is an example of grouped survival data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a regression model considering the modified Weibull distribution. This distribution can be used to model bathtub-shaped failure rate functions. Assuming censored data, we consider maximum likelihood and Jackknife estimators for the parameters of the model. We derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and we also present some ways to perform global influence. Besides, for different parameter settings, sample sizes and censoring percentages, various simulations are performed and the empirical distribution of the modified deviance residual is displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended for a martingale-type residual in log-modified Weibull regression models with censored data. Finally, we analyze a real data set under log-modified Weibull regression models. A diagnostic analysis and a model checking based on the modified deviance residual are performed to select appropriate models. (c) 2008 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, regression models are evaluated for grouped survival data when the effect of censoring time is considered in the model and the regression structure is modeled through four link functions. The methodology for grouped survival data is based on life tables, and the times are grouped in k intervals so that ties are eliminated. Thus, the data modeling is performed by considering the discrete models of lifetime regression. The model parameters are estimated by using the maximum likelihood and jackknife methods. To detect influential observations in the proposed models, diagnostic measures based on case deletion, which are denominated global influence, and influence measures based on small perturbations in the data or in the model, referred to as local influence, are used. In addition to those measures, the local influence and the total influential estimate are also employed. Various simulation studies are performed and compared to the performance of the four link functions of the regression models for grouped survival data for different parameter settings, sample sizes and numbers of intervals. Finally, a data set is analyzed by using the proposed regression models. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A four-parameter extension of the generalized gamma distribution capable of modelling a bathtub-shaped hazard rate function is defined and studied. The beauty and importance of this distribution lies in its ability to model monotone and non-monotone failure rate functions, which are quite common in lifetime data analysis and reliability. The new distribution has a number of well-known lifetime special sub-models, such as the exponentiated Weibull, exponentiated generalized half-normal, exponentiated gamma and generalized Rayleigh, among others. We derive two infinite sum representations for its moments. We calculate the density of the order statistics and two expansions for their moments. The method of maximum likelihood is used for estimating the model parameters and the observed information matrix is obtained. Finally, a real data set from the medical area is analysed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Joint generalized linear models and double generalized linear models (DGLMs) were designed to model outcomes for which the variability can be explained using factors and/or covariates. When such factors operate, the usual normal regression models, which inherently exhibit constant variance, will under-represent variation in the data and hence may lead to erroneous inferences. For count and proportion data, such noise factors can generate a so-called overdispersion effect, and the use of binomial and Poisson models underestimates the variability and, consequently, incorrectly indicate significant effects. In this manuscript, we propose a DGLM from a Bayesian perspective, focusing on the case of proportion data, where the overdispersion can be modeled using a random effect that depends on some noise factors. The posterior joint density function was sampled using Monte Carlo Markov Chain algorithms, allowing inferences over the model parameters. An application to a data set on apple tissue culture is presented, for which it is shown that the Bayesian approach is quite feasible, even when limited prior information is available, thereby generating valuable insight for the researcher about its experimental results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Grass reference evapotranspiration (ETo) is an important agrometeorological parameter for climatological and hydrological studies, as well as for irrigation planning and management. There are several methods to estimate ETo, but their performance in different environments is diverse, since all of them have some empirical background. The FAO Penman-Monteith (FAD PM) method has been considered as a universal standard to estimate ETo for more than a decade. This method considers many parameters related to the evapotranspiration process: net radiation (Rn), air temperature (7), vapor pressure deficit (Delta e), and wind speed (U); and has presented very good results when compared to data from lysimeters Populated with short grass or alfalfa. In some conditions, the use of the FAO PM method is restricted by the lack of input variables. In these cases, when data are missing, the option is to calculate ETo by the FAD PM method using estimated input variables, as recommended by FAD Irrigation and Drainage Paper 56. Based on that, the objective of this study was to evaluate the performance of the FAO PM method to estimate ETo when Rn, Delta e, and U data are missing, in Southern Ontario, Canada. Other alternative methods were also tested for the region: Priestley-Taylor, Hargreaves, and Thornthwaite. Data from 12 locations across Southern Ontario, Canada, were used to compare ETo estimated by the FAD PM method with a complete data set and with missing data. The alternative ETo equations were also tested and calibrated for each location. When relative humidity (RH) and U data were missing, the FAD PM method was still a very good option for estimating ETo for Southern Ontario, with RMSE smaller than 0.53 mm day(-1). For these cases, U data were replaced by the normal values for the region and Delta e was estimated from temperature data. The Priestley-Taylor method was also a good option for estimating ETo when U and Delta e data were missing, mainly when calibrated locally (RMSE = 0.40 mm day(-1)). When Rn was missing, the FAD PM method was not good enough for estimating ETo, with RMSE increasing to 0.79 mm day(-1). When only T data were available, adjusted Hargreaves and modified Thornthwaite methods were better options to estimate ETo than the FAO) PM method, since RMSEs from these methods, respectively 0.79 and 0.83 mm day(-1), were significantly smaller than that obtained by FAO PM (RMSE = 1.12 mm day(-1). (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Leaf wetness duration (LWD) is a key parameter in agricultural meteorology since it is related to epidemiology of many important crops, controlling pathogen infection and development rates. Because LWD is not widely measured, several methods have been developed to estimate it from weather data. Among the models used to estimate LWD, those that use physical principles of dew formation and dew and/or rain evaporation have shown good portability and sufficiently accurate results, but their complexity is a disadvantage for operational use. Alternatively, empirical models have been used despite their limitations. The simplest empirical models use only relative humidity data. The objective of this study was to evaluate the performance of three RH-based empirical models to estimate LWD in four regions around the world that have different climate conditions. Hourly LWD, air temperature, and relative humidity data were obtained from Ames, IA (USA), Elora, Ontario (Canada), Florence, Toscany (Italy), and Piracicaba, Sao Paulo State (Brazil). These data were used to evaluate the performance of the following empirical LWD estimation models: constant RH threshold (RH >= 90%); dew point depression (DPD); and extended RH threshold (EXT_RH). Different performance of the models was observed in the four locations. In Ames, Elora and Piracicaba, the RH >= 90% and DPD models underestimated LWD, whereas in Florence these methods overestimated LWD, especially for shorter wet periods. When the EXT_RH model was used, LWD was overestimated for all locations, with a significant increase in the errors. In general, the RH >= 90% model performed best, presenting the highest general fraction of correct estimates (F(C)), between 0.87 and 0.92, and the lowest false alarm ratio (F(AR)), between 0.02 and 0.31. The use of specific thresholds for each location improved accuracy of the RH model substantially, even when independent data were used; MAE ranged from 1.23 to 1.89 h, which is very similar to errors obtained with published physical models for LWD estimation. Based on these results, we concluded that, if calibrated locally, LWD can be estimated with acceptable accuracy by RH above a specific threshold, and that the EXT_RH method was unsuitable for estimating LWD at the locations used in this study. (C) 2007 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Leaf wetness duration (LWD) is related to plant disease occurrence and is therefore a key parameter in agrometeorology. As LWD is seldom measured at standard weather stations, it must be estimated in order to ensure the effectiveness of warning systems and the scheduling of chemical disease control. Among the models used to estimate LWD, those that use physical principles of dew formation and dew and/or rain evaporation have shown good portability and sufficiently accurate results for operational use. However, the requirement of net radiation (Rn) is a disadvantage foroperational physical models, since this variable is usually not measured over crops or even at standard weather stations. With the objective of proposing a solution for this problem, this study has evaluated the ability of four models to estimate hourly Rn and their impact on LWD estimates using a Penman-Monteith approach. A field experiment was carried out in Elora, Ontario, Canada, with measurements of LWD, Rn and other meteorological variables over mowed turfgrass for a 58 day period during the growing season of 2003. Four models for estimating hourly Rn based on different combinations of incoming solar radiation (Rg), airtemperature (T), relative humidity (RH), cloud cover (CC) and cloud height (CH), were evaluated. Measured and estimated hourly Rn values were applied in a Penman-Monteith model to estimate LWD. Correlating measured and estimated Rn, we observed that all models performed well in terms of estimating hourly Rn. However, when cloud data were used the models overestimated positive Rn and underestimated negative Rn. When only Rg and T were used to estimate hourly Rn, the model underestimated positive Rn and no tendency was observed for negative Rn. The best performance was obtained with Model I, which presented, in general, the smallest mean absolute error (MAE) and the highest C-index. When measured LWD was compared to the Penman-Monteith LWD, calculated with measured and estimated Rn, few differences were observed. Both precision and accuracy were high, with the slopes of the relationships ranging from 0.96 to 1.02 and R-2 from 0.85 to 0.92, resulting in C-indices between 0.87 and 0.93. The LWD mean absolute errors associated with Rn estimates were between 1.0 and 1.5h, which is sufficient for use in plant disease management schemes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article presents a statistical model of agricultural yield data based on a set of hierarchical Bayesian models that allows joint modeling of temporal and spatial autocorrelation. This method captures a comprehensive range of the various uncertainties involved in predicting crop insurance premium rates as opposed to the more traditional ad hoc, two-stage methods that are typically based on independent estimation and prediction. A panel data set of county-average yield data was analyzed for 290 counties in the State of Parana (Brazil) for the period of 1990 through 2002. Posterior predictive criteria are used to evaluate different model specifications. This article provides substantial improvements in the statistical and actuarial methods often applied to the calculation of insurance premium rates. These improvements are especially relevant to situations where data are limited.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Allele frequency distributions and population data for 12 Y-chromosomal short tandem repeats (STRs) included in the PowerPlex (R) Y Systems (Promega) were obtained for a sample of 200 healthy unrelated males living in S (a) over tildeo Paulo State (Southeast of Brazil). A total of 192 haplotypes were identified, of which 184 were unique and 8 were found in 2 individuals. The average gene diversity of the 12 Y-STR was 0.6746 and the haplotype diversity was 0.9996. Pairwise analysis confirmed that our population is more similar with the Italy, North Portugal and Spain, being more distant of the Japan. (c) 2007 Elsevier Ireland Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Brazilian Network of Food Data Systems (BRASILFOODS) has been keeping the Brazilian Food Composition Database-USP (TBCA-USP) (http://www.fcf.usp.br/tabela) since 1998. Besides the constant compilation, analysis and update work in the database, the network tries to innovate through the introduction of food information that may contribute to decrease the risk for non-transmissible chronic diseases, such as the profile of carbohydrates and flavonoids in foods. In 2008, data on carbohydrates, individually analyzed, of 112 foods, and 41 data related to the glycemic response produced by foods widely consumed in the country were included in the TBCA-USP. Data (773) about the different flavonoid subclasses of 197 Brazilian foods were compiled and the quality of each data was evaluated according to the USDAs data quality evaluation system. In 2007, BRASILFOODS/USP and INFOODS/FAO organized the 7th International Food Data Conference ""Food Composition and Biodiversity"". This conference was a unique opportunity for interaction between renowned researchers and participants from several countries and it allowed the discussion of aspects that may improve the food composition area. During the period, the LATINFOODS Regional Technical Compilation Committee and BRASILFOODS disseminated to Latin America the Form and Manual for Data Compilation, version 2009, ministered a Food Composition Data Compilation course and developed many activities related to data production and compilation. (C) 2010 Elsevier Inc. All rights reserved.