60 resultados para grouping estimators
Resumo:
Nowadays the composting process has shown itself to be an alternative in the treatment of municipal solid wastes by composting plants. However, although more than 50% of the waste generated by the Brazilian population is composed of matter susceptible to organic composting, this process is, still today, insufficiently developed in Brazil, due to low compost quality and lack of investments in the sector. The objective of this work was to use physical analyses to evaluate the quality of the compost produced at 14 operative composting plants in the Sao Paulo State in Brazil. For this purpose, size distribution and total inert content tests were done. The results were analyzed by grouping the plants according to their productive processes: plants with a rotating drum, plants with shredders or mills, and plants without treatment after the sorting conveyor belt. Compost quality was analyzed considering the limits imposed by the Brazilian Legislation and the European standards for inert contents. The size distribution tests showed the influence of the machinery after the sorting conveyer on the granule sizes as well as the inert content, which contributes to the presence of materials that reduce the quality of the final product. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
We studied, for the first time, the near-infrared, stellar and baryonic Tully-Fisher relations for a sample of field galaxies taken from a homogeneous Fabry-Perot sample of galaxies [the Gassendi HAlpha survey of SPirals (GHASP) survey]. The main advantage of GHASP over other samples is that the maximum rotational velocities were estimated from 2D velocity fields, avoiding assumptions about the inclination and position angle of the galaxies. By combining these data with 2MASS photometry, optical colours, HI masses and different mass-to-light ratio estimators, we found a slope of 4.48 +/- 0.38 and 3.64 +/- 0.28 for the stellar and baryonic Tully-Fisher relation, respectively. We found that these values do not change significantly when different mass-to-light ratio recipes were used. We also point out, for the first time, that the rising rotation curves as well as asymmetric rotation curves show a larger dispersion in the Tully-Fisher relation than the flat ones or the symmetric ones. Using the baryonic mass and the optical radius of galaxies, we found that the surface baryonic mass density is almost constant for all the galaxies of this sample. In this study we also emphasize the presence of a break in the NIR Tully-Fisher relation at M(H,K) similar to -20 and we confirm that late-type galaxies present higher total-to-baryonic mass ratios than early-type spirals, suggesting that supernova feedback is actually an important issue in late-type spirals. Due to the well-defined sample selection criteria and the homogeneity of the data analysis, the Tully-Fisher relation for GHASP galaxies can be used as a reference for the study of this relation in other environments and at higher redshifts.
Resumo:
Point placement strategies aim at mapping data points represented in higher dimensions to bi-dimensional spaces and are frequently used to visualize relationships amongst data instances. They have been valuable tools for analysis and exploration of data sets of various kinds. Many conventional techniques, however, do not behave well when the number of dimensions is high, such as in the case of documents collections. Later approaches handle that shortcoming, but may cause too much clutter to allow flexible exploration to take place. In this work we present a novel hierarchical point placement technique that is capable of dealing with these problems. While good grouping and separation of data with high similarity is maintained without increasing computation cost, its hierarchical structure lends itself both to exploration in various levels of detail and to handling data in subsets, improving analysis capability and also allowing manipulation of larger data sets.
Resumo:
In this paper we deal with robust inference in heteroscedastic measurement error models Rather than the normal distribution we postulate a Student t distribution for the observed variables Maximum likelihood estimates are computed numerically Consistent estimation of the asymptotic covariance matrices of the maximum likelihood and generalized least squares estimators is also discussed Three test statistics are proposed for testing hypotheses of interest with the asymptotic chi-square distribution which guarantees correct asymptotic significance levels Results of simulations and an application to a real data set are also reported (C) 2009 The Korean Statistical Society Published by Elsevier B V All rights reserved
Resumo:
The south region of Sao Paulo city hosts the Guarapiranga dam, responsible for water supply to 25% of the city population. Their surroundings have been subject to intense and irregular occupation by people from very low socioeconomics classes. Measurements undertaken on sediment and particulate materials in the dam revealed concentrations of lead. copper, zinc and cadmium above internationally accepted limits. Epidemiological and toxicological studies undertaken by the World Health Organization in individuals exhibiting lead concentrations in blood, near or below the maximum recommended (10 mu g dl(-1)), surprisingly revealed that toxic effects are more intense in individuals belonging to low socioeconomics classes. Motivated by these facts, we aimed at the investigation of chronic incorporation of lead. as well as the use of our BIOKINETICS code, which is based on an accepted ICRP biokinetics model for lead, in order to extrapolate the results from teeth to other organs. The focus of our data taking was children from poor families, living in a small, restrict and allegedly contaminated area in Sao Paulo city. Thus, a total of 74 human teeth were collected. The average concentration of lead in teeth of children 5 to 10 years old was determined by means of a high-resolution inductively coupled plasma mass spectrometer (ICP-MS). For standardization of the measurements, an animal bone certified material (H-Animal Bone), from the International Atomic Energy Agency, was analyzed. The amount of lead in children living in the surroundings of the dam, was approximately 40% higher than those from the control region, and the average lead concentration was equal to 1.3 mu g g(-1) approximately. Grouping the results in terms of gender, tooth type and condition, it was concluded that a carious molar of boys is a much more efficient contamination pathway for lead, resulting in concentrations 70% higher than in the control region. We also inferred the average concentrations of lead in other organs of these children, by making use of our BIOKINETIC code. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
We consider bipartitions of one-dimensional extended systems whose probability distribution functions describe stationary states of stochastic models. We define estimators of the information shared between the two subsystems. If the correlation length is finite, the estimators stay finite for large system sizes. If the correlation length diverges, so do the estimators. The definition of the estimators is inspired by information theory. We look at several models and compare the behaviors of the estimators in the finite-size scaling limit. Analytical and numerical methods as well as Monte Carlo simulations are used. We show how the finite-size scaling functions change for various phase transitions, including the case where one has conformal invariance.
Resumo:
There has been great interest in deciding whether a combinatorial structure satisfies some property, or in estimating the value of some numerical function associated with this combinatorial structure, by considering only a randomly chosen substructure of sufficiently large, but constant size. These problems are called property testing and parameter testing, where a property or parameter is said to be testable if it can be estimated accurately in this way. The algorithmic appeal is evident, as, conditional on sampling, this leads to reliable constant-time randomized estimators. Our paper addresses property testing and parameter testing for permutations in a subpermutation perspective; more precisely, we investigate permutation properties and parameters that can be well approximated based on a randomly chosen subpermutation of much smaller size. In this context, we use a theory of convergence of permutation sequences developed by the present authors [C. Hoppen, Y. Kohayakawa, C.G. Moreira, R.M. Sampaio, Limits of permutation sequences through permutation regularity, Manuscript, 2010, 34pp.] to characterize testable permutation parameters along the lines of the work of Borgs et al. [C. Borgs, J. Chayes, L Lovasz, V.T. Sos, B. Szegedy, K. Vesztergombi, Graph limits and parameter testing, in: STOC`06: Proceedings of the 38th Annual ACM Symposium on Theory of Computing, ACM, New York, 2006, pp. 261-270.] in the case of graphs. Moreover, we obtain a permutation result in the direction of a famous result of Alon and Shapira [N. Alon, A. Shapira, A characterization of the (natural) graph properties testable with one-sided error, SIAM J. Comput. 37 (6) (2008) 1703-1727.] stating that every hereditary graph property is testable. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Mixed models may be defined with or without reference to sampling, and can be used to predict realized random effects, as when estimating the latent values of study subjects measured with response error. When the model is specified without reference to sampling, a simple mixed model includes two random variables, one stemming from an exchangeable distribution of latent values of study subjects and the other, from the study subjects` response error distributions. Positive probabilities are assigned to both potentially realizable responses and artificial responses that are not potentially realizable, resulting in artificial latent values. In contrast, finite population mixed models represent the two-stage process of sampling subjects and measuring their responses, where positive probabilities are only assigned to potentially realizable responses. A comparison of the estimators over the same potentially realizable responses indicates that the optimal linear mixed model estimator (the usual best linear unbiased predictor, BLUP) is often (but not always) more accurate than the comparable finite population mixed model estimator (the FPMM BLUP). We examine a simple example and provide the basis for a broader discussion of the role of conditioning, sampling, and model assumptions in developing inference.
Resumo:
Predictors of random effects are usually based on the popular mixed effects (ME) model developed under the assumption that the sample is obtained from a conceptual infinite population; such predictors are employed even when the actual population is finite. Two alternatives that incorporate the finite nature of the population are obtained from the superpopulation model proposed by Scott and Smith (1969. Estimation in multi-stage surveys. J. Amer. Statist. Assoc. 64, 830-840) or from the finite population mixed model recently proposed by Stanek and Singer (2004. Predicting random effects from finite population clustered samples with response error. J. Amer. Statist. Assoc. 99, 1119-1130). Predictors derived under the latter model with the additional assumptions that all variance components are known and that within-cluster variances are equal have smaller mean squared error (MSE) than the competitors based on either the ME or Scott and Smith`s models. As population variances are rarely known, we propose method of moment estimators to obtain empirical predictors and conduct a simulation study to evaluate their performance. The results suggest that the finite population mixed model empirical predictor is more stable than its competitors since, in terms of MSE, it is either the best or the second best and when second best, its performance lies within acceptable limits. When both cluster and unit intra-class correlation coefficients are very high (e.g., 0.95 or more), the performance of the empirical predictors derived under the three models is similar. (c) 2007 Elsevier B.V. All rights reserved.
Resumo:
We obtain adjustments to the profile likelihood function in Weibull regression models with and without censoring. Specifically, we consider two different modified profile likelihoods: (i) the one proposed by Cox and Reid [Cox, D.R. and Reid, N., 1987, Parameter orthogonality and approximate conditional inference. Journal of the Royal Statistical Society B, 49, 1-39.], and (ii) an approximation to the one proposed by Barndorff-Nielsen [Barndorff-Nielsen, O.E., 1983, On a formula for the distribution of the maximum likelihood estimator. Biometrika, 70, 343-365.], the approximation having been obtained using the results by Fraser and Reid [Fraser, D.A.S. and Reid, N., 1995, Ancillaries and third-order significance. Utilitas Mathematica, 47, 33-53.] and by Fraser et al. [Fraser, D.A.S., Reid, N. and Wu, J., 1999, A simple formula for tail probabilities for frequentist and Bayesian inference. Biometrika, 86, 655-661.]. We focus on point estimation and likelihood ratio tests on the shape parameter in the class of Weibull regression models. We derive some distributional properties of the different maximum likelihood estimators and likelihood ratio tests. The numerical evidence presented in the paper favors the approximation to Barndorff-Nielsen`s adjustment.
Resumo:
In this paper, a novel statistical test is introduced to compare two locally stationary time series. The proposed approach is a Wald test considering time-varying autoregressive modeling and function projections in adequate spaces. The covariance structure of the innovations may be also time- varying. In order to obtain function estimators for the time- varying autoregressive parameters, we consider function expansions in splines and wavelet bases. Simulation studies provide evidence that the proposed test has a good performance. We also assess its usefulness when applied to a financial time series.
Resumo:
This paper deals with asymptotic results on a multivariate ultrastructural errors-in-variables regression model with equation errors Sufficient conditions for attaining consistent estimators for model parameters are presented Asymptotic distributions for the line regression estimators are derived Applications to the elliptical class of distributions with two error assumptions are presented The model generalizes previous results aimed at univariate scenarios (C) 2010 Elsevier Inc All rights reserved
Resumo:
When missing data occur in studies designed to compare the accuracy of diagnostic tests, a common, though naive, practice is to base the comparison of sensitivity, specificity, as well as of positive and negative predictive values on some subset of the data that fits into methods implemented in standard statistical packages. Such methods are usually valid only under the strong missing completely at random (MCAR) assumption and may generate biased and less precise estimates. We review some models that use the dependence structure of the completely observed cases to incorporate the information of the partially categorized observations into the analysis and show how they may be fitted via a two-stage hybrid process involving maximum likelihood in the first stage and weighted least squares in the second. We indicate how computational subroutines written in R may be used to fit the proposed models and illustrate the different analysis strategies with observational data collected to compare the accuracy of three distinct non-invasive diagnostic methods for endometriosis. The results indicate that even when the MCAR assumption is plausible, the naive partial analyses should be avoided.
Resumo:
We review some issues related to the implications of different missing data mechanisms on statistical inference for contingency tables and consider simulation studies to compare the results obtained under such models to those where the units with missing data are disregarded. We confirm that although, in general, analyses under the correct missing at random and missing completely at random models are more efficient even for small sample sizes, there are exceptions where they may not improve the results obtained by ignoring the partially classified data. We show that under the missing not at random (MNAR) model, estimates on the boundary of the parameter space as well as lack of identifiability of the parameters of saturated models may be associated with undesirable asymptotic properties of maximum likelihood estimators and likelihood ratio tests; even in standard cases the bias of the estimators may be low only for very large samples. We also show that the probability of a boundary solution obtained under the correct MNAR model may be large even for large samples and that, consequently, we may not always conclude that a MNAR model is misspecified because the estimate is on the boundary of the parameter space.
Resumo:
In chemical analyses performed by laboratories, one faces the problem of determining the concentration of a chemical element in a sample. In practice, one deals with the problem using the so-called linear calibration model, which considers that the errors associated with the independent variables are negligible compared with the former variable. In this work, a new linear calibration model is proposed assuming that the independent variables are subject to heteroscedastic measurement errors. A simulation study is carried out in order to verify some properties of the estimators derived for the new model and it is also considered the usual calibration model to compare it with the new approach. Three applications are considered to verify the performance of the new approach. Copyright (C) 2010 John Wiley & Sons, Ltd.