998 resultados para Statistical Convergence
Resumo:
Efficient automatic protein classification is of central importance in genomic annotation. As an independent way to check the reliability of the classification, we propose a statistical approach to test if two sets of protein domain sequences coming from two families of the Pfam database are significantly different. We model protein sequences as realizations of Variable Length Markov Chains (VLMC) and we use the context trees as a signature of each protein family. Our approach is based on a Kolmogorov-Smirnov-type goodness-of-fit test proposed by Balding et at. [Limit theorems for sequences of random trees (2008), DOI: 10.1007/s11749-008-0092-z]. The test statistic is a supremum over the space of trees of a function of the two samples; its computation grows, in principle, exponentially fast with the maximal number of nodes of the potential trees. We show how to transform this problem into a max-flow over a related graph which can be solved using a Ford-Fulkerson algorithm in polynomial time on that number. We apply the test to 10 randomly chosen protein domain families from the seed of Pfam-A database (high quality, manually curated families). The test shows that the distributions of context trees coming from different families are significantly different. We emphasize that this is a novel mathematical approach to validate the automatic clustering of sequences in any context. We also study the performance of the test via simulations on Galton-Watson related processes.
Resumo:
The Community Climate Model (CCM3) from the National Center for Atmospheric Research (NCAR) is used to investigate the effect of the South Atlantic sea surface temperature (SST) anomalies on interannual to decadal variability of South American precipitation. Two ensembles composed of multidecadal simulations forced with monthly SST data from the Hadley Centre for the period 1949 to 2001 are analysed. A statistical treatment based on signal-to-noise ratio and Empirical Orthogonal Functions (EOF) is applied to the ensembles in order to reduce the internal variability among the integrations. The ensemble treatment shows a spatial and temporal dependence of reproducibility. High degree of reproducibility is found in the tropics while the extratropics is apparently less reproducible. Austral autumn (MAM) and spring (SON) precipitation appears to be more reproducible over the South America-South Atlantic region than the summer (DJF) and winter (JJA) rainfall. While the Inter-tropical Convergence Zone (ITCZ) region is dominated by external variance, the South Atlantic Convergence Zone (SACZ) over South America is predominantly determined by internal variance, which makes it a difficult phenomenon to predict. Alternatively, the SACZ over western South Atlantic appears to be more sensitive to the subtropical SST anomalies than over the continent. An attempt is made to separate the atmospheric response forced by the South Atlantic SST anomalies from that associated with the El Nino - Southern Oscillation (ENSO). Results show that both the South Atlantic and Pacific SSTs modulate the intensity and position of the SACZ during DJF. Particularly, the subtropical South Atlantic SSTs are more important than ENSO in determining the position of the SACZ over the southeast Brazilian coast during DJF. On the other hand, the ENSO signal seems to influence the intensity of the SACZ not only in DJF but especially its oceanic branch during MAM. Both local and remote influences, however, are confounded by the large internal variance in the region. During MAM and JJA, the South Atlantic SST anomalies affect the magnitude and the meridional displacement of the ITCZ. In JJA, the ENSO has relatively little influence on the interannual variability of the simulated rainfall. During SON, however, the ENSO seems to counteract the effect of the subtropical South Atlantic SST variations on convection over South America.
Resumo:
This article intends to contribute to the reflection on the Educational Statistics as being source for the researches on History of Education. The main concern was to reveal the way Educational Statistics related to the period from 1871 to 1931 were produced, in central government. Official reports - from the General Statistics Directory - and Statistics yearbooks released by that department were analyzed and, on this analysis, recommendations and definitions to perform the works were sought. By rending problematic to the documental issues on Educational Statistics and their usual interpretations, the intention was to reduce the ignorance about the origin of the school numbers, which are occasionally used in current researches without the convenient critical exam.
Resumo:
This study presents the results of a mature landfill leachate treated by a homogeneous catalytic ozonation process with ions Fe(2+) and Fe(3+) at acidic pH. Quality assessments were performed using Taguchi`s method (L(8) design). Strong synergism was observed statistically between molecular ozone and ferric ions, pointing to their catalytic effect on (center dot)OH generation. The achievement of better organic matter depollution rates requires an ozone flow of 5 L h(-1) (590 mg h(-1) O(3)) and a ferric ion concentration of 5 mg L(-1).
Resumo:
This work presents a statistical study on the variability of the mechanical properties of hardened self-compacting concrete, including the compressive strength, splitting tensile strength and modulus of elasticity. The comparison of the experimental results with those derived from several codes and recommendations allows evaluating if the hardened behaviour of self-compacting concrete can be appropriately predicted by the existing formulations. The variables analyzed include the maximum size aggregate, paste and gravel content. Results from the analyzed self-compacting concretes presented variability measures in the same range than the expected for conventional vibrated concrete, with all the results within a confidence level of 95%. From several formulations for conventional concrete considered in this study, it was observed that a safe estimation of the modulus of elasticity can be obtained from the value of compressive strength; with lower strength self-compacting concretes presenting higher safety margins. However, most codes overestimate the material tensile strength. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
This paper presents an Adaptive Maximum Entropy (AME) approach for modeling biological species. The Maximum Entropy algorithm (MaxEnt) is one of the most used methods in modeling biological species geographical distribution. The approach presented here is an alternative to the classical algorithm. Instead of using the same set features in the training, the AME approach tries to insert or to remove a single feature at each iteration. The aim is to reach the convergence faster without affect the performance of the generated models. The preliminary experiments were well performed. They showed an increasing on performance both in accuracy and in execution time. Comparisons with other algorithms are beyond the scope of this paper. Some important researches are proposed as future works.
Resumo:
This paper analyzes the convergence of the constant modulus algorithm (CMA) in a decision feedback equalizer using only a feedback filter. Several works had already observed that the CMA presented a better performance than decision directed algorithm in the adaptation of the decision feedback equalizer, but theoretical analysis always showed to be difficult specially due to the analytical difficulties presented by the constant modulus criterion. In this paper, we surmount such obstacle by using a recent result concerning the CM analysis, first obtained in a linear finite impulse response context with the objective of comparing its solutions to the ones obtained through the Wiener criterion. The theoretical analysis presented here confirms the robustness of the CMA when applied to the adaptation of the decision feedback equalizer and also defines a class of channels for which the algorithm will suffer from ill-convergence when initialized at the origin.
Resumo:
Although the formulation of the nonlinear theory of H(infinity) control has been well developed, solving the Hamilton-Jacobi-Isaacs equation remains a challenge and is the major bottleneck for practical application of the theory. Several numerical methods have been proposed for its solution. In this paper, results on convergence and stability for a successive Galerkin approximation approach for nonlinear H(infinity) control via output feedback are presented. An example is presented illustrating the application of the algorithm.
Resumo:
The aim objective of this project was to evaluate the protein extraction of soybean flour in dairy whey, by the multivariate statistical method with 2(3) experiments. Influence of three variables were considered: temperature, pH and percentage of sodium chloride against the process specific variable ( percentage of protein extraction). It was observed that, during the protein extraction against time and temperature, the treatments at 80 degrees C for 2h presented great values of total protein (5.99%). The increasing for the percentage of protein extraction was major according to the heating time. Therefore, the maximum point from the function that represents the protein extraction was analysed by factorial experiment 2(3). By the results, it was noted that all the variables were important to extraction. After the statistical analyses, was observed that the parameters as pH, temperature, and percentage of sodium chloride, did not sufficient for the extraction process, since did not possible to obtain the inflection point from mathematical function, however, by the other hand, the mathematical model was significant, as well as, predictive.
Resumo:
We define a new type of self-similarity for one-parameter families of stochastic processes, which applies to certain important families of processes that are not self-similar in the conventional sense. This includes Hougaard Levy processes such as the Poisson processes, Brownian motions with drift and the inverse Gaussian processes, and some new fractional Hougaard motions defined as moving averages of Hougaard Levy process. Such families have many properties in common with ordinary self-similar processes, including the form of their covariance functions, and the fact that they appear as limits in a Lamperti-type limit theorem for families of stochastic processes.
Resumo:
This article considers alternative methods to calculate the fair premium rate of crop insurance contracts based on county yields. The premium rate was calculated using parametric and nonparametric approaches to estimate the conditional agricultural yield density. These methods were applied to a data set of county yield provided by the Statistical and Geography Brazilian Institute (IBGE), for the period of 1990 through 2002, for soybean, corn and wheat, in the State of Paran. In this article, we propose methodological alternatives to pricing crop insurance contracts resulting in more accurate premium rates in a situation of limited data.
Resumo:
Hydrodynamic studies were conducted in a semi-cylindrical spouted bed column of diameter 150 mm, height 1000 mm, conical base included angle of 60 degrees and inlet orifice diameter 25 mm. Pressure transducers at several axial positions were used to obtain pressure fluctuation time series with 1.2 and 2.4 mm glass beads at U/U-ms from 0.3 to 1.6, and static bed depths from 150 to 600 mm. The conditions covered several flow regimes (fixed bed, incipient spouting, stable spouting, pulsating spouting, slugging, bubble spouting and fluidization). Images of the system dynamics were also acquired through the transparent walls with a digital camera. The data were analyzed via statistical, mutual information theory, spectral and Hurst`s Rescaled Range methods to assess the potential of these methods to characterize the spouting quality. The results indicate that these methods have potential for monitoring spouted bed operation.
Resumo:
Three main models of parameter setting have been proposed: the Variational model proposed by Yang (2002; 2004), the Structured Acquisition model endorsed by Baker (2001; 2005), and the Very Early Parameter Setting (VEPS) model advanced by Wexler (1998). The VEPS model contends that parameters are set early. The Variational model supposes that children employ statistical learning mechanisms to decide among competing parameter values, so this model anticipates delays in parameter setting when critical input is sparse, and gradual setting of parameters. On the Structured Acquisition model, delays occur because parameters form a hierarchy, with higher-level parameters set before lower-level parameters. Assuming that children freely choose the initial value, children sometimes will miss-set parameters. However when that happens, the input is expected to trigger a precipitous rise in one parameter value and a corresponding decline in the other value. We will point to the kind of child language data that is needed in order to adjudicate among these competing models.
Resumo:
OBJECTIVE: To describe variation in all cause and selected cause-specific mortality rates across Australia. METHODS: Mortality and population data for 1997 were obtained from the Australian Bureau of Statistics. All cause and selected cause-specific mortality rates were calculated and directly standardised to the 1997 Australian population in 5-year age groups. Selected major causes of death included cancer, coronary artery disease, cerebrovascular disease, diabetes, accidents and suicide. Rates are reported by statistical division, and State and Territory. RESULTS: All cause age-standardised mortality was 6.98 per 1000 in 1997 and this varied 2-fold from a low in the statistical division of Pilbara, Western Australia (5.78, 95% confidence interval 5.06-6.56), to a high in Northern Territory-excluding Darwin (11.30, 10.67-11.98). Similar mortality variation (all p<0.0001) exists for cancer (1.01-2.23 per 1000) and coronary artery disease (0.99-2.23 per 1000), the two biggest killers. Larger variation (all p<0.0001) exists for cerebrovascular disease (0.7-11.8 per 10,000), diabetes (0.7-6.9 per 10,000), accidents (1.7-7.2 per 10,000) and suicide (0.6-3.8 per 10,000). Less marked variation was observed when analysed by State and Territory. but Northern Territory consistently has the highest age-standardised mortality rates. CONCLUSIONS: Analysed by statistical division, substantial mortality gradients exist across Australia, suggesting an inequitable distribution of the determinants of health. Further research is required to better understand this heterogeneity.
Resumo:
The increase of the women purchase power has led some companies to adopt strategies of products differentiation as well as to produce specific products to the female public. The auto industry is not immune to this phenomenon, once the women represent, approximately half of the automobile sales in the country. Considering the consumption and the behavior differences between women and men, it has set the following question: are there differences between the choices associated to the automobile by men and the choices associated to the automobile by women? It has been presented to the participants items found in the people`s day-by-day, which are valorized by them, and the participants have been asked to choose and associate these items to the automobile. The results analysis revealed there are more similarities than differences between choices associated to the automobile by men ad choices associated to the automobile by women. The similarity between the choices suggests that the representations, the meanings and values assigned. to the car by men ana women are similar and thus the strategy of product differentiation does not apply to the automotive industry