21 resultados para Multivariate statistical methods

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Concentrations of 39 organic compounds were determined in three fractions (head, heart and tail) obtained from the pot still distillation of fermented sugarcane juice. The results were evaluated using analysis of variance (ANOVA), Tukey's test, principal component analysis (PCA), hierarchical cluster analysis (HCA) and linear discriminant analysis (LDA). According to PCA and HCA, the experimental data lead to the formation of three clusters. The head fractions give rise to a more defined group. The heart and tail fractions showed some overlap consistent with its acid composition. The predictive ability of calibration and validation of the model generated by LDA for the three fractions classification were 90.5 and 100%, respectively. This model recognized as the heart twelve of the thirteen commercial cachacas (92.3%) with good sensory characteristics, thus showing potential for guiding the process of cuts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As concentrações de 39 compostos orgânicos foram determinadas em três frações (cabeça, coração e cauda) obtidas da destilação em alambique do caldo de cana fermentado. Os resultados foram avaliados utilizando-se análise de variância (ANOVA), teste de Tukey, análise de componentes principais (PCA), agrupamento hierárquico (HCA) e análise discriminante linear (LDA). De acordo com PCA e HCA, os dados experimentais conduzem à formação de três agrupamentos. As frações de cabeça deram origem a um grupo mais definido. As frações coração e cauda apresentaram alguma sobreposição coerente com sua composição em ácidos. As habilidades preditivas de calibração e validação dos modelos gerados pela LDA para a classificação das três frações foram de 90,5 e 100%, respectivamente. Este modelo reconheceu como coração doze de treze cachaças comerciais (92,3%) com boas características sensoriais, apresentando potencial para a orientação do processo de cortes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The use of statistical methods to analyze large databases of text has been useful in unveiling patterns of human behavior and establishing historical links between cultures and languages. In this study, we identified literary movements by treating books published from 1590 to 1922 as complex networks, whose metrics were analyzed with multivariate techniques to generate six clusters of books. The latter correspond to time periods coinciding with relevant literary movements over the last five centuries. The most important factor contributing to the distinctions between different literary styles was the average shortest path length, in particular the asymmetry of its distribution. Furthermore, over time there has emerged a trend toward larger average shortest path lengths, which is correlated with increased syntactic complexity, and a more uniform use of the words reflected in a smaller power-law coefficient for the distribution of word frequency. Changes in literary style were also found to be driven by opposition to earlier writing styles, as revealed by the analysis performed with geometrical concepts. The approaches adopted here are generic and may be extended to analyze a number of features of languages and cultures.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background The genetic mechanisms underlying interindividual blood pressure variation reflect the complex interplay of both genetic and environmental variables. The current standard statistical methods for detecting genes involved in the regulation mechanisms of complex traits are based on univariate analysis. Few studies have focused on the search for and understanding of quantitative trait loci responsible for gene × environmental interactions or multiple trait analysis. Composite interval mapping has been extended to multiple traits and may be an interesting approach to such a problem. Methods We used multiple-trait analysis for quantitative trait locus mapping of loci having different effects on systolic blood pressure with NaCl exposure. Animals studied were 188 rats, the progenies of an F2 rat intercross between the hypertensive and normotensive strain, genotyped in 179 polymorphic markers across the rat genome. To accommodate the correlational structure from measurements taken in the same animals, we applied univariate and multivariate strategies for analyzing the data. Results We detected a new quantitative train locus on a region close to marker R589 in chromosome 5 of the rat genome, not previously identified through serial analysis of individual traits. In addition, we were able to justify analytically the parametric restrictions in terms of regression coefficients responsible for the gain in precision with the adopted analytical approach. Conclusion Future work should focus on fine mapping and the identification of the causative variant responsible for this quantitative trait locus signal. The multivariable strategy might be valuable in the study of genetic determinants of interindividual variation of antihypertensive drug effectiveness.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Abstract Background Patients under haemodialysis are considered at high risk to acquire hepatitis B virus (HBV) infection. Since few data are reported from Brazil, our aim was to assess the frequency and risk factors for HBV infection in haemodialysis patients from 22 Dialysis Centres from Santa Catarina State, south of Brazil. Methods This study includes 813 patients, 149 haemodialysis workers and 772 healthy controls matched by sex and age. Serum samples were assayed for HBV markers and viraemia was detected by nested PCR. HBV was genotyped by partial S gene sequencing. Univariate and multivariate statistical analyses with stepwise logistic regression analysis were carried out to analyse the relationship between HBV infection and the characteristics of patients and their Dialysis Units. Results Frequency of HBV infection was 10.0%, 2.7% and 2.7% among patients, haemodialysis workers and controls, respectively. Amidst patients, the most frequent HBV genotypes were A (30.6%), D (57.1%) and F (12.2%). Univariate analysis showed association between HBV infection and total time in haemodialysis, type of dialysis equipment, hygiene and sterilization of equipment, number of times reusing the dialysis lines and filters, number of patients per care-worker and current HCV infection. The logistic regression model showed that total time in haemodialysis, number of times of reusing the dialysis lines and filters, and number of patients per worker were significantly related to HBV infection. Conclusions Frequency of HBV infection among haemodialysis patients at Santa Catarina state is very high. The most frequent HBV genotypes were A, D and F. The risk for a patient to become HBV positive increase 1.47 times each month of haemodialysis; 1.96 times if the dialysis unit reuses the lines and filters ≥ 10 times compared with haemodialysis units which reuse < 10 times; 3.42 times if the number of patients per worker is more than five. Sequence similarity among the HBV S gene from isolates of different patients pointed out to nosocomial transmission.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Structure of intertidal and subtidal benthic macrofauna in the northeastern region of Todos os Santos Bay (TSB), northeast Brazil, was investigated during a period of two years. Relationships with environmental parameters were studied through uni- and multivariate statistical analyses, and the main distributional patterns shown to be especially related to sediment type and content of organic fractions (Carbon, Nitrogen, Phosphorus), on both temporal and spatial scales. Polychaete annelids accounted for more than 70% of the total fauna and showed low densities, species richness and diversity, except for the area situated on the reef banks. These banks constitute a peculiar environment in relation to the rest of the region by having coarse sediments poor in organic matter and rich in biodetritic carbonates besides an abundant and diverse fauna. The intertidal region and the shallower area nearer to the oil refinery RLAM, with sediments composed mainly of fine sand, seem to constitute an unstable system with few highly dominant species, such as Armandia polyophthalma and Laeonereis acuta. In the other regions of TSB, where muddy bottoms predominated, densities and diversity were low, especially in the stations near the refinery. Here the lowest values of the biological indicators occurred together with the highest organic compound content. In addition, the nearest sites (stations 4 and 7) were sometimes azoic. The adjacent Caboto, considered as a control area at first, presented low density but intermediate values of species diversity, which indicates a less disturbed environment in relation to the pelitic infralittoral in front of the refinery. The results of the ordination analyses evidenced five homogeneous groups of stations (intertidal; reef banks; pelitic infralittoral; mixed sediments; Caboto) with different specific patterns, a fact which seems to be mainly related to granulometry and chemical sediment characteristics.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Gumbel distribution is perhaps the most widely applied statistical distribution for problems in engineering. We propose a generalization-referred to as the Kumaraswamy Gumbel distribution-and provide a comprehensive treatment of its structural properties. We obtain the analytical shapes of the density and hazard rate functions. We calculate explicit expressions for the moments and generating function. The variation of the skewness and kurtosis measures is examined and the asymptotic distribution of the extreme values is investigated. Explicit expressions are also derived for the moments of order statistics. The methods of maximum likelihood and parametric bootstrap and a Bayesian procedure are proposed for estimating the model parameters. We obtain the expected information matrix. An application of the new model to a real dataset illustrates the potentiality of the proposed model. Two bivariate generalizations of the model are proposed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Mesoclemmys heliostemma (Testudines: Chelidae) was described based on five vouchered specimens and nine live specimens from the western Amazon basin. Some authors questioned its status as a valid species, suggesting that it represents a junior synonym of M. raniceps. Here, we report on eight additional specimens from eastern Peru and northern Brazil, and provide descriptive statistics of morphological characters for hatchlings, juveniles, and adults of M. heliostemma, M. raniceps, and M. gibba. We also test for group differences through univariate and multivariate statistical analyses, and discuss some advantages of this methodology. Our data suggest that all three taxa are morphologically divergent, and that M. heliostemma is a valid species.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Many discussions have enlarged the literature in Bibliometrics since the Hirsch proposal, the so called h-index. Ranking papers according to their citations, this index quantifies a researcher only by its greatest possible number of papers that are cited at least h times. A closed formula for h-index distribution that can be applied for distinct databases is not yet known. In fact, to obtain such distribution, the knowledge of citation distribution of the authors and its specificities are required. Instead of dealing with researchers randomly chosen, here we address different groups based on distinct databases. The first group is composed of physicists and biologists, with data extracted from Institute of Scientific Information (IS!). The second group is composed of computer scientists, in which data were extracted from Google-Scholar system. In this paper, we obtain a general formula for the h-index probability density function (pdf) for groups of authors by using generalized exponentials in the context of escort probability. Our analysis includes the use of several statistical methods to estimate the necessary parameters. Also an exhaustive comparison among the possible candidate distributions are used to describe the way the citations are distributed among authors. The h-index pdf should be used to classify groups of researchers from a quantitative point of view, which is meaningfully interesting to eliminate obscure qualitative methods. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The realization that statistical physics methods can be applied to analyze written texts represented as complex networks has led to several developments in natural language processing, including automatic summarization and evaluation of machine translation. Most importantly, so far only a few metrics of complex networks have been used and therefore there is ample opportunity to enhance the statistics-based methods as new measures of network topology and dynamics are created. In this paper, we employ for the first time the metrics betweenness, vulnerability and diversity to analyze written texts in Brazilian Portuguese. Using strategies based on diversity metrics, a better performance in automatic summarization is achieved in comparison to previous work employing complex networks. With an optimized method the Rouge score (an automatic evaluation method used in summarization) was 0.5089, which is the best value ever achieved for an extractive summarizer with statistical methods based on complex networks for Brazilian Portuguese. Furthermore, the diversity metric can detect keywords with high precision, which is why we believe it is suitable to produce good summaries. It is also shown that incorporating linguistic knowledge through a syntactic parser does enhance the performance of the automatic summarizers, as expected, but the increase in the Rouge score is only minor. These results reinforce the suitability of complex network methods for improving automatic summarizers in particular, and treating text in general. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Genes involved in host-pathogen interactions are often strongly affected by positive natural selection. The Duffy antigen, coded by the Duffy antigen receptor for chemokines (DARC) gene, serves as a receptor for Plasmodium vivax in humans and for Plasmodium knowlesi in some nonhuman primates. In the majority of sub-Saharan Africans, a nucleic acid variant in GATA-1 of the gene promoter is responsible for the nonexpression of the Duffy antigen on red blood cells and consequently resistance to invasion by P. vivax. The Duffy antigen also acts as a receptor for chemokines and is expressed in red blood cells and many other tissues of the body. Because of this dual role, we sequenced a 3,000-bp region encompassing the entire DARC gene as well as part of its 5' and 3' flanking regions in a phylogenetic sample of primates and used statistical methods to evaluate the nature of selection pressures acting on the gene during its evolution. We analyzed both coding and regulatory regions of the DARC gene. The regulatory analysis showed accelerated rates of substitution at several sites near known motifs. Our tests of positive selection in the coding region using maximum likelihood by branch sites and maximum likelihood by codon sites did not yield statistically significant evidence for the action of positive selection. However, the maximum likelihood test in which the gene was subdivided into different structural regions showed that the known binding region for P. vivax/P. knowlesi is under very different selective pressures than the remainder of the gene. In fact, most of the gene appears to be under strong purifying selection, but this is not evident in the binding region. We suggest that the binding region is under the influence of two opposing selective pressures, positive selection possibly exerted by the parasite and purifying selection exerted by chemokines.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Chaabene, H, Hachana, Y, Franchini, E, Mkaouer, B, Montassar, M, and Chamari, K. Reliability and construct validity of the karate-specific aerobic test. J Strength Cond Res 26(12): 3454-3460, 2012-The aim of this study was to examine absolute and relative reliabilities and external responsiveness of the Karate-specific aerobic test (KSAT). This study comprised 43 male karatekas, 19 of them participated in the first study to establish test-retest reliability and 40, selected on the bases of their karate experience and level of practice, participated in the second study to identify external responsiveness of the KSAT. The latter group was divided into 2 categories: national-level group (G(n)) and regional-level group (Gr). Analysis showed excellent test-retest reliability of time to exhaustion (TE), with intraclass correlation coefficient ICC(3,1) >0.90, standard error of measurement (SEM) <5%: (3.2%) and mean difference (bias) +/- the 95% limits of agreement: -9.5 +/- 78.8 seconds. There was a significant difference between test-retest session in peak lactate concentration (Peak [La]) (9.12 +/- 2.59 vs. 8.05 +/- 2.67 mmol.L-1; p < 0.05) but not in peak heart rate (HRpeak) and rating of perceived exertion (RPE) (196 +/- 9 vs. 194 +/- 9 b.min(-1) and 7.6 +/- 0.93 vs. 7.8 +/- 1.15; p > 0.05), respectively. National-level karate athletes (1,032 +/- 101 seconds) were better than regional level (841 +/- 134 seconds) on TE performance during KSAT (p < 0.001). Thus, KSAT provided good external responsiveness. The area under the receiver operator characteristics curve was >0.70 (0.86; confidence interval 95%: 0.72-0.95). Significant difference was detected in Peak [La] between national- (6.09 +/- 1.78 mmol.L-1) and regional-level (8.48 +/- 2.63 mmol.L-1) groups, but not in HRpeak (194 +/- 8 vs. 195 +/- 8 b.min(-1)) and RPE (7.57 +/- 1.15 vs. 7.42 +/- 1.1), respectively. The result of this study indicates that KSAT provides excellent absolute and relative reliabilities. The KSAT can effectively distinguish karate athletes of different competitive levels. Thus, the KSAT may be suitable for field assessment of aerobic fitness of karate practitioners.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND: It is widely accepted that red wines constitute one of the most important sources of dietary polyphenolic antioxidants. However, it is still not known how some variables such as variety, vintage, country of origin, and retail price are associated with the antioxidant activity and sensory profile of South American red wines. In this regard, 80 samples produced in Brazil, Chile and Argentina were assessed in relation to their sensory properties, color and in vitro antioxidant activity, and results were subjected to multivariate statistical techniques. RESULTS: Samples were grouped in clusters, characterized by high, intermediate and low in vitro antioxidant activity, sensory properties and prices. It was possible to observe that wines with high antioxidant activity were associated to high retail prices and overall perception of sensory quality. CONCLUSION: South American wines produced from Vitis vinifera such as Syrah, Malbec and Cabernet Sauvignon had higher in vitro antioxidant activity and also higher sensory quality than wines produced from Vitis labrusca. This result was independent of vintage (2002-2010), corroborating the idea that the same grape varietal, even when produced in different years, displays similar sensory characteristics and antioxidant activity. (C) 2011 Society of Chemical Industry

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background Statistical methods for estimating usual intake require at least two short-term dietary measurements in a subsample of the target population. However, the percentage of individuals with a second dietary measurement (replication rate) may influence the precision of estimates, such as percentiles and proportions of individuals below cut-offs of intake. Objective To investigate the precision of the usual food intake estimates using different replication rates and different sample sizes. Participants/setting Adolescents participating in the continuous National Health and Nutrition Examination Survey 2007-2008 (n=1,304) who completed two 24-hour recalls. Statistical analyses performed The National Cancer Institute method was used to estimate the usual intake of dark green vegetables in the original sample comprising 1,304 adolescents with a replication rate of 100%. A bootstrap with 100 replications was performed to estimate CIs for percentiles and proportions of individuals below cut-offs of intake. Using the same bootstrap replications, four sets of data sets were sampled with different replication rates (80%, 60%, 40%, and 20%). For each data set created, the National Cancer Institute method was performed and percentiles, Cl, and proportions of individuals below cut-offs were calculated. Precision estimates were checked by comparing each Cl obtained from data sets with different replication rates with the Cl obtained from original data set. Further, we sampled 1,000, 750, 500, and 250 individuals from the original data set, and performed the same analytical procedures. Results Percentiles of intake and percentage of individuals below the cut-off points were similar throughout the replication rates and sample sizes, but the Cl increased as the replication rate decreased. Wider CIs were observed at 40% and 20% of replication rate. Conclusions The precision of the usual intake estimates decreased when low replication rates were used. However, even with different sample sizes, replication rates >40% may not lead to an important loss of precision. J Acad Nutr Diet. 2012;112:1015-1020.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this work, 50 ceramic fragments from the Lago Grande and 30 from the Osvaldo archaeological site were compared to assess elemental similarities. The aim is to perform a preliminary comparison between the sites, which are located in the central Amazon, Brazil. The analytical technique employed to obtain the ceramics elemental composition was instrumental neutron activation analysis (INAA). The data set obtained was explored by the multivariate statistical techniques of cluster, principal component and discriminant analysis. The analyzed elements were: Na, Lu, U, Yb, La, Th, Cr, Cs, Sc, Fe, Eu, Ce and Hf. The results showed the existence of at least two compositional groups for Lago Grande and Osvaldo. Each compositional group of Osvaldo archaeological site matches with one group of Lago Grande. Correlated with the archaeological background, the results suggest commercial or cultural exchange in the region, which is an indicative of socio-cultural interactions between those sites.