941 resultados para Multivariate Statistics
Resumo:
We present a novel maximum-likelihood-based algorithm for estimating the distribution of alignment scores from the scores of unrelated sequences in a database search. Using a new method for measuring the accuracy of p-values, we show that our maximum-likelihood-based algorithm is more accurate than existing regression-based and lookup table methods. We explore a more sophisticated way of modeling and estimating the score distributions (using a two-component mixture model and expectation maximization), but conclude that this does not improve significantly over simply ignoring scores with small E-values during estimation. Finally, we measure the classification accuracy of p-values estimated in different ways and observe that inaccurate p-values can, somewhat paradoxically, lead to higher classification accuracy. We explain this paradox and argue that statistical accuracy, not classification accuracy, should be the primary criterion in comparisons of similarity search methods that return p-values that adjust for target sequence length.
Resumo:
This article develops a weighted least squares version of Levene's test of homogeneity of variance for a general design, available both for univariate and multivariate situations. When the design is balanced, the univariate and two common multivariate test statistics turn out to be proportional to the corresponding ordinary least squares test statistics obtained from an analysis of variance of the absolute values of the standardized mean-based residuals from the original analysis of the data. The constant of proportionality is simply a design-dependent multiplier (which does not necessarily tend to unity). Explicit results are presented for randomized block and Latin square designs and are illustrated for factorial treatment designs and split-plot experiments. The distribution of the univariate test statistic is close to a standard F-distribution, although it can be slightly underdispersed. For a complex design, the test assesses homogeneity of variance across blocks, treatments, or treatment factors and offers an objective interpretation of residual plot.
Resumo:
This paper proposes a template for modelling complex datasets that integrates traditional statistical modelling approaches with more recent advances in statistics and modelling through an exploratory framework. Our approach builds on the well-known and long standing traditional idea of 'good practice in statistics' by establishing a comprehensive framework for modelling that focuses on exploration, prediction, interpretation and reliability assessment, a relatively new idea that allows individual assessment of predictions. The integrated framework we present comprises two stages. The first involves the use of exploratory methods to help visually understand the data and identify a parsimonious set of explanatory variables. The second encompasses a two step modelling process, where the use of non-parametric methods such as decision trees and generalized additive models are promoted to identify important variables and their modelling relationship with the response before a final predictive model is considered. We focus on fitting the predictive model using parametric, non-parametric and Bayesian approaches. This paper is motivated by a medical problem where interest focuses on developing a risk stratification system for morbidity of 1,710 cardiac patients given a suite of demographic, clinical and preoperative variables. Although the methods we use are applied specifically to this case study, these methods can be applied across any field, irrespective of the type of response.
Resumo:
O cultivo do café é uma das atividades do agronegócio de maior importância socioeconômica dentre as diferentes atividades ligadas ao comércio agrícola mundial. Uma das maiores contribuições da genética quantitativa para o melhoramento genético é a possibilidade de prever ganhos genéticos. Quando diferentes critérios de seleção são considerados, a predição de ganhos referentes a cada critério tem grande importância, pois indica os melhoristas sobre como utilizar o material genético disponível, visando obter o máximo de ganhos possível para as características de interesse. O presente trabalho foi instalado em julho de 2004, na Fazenda Experimental de Bananal do Norte, conduzida pelo Incaper, no distrito de Pacotuba, município de Cachoeiro de Itapemirim, região Sul do Estado, com o objetivo de selecionar as melhores plantas entre e dentro de progênies de meios- irmãos de Coffea canephora, por meio de diferentes critérios de seleção. Foram realizadas análises de variância individuais e conjuntas para 26 progênies de meios- irmãos Coffea canephora. O delineamento experimental utilizado foi em blocos ao acaso com quatro testemunhas adicionais com quatro repetições e parcela composta por cinco plantas, com o espaçamento de 3,0 m x 1,2 m. Neste trabalho, considerou-se os dados das últimas cinco colheitas. As características mensuradas foram: florescimento, maturação, tamanho do grão, peso, porte, vigor, ferrugem, mancha cercóspora, seca de ponteiros, escala geral, porcentagem de frutos boia e bicho mineiro. Todas as análises estatísticas foram realizadas com o aplicativo computacional em genética e estatística (GENES). Foram estimados os ganhos de seleção em função da porcentagem de seleção de 20% entre e dentro, sendo as mesmas mantidas para todas as características. Todas as características foram submetidas a seleção no sentido positivo, exceto para florescimento, porte, ferrugem, mancha cercóspora, seca de ponteiros, porcentagem de frutos boia e bicho mineiro, para obter decréscimo em suas médias originais. Os critérios de seleção estudados foram: seleção convencional entre e dentro das famílias, índice de seleção combinada, seleção massal e seleção massal estratificada. Esta dissertação é composta por dois capítulos, em que foram realizadas análises biométricas, como a obtenção de estimativas de parâmetros genéticos. Na maioria das características estudadas, verificaram-se diferenças significativas (P<0,05) para genótipos que, associados aos coeficientes de variação genotípicos e também ao coeficiente de determinação genotípico e à relação CVg/CVe, indicam a existência de variabilidade genética nos materiais genéticos para a maioria das características e condições favoráveis para obtenção de ganhos genéticos pela seleção. Essas características também foram correlacionadas. Os dados foram submetidos às análises de variância e multivariada, aplicando-se a técnica de agrupamento e UPGMA, teste de médias e estudo de correlações. Na técnica de agrupamento, foi utilizada a distância generalizada de Mahalanobis como medida de dissimilaridade, e na delimitação dos grupos, o método de Tocher. Foi encontrada diversidade genética para as características associadas à qualidade fisiológica, mobilização de reserva das sementes, dimensões e biomassa das plântulas. Quatro grupos de genótipos puderam ser formados. Peso de massa seca de sementes, redução de reserva de sementes e peso de massa seca de plântulas estão positivamente correlacionados entre si, enquanto a redução de reserva das sementes e a eficiência na conversão dessas reservas em plântulas estão negativamente correlacionadas. De acordo com os resultados obtidos, verificou-se que todas as características apresentaram níveis diferenciados de variabilidade genética e os critérios de seleção utilizados mostraram-se eficientes para o melhoramento, no qual o índice de seleção combinada é o critério de seleção que apresentou os melhores resultados em termos de ganhos, sendo indicado como critério mais apropriado para o melhoramento genético da população estudada. Nos estudos de correlações, em 70% dos casos, a correlação fenotípica foi superior à genotípica, mostrando maior influência dos fatores ambientais em relação aos genotípicos e condições propícias ao melhoramento dos diferentes caracteres. No estudo de divergência genética, observou-se que pelo agrupamento de genótipos, pela técnica de Tocher, indicou que os genótipos foram distribuídos em três grupos.
Resumo:
Dissertação de Mestrado, Ciências Económicas e Empresariais, 6 de Dezembro de 2012, Universidade dos Açores.
Resumo:
The present study, covering students from public schools and a private school on the island of São Miguel (Azores, Portugal), aims to meet the difficulties of the students of the 3rd and 4th years of the primary education in solving tasks involving construction, reading and interpreting tables and statistical graphs, in the context of Organization and Data Handling (ODH). We present the main results obtained from statistical methods, among which we highlight some non-parametric hypothesis tests and the Categorical Principal Component Analysis (CatPCA), given the nature of the variables included in the questionnaire (mostly nominal and ordinal variables).
Resumo:
ABSTRACT OBJECTIVE To identify the factors that interfere with the access of adolescents and young people to childbirth care for in the Northeast region of Brazil. METHODS Cross-sectional study with 3,014 adolescents and young people admitted to the selected maternity wards to give birth in the Northeast region of Brazil. The sample design was probabilistic, in two stages: the first corresponded to the health establishments and the second to women who had recently given birth and their babies. The data was collected by means of interviews and consulting the hospital records, from pre-tested electronic form. Descriptive statistics were used for the univariate analysis, Pearson’s Chi-square test for the bivariate analysis and multiple logistic regressions for the multivariate analysis. Sociodemographic variables, obstetrical history, and birth care were analyzed. RESULTS Half of the adolescents and young people interviewed had not been given guidance on the location that they should go to when in labor, and among those who had, 23.5% did not give birth in the indicated health service. Furthermore, one third (33.3%) had to travel in search of assisted birth, and the majority (66.7%) of the postpartum women came to maternity by their own means. In the bivariate analysis, the variables marital status, paid work, health insurance, number of previous pregnancies, parity, city location, and type of health establishment showed a significant association (p < 0.20) with inadequate access to childbirth care. The multivariate analysis showed that married adolescents and young people (p < 0.015), with no health insurance (p < 0.002) and from the countryside (p < 0.001) were more likely to have inadequate access to childbirth care. CONCLUSIONS Adolescents and young women, married, without health insurance, and from the countryside are more likely to have inadequate access to birth care. The articulation between outpatient care and birth care can improve this access and, consequently, minimize the maternal and fetal risks that arise from a lack of systematic hospitalization planning.
Resumo:
Probability and Statistics—Selected Problems is a unique book for senior undergraduate and graduate students to fast review basic materials in Probability and Statistics. Descriptive statistics are presented first, and probability is reviewed secondly. Discrete and continuous distributions are presented. Sample and estimation with hypothesis testing are presented in the last two chapters. The solutions for proposed excises are listed for readers to references.
Resumo:
Associations between socio-demographic factors, water contact patterns and Schistosoma mansoni infection were investigated in 506 individuals (87% of inhabitants over 1 year of age) in an endemic area in Brazil (Divino), aiming at determining priorities for public health measures to prevent the infection. Those who eliminated S. mansoni eggs (n = 198) were compared to those without eggs in the stools (n = 308). The following explanatory variables were considered: age, sex, color, previous treatment with schistosomicide, place of birth, quality of the houses, water supply for the household, distance from houses to stream, and frequency and reasons for water contact. Factors found to be independently associated with the infection were age (10-19 and > 20 yrs old), and water contact for agricultural activities, fishing, and swimming or bathing (Adjusted relative odds = 5.0, 2.4, 3.2, 2.1 and 2.0, respectively). This suggests the need for public health measures to prevent the infection, emphasizing water contact for leisure and agricultural activities in this endemic area.
Resumo:
This study aims to optimize the water quality monitoring of a polluted watercourse (Leça River, Portugal) through the principal component analysis (PCA) and cluster analysis (CA). These statistical methodologies were applied to physicochemical, bacteriological and ecotoxicological data (with the marine bacterium Vibrio fischeri and the green alga Chlorella vulgaris) obtained with the analysis of water samples monthly collected at seven monitoring sites and during five campaigns (February, May, June, August, and September 2006). The results of some variables were assigned to water quality classes according to national guidelines. Chemical and bacteriological quality data led to classify Leça River water quality as “bad” or “very bad”. PCA and CA identified monitoring sites with similar pollution pattern, giving to site 1 (located in the upstream stretch of the river) a distinct feature from all other sampling sites downstream. Ecotoxicity results corroborated this classification thus revealing differences in space and time. The present study includes not only physical, chemical and bacteriological but also ecotoxicological parameters, which broadens new perspectives in river water characterization. Moreover, the application of PCA and CA is very useful to optimize water quality monitoring networks, defining the minimum number of sites and their location. Thus, these tools can support appropriate management decisions.
Resumo:
A cross-sectional case-control study on the association between the reduced work ability and S. japonicum infection was carried out in a moderate endemic area for schistosomiasis japonica in the southern part of Dongting lake in China. A total of 120 cases with reduced work ability and 240 controls paired to the case by age, sex, occupation and without reduced work ability, participated in the study. The mean age for individuals was 37.6 years old (21-60), the ratio of male: female was 60:40, the prevalence of S. japonicum in the individuals was 28.3%. The results obtained in this study showed that the infection of S. japonicum in case and control groups was 49.2% (59/120) and 17.9% (43/240), respectively. Odds ratio for reduced work ability among those who had schistosomiasis was 4.34 (95%), confidence interval was 2.58-7.34, and among those who had S. japonicum infection (egg per gram > 100) was up to 12.67 (95%), confidence interval was 3.64-46.39. After odds ratio was adjusted by multiple logistic regression, it was confirmed that heavier intensity of S. japonicum infection and splenomegaly due to S. japonicum infection were the main risk factors for reduced work ability in the population studied.
Resumo:
RESUMO: A partir da desinstitucionalização psiquiátrica, a ênfase nas políticas públicas de saúde mental passou para os serviços comunitários e para períodos mais curtos de hospitalização. As famílias, então, tornaram-se as principais provedoras de cuidados cotidianos e de apoio aos pacientes. As dificuldades e o despreparo em assumir este novo papel têm gerado um sentimento de sobrecarga nos familiares, o que pode afetar sua saúde física e mental. Vários estudos investigaram as consequências de se tornar um cuidador de um paciente psiquiátrico, mas poucos pesquisaram o impacto na saúde mental desses cuidadores. A presente pesquisa investigou a relação entre a sobrecarga e a saúde mental dos familiares cuidadores de pacientes psiquiátricos. Participaram deste estudo 74 familiares cuidadores de pacientes com diagnóstico de esquizofrenia, atendidos no ambulatório do Serviço de Referência em Saúde Mental, da cidade de Divinópolis, MG. Os familiares participaram de uma entrevista estruturada. Nela foram aplicadas a Escala de Avaliação da Sobrecarga dos Familiares de Pacientes Psiquiátricos (FBIS-BR) e, para avaliar a saúde mental dos cuidadores, a Escala de Depressão de Beck (BDI). Foram realizadas análises estatísticas descritivas, univariadas e multivariadas. Os resultados mostraram que a maioria dos cuidadores era do sexo feminino (78,40%), pais (62,20%) e com idade média de 59,14 anos. Os cuidadores apresentaram uma média de sobrecarga global objetiva de 2,05 (DP ± 0,54), em uma escala de 1 a 5 pontos, e uma média de sobrecarga global subjetiva de 2,44 (DP ± 0,71), em uma escala de 1 a 4 pontos. Os resultados da escala BDI mostraram que 42 cuidadores poderiam ser classificados com depressão mínima (56,80%), 17 com depressão leve (23,00%), 7 com depressão moderada (9,50%) e 8 com depressão grave (10,80%). Foram encontradas correlações positivas significativas entre o grau de sobrecarga global e das subescalas e o nível de depressão. As análises multivariadas mostraram que o principal preditor de depressão dos cuidadores foi a sobrecarga global subjetiva. Outros preditores foram a obrecarga objetiva das rotinas diárias e da supervisão dos comportamentos problemáticos dos pacientes e a sobrecarga subjetiva das preocupações com o paciente. As informações levantadas mostraram o impacto do papel de cuidador na saúde mental dos familiares e apontaram para a necessidade de uma maior atenção, por parte dos gestores e profissionais da área, aos cuidadores de pacientes psiquiátricos.----------ABSTRACT: The emphasis in public policy on mental health was transferred to community services and for shorter periods of hospitalization from the psychiatric deinstitutionalization. Then the families become the first provider of daily care and support to patients. The difficulties and unprepared to assume this new role has generated a sense of overload in the relatives, which can affect your physical and mental health. Several studies have investigated the consequences of becoming a caregiver of a psychiatric patient, but few scholars have researched the impact on the mental health of caregivers. The present study has investigated the relationship between overload and mental health of family caregivers of psychiatric patients. The study included 74 family caregivers of patients with schizophrenia and outpatient clinic of the Department of Mental Health Reference, in Divinópolis, Minas Gerais, Brazil. The Rating Scale Burden of Relatives of Psychiatric Patients and the scale of Beck Depression Inventory (BDI) to assess the mental health of caregivers were applied in the interview. Descriptive statistics and univariate and multivariate analysis have performed. The results showed that the majority of caregivers were female (78.40%), parents (62.20%) and mean age of 59.14 years. The caregivers had an average burden overall objective of 2.05 (± 0.54) on a scale of 1 to 5 points, and a subjective global average burden of 2.44 (± 0.71) in a scale of 1 to 4 points. The results of the BDI showed that 42 caregivers could be classified with minimal depression (56.80%), 17 with mild depression (23.00%), 7 with moderate depression (9.50%) and 8 with severe depression (10 80%). Significant positive correlations were found between the degree of overloading and global subscales and depression levels. Multivariate analysis showed that the main predictor of caregivers' depression was the global subjective burden. Other predictors were the objective burden of daily routines and supervision of problem behaviors of patients and subjective burden of the concerns about patient. The resulting information showed the impact of caregiver role in the mental health of relatives and pointed to the need for higher attention of managers and professionals to caregivers of psychiatric patients.
Resumo:
Burn mortality statistics may be misleading unless they account properly for the many factors that can influence outcome. Such estimates are useful for patients and others making medical and financial decisions concerning their care. This study aimed to define the clinical, microbiological and laboratorial predictors of mortality with a view to focus on better burn care. Data were collected using independent variables, which were analyzed sequentially and cumulatively, employing univariate statistics and a pooled, cross-sectional, multivariate logistic regression to establish which variables better predict the probability of mortality. Survivors and non-survivors among burn patients were compared to define the predictive factors of mortality. Mortality rate was 5.0%. Higher age, larger burn area, presence of fungi in the wound, shorter length of stay and the presence of multi-resistant bacteria in the wound significantly predicted increased mortality. The authors conclude that those patients who are most apt to die are those with age > 50 years, with limited skin donor sites and those with multi-resistant bacteria and fungi in the wound.
Resumo:
In health related research it is common to have multiple outcomes of interest in a single study. These outcomes are often analysed separately, ignoring the correlation between them. One would expect that a multivariate approach would be a more efficient alternative to individual analyses of each outcome. Surprisingly, this is not always the case. In this article we discuss different settings of linear models and compare the multivariate and univariate approaches. We show that for linear regression models, the estimates of the regression parameters associated with covariates that are shared across the outcomes are the same for the multivariate and univariate models while for outcome-specific covariates the multivariate model performs better in terms of efficiency.