970 resultados para Multivariate data
Resumo:
We compare Bayesian methodology utilizing free-ware BUGS (Bayesian Inference Using Gibbs Sampling) with the traditional structural equation modelling approach based on another free-ware package, Mx. Dichotomous and ordinal (three category) twin data were simulated according to different additive genetic and common environment models for phenotypic variation. Practical issues are discussed in using Gibbs sampling as implemented by BUGS to fit subject-specific Bayesian generalized linear models, where the components of variation may be estimated directly. The simulation study (based on 2000 twin pairs) indicated that there is a consistent advantage in using the Bayesian method to detect a correct model under certain specifications of additive genetics and common environmental effects. For binary data, both methods had difficulty in detecting the correct model when the additive genetic effect was low (between 10 and 20%) or of moderate range (between 20 and 40%). Furthermore, neither method could adequately detect a correct model that included a modest common environmental effect (20%) even when the additive genetic effect was large (50%). Power was significantly improved with ordinal data for most scenarios, except for the case of low heritability under a true ACE model. We illustrate and compare both methods using data from 1239 twin pairs over the age of 50 years, who were registered with the Australian National Health and Medical Research Council Twin Registry (ATR) and presented symptoms associated with osteoarthritis occurring in joints of the hand.
Resumo:
This paper proposes a template for modelling complex datasets that integrates traditional statistical modelling approaches with more recent advances in statistics and modelling through an exploratory framework. Our approach builds on the well-known and long standing traditional idea of 'good practice in statistics' by establishing a comprehensive framework for modelling that focuses on exploration, prediction, interpretation and reliability assessment, a relatively new idea that allows individual assessment of predictions. The integrated framework we present comprises two stages. The first involves the use of exploratory methods to help visually understand the data and identify a parsimonious set of explanatory variables. The second encompasses a two step modelling process, where the use of non-parametric methods such as decision trees and generalized additive models are promoted to identify important variables and their modelling relationship with the response before a final predictive model is considered. We focus on fitting the predictive model using parametric, non-parametric and Bayesian approaches. This paper is motivated by a medical problem where interest focuses on developing a risk stratification system for morbidity of 1,710 cardiac patients given a suite of demographic, clinical and preoperative variables. Although the methods we use are applied specifically to this case study, these methods can be applied across any field, irrespective of the type of response.
Resumo:
The tests that are currently available for the measurement of overexpression of the human epidermal growth factor-2 (HER2) in breast cancer have shown considerable problems in accuracy and interlaboratory reproducibility. Although these problems are partly alleviated by the use of validated, standardised 'kits', there may be considerable cost involved in their use. Prior to testing it may therefore be an advantage to be able to predict from basic pathology data whether a cancer is likely to overexpress HER2. In this study, we have correlated pathology features of cancers with the frequency of HER2 overexpression assessed by immunohistochemistry (IHC) using HercepTest (Dako). In addition, fluorescence in situ hybridisation (FISH) has been used to re-test the equivocal cancers and interobserver variation in assessing HER2 overexpression has been examined by a slide circulation scheme. Of the 1536 cancers, 1144 (74.5%) did not overexpress HER2. Unequivocal overexpression (3+ by IHC) was seen in 186 cancers (12%) and an equivocal result (2+ by IHC) was seen in 206 cancers (13%). Of the 156 IHC 3+ cancers for which complete data was available, 149 (95.5%) were ductal NST and 152 (97%) were histological grade 2 or 3. Only 1 of 124 infiltrating lobular carcinomas (0.8%) showed HER2 overexpression. None of the 49 'special types' of carcinoma showed HER2 overexpression. Re-testing by FISH of a proportion of the IHC 2+ cancers showed that only 25 (23%) of those assessable exhibited HER2 gene amplification, but 46 of the 47 IHC 3+ cancers (98%) were confirmed as showing gene amplification. Circulating slides for the assessment of HER2 score showed a moderate level of agreement between pathologists (kappa 0.4). As a result of this study we would advocate consideration of a triage approach to HER-2 testing. Infiltrating lobular and special types of carcinoma may not need to be routinely tested at presentation nor may grade 1 NST carcinomas in which only 1.4% have been shown to overexpress HER2. Testing of these carcinomas may be performed when HER2 status is required to assist in therapeutic or other clinical/prognostic decision-making. The highest yield of HER2 overexpressing carcinomas is seen in the grade 3 NST subgroup in which 24% are positive by IHC. (C) 2003 Elsevier Science Ltd. All rights reserved.
Resumo:
Measurement of exchange of substances between blood and tissue has been a long-lasting challenge to physiologists, and considerable theoretical and experimental accomplishments were achieved before the development of the positron emission tomography (PET). Today, when modeling data from modern PET scanners, little use is made of earlier microvascular research in the compartmental models, which have become the standard model by which the vast majority of dynamic PET data are analysed. However, modern PET scanners provide data with a sufficient temporal resolution and good counting statistics to allow estimation of parameters in models with more physiological realism. We explore the standard compartmental model and find that incorporation of blood flow leads to paradoxes, such as kinetic rate constants being time-dependent, and tracers being cleared from a capillary faster than they can be supplied by blood flow. The inability of the standard model to incorporate blood flow consequently raises a need for models that include more physiology, and we develop microvascular models which remove the inconsistencies. The microvascular models can be regarded as a revision of the input function. Whereas the standard model uses the organ inlet concentration as the concentration throughout the vascular compartment, we consider models that make use of spatial averaging of the concentrations in the capillary volume, which is what the PET scanner actually registers. The microvascular models are developed for both single- and multi-capillary systems and include effects of non-exchanging vessels. They are suitable for analysing dynamic PET data from any capillary bed using either intravascular or diffusible tracers, in terms of physiological parameters which include regional blood flow. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.
Resumo:
This study investigated the haemodynamic response to the 90-minute application of 85 Hz transcutaneous electrical nerve stimulation (TENS) to the T1 and T5 nerve roots. Comparison was made between 20 healthy subjects who had TENS stimulation and a separate group of 20 healthy subjects who rested for 90 minutes. Pulse and blood pressure were measured just prior to the start of TENS stimulation, after 30 minutes of stimulation, and after 90 minutes of stimulation (immediately after stopping TENS) or at completion of the rest time depending on group allocation. The rate pressure product was calculated from the pulse and systolic blood pressure data. Multivariate repeated measures analysis showed a significant group effect for TENS (p = 0.048). Univariate repeated measures analyses showed a significant group by time effect due to TENS on systolic blood pressure over the 90-minute time period (p = 0.028). Separate group repeated measures ANOVA showed a significant decline in heart rate (p = 0.000), systolic blood pressure (p = 0.013) and rate pressure product (p = 0.000) for the TENS group, while the control resting group showed a significant decline in heart rate only (p = 0.04). The application of 85 Hz TENS to the upper thoracic nerve roots causes no adverse haemodynamic effects in healthy subjects.
Resumo:
The effect of number of samples and selection of data for analysis on the calculation of surface motor unit potential (SMUP) size in the statistical method of motor unit number estimates (MUNE) was determined in 10 normal subjects and 10 with amyotrophic lateral sclerosis (ALS). We recorded 500 sequential compound muscle action potentials (CMAPs) at three different stable stimulus intensities (10–50% of maximal CMAP). Estimated mean SMUP sizes were calculated using Poisson statistical assumptions from the variance of 500 sequential CMAP obtained at each stimulus intensity. The results with the 500 data points were compared with smaller subsets from the same data set. The results using a range of 50–80% of the 500 data points were compared with the full 500. The effect of restricting analysis to data between 5–20% of the CMAP and to standard deviation limits was also assessed. No differences in mean SMUP size were found with stimulus intensity or use of different ranges of data. Consistency was improved with a greater sample number. Data within 5% of CMAP size gave both increased consistency and reduced mean SMUP size in many subjects, but excluded valid responses present at that stimulus intensity. These changes were more prominent in ALS patients in whom the presence of isolated SMUP responses was a striking difference from normal subjects. Noise, spurious data, and large SMUP limited the Poisson assumptions. When these factors are considered, consistent statistical MUNE can be calculated from a continuous sequence of data points. A 2 to 2.5 SD or 10% window are reasonable methods of limiting data for analysis. Muscle Nerve 27: 320–331, 2003
Resumo:
In microarray studies, the application of clustering techniques is often used to derive meaningful insights into the data. In the past, hierarchical methods have been the primary clustering tool employed to perform this task. The hierarchical algorithms have been mainly applied heuristically to these cluster analysis problems. Further, a major limitation of these methods is their inability to determine the number of clusters. Thus there is a need for a model-based approach to these. clustering problems. To this end, McLachlan et al. [7] developed a mixture model-based algorithm (EMMIX-GENE) for the clustering of tissue samples. To further investigate the EMMIX-GENE procedure as a model-based -approach, we present a case study involving the application of EMMIX-GENE to the breast cancer data as studied recently in van 't Veer et al. [10]. Our analysis considers the problem of clustering the tissue samples on the basis of the genes which is a non-standard problem because the number of genes greatly exceed the number of tissue samples. We demonstrate how EMMIX-GENE can be useful in reducing the initial set of genes down to a more computationally manageable size. The results from this analysis also emphasise the difficulty associated with the task of separating two tissue groups on the basis of a particular subset of genes. These results also shed light on why supervised methods have such a high misallocation error rate for the breast cancer data.
Resumo:
For zygosity diagnosis in the absence of genotypic data, or in the recruitment phase of a twin study where only single twins from same-sex pairs are being screened, or to provide a test for sample duplication leading to the false identification of a dizygotic pair as monozygotic, the appropriate analysis of respondents' answers to questions about zygosity is critical. Using data from a young adult Australian twin cohort (N = 2094 complete pairs and 519 singleton twins from same-sex pairs with complete responses to all zygosity items), we show that application of latent class analysis (LCA), fitting a 2-class model, yields results that show good concordance with traditional methods of zygosity diagnosis, but with certain important advantages. These include the ability, in many cases, to assign zygosity with specified probability on the basis of responses of a single informant (advantageous when one zygosity type is being oversampled); and the ability to quantify the probability of misassignment of zygosity, allowing prioritization of cases for genotyping as well as identification of cases of probable laboratory error. Out of 242 twins (from 121 like-sex pairs) where genotypic data were available for zygosity confirmation, only a single case was identified of incorrect zygosity assignment by the latent class algorithm. Zygosity assignment for that single case was identified by the LCA as uncertain (probability of being a monozygotic twin only 76%), and the co-twin's responses clearly identified the pair as dizygotic (probability of being dizygotic 100%). In the absence of genotypic data, or as a safeguard against sample duplication, application of LCA for zygosity assignment or confirmation is strongly recommended.
Resumo:
Acetohydroxyacid synthase (AHAS, EC 4.1.3.18) catalyses the first step in branched-chain amino acid biosynthesis and is the target for sulfonylurea and imidazolinone herbicides, which act as potent and specific inhibitors. Mutants of the enzyme have been identified that are resistant to particular herbicides. However, the selectivity of these mutants towards various sulfonylureas and imidazolinones has not been determined systematically. Now that the structure of the yeast enzyme is known, both in the absence and presence of a bound herbicide, a detailed understanding of the molecular interactions between the enzyme and its inhibitors becomes possible. Here we construct 10 active mutants of yeast AHAS, purify the enzymes and determine their sensitivity to six sulfonylureas and three imidazolinones. An additional three active mutants were constructed with a view to increasing imidazolinone sensitivity. These three variants were purified and tested for their sensitivity to the imidazolinones only. Substantial differences are observed in the sensitivity of the 13 mutants to the various inhibitors and these differences are interpreted in terms of the structure of the herbicide-binding site on the enzyme.
Resumo:
Este trabalho consiste num estudo de caso que se destina ao desenvolvimento de um Data Mart que possibilite a Escola Nacional de Administra????o P??blica ??? ENAP conhecer o perfil e o panorama geral da situa????o funcional dos servidores p??blicos federais que se capacitaram na Escola nos ??ltimos 7 anos. O aplicativo foi desenvolvido cruzando o banco de dados do sistema gerenciador dos cursos ministrados pela ENAP, onde est??o armazenadas informa????es sobre os alunos capacitados, os cursos realizados, os resultados alcan??ados, o perfil dos docentes e demais informa????es relativas ??s atividades da Escola, com os dados gerados pelo Sistema Integrado de Administra????o de Recursos Humanos ??? SIAPE, cuja extra????o de dados foi direcionada para os registros sobre a situa????o funcional, cargos, carreiras, fun????es, ??rg??os e alguns dados pessoais dos alunos, servidores p??blicos federais que se encontram registrados no SIAPE
Resumo:
Bakeriella lata sp. nov. (Brazil, Rondônia), Bakeriella aurata sp. nov. (Brazil, Amazonas) and Bakeriella sulcaticeps sp. nov. (Brazil, Amazonas) are described and illustrated. New geographic records and variation data for B. cristata Evans, 1964, B. floridana Evans, 1964, B. flavicornis Kieffer, 1910, B. incompleta Azevedo, 1994, B. mira Evans, 1997, B. montivaga (Kieffer, 1910), B. olmeca Evans, 1964 and B. subcarinata Evans, 1965 are provided. The male of B. incompleta is described for the first time.
Prevalência e fatores associados à infecção pelo M. tuberculosis entre agentes comunitários de saúde
Resumo:
Introdução: A tuberculose é uma doença milenar e que, ainda hoje, constitui grave problema de saúde pública em todo o mundo. Objetivo: Estimar a prevalência e os fatores associados à infecção latente pelo MTB entre Agentes Comunitários de Saúde atuantes na rede básica de saúde de Municípios prioritários para o controle de TB – Cuiabá/MT, Manaus/AM, Salvador/BA e Vitória/ES. Métodos: Estudo de corte transversal no qual os dados foram coletados através de questionário, composto de questões abertas e fechadas sobre características pessoais; informações a respeito da tuberculose; utilização de medidas preventivas, etc. Aplicou-se prova tuberculínica, com leitura após 48-72h por enfermeiros treinados, considerando como ponte de corte positivo 5 e 10 mm de enduração. A análise múltipla foi feita por meio de regressão logística hierarquica. Foram incluídas no modelo as variáveis que mostraram associação com desfecho com p<0,1. Permaneceram no modelo as variáveis independentes que mantiveram associação com desfecho após ajuste (p<0,05). Este estudo obteve aprovação do Comitê de Ética em Pesquisa com seres humanos do Centro de Ciências da Saúde da Universidade Federal do Espírito Santo, n° de registro CEP-07/2010 e das Secretarias Municipais de Saúde, por meio de uma Carta de Apresentação. Resultados: 322 Agentes Comunitários de Saúde (ACS) aceitaram participar voluntariamente do estudo por meio da assinatura do Termo de Consentimento Livre e Esclarecido. Destes, 10 não compareceram para leitura, sendo estes considerados como perdas, além do que um indivíduo foi excluído pelo fato do teste rápido para HIV ter resultado positivo, perfazendo uma amostra final de 311 participantes. Ainda em relação aos ACS triados, a positividade a Prova Tuberculínica, levando-se em consideração o ponto de corte ao teste de 10 mm e de 5 mm de enduração, foi de 37,30% (IC95%: 0,31-0,42) e de 57,88% (IC95%: 0,52-0,63), respectivamente.Conclusões: Faz-se necessário um programa de realização de Prova Tuberculínica, de rotina, combinado com intervenções para reduzir o risco de transmissão nosocomial, bem como a realização de outros estudos para avaliar a eficácia de novos testes para detecção de tuberculose latente.
Resumo:
O cultivo do café é uma das atividades do agronegócio de maior importância socioeconômica dentre as diferentes atividades ligadas ao comércio agrícola mundial. Uma das maiores contribuições da genética quantitativa para o melhoramento genético é a possibilidade de prever ganhos genéticos. Quando diferentes critérios de seleção são considerados, a predição de ganhos referentes a cada critério tem grande importância, pois indica os melhoristas sobre como utilizar o material genético disponível, visando obter o máximo de ganhos possível para as características de interesse. O presente trabalho foi instalado em julho de 2004, na Fazenda Experimental de Bananal do Norte, conduzida pelo Incaper, no distrito de Pacotuba, município de Cachoeiro de Itapemirim, região Sul do Estado, com o objetivo de selecionar as melhores plantas entre e dentro de progênies de meios- irmãos de Coffea canephora, por meio de diferentes critérios de seleção. Foram realizadas análises de variância individuais e conjuntas para 26 progênies de meios- irmãos Coffea canephora. O delineamento experimental utilizado foi em blocos ao acaso com quatro testemunhas adicionais com quatro repetições e parcela composta por cinco plantas, com o espaçamento de 3,0 m x 1,2 m. Neste trabalho, considerou-se os dados das últimas cinco colheitas. As características mensuradas foram: florescimento, maturação, tamanho do grão, peso, porte, vigor, ferrugem, mancha cercóspora, seca de ponteiros, escala geral, porcentagem de frutos boia e bicho mineiro. Todas as análises estatísticas foram realizadas com o aplicativo computacional em genética e estatística (GENES). Foram estimados os ganhos de seleção em função da porcentagem de seleção de 20% entre e dentro, sendo as mesmas mantidas para todas as características. Todas as características foram submetidas a seleção no sentido positivo, exceto para florescimento, porte, ferrugem, mancha cercóspora, seca de ponteiros, porcentagem de frutos boia e bicho mineiro, para obter decréscimo em suas médias originais. Os critérios de seleção estudados foram: seleção convencional entre e dentro das famílias, índice de seleção combinada, seleção massal e seleção massal estratificada. Esta dissertação é composta por dois capítulos, em que foram realizadas análises biométricas, como a obtenção de estimativas de parâmetros genéticos. Na maioria das características estudadas, verificaram-se diferenças significativas (P<0,05) para genótipos que, associados aos coeficientes de variação genotípicos e também ao coeficiente de determinação genotípico e à relação CVg/CVe, indicam a existência de variabilidade genética nos materiais genéticos para a maioria das características e condições favoráveis para obtenção de ganhos genéticos pela seleção. Essas características também foram correlacionadas. Os dados foram submetidos às análises de variância e multivariada, aplicando-se a técnica de agrupamento e UPGMA, teste de médias e estudo de correlações. Na técnica de agrupamento, foi utilizada a distância generalizada de Mahalanobis como medida de dissimilaridade, e na delimitação dos grupos, o método de Tocher. Foi encontrada diversidade genética para as características associadas à qualidade fisiológica, mobilização de reserva das sementes, dimensões e biomassa das plântulas. Quatro grupos de genótipos puderam ser formados. Peso de massa seca de sementes, redução de reserva de sementes e peso de massa seca de plântulas estão positivamente correlacionados entre si, enquanto a redução de reserva das sementes e a eficiência na conversão dessas reservas em plântulas estão negativamente correlacionadas. De acordo com os resultados obtidos, verificou-se que todas as características apresentaram níveis diferenciados de variabilidade genética e os critérios de seleção utilizados mostraram-se eficientes para o melhoramento, no qual o índice de seleção combinada é o critério de seleção que apresentou os melhores resultados em termos de ganhos, sendo indicado como critério mais apropriado para o melhoramento genético da população estudada. Nos estudos de correlações, em 70% dos casos, a correlação fenotípica foi superior à genotípica, mostrando maior influência dos fatores ambientais em relação aos genotípicos e condições propícias ao melhoramento dos diferentes caracteres. No estudo de divergência genética, observou-se que pelo agrupamento de genótipos, pela técnica de Tocher, indicou que os genótipos foram distribuídos em três grupos.