971 resultados para Genetic clustering analysis
Resumo:
We compare Bayesian methodology utilizing free-ware BUGS (Bayesian Inference Using Gibbs Sampling) with the traditional structural equation modelling approach based on another free-ware package, Mx. Dichotomous and ordinal (three category) twin data were simulated according to different additive genetic and common environment models for phenotypic variation. Practical issues are discussed in using Gibbs sampling as implemented by BUGS to fit subject-specific Bayesian generalized linear models, where the components of variation may be estimated directly. The simulation study (based on 2000 twin pairs) indicated that there is a consistent advantage in using the Bayesian method to detect a correct model under certain specifications of additive genetics and common environmental effects. For binary data, both methods had difficulty in detecting the correct model when the additive genetic effect was low (between 10 and 20%) or of moderate range (between 20 and 40%). Furthermore, neither method could adequately detect a correct model that included a modest common environmental effect (20%) even when the additive genetic effect was large (50%). Power was significantly improved with ordinal data for most scenarios, except for the case of low heritability under a true ACE model. We illustrate and compare both methods using data from 1239 twin pairs over the age of 50 years, who were registered with the Australian National Health and Medical Research Council Twin Registry (ATR) and presented symptoms associated with osteoarthritis occurring in joints of the hand.
Resumo:
In microarray studies, the application of clustering techniques is often used to derive meaningful insights into the data. In the past, hierarchical methods have been the primary clustering tool employed to perform this task. The hierarchical algorithms have been mainly applied heuristically to these cluster analysis problems. Further, a major limitation of these methods is their inability to determine the number of clusters. Thus there is a need for a model-based approach to these. clustering problems. To this end, McLachlan et al. [7] developed a mixture model-based algorithm (EMMIX-GENE) for the clustering of tissue samples. To further investigate the EMMIX-GENE procedure as a model-based -approach, we present a case study involving the application of EMMIX-GENE to the breast cancer data as studied recently in van 't Veer et al. [10]. Our analysis considers the problem of clustering the tissue samples on the basis of the genes which is a non-standard problem because the number of genes greatly exceed the number of tissue samples. We demonstrate how EMMIX-GENE can be useful in reducing the initial set of genes down to a more computationally manageable size. The results from this analysis also emphasise the difficulty associated with the task of separating two tissue groups on the basis of a particular subset of genes. These results also shed light on why supervised methods have such a high misallocation error rate for the breast cancer data.
Resumo:
Epstein-Barr virus (EBV)-encoded oncogene latent membrane protein (LMP) 1, which is consistently expressed in multiple EBV-associated malignancies, has been proposed as a potential target antigen for any future vaccine designed to control these malignancies. However, the high degree of genetic variation in the LMP1 sequence has been considered a major impediment for its use as a potential immunotherapeutic target for the treatment of EBV-associated malignancies. In the present study, we have employed a highly efficient strategy, based on ex vivo functional assays, to conduct an extensive sequence-wide analysis of LMP1-specific T-cell responses in a large panel of healthy virus carriers of diverse ethnic origin and nasopharyngeal carcinoma patients. By comparing the frequencies of T cells specific for overlapping peptides spanning LMP1, we mapped a number of novel HLA class I- and class II-restricted LMP1 T-cell epitopes, including an epitope with dual HLA class I restriction. More importantly, extensive sequence analysis of LMP1 revealed that the majority of the T-cell epitopes were highly conserved in EBV isolates from Caucasian, Papua New Guinean, African, and Southeast Asian populations, while unique geographically constrained genetic variation was observed within one HLA A2 supertype-restricted epitope. These findings indicate that conserved LMP1 epitopes should be considered in designing epitope-based immunotherapeutic strategies against EBV-associated malignancies in different ethnic populations.
Resumo:
This paper delineates the development of a prototype hybrid knowledge-based system for the optimum design of liquid retaining structures by coupling the blackboard architecture, an expert system shell VISUAL RULE STUDIO and genetic algorithm (GA). Through custom-built interactive graphical user interfaces under a user-friendly environment, the user is directed throughout the design process, which includes preliminary design, load specification, model generation, finite element analysis, code compliance checking, and member sizing optimization. For structural optimization, GA is applied to the minimum cost design of structural systems with discrete reinforced concrete sections. The design of a typical example of the liquid retaining structure is illustrated. The results demonstrate extraordinarily converging speed as near-optimal solutions are acquired after merely exploration of a small portion of the search space. This system can act as a consultant to assist novice designers in the design of liquid retaining structures.
Resumo:
This paper discusses a document discovery tool based on Conceptual Clustering by Formal Concept Analysis. The program allows users to navigate e-mail using a visual lattice metaphor rather than a tree. It implements a virtual. le structure over e-mail where files and entire directories can appear in multiple positions. The content and shape of the lattice formed by the conceptual ontology can assist in e-mail discovery. The system described provides more flexibility in retrieving stored e-mails than what is normally available in e-mail clients. The paper discusses how conceptual ontologies can leverage traditional document retrieval systems and aid knowledge discovery in document collections.
Resumo:
Pili of Neisseria meningitidis are a key virulence factor, being the major adhesin of this capsulate organism and contributing to specificity for the human host. Pili are post-translationally modified by addition of either an O-linked trisaccharide, Gal (beta1-4) Gal (alpha1-3) 2,4-diacetamido-2,4,6-trideoxyhexose or an O-linked disaccharide Gal (alpha1,3) GlcNAc. The role of these structures in meningococcal pathogenesis has not been resolved. In previous studies we identified two separate genetic loci, pglA and pglBCD, involved in pilin glycosylation. Putative functions have been allocated to these genes; however, there are not enough genes to account for the complete biosynthesis of the described structures, suggesting additional genes remain to be identified. In addition, it is not known why some strains express the trisaccharide structure and some the disaccharide structure. In order to find additional genes involved in the biosynthesis. of these structures, we used the recently published group A strain Z2491 and group B strain MC58 Neisseria meningitidis genomes and the unfinished Neisseria meningitidis group C strain FAM18 and Neisseria gonorrhoeae strain FA1090 genomes to identify novel genes involved in pilin glycosylation, based on homology to known oligosaccharide biosynthetic genes. We identified a new gene involved in pilin glycosylation designated pglE and examined four additional genes pgIB/B2, pglF, pglG and pglH. A strain survey revealed that pglE and pglF were present in each strain examined. The pglG, pglH and pgIB2 polymorphisms were not found in strain C311#3 but were present in a large number of clinical isolates. Insertional mutations were constructed in pglE and pglF in N. meningitidis strain C311#3, a strain with well-defined lipopolysaccharide (LPS) and pilin-linked glycan structures. Increased gel migration of the pilin subunit molecules of pglE and pglF mutants was observed by Western analysis, indicating truncation of the trisaccharide structure. Antisera specific for the C311#3 trisaccharide failed to react with pilin from these pglE and pglF mutants. GC-MS analysis of the sugar composition of the pglE mutant showed a reduction in galactose compared with C311#3 wild type. Analysis of amino acid sequence homologies has suggested specific roles for pglE and pglF in the biosynthesis of the trisaccharide structure. Further, we present evidence that pglE, which contains heptanucleotide repeats, is responsible for the phase variation between trisaccharide and disaccharide structures in strain C311#3 and other strains. We also present evidence that pglG, pglH and pgIB2 are potentially phase variable.
Resumo:
O cultivo do café é uma das atividades do agronegócio de maior importância socioeconômica dentre as diferentes atividades ligadas ao comércio agrícola mundial. Uma das maiores contribuições da genética quantitativa para o melhoramento genético é a possibilidade de prever ganhos genéticos. Quando diferentes critérios de seleção são considerados, a predição de ganhos referentes a cada critério tem grande importância, pois indica os melhoristas sobre como utilizar o material genético disponível, visando obter o máximo de ganhos possível para as características de interesse. O presente trabalho foi instalado em julho de 2004, na Fazenda Experimental de Bananal do Norte, conduzida pelo Incaper, no distrito de Pacotuba, município de Cachoeiro de Itapemirim, região Sul do Estado, com o objetivo de selecionar as melhores plantas entre e dentro de progênies de meios- irmãos de Coffea canephora, por meio de diferentes critérios de seleção. Foram realizadas análises de variância individuais e conjuntas para 26 progênies de meios- irmãos Coffea canephora. O delineamento experimental utilizado foi em blocos ao acaso com quatro testemunhas adicionais com quatro repetições e parcela composta por cinco plantas, com o espaçamento de 3,0 m x 1,2 m. Neste trabalho, considerou-se os dados das últimas cinco colheitas. As características mensuradas foram: florescimento, maturação, tamanho do grão, peso, porte, vigor, ferrugem, mancha cercóspora, seca de ponteiros, escala geral, porcentagem de frutos boia e bicho mineiro. Todas as análises estatísticas foram realizadas com o aplicativo computacional em genética e estatística (GENES). Foram estimados os ganhos de seleção em função da porcentagem de seleção de 20% entre e dentro, sendo as mesmas mantidas para todas as características. Todas as características foram submetidas a seleção no sentido positivo, exceto para florescimento, porte, ferrugem, mancha cercóspora, seca de ponteiros, porcentagem de frutos boia e bicho mineiro, para obter decréscimo em suas médias originais. Os critérios de seleção estudados foram: seleção convencional entre e dentro das famílias, índice de seleção combinada, seleção massal e seleção massal estratificada. Esta dissertação é composta por dois capítulos, em que foram realizadas análises biométricas, como a obtenção de estimativas de parâmetros genéticos. Na maioria das características estudadas, verificaram-se diferenças significativas (P<0,05) para genótipos que, associados aos coeficientes de variação genotípicos e também ao coeficiente de determinação genotípico e à relação CVg/CVe, indicam a existência de variabilidade genética nos materiais genéticos para a maioria das características e condições favoráveis para obtenção de ganhos genéticos pela seleção. Essas características também foram correlacionadas. Os dados foram submetidos às análises de variância e multivariada, aplicando-se a técnica de agrupamento e UPGMA, teste de médias e estudo de correlações. Na técnica de agrupamento, foi utilizada a distância generalizada de Mahalanobis como medida de dissimilaridade, e na delimitação dos grupos, o método de Tocher. Foi encontrada diversidade genética para as características associadas à qualidade fisiológica, mobilização de reserva das sementes, dimensões e biomassa das plântulas. Quatro grupos de genótipos puderam ser formados. Peso de massa seca de sementes, redução de reserva de sementes e peso de massa seca de plântulas estão positivamente correlacionados entre si, enquanto a redução de reserva das sementes e a eficiência na conversão dessas reservas em plântulas estão negativamente correlacionadas. De acordo com os resultados obtidos, verificou-se que todas as características apresentaram níveis diferenciados de variabilidade genética e os critérios de seleção utilizados mostraram-se eficientes para o melhoramento, no qual o índice de seleção combinada é o critério de seleção que apresentou os melhores resultados em termos de ganhos, sendo indicado como critério mais apropriado para o melhoramento genético da população estudada. Nos estudos de correlações, em 70% dos casos, a correlação fenotípica foi superior à genotípica, mostrando maior influência dos fatores ambientais em relação aos genotípicos e condições propícias ao melhoramento dos diferentes caracteres. No estudo de divergência genética, observou-se que pelo agrupamento de genótipos, pela técnica de Tocher, indicou que os genótipos foram distribuídos em três grupos.
Resumo:
In this investigation, a cluster analysis was used to separate Guimara˜es (Portugal) residents into clusters according to their perceptions of the impacts of tourism development. This approach is uncommonly applied to Portugal data and is even rarer for world heritage sites. The world heritage designation is believed to make an area more attractive to tourists. The clustering procedure analysed 400 data observations from a Guimara˜es resident survey and revealed the existence of three clusters: the Sceptics, the Moderately Optimistic and the Enthusiasts. The results were consistent with the empirical literature’s results, with the emergent nature of the destination found to be relevant. The fact that tourism is relatively recent in this destination has its major reflex in the devaluation by most of the residents of the negative impacts of tourism development.
Resumo:
Genetic diversity in a collection of 64 sugar apple accessions collected from different municipalities in northern Minas Gerais was assessed by RAPD analysis. Using 20 selected RAPD primers 167 fragments were generated, of which 48 were polymorphic (28.7%) producing an average of 2.4 polymorphic fragments per primer. Low percentage of polymorphism (< 29%) was observed by using the set of primers indicating low level of genetic variation among the 64 accessions evaluated. Genetic relationships were estimated using Jaccard's coefficient of similarity. Accessions from different municipalities clustered together indicating no correlation between molecular grouping and geographical origin. The dendrogram revealed five clusters. The first cluster grouped C19 and G29 accessions collected from the municipalities of Verdelândia and Monte Azul, respectively. The second cluster grouped G16 and B11 accessions collected from the municipalities of Monte Azul and Coração de Jesus, respectively. The remaining accessions were grouped in three clusters, with 8, 15 and 37 accessions, respectively. In summary, RAPD showed a low percentage of polymorphism in the germplasm collection.
Resumo:
Understanding the genetic variability of a species is crucial for the progress of a genetic breeding program and requires characterization and evaluation of germplasm. This study aimed to characterize and evaluate 101 tomato subsamples of the Salad group (fresh market) and two commercial controls, one of the Salad group (cv. Fanny) and another of the Santa Cruz group (cv. Santa Clara). Four experiments were conducted in a randomized block design with three replications and five plants per plot. The joint analysis of variance was performed and characteristics with significant complex interaction between control and experiment were excluded. Subsequently, the multicollinearity diagnostic test was carried out and characteristics that contributed to severe multicollinearity were excluded. The relative importance of each characteristics for genetic divergence was calculated by the Singh's method (Singh, 1981), and the less important ones were excluded according to Garcia (1998). Results showed large genetic divergence among the subsamples for morphological, agronomic and organoleptic characteristics, indicating potential for genetic improvement. The characteristics total soluble solids, mean number of good fruits per plant, endocarp thickness, mean mass of marketable fruit per plant, total acidity, mean number of unmarketable fruit per plant, internode diameter, internode length, main stem thickness and leaf width contributed little to the genetic divergence between the subsamples and may be excluded in future studies.
Resumo:
Background: Regulating mechanisms of branching morphogenesis of fetal lung rat explants have been an essential tool for molecular research. This work presents a new methodology to accurately quantify the epithelial, outer contour and peripheral airway buds of lung explants during cellular development from microscopic images. Methods: The outer contour was defined using an adaptive and multi-scale threshold algorithm whose level was automatically calculated based on an entropy maximization criterion. The inner lung epithelial was defined by a clustering procedure that groups small image regions according to the minimum description length principle and local statistical properties. Finally, the number of peripheral buds were counted as the skeleton branched ends from a skeletonized image of the lung inner epithelial. Results: The time for lung branching morphometric analysis was reduced in 98% in contrast to the manual method. Best results were obtained in the first two days of cellular development, with lesser standard deviations. Non-significant differences were found between the automatic and manual results in all culture days. Conclusions: The proposed method introduces a series of advantages related to its intuitive use and accuracy, making the technique suitable to images with different lightning characteristics and allowing a reliable comparison between different researchers.
Resumo:
Regulating mechanisms of branchingmorphogenesis of fetal lung rat explants have been an essential tool formolecular research.This work presents a new methodology to accurately quantify the epithelial, outer contour, and peripheral airway buds of lung explants during cellular development frommicroscopic images. Methods.Theouter contour was defined using an adaptive and multiscale threshold algorithm whose level was automatically calculated based on an entropy maximization criterion. The inner lung epithelium was defined by a clustering procedure that groups small image regions according to the minimum description length principle and local statistical properties. Finally, the number of peripheral buds was counted as the skeleton branched ends from a skeletonized image of the lung inner epithelia. Results. The time for lung branching morphometric analysis was reduced in 98% in contrast to themanualmethod. Best results were obtained in the first two days of cellular development, with lesser standard deviations. Nonsignificant differences were found between the automatic and manual results in all culture days. Conclusions. The proposed method introduces a series of advantages related to its intuitive use and accuracy, making the technique suitable to images with different lighting characteristics and allowing a reliable comparison between different researchers.
Resumo:
A previously developed model is used to numerically simulate real clinical cases of the surgical correction of scoliosis. This model consists of one-dimensional finite elements with spatial deformation in which (i) the column is represented by its axis; (ii) the vertebrae are assumed to be rigid; and (iii) the deformability of the column is concentrated in springs that connect the successive rigid elements. The metallic rods used for the surgical correction are modeled by beam elements with linear elastic behavior. To obtain the forces at the connections between the metallic rods and the vertebrae geometrically, non-linear finite element analyses are performed. The tightening sequence determines the magnitude of the forces applied to the patient column, and it is desirable to keep those forces as small as possible. In this study, a Genetic Algorithm optimization is applied to this model in order to determine the sequence that minimizes the corrective forces applied during the surgery. This amounts to find the optimal permutation of integers 1, ... , n, n being the number of vertebrae involved. As such, we are faced with a combinatorial optimization problem isomorph to the Traveling Salesman Problem. The fitness evaluation requires one computing intensive Finite Element Analysis per candidate solution and, thus, a parallel implementation of the Genetic Algorithm is developed.
Resumo:
OBJECTIVE: To estimate the incidence rate of type 1 diabetes in the urban area of Santiago, Chile, from March 21, 1997 to March 20, 1998, and to assess the spatio-temporal clustering of cases during that period. METHODS: All sixty-one incident cases were located temporally (day of diagnosis) and spatially (place of residence) in the area of study. Knox's method was used to assess spatio-temporal clustering of incident cases. RESULTS: The overall incidence rate of type 1 diabetes was 4.11 cases per 100,000 children aged less than 15 years per year (95% confidence interval: 3.06--5.14). The incidence rate seems to have increased since the last estimate of the incidence calculated for the years 1986--1992 in the metropolitan region of Santiago. Different combinations of space-time intervals have been evaluated to assess spatio-temporal clustering. The smallest p-value was found for the combination of critical distances of 750 meters and 60 days (uncorrected p-value = 0.048). CONCLUSIONS: Although these are preliminary results regarding space-time clustering in Santiago, exploratory analysis of the data method would suggest a possible aggregation of incident cases in space-time coordinates.
Resumo:
The present research paper presents five different clustering methods to identify typical load profiles of medium voltage (MV) electricity consumers. These methods are intended to be used in a smart grid environment to extract useful knowledge about customer’s behaviour. The obtained knowledge can be used to support a decision tool, not only for utilities but also for consumers. Load profiles can be used by the utilities to identify the aspects that cause system load peaks and enable the development of specific contracts with their customers. The framework presented throughout the paper consists in several steps, namely the pre-processing data phase, clustering algorithms application and the evaluation of the quality of the partition, which is supported by cluster validity indices. The process ends with the analysis of the discovered knowledge. To validate the proposed framework, a case study with a real database of 208 MV consumers is used.