926 resultados para Genomic data integration
Resumo:
Diagnostic methods have been an important tool in regression analysis to detect anomalies, such as departures from error assumptions and the presence of outliers and influential observations with the fitted models. Assuming censored data, we considered a classical analysis and Bayesian analysis assuming no informative priors for the parameters of the model with a cure fraction. A Bayesian approach was considered by using Markov Chain Monte Carlo Methods with Metropolis-Hasting algorithms steps to obtain the posterior summaries of interest. Some influence methods, such as the local influence, total local influence of an individual, local influence on predictions and generalized leverage were derived, analyzed and discussed in survival data with a cure fraction and covariates. The relevance of the approach was illustrated with a real data set, where it is shown that, by removing the most influential observations, the decision about which model best fits the data is changed.
Resumo:
QTL mapping provides usefull information for breeding programs since it allows the estimation of genomic locations and genetic effects of chromossomal regions related to the expression of quantitative traits. The objective of this study was to map QTL related to several agronomic important traits associated with grain yield: ear weight (EW), prolificacy (PROL), ear number (NE), ear length (EL) and diameter (ED), number of rows on the ear (NRE) and number of kernels per row on the ear (NKPR). Four hundred F-2:3 tropical maize progenies were evaluated in five environments in Piracicaba, Sao Paulo, Brazil. The genetic map was previously estimated and had 117 microssatelite loci with average distance of 14 cM. Data was analysed using Composite Interval Mapping for each trait. Thirty six QTL were mapped and related to the expression of EW (2), PROL (3), NE (2), EL (5), ED (5), NRE (10), NKPR (5). Few QTL were mapped since there was high GxE interaction. Traits EW, PROL and EN showed high genetic correlation with grain yield and several QTL mapped to similar genomic regions, which could cause the observed correlation. However, further analysis using apropriate statistical models are required to separate linked versus pleiotropic QTL. Five QTL (named Ew1, Ne1, Ed3, Nre3 and Nre10) had high genetic effects, explaining from 10.8% (Nre3) to 16.9% (Nre10) of the phenotypic variance, and could be considered in further studies.
Resumo:
Some factors complicate comparisons between linkage maps from different studies. This problem can be resolved if measures of precision, such as confidence intervals and frequency distributions, are associated with markers. We examined the precision of distances and ordering of microsatellite markers in the consensus linkage maps of chromosomes 1, 3 and 4 from two F 2 reciprocal Brazilian chicken populations, using bootstrap sampling. Single and consensus maps were constructed. The consensus map was compared with the International Consensus Linkage Map and with the whole genome sequence. Some loci showed segregation distortion and missing data, but this did not affect the analyses negatively. Several inversions and position shifts were detected, based on 95% confidence intervals and frequency distributions of loci. Some discrepancies in distances between loci and in ordering were due to chance, whereas others could be attributed to other effects, including reciprocal crosses, sampling error of the founder animals from the two populations, F(2) population structure, number of and distance between microsatellite markers, number of informative meioses, loci segregation patterns, and sex. In the Brazilian consensus GGA1, locus LEI1038 was in a position closer to the true genome sequence than in the International Consensus Map, whereas for GGA3 and GGA4, no such differences were found. Extending these analyses to the remaining chromosomes should facilitate comparisons and the integration of several available genetic maps, allowing meta-analyses for map construction and quantitative trait loci (QTL) mapping. The precision of the estimates of QTL positions and their effects would be increased with such information.
Resumo:
Xylella fastidiosa is a Gram negative plant pathogen causing many economically important diseases, and analyses of completely sequenced X. fastidiosa genome strains allowed the identification of many prophage-like elements and possibly phage remnants, accounting for up to 15% of the genome composition. To better evaluate the recent evolution of the X. fastidiosa chromosome backbone among distinct pathovars, the number and location of prophage-like regions on two finished genomes (9a5c and Temecula1), and in two candidate molecules (Ann1 and Dixon) were assessed. Based on comparative best bidirectional hit analyses, the majority (51%) of the predicted genes in the X. fastidiosa prophage-like regions are related to structural phage genes belonging to the Siphoviridae family. Electron micrograph reveals the existence of putative viral particles with similar morphology to lambda phages in the bacterial cell in planta. Moreover, analysis of microarray data indicates that 9a5c strain cultivated under stress conditions presents enhanced expression of phage anti-repressor genes, suggesting switches from lysogenic to lytic cycle of phages under stress-induced situations. Furthermore, virulence-associated proteins and toxins are found within these prophage-like elements, thus suggesting an important role in host adaptation. Finally, clustering analyses of phage integrase genes based on multiple alignment patterns reveal they group in five lineages, all possessing a tyrosine recombinase catalytic domain, and phylogenetically close to other integrases found in phages that are genetic mosaics and able to perform generalized and specialized transduction. Integration sites and tRNA association is also evidenced. In summary, we present comparative and experimental evidence supporting the association and contribution of phage activity on the differentiation of Xylella genomes.
Resumo:
We present the genome sequences of a new clinical isolate of the important human pathogen, Aspergillus fumigatus, A1163, and two closely related but rarely pathogenic species, Neosartorya fischeri NRRL181 and Aspergillus clavatus NRRL1. Comparative genomic analysis of A1163 with the recently sequenced A. fumigatus isolate Af293 has identified core, variable and up to 2% unique genes in each genome. While the core genes are 99.8% identical at the nucleotide level, identity for variable genes can be as low 40%. The most divergent loci appear to contain heterokaryon incompatibility ( het) genes associated with fungal programmed cell death such as developmental regulator rosA. Cross-species comparison has revealed that 8.5%, 13.5% and 12.6%, respectively, of A. fumigatus, N. fischeri and A. clavatus genes are species-specific. These genes are significantly smaller in size than core genes, contain fewer exons and exhibit a subtelomeric bias. Most of them cluster together in 13 chromosomal islands, which are enriched for pseudogenes, transposons and other repetitive elements. At least 20% of A. fumigatus-specific genes appear to be functional and involved in carbohydrate and chitin catabolism, transport, detoxification, secondary metabolism and other functions that may facilitate the adaptation to heterogeneous environments such as soil or a mammalian host. Contrary to what was suggested previously, their origin cannot be attributed to horizontal gene transfer ( HGT), but instead is likely to involve duplication, diversification and differential gene loss (DDL). The role of duplication in the origin of lineage-specific genes is further underlined by the discovery of genomic islands that seem to function as designated ""gene dumps'' and, perhaps, simultaneously, as "" gene factories''.
Resumo:
With the advent and development of technology, mainly in the Internet, more and more electronic services are being offered to customers in all areas of business, especially in the offering of information services, as in virtual libraries. This article proposes a new opportunity to provide services to virtual libraries customers, presenting a methodology for the implementation of electronic services oriented by these customers' life situations. Through analytical observations of some national virtual libraries sites, it could be identified that the offer of services considering life situations and relationship interest situations can promote the service to their customers, providing greater satisfaction and, consequently, improving quality in the offer of information services. The visits to those sites and the critical analysis of the data collected during these visits, supported by bibliographic researches results, have enabled the description of this methodology, concluding that the provision of services on an isolated way or in accordance with the user's profile on sites of virtual libraries is not always enough to ensure the attendance to the needs and expectations of its customers, which suggests the offering of these services considering life situations and relationship interest situations as a complement that adds value to the business of virtual library. This becomes relevant when indicates new opportunities to provide virtual libraries services with quality, serving as a guide to the information providers managers, enabling the offering of new means to access information services by such customers, looking for pro - activity and services integration, in order to solve definitely real problems.
Resumo:
Background: The post-genomic era has brought new challenges regarding the understanding of the organization and function of the human genome. Many of these challenges are centered on the meaning of differential gene regulation under distinct biological conditions and can be performed by analyzing the Multiple Differential Expression (MDE) of genes associated with normal and abnormal biological processes. Currently MDE analyses are limited to usual methods of differential expression initially designed for paired analysis. Results: We proposed a web platform named ProbFAST for MDE analysis which uses Bayesian inference to identify key genes that are intuitively prioritized by means of probabilities. A simulated study revealed that our method gives a better performance when compared to other approaches and when applied to public expression data, we demonstrated its flexibility to obtain relevant genes biologically associated with normal and abnormal biological processes. Conclusions: ProbFAST is a free accessible web-based application that enables MDE analysis on a global scale. It offers an efficient methodological approach for MDE analysis of a set of genes that are turned on and off related to functional information during the evolution of a tumor or tissue differentiation. ProbFAST server can be accessed at http://gdm.fmrp.usp.br/probfast.
Resumo:
Large-conductance Ca(2+)-activated K(+) channels (BK) play a fundamental role in modulating membrane potential in many cell types. The gating of BK channels and its modulation by Ca(2+) and voltage has been the subject of intensive research over almost three decades, yielding several of the most complicated kinetic mechanisms ever proposed. A large number of open and closed states disposed, respectively, in two planes, named tiers, characterize these mechanisms. Transitions between states in the same plane are cooperative and modulated by Ca(2+). Transitions across planes are highly concerted and voltage-dependent. Here we reexamine the validity of the two-tiered hypothesis by restricting attention to the modulation by Ca(2+). Large single channel data sets at five Ca(2+) concentrations were simultaneously analyzed from a Bayesian perspective by using hidden Markov models and Markov-chain Monte Carlo stochastic integration techniques. Our results support a dramatic reduction in model complexity, favoring a simple mechanism derived from the Monod-Wyman-Changeux allosteric model for homotetramers, able to explain the Ca(2+) modulation of the gating process. This model differs from the standard Monod-Wyman-Changeux scheme in that one distinguishes when two Ca(2+) ions are bound to adjacent or diagonal subunits of the tetramer.
Resumo:
We consider a nontrivial one-species population dynamics model with finite and infinite carrying capacities. Time-dependent intrinsic and extrinsic growth rates are considered in these models. Through the model per capita growth rate we obtain a heuristic general procedure to generate scaling functions to collapse data into a simple linear behavior even if an extrinsic growth rate is included. With this data collapse, all the models studied become independent from the parameters and initial condition. Analytical solutions are found when time-dependent coefficients are considered. These solutions allow us to perceive nontrivial transitions between species extinction and survival and to calculate the transition's critical exponents. Considering an extrinsic growth rate as a cancer treatment, we show that the relevant quantity depends not only on the intensity of the treatment, but also on when the cancerous cell growth is maximum.
Resumo:
Background: Genome wide association studies (GWAS) are becoming the approach of choice to identify genetic determinants of complex phenotypes and common diseases. The astonishing amount of generated data and the use of distinct genotyping platforms with variable genomic coverage are still analytical challenges. Imputation algorithms combine directly genotyped markers information with haplotypic structure for the population of interest for the inference of a badly genotyped or missing marker and are considered a near zero cost approach to allow the comparison and combination of data generated in different studies. Several reports stated that imputed markers have an overall acceptable accuracy but no published report has performed a pair wise comparison of imputed and empiric association statistics of a complete set of GWAS markers. Results: In this report we identified a total of 73 imputed markers that yielded a nominally statistically significant association at P < 10(-5) for type 2 Diabetes Mellitus and compared them with results obtained based on empirical allelic frequencies. Interestingly, despite their overall high correlation, association statistics based on imputed frequencies were discordant in 35 of the 73 (47%) associated markers, considerably inflating the type I error rate of imputed markers. We comprehensively tested several quality thresholds, the haplotypic structure underlying imputed markers and the use of flanking markers as predictors of inaccurate association statistics derived from imputed markers. Conclusions: Our results suggest that association statistics from imputed markers showing specific MAF (Minor Allele Frequencies) range, located in weak linkage disequilibrium blocks or strongly deviating from local patterns of association are prone to have inflated false positive association signals. The present study highlights the potential of imputation procedures and proposes simple procedures for selecting the best imputed markers for follow-up genotyping studies.
Resumo:
Introduction: The successful integration of stem cells in adult brain has become a central issue in modern neuroscience. In this study we sought to test the hypothesis that survival and neurodifferentiation of mesenchymal stem cells (MSCs) may be dependent upon microenvironmental conditions according to the site of implant in the brain. Methods: MSCs were isolated from adult rats and labeled with enhanced-green fluorescent protein (eGFP) lentivirus. A cell suspension was implanted stereotactically into the brain of 50 young rats, into one neurogenic area (hippocampus), and into another nonneurogenic area (striatum). Animals were sacrificed 6 or 12 weeks after surgery, and brains were stained for mature neuronal markers. Cells coexpressing NeuN (neuronal specific nuclear protein) and GFP (green fluorescent protein) were counted stereologically at both targets. Results: The isolated cell population was able to generate neurons positive for microtubule-associated protein 2 (MAP2), neuronal-specific nuclear protein (NeuN), and neurofilament 200 (NF200) in vitro. Electrophysiology confirmed expression of voltage-gated ionic channels. Once implanted into the hippocampus, cells survived for up to 12 weeks, migrated away from the graft, and gave rise to mature neurons able to synthesize neurotransmitters. By contrast, massive cell degeneration was seen in the striatum, with no significant migration. Induction of neuronal differentiation with increased cyclic adenosine monophosphate in the culture medium before implantation favored differentiation in vivo. Conclusions: Our data demonstrated that survival and differentiation of MSCs is strongly dependent upon a permissive microenvironment. Identification of the pro-neurogenic factors present in the hippocampus could subsequently allow for the integration of stem cells into nonpermissive areas of the central nervous system.
Resumo:
Based on pre-DNA racial/color methodology, clinical and pharmacological trials have traditionally considered the different geographical regions of Brazil as being very heterogeneous. We wished to ascertain how such diversity of regional color categories correlated with ancestry. Using a panel of 40 validated ancestry-informative insertion-deletion DNA polymorphisms we estimated individually the European, African and Amerindian ancestry components of 934 self-categorized White, Brown or Black Brazilians from the four most populous regions of the Country. We unraveled great ancestral diversity between and within the different regions. Especially, color categories in the northern part of Brazil diverged significantly in their ancestry proportions from their counterparts in the southern part of the Country, indicating that diverse regional semantics were being used in the self-classification as White, Brown or Black. To circumvent these regional subjective differences in color perception, we estimated the general ancestry proportions of each of the four regions in a form independent of color considerations. For that, we multiplied the proportions of a given ancestry in a given color category by the official census information about the proportion of that color category in the specific region, to arrive at a ""total ancestry"" estimate. Once such a calculation was performed, there emerged a much higher level of uniformity than previously expected. In all regions studied, the European ancestry was predominant, with proportions ranging from 60.6% in the Northeast to 77.7% in the South. We propose that the immigration of six million Europeans to Brazil in the 19(th) and 20(th) centuries - a phenomenon described and intended as the ""whitening of Brazil"" -is in large part responsible for dissipating previous ancestry dissimilarities that reflected region-specific population histories. These findings, of both clinical and sociological importance for Brazil, should also be relevant to other countries with ancestrally admixed populations.
Resumo:
Background: The inherent complexity of statistical methods and clinical phenomena compel researchers with diverse domains of expertise to work in interdisciplinary teams, where none of them have a complete knowledge in their counterpart's field. As a result, knowledge exchange may often be characterized by miscommunication leading to misinterpretation, ultimately resulting in errors in research and even clinical practice. Though communication has a central role in interdisciplinary collaboration and since miscommunication can have a negative impact on research processes, to the best of our knowledge, no study has yet explored how data analysis specialists and clinical researchers communicate over time. Methods/Principal Findings: We conducted qualitative analysis of encounters between clinical researchers and data analysis specialists (epidemiologist, clinical epidemiologist, and data mining specialist). These encounters were recorded and systematically analyzed using a grounded theory methodology for extraction of emerging themes, followed by data triangulation and analysis of negative cases for validation. A policy analysis was then performed using a system dynamics methodology looking for potential interventions to improve this process. Four major emerging themes were found. Definitions using lay language were frequently employed as a way to bridge the language gap between the specialties. Thought experiments presented a series of ""what if'' situations that helped clarify how the method or information from the other field would behave, if exposed to alternative situations, ultimately aiding in explaining their main objective. Metaphors and analogies were used to translate concepts across fields, from the unfamiliar to the familiar. Prolepsis was used to anticipate study outcomes, thus helping specialists understand the current context based on an understanding of their final goal. Conclusion/Significance: The communication between clinical researchers and data analysis specialists presents multiple challenges that can lead to errors.
Resumo:
Introduction: Work disability is a major consequence of rheumatoid arthritis (RA), associated not only with traditional disease activity variables, but also more significantly with demographic, functional, occupational, and societal variables. Recent reports suggest that the use of biologic agents offers potential for reduced work disability rates, but the conclusions are based on surrogate disease activity measures derived from studies primarily from Western countries. Methods: The Quantitative Standard Monitoring of Patients with RA (QUEST-RA) multinational database of 8,039 patients in 86 sites in 32 countries, 16 with high gross domestic product (GDP) (>24K US dollars (USD) per capita) and 16 low-GDP countries (<11K USD), was analyzed for work and disability status at onset and over the course of RA and clinical status of patients who continued working or had stopped working in high-GDP versus low-GDP countries according to all RA Core Data Set measures. Associations of work disability status with RA Core Data Set variables and indices were analyzed using descriptive statistics and regression analyses. Results: At the time of first symptoms, 86% of men (range 57%-100% among countries) and 64% (19%-87%) of women <65 years were working. More than one third (37%) of these patients reported subsequent work disability because of RA. Among 1,756 patients whose symptoms had begun during the 2000s, the probabilities of continuing to work were 80% (95% confidence interval (CI) 78%-82%) at 2 years and 68% (95% CI 65%-71%) at 5 years, with similar patterns in high-GDP and low-GDP countries. Patients who continued working versus stopped working had significantly better clinical status for all clinical status measures and patient self-report scores, with similar patterns in high-GDP and low-GDP countries. However, patients who had stopped working in high-GDP countries had better clinical status than patients who continued working in low-GDP countries. The most significant identifier of work disability in all subgroups was Health Assessment Questionnaire (HAQ) functional disability score. Conclusions: Work disability rates remain high among people with RA during this millennium. In low-GDP countries, people remain working with high levels of disability and disease activity. Cultural and economic differences between societies affect work disability as an outcome measure for RA.
Resumo:
Background: The genetic diversity of the human immunodeficiency virus type 1 (HIV-1) is critical to lay the groundwork for the design of successful drugs or vaccine. In this study we aimed to characterize and define the molecular prevalence of HIV-1 subclade F1 currently circulating in Sao Paulo, Brazil. Methods: A total of 36 samples were selected from 888 adult patients residing in Sao Paulo who had previously been diagnosed in two independent studies in our laboratory as being infected with subclade F1 based on pol subgenomic fragment sequencing. Proviral DNA was amplified from the purified genomic DNA of all 36 blood samples by 5 fragments overlapping PCR followed by direct sequencing. Sequence data were obtained from the 5 fragments of pure subclade F1 and phylogenetic trees were constructed and compared with previously published sequences. Subclades F1 that exhibited mosaic structure with other subtypes were omitted from any further analysis Results: Our methods of fragment amplification and sequencing confirmed that only 5 sequences inferred from pol region as subclade F1 also holds true for the genome as a whole and, thus, estimated the true prevalence at 0.56%. The results also showed a single phylogenetic cluster of the Brazilian subclade F1 along with non-Brazilian South American isolates in both subgenomic and the full-length genomes analysis with an overall intrasubtype nucleotide divergence of 6.9%. The nucleotide differences within the South American and Central African F1 strains, in the C2-C3 env, were 8.5% and 12.3%, respectively. Conclusion: All together, our findings showed a surprisingly low prevalence rate of subclade F1 in Brazil and suggest that these isolates originated in Central Africa and subsequently introduced to South America.