224 resultados para bioinformatic
Resumo:
In the present paper, we introduce BioPatML.NET, an application library for the Microsoft Windows .NET framework [2] that implements the BioPatML pattern definition language and sequence search engine. BioPatML.NET is integrated with the Microsoft Biology Foundation (MBF) application library [3], unifying the parsers and annotation services supported or emerging through MBF with the language, search framework and pattern repository of BioPatML. End users who wish to exploit the BioPatML.NET engine and repository without engaging the services of a programmer may do so via the freely accessible web-based BioPatML Editor, which we describe below.
Resumo:
This study aimed to identify new peptide antigens from Chlamydia (C.) trachomatis in a proof of concept approach which could be used to develop an epitope-based serological diagnostic for C. trachomatis related infertility in women. A bioinformatics analysis was conducted examining several immunodominant proteins from C. trachomatis to identify predicted immunoglobulin epitopes unique to C. trachomatis. A peptide array of these epitopes was screened against participant sera. The participants (all female) were categorized into the following cohorts based on their infection and gynecological history; acute (single treated infection with C. trachomatis), multiple (more than one C. trachomatis infection, all treated), sequelae (PID or tubal infertility with a history of C. trachomatis infection), and infertile (no history of C. trachomatis infection and no detected tubal damage). The bioinformatics strategy identified several promising epitopes. Participants who reacted positively in the peptide 11 ELISA were found to have an increased likelihood of being in the sequelae cohort compared to the infertile cohort with an odds ratio of 16.3 (95% c.i. 1.65 – 160), with 95% specificity and 46% sensitivity (0.19-0.74). The peptide 11 ELISA has the potential to be further developed as a screening tool for use during the early IVF work up and provides proof of concept that there may be further peptide antigens which could be identified using bioinformatics and screening approaches.
Resumo:
Rapid advances in sequencing technologies (Next Generation Sequencing or NGS) have led to a vast increase in the quantity of bioinformatics data available, with this increasing scale presenting enormous challenges to researchers seeking to identify complex interactions. This paper is concerned with the domain of transcriptional regulation, and the use of visualisation to identify relationships between specific regulatory proteins (the transcription factors or TFs) and their associated target genes (TGs). We present preliminary work from an ongoing study which aims to determine the effectiveness of different visual representations and large scale displays in supporting discovery. Following an iterative process of implementation and evaluation, representations were tested by potential users in the bioinformatics domain to determine their efficacy, and to understand better the range of ad hoc practices among bioinformatics literate users. Results from two rounds of small scale user studies are considered with initial findings suggesting that bioinformaticians require richly detailed views of TF data, features to compare TF layouts between organisms quickly, and ways to keep track of interesting data points.
Resumo:
C1q is the first subcomponent of classical pathway in the complement system and a major link between innate and acquired immunities. The globular (gC1q) domain similar with C1q was also found in many non-complement C1q-domain-containing (C1qDC) proteins which have similar crystal structure to that of the multifunctional tumor necrosis factor (TNF) ligand family, and also have diverse functions. In this study, we identified a total of 52 independent gene sequences encoding C1q-domain-containing proteins through comprehensive searches of zebrafish genome, cDNA and EST databases. In comparison to 31 orthologous genes in human and different numbers in other species, a significant selective pressure was suggested during vertebrate evolution. Domain organization of C1q-domain-containing (C1qDC) proteins mainly includes a leading signal peptide, a collagen-like region of variable length, and a C-terminal C1q domain. There are 11 highly conserved residues within the C1q domain, among which 2 are invariant within the zebrafish gene set. A more extensive database searches also revealed homologous C1qDC proteins in other vertebrates, invertebrates and even bacterium, but no homologous sequences for encoding C1qDC proteins were found in many species that have a more recent evolutionary history with zebrafish. Therefore, further studies on C1q-domain-containing genes among different species will help us understand evolutionary mechanism of innate and acquired immunities.
Resumo:
This article presents a new method for predicting viral resistance to seven protease inhibitors from the HIV-1 genotype, and for identifying the positions in the protease gene at which the specific nature of the mutation affects resistance. The neural network Analog ARTMAP predicts protease inhibitor resistance from viral genotypes. A feature selection method detects genetic positions that contribute to resistance both alone and through interactions with other positions. This method has identified positions 35, 37, 62, and 77, where traditional feature selection methods have not detected a contribution to resistance. At several positions in the protease gene, mutations confer differing degress of resistance, depending on the specific amino acid to which the sequence has mutated. To find these positions, an Amino Acid Space is introduced to represent genes in a vector space that captures the functional similarity between amino acid pairs. Feature selection identifies several new positions, including 36, 37, and 43, with amino acid-specific contributions to resistance. Analog ARTMAP networks applied to inputs that represent specific amino acids at these positions perform better than networks that use only mutation locations.
Resumo:
Objective: Diabetic nephropathy (DN) is a microvascular complication of diabetes. Members of the WNT/ β-catenin pathways have been implicated in interstitial fibrosis and glomerular sclerosis, characteristic hallmarks of DN. These processes are controlled, in part, by transcription factors (TFs), proteins which bind to gene promoter regions attenuating their regulation. We sought to identify predicted cis-acting transcription factor binding sites (TFBS) over-represented within the promoter regions of WNT pathway members compared to genes across the genome.Methods: We assessed the frequency of 62 TFBS motifs from the JASPAR databases on 65 WNT pathway genes. P-values were estimated on the hypergeometric distribution for each TF. Gene expression profiles of enriched motifs were examined from DN-related datasets to assess clinical significance.Results: TFBS motifs transcription factor AP-2 alpha (TFAP2A), myeloid zinc finger 1 (MZF1), and specificity protein 1 (SP1) were significantly enriched within WNT pathway genes (P-values<6.83x10-29, 1.34x10-11 and 3.01x10-6 respectively). MZF1 gene expression was significantly increased in DN in a whole kidney dataset (fold change = 1.16; 16% increase; P = 0.03). TFAP2A gene expression was decreased in an independent dataset (fold change = -1.02; P = 0.03). SP1 was not differentially expressed in any datasets examined.Conclusions: Three TFBS profiles are significantly enriched within the WNT pathway genes examined highlighting the use of in silico analyses for identifying key regulators of this pathway. Modification of TF binding to gene promoter regions involved in DN pathology may limit progression, making refinement of targeted therapeutic strategies possible through clearer delineation of their role.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Here I will focus on three main topics that best address and include the projects I have been working in during my three year PhD period that I have spent in different research laboratories addressing both computationally and practically important problems all related to modern molecular genomics. The first topic is the use of livestock species (pigs) as a model of obesity, a complex human dysfunction. My efforts here concern the detection and annotation of Single Nucleotide Polymorphisms. I developed a pipeline for mining human and porcine sequences. Starting from a set of human genes related with obesity the platform returns a list of annotated porcine SNPs extracted from a new set of potential obesity-genes. 565 of these SNPs were analyzed on an Illumina chip to test the involvement in obesity on a population composed by more than 500 pigs. Results will be discussed. All the computational analysis and experiments were done in collaboration with the Biocomputing group and Dr.Luca Fontanesi, respectively, under the direction of prof. Rita Casadio at the Bologna University, Italy. The second topic concerns developing a methodology, based on Factor Analysis, to simultaneously mine information from different levels of biological organization. With specific test cases we develop models of the complexity of the mRNA-miRNA molecular interaction in brain tumors measured indirectly by microarray and quantitative PCR. This work was done under the supervision of Prof. Christine Nardini, at the “CAS-MPG Partner Institute for Computational Biology” of Shangai, China (co-founded by the Max Planck Society and the Chinese Academy of Sciences jointly) The third topic concerns the development of a new method to overcome the variety of PCR technologies routinely adopted to characterize unknown flanking DNA regions of a viral integration locus of the human genome after clinical gene therapy. This new method is entirely based on next generation sequencing and it reduces the time required to detect insertion sites, decreasing the complexity of the procedure. This work was done in collaboration with the group of Dr. Manfred Schmidt at the Nationales Centrum für Tumorerkrankungen (Heidelberg, Germany) supervised by Dr. Annette Deichmann and Dr. Ali Nowrouzi. Furthermore I add as an Appendix the description of a R package for gene network reconstruction that I helped to develop for scientific usage (http://www.bioconductor.org/help/bioc-views/release/bioc/html/BUS.html).
Resumo:
The aging process is characterized by the progressive fitness decline experienced at all the levels of physiological organization, from single molecules up to the whole organism. Studies confirmed inflammaging, a chronic low-level inflammation, as a deeply intertwined partner of the aging process, which may provide the “common soil” upon which age-related diseases develop and flourish. Thus, albeit inflammation per se represents a physiological process, it can rapidly become detrimental if it goes out of control causing an excess of local and systemic inflammatory response, a striking risk factor for the elderly population. Developing interventions to counteract the establishment of this state is thus a top priority. Diet, among other factors, represents a good candidate to regulate inflammation. Building on top of this consideration, the EU project NU-AGE is now trying to assess if a Mediterranean diet, fortified for the elderly population needs, may help in modulating inflammaging. To do so, NU-AGE enrolled a total of 1250 subjects, half of which followed a 1-year long diet, and characterized them by mean of the most advanced –omics and non –omics analyses. The aim of this thesis was the development of a solid data management pipeline able to efficiently cope with the results of these assays, which are now flowing inside a centralized database, ready to be used to test the most disparate scientific hypotheses. At the same time, the work hereby described encompasses the data analysis of the GEHA project, which was focused on identifying the genetic determinants of longevity, with a particular focus on developing and applying a method for detecting epistatic interactions in human mtDNA. Eventually, in an effort to propel the adoption of NGS technologies in everyday pipeline, we developed a NGS variant calling pipeline devoted to solve all the sequencing-related issues of the mtDNA.