47 resultados para bioinformatics
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Bionformatics is a rapidly evolving research field dedicated toanalyzing and managing biological data with computational resources. This paperaims to overview some of the processes and applications currently implementedat CCiT-UB¿s Bioinformatics Unit, focusing mainly on the areas of Genomics,Transcriptomics and Proteomics
Resumo:
Peer-reviewed
Resumo:
Article About the Authors Metrics Comments Related Content Abstract Introduction Functionality Implementation Discussion Acknowledgments Author Contributions References Reader Comments (0) Figures Abstract Despite of the variety of available Web services registries specially aimed at Life Sciences, their scope is usually restricted to a limited set of well-defined types of services. While dedicated registries are generally tied to a particular format, general-purpose ones are more adherent to standards and usually rely on Web Service Definition Language (WSDL). Although WSDL is quite flexible to support common Web services types, its lack of semantic expressiveness led to various initiatives to describe Web services via ontology languages. Nevertheless, WSDL 2.0 descriptions gained a standard representation based on Web Ontology Language (OWL). BioSWR is a novel Web services registry that provides standard Resource Description Framework (RDF) based Web services descriptions along with the traditional WSDL based ones. The registry provides Web-based interface for Web services registration, querying and annotation, and is also accessible programmatically via Representational State Transfer (REST) API or using a SPARQL Protocol and RDF Query Language. BioSWR server is located at http://inb.bsc.es/BioSWR/and its code is available at https://sourceforge.net/projects/bioswr/under the LGPL license.
Resumo:
Con la mayor capacidad de los nodos de procesamiento en relación a la potencia de cómputo, cada vez más aplicaciones intensivas de datos como las aplicaciones de la bioinformática, se llevarán a ejecutar en clusters no dedicados. Los clusters no dedicados se caracterizan por su capacidad de combinar la ejecución de aplicaciones de usuarios locales con aplicaciones, científicas o comerciales, ejecutadas en paralelo. Saber qué efecto las aplicaciones con acceso intensivo a dados producen respecto a la mezcla de otro tipo (batch, interativa, SRT, etc) en los entornos no-dedicados permite el desarrollo de políticas de planificación más eficientes. Algunas de las aplicaciones intensivas de E/S se basan en el paradigma MapReduce donde los entornos que las utilizan, como Hadoop, se ocupan de la localidad de los datos, balanceo de carga de forma automática y trabajan con sistemas de archivos distribuidos. El rendimiento de Hadoop se puede mejorar sin aumentar los costos de hardware, al sintonizar varios parámetros de configuración claves para las especificaciones del cluster, para el tamaño de los datos de entrada y para el procesamiento complejo. La sincronización de estos parámetros de sincronización puede ser demasiado compleja para el usuario y/o administrador pero procura garantizar prestaciones más adecuadas. Este trabajo propone la evaluación del impacto de las aplicaciones intensivas de E/S en la planificación de trabajos en clusters no-dedicados bajo los paradigmas MPI y Mapreduce.
Resumo:
Avui en dia la biologia aporta grans quantitats de dades que només la informàtica pot tractar. Les aplicacions bioinformàtiques són la més important eina d’anàlisi i comparació que tenim per entendre la vida i aconseguir desxifrar aquestes dades. Aquest projecte centra el seu esforç en l’estudi de les aplicacions dedicades a l’alineament de seqüències genètiques, i més concretament a dos algoritmes, basats en programació dinàmica i òptims: el Needleman&Wunsch i el Smith&Waterman. Amb l’objectiu de millorar el rendiment d’aquests algoritmes per a alineaments de seqüències grans, proposem diferents versions d’implementació. Busquem millorar rendiments en temps i espai. Per a aconseguir millorar els resultats aprofitem el paral·lelisme. Els resultats dels anàlisis de les versions els comparem per obtenir les dades necessàries per valorar cost, guany i rendiment.
Resumo:
Las aplicaciones de alineamiento de secuencias son una herramienta importante para la comunidad científica. Estas aplicaciones bioinformáticas son usadas en muchos campos distintos como pueden ser la medicina, la biología, la farmacología, la genética, etc. A día de hoy los algoritmos de alineamiento de secuencias tienen una complejidad elevada y cada día tienen que manejar un volumen de datos más grande. Por esta razón se deben buscar alternativas para que estas aplicaciones sean capaces de manejar el aumento de tamaño que los bancos de secuencias están sufriendo día a día. En este proyecto se estudian y se investigan mejoras en este tipo de aplicaciones como puede ser el uso de sistemas paralelos que pueden mejorar el rendimiento notablemente.
Resumo:
Cada vez es mayor el número de aplicaciones desarrolladas en el ámbito científico, como en la Bioinformática o en las Geociencias, escritas bajo el modelo MapReduce, empleando herramientas de código abierto como Apache Hadoop. De la necesidad de integrar Hadoop en entornos HPC, para posibilitar la ejecutar aplicaciones desarrolladas bajo el paradigma MapReduce, nace el presente proyecto. Se analizan dos frameworks diseñados para facilitar dicha integración a los desarrolladores: HoD y myHadoop. En este proyecto se analiza, tanto las posibilidades en cuanto a entornos que ofrecen dichos frameworks para la ejecución de aplicaciones MapReduce, como el rendimiento de los clúster Hadoop generados con HoD o myHadoop respecto a un clúster Hadoop físico.
Resumo:
La recent revolució en les tècniques de generació de dades genòmiques ha portat a una situació de creixement exponencial de la quantitat de dades generades i fa més necessari que mai el treball en la optimització de la gestió i maneig d'aquesta informació. En aquest treball s'han atacat tres vessants del problema: la disseminació de la informació, la integració de dades de diverses fonts i finalment la seva visualització. Basant-nos en el Sistema d'Anotacions Distribuides, DAS, hem creat un aplicatiu per a la creació automatitzada de noves fonts de dades en format estandaritzat i accessible programàticament a partir de fitxers de dades simples. Aquest progrtamari, easyDAS, està en funcionament a l'Institut Europeu de Bioinformàtica. Aquest sistema facilita i encoratja la compartició i disseminació de dades genòmiques en formats usables. jsDAS és una llibreria client de DAS que permet incorporar dades DAS en qualsevol aplicatiu web de manera senzilla i ràpida. Aprofitant els avantatges que ofereix DAS és capaç d'integrar dades de múltiples fonts de manera coherent i robusta. GenExp és el prototip de navegador genòmic basat en web altament interactiu i que facilita l'exploració dels genomes en temps real. És capaç d'integrar dades de quansevol font DAS i crear-ne una representació en client usant els últims avenços en tecnologies web.
Resumo:
We present building blocks for algorithms for the efficient reduction of square factor, i.e. direct repetitions in strings. So the basic problem is this: given a string, compute all strings that can be obtained by reducing factors of the form zz to z. Two types of algorithms are treated: an offline algorithm is one that can compute a data structure on the given string in advance before the actual search for the square begins; in contrast, online algorithms receive all input only at the time when a request is made. For offline algorithms we treat the following problem: Let u and w be two strings such that w is obtained from u by reducing a square factor zz to only z. If we further are given the suffix table of u, how can we derive the suffix table for w without computing it from scratch? As the suffix table plays a key role in online algorithms for the detection of squares in a string, this derivation can make the iterated reduction of squares more efficient. On the other hand, we also show how a suffix array, used for the offline detection of squares, can be adapted to the new string resulting from the deletion of a square. Because the deletion is a very local change, this adaption is more eficient than the computation of the new suffix array from scratch.
Resumo:
A report of the 6th Georgia Tech-Oak Ridge National Lab International Conference on Bioinformatics 'In silico Biology: Gene Discovery and Systems Genomics', Atlanta, USA, 15-17 November, 2007.
Resumo:
Malaria in pregnancy forms a substantial part of the worldwide burden of malaria, with an estimated annual death toll of up to 200,000 infants, as well as increased maternal morbidity and mortality. Studies of genetic susceptibility to malaria have so far focused on infant malaria, with only a few studies investigating the genetic basis of placental malaria, focusing only on a limited number of candidate genes. The aim of this study therefore was to identify novel host genetic factors involved in placental malaria infection. To this end we carried out a nested case-control study on 180 Mozambican pregnant women with placental malaria infection, and 180 controls within an intervention trial of malaria prevention. We genotyped 880 SNPs in a set of 64 functionally related genes involved in glycosylation and innate immunity. A SNP located in the gene FUT9, rs3811070, was significantly associated with placental malaria infection (OR = 2.31, permutation p-value = 0.028). Haplotypic analysis revealed a similarly strong association of a common haplotype of four SNPs including rs3811070. FUT9 codes for a fucosyl-transferase that is catalyzing the last step in the biosynthesis of the Lewis-x antigen, which forms part of the Lewis blood group-related antigens. These results therefore suggest an involvement of this antigen in the pathogenesis of placental malaria infection.
Resumo:
A large proportion of the death toll associated with malaria is a consequence of malaria infection during pregnancy, causing up to 200,000 infant deaths annually. We previously published the first extensive genetic association study of placental malaria infection, and here we extend this analysis considerably, investigating genetic variation in over 9,000 SNPs in more than 1,000 genes involved in immunity and inflammation for their involvement in susceptibility to placental malaria infection. We applied a new approach incorporating results from both single gene analysis as well as gene-gene interactionson a protein-protein interaction network. We found suggestive associations of variants in the gene KLRK1 in the single geneanalysis, as well as evidence for associations of multiple members of the IL-7/IL-7R signalling cascade in the combined analysis. To our knowledge, this is the first large-scale genetic study on placental malaria infection to date, opening the door for follow-up studies trying to elucidate the genetic basis of this neglected form of malaria.
Resumo:
This paper presents the platform developed in the PANACEA project, a distributed factory that automates the stages involved in the acquisition, production, updating and maintenance of Language Resources required by Machine Translation and other Language Technologies. We adopt a set of tools that have been successfully used in the Bioinformatics field, they are adapted to the needs of our field and used to deploy web services, which can be combined to build more complex processing chains (workflows). This paper describes the platform and its different components (web services, registry, workflows, social network and interoperability). We demonstrate the scalability of the platform by carrying out a set of massive data experiments. Finally, a validation of the platform across a set of required criteria proves its usability for different types of users (non-technical users and providers).
Resumo:
Background Diet plays a role on the development of the immune system, and polyunsaturated fatty acids can modulate the expression of a variety of genes. Human milk contains conjugated linoleic acid (CLA), a fatty acid that seems to contribute to immune development. Indeed, recent studies carried out in our group in suckling animals have shown that the immune function is enhanced after feeding them with an 80:20 isomer mix composed of c9,t11 and t10,c12 CLA. However, little work has been done on the effects of CLA on gene expression, and even less regarding immune system development in early life. Results The expression profile of mesenteric lymph nodes from animals supplemented with CLA during gestation and suckling through dam's milk (Group A) or by oral gavage (Group B), supplemented just during suckling (Group C) and control animals (Group D) was determined with the aid of the specific GeneChip® Rat Genome 230 2.0 (Affymettrix). Bioinformatics analyses were performed using the GeneSpring GX software package v10.0.2 and lead to the identification of 89 genes differentially expressed in all three dietary approaches. Generation of a biological association network evidenced several genes, such as connective tissue growth factor (Ctgf), tissue inhibitor of metalloproteinase 1 (Timp1), galanin (Gal), synaptotagmin 1 (Syt1), growth factor receptor bound protein 2 (Grb2), actin gamma 2 (Actg2) and smooth muscle alpha actin (Acta2), as highly interconnected nodes of the resulting network. Gene underexpression was confirmed by Real-Time RT-PCR. Conclusions Ctgf, Timp1, Gal and Syt1, among others, are genes modulated by CLA supplementation that may have a role on mucosal immune responses in early life.