970 resultados para Genomics -- Data processing
Resumo:
Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Maltose-binding protein is the periplasmic component of the ABC transporter responsible for the uptake of maltose/maltodextrins. The Xanthomonas axonopodis pv. citri maltose-binding protein MalE has been crystallized at 293 Kusing the hanging-drop vapour-diffusion method. The crystal belonged to the primitive hexagonal space group P6(1)22, with unit-cell parameters a = 123.59, b = 123.59, c = 304.20 angstrom, and contained two molecules in the asymetric unit. It diffracted to 2.24 angstrom resolution.
Resumo:
Network control systems (NCSs) are spatially distributed systems in which the communication between sensors, actuators and controllers occurs through a shared band-limited digital communication network. However, the use of a shared communication network, in contrast to using several dedicated independent connections, introduces new challenges which are even more acute in large scale and dense networked control systems. In this paper we investigate a recently introduced technique of gathering information from a dense sensor network to be used in networked control applications. Obtaining efficiently an approximate interpolation of the sensed data is exploited as offering a good tradeoff between accuracy in the measurement of the input signals and the delay to the actuation. These are important aspects to take into account for the quality of control. We introduce a variation to the state-of-the-art algorithms which we prove to perform relatively better because it takes into account the changes over time of the input signal within the process of obtaining an approximate interpolation.
Resumo:
Cooperating objects (COs) is a recently coined term used to signify the convergence of classical embedded computer systems, wireless sensor networks and robotics and control. We present essential elements of a reference architecture for scalable data processing for the CO paradigm.
Resumo:
Los eventos transitorios únicos analógicos (ASET, Analog Single Event Transient) se producen debido a la interacción de un ión pesado o un protón de alta energía con un dispositivo sensible de un circuito analógico. La interacción del ión con un transistor bipolar o de efecto de campo MOS induce pares electrón-hueco que provocan picos que pueden propagarse a la salida del componente analógico provocando transitorios que pueden inducir fallas en el nivel sistema. Los problemas más graves debido a este tipo de fenómeno se dan en el medioambiente espacial, muy rico en iones pesados. Casos típicos los constituyen las computadoras de a bordo de satélites y otros artefactos espaciales. Sin embargo, y debido a la continua contracción de dimensiones de los transistores (que trae aparejado un aumento de sensibilidad), este fenómeno ha comenzado a observarse a nivel del mar, provocado fundamentalmente por el impacto de neutrones atmosféricos. Estos efectos pueden provocar severos problemas a los sistemas informáticos con interfaces analógicas desde las que obtienen datos para el procesamiento y se han convertido en uno de los problemas más graves a los que tienen que hacer frente los diseñadores de sistemas de alta escala de integración. Casos típicos son los Sistemas en Chip que incluyen módulos de procesamiento de altas prestaciones como las interfaces analógicas.El proyecto persigue como objetivo general estudiar la susceptibilidad de sistemas informáticos a ASETs en sus secciones analógicas, proponiendo estrategias para la mitigación de los errores.Como objetivos específicos se pretende: -Proponer nuevos modelos de ASETs basados en simulaciones en el nivel dispositivo y resueltas por el método de elementos finitos.-Utilizar los modelos para identificar las secciones más propensas a producir errores y consecuentemente para ser candidatos a la aplicación de técnicas de endurecimiento a radiaciones.-Utilizar estos modelos para estudiar la naturaleza de los errores producidos en sistemas de procesamiento de datos.-Proponer soluciones novedosas para la mitigación de estos efectos en los mismos circuitos analógicos evitando su propagación a las secciones digitales.-Proponer soluciones para la mitigación de los efectos en el nivel sistema.Para llevar a cabo el proyecto se plantea un procedimiento ascendente para las investigaciones a realizar, comenzando por descripciones en el nivel físico para posteriormente aumentar el nivel de abstracción en el que se encuentra modelado el circuito. Se propone el modelado físico de los dispositivos MOS y su resolución mediante el Método de Elementos Finitos. La inyección de cargas en las zonas sensibles de los modelos permitirá determinar los perfiles de los pulsos de corriente que deben inyectarse en el nivel circuito para emular estos efectos. Estos procedimientos se realizarán para los distintos bloques constructivos de las interfaces analógicas, proponiendo estrategias de mitigación de errores en diferentes niveles.Los resultados esperados del presente proyecto incluyen hardware para detección de errores y tolerancia a este tipo de eventos que permitan aumentar la confiabilidad de sistemas de tratamiento de la información, así como también nuevos datos referentes a efectos de la radiación en semiconductores, nuevos modelos de fallas transitorias que permitan una simulación de estos eventos en el nivel circuito y la determinación de zonas sensibles de interfaces analógicas típicas que deben ser endurecidas para radiación.
Resumo:
There is described data processing at the flaw detector with combined multisectional eddy-current transducer and heterofrequency magnetic field. The application of this method for detecting flaws in rods and pipes under the conditions of significant transverse displacements is described.
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
This article analyses the results of an empirical study on the 200 most popular UK-based websites in various sectors of e-commerce services. The study provides empirical evidence on unlawful processing of personal data. It comprises a survey on the methods used to seek and obtain consent to process personal data for direct marketing and advertisement, and a test on the frequency of unsolicited commercial emails (UCE) received by customers as a consequence of their registration and submission of personal information to a website. Part One of the article presents a conceptual and normative account of data protection, with a discussion of the ethical values on which EU data protection law is grounded and an outline of the elements that must be in place to seek and obtain valid consent to process personal data. Part Two discusses the outcomes of the empirical study, which unveils a significant departure between EU legal theory and practice in data protection. Although a wide majority of the websites in the sample (69%) has in place a system to ask separate consent for engaging in marketing activities, it is only 16.2% of them that obtain a consent which is valid under the standards set by EU law. The test with UCE shows that only one out of three websites (30.5%) respects the will of the data subject not to receive commercial communications. It also shows that, when submitting personal data in online transactions, there is a high probability (50%) of incurring in a website that will ignore the refusal of consent and will send UCE. The article concludes that there is severe lack of compliance of UK online service providers with essential requirements of data protection law. In this respect, it suggests that there is inappropriate standard of implementation, information and supervision by the UK authorities, especially in light of the clarifications provided at EU level.
Resumo:
Background: Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results: We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2 of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log(2) units (6 of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions: This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells.
Resumo:
A student from the Data Processing program at the New York Trade School is shown working. Black and white photograph with some edge damage due to writing in black along the top.
Resumo:
Felice Gigante a graduate from the New York Trade School Electronics program works on a machine in his job as Data Processing Customer Engineer for the International Business Machines Corp. Original caption reads, "Felice Gigante - Electronices, International Business Machines Corp." Black and white photograph with caption glued to reverse.
Resumo:
GPS technology has been embedded into portable, low-cost electronic devices nowadays to track the movements of mobile objects. This implication has greatly impacted the transportation field by creating a novel and rich source of traffic data on the road network. Although the promise offered by GPS devices to overcome problems like underreporting, respondent fatigue, inaccuracies and other human errors in data collection is significant; the technology is still relatively new that it raises many issues for potential users. These issues tend to revolve around the following areas: reliability, data processing and the related application. This thesis aims to study the GPS tracking form the methodological, technical and practical aspects. It first evaluates the reliability of GPS based traffic data based on data from an experiment containing three different traffic modes (car, bike and bus) traveling along the road network. It then outline the general procedure for processing GPS tracking data and discuss related issues that are uncovered by using real-world GPS tracking data of 316 cars. Thirdly, it investigates the influence of road network density in finding optimal location for enhancing travel efficiency and decreasing travel cost. The results show that the geographical positioning is reliable. Velocity is slightly underestimated, whereas altitude measurements are unreliable.Post processing techniques with auxiliary information is found necessary and important when solving the inaccuracy of GPS data. The densities of the road network influence the finding of optimal locations. The influence will stabilize at a certain level and do not deteriorate when the node density is higher.