947 resultados para Data pre-processing


Relevância:

40.00% 40.00%

Publicador:

Resumo:

We have analysed the extent of base-pairing interactions between spacer sequences of histone pre-mRNA and U7 snRNA present in the trans-acting U7 snRNP and their importance for histone RNA 3' end processing in vitro. For the efficiently processed mouse H4-12 gene, a computer analysis revealed that additional base pairs could be formed with U7 RNA outside of the previously recognised spacer element (stem II). One complementarity (stem III) is located more 3' and involves nucleotides from the very 5' end of U7 RNA. The other, more 5' located complementarity (stem I) involves nucleotides of the Sm binding site of U7 RNA, a part known to interact with snRNP structural proteins. These potential stem structures are separated from each other by short internal loops of unpaired nucleotides. Mutational analyses of the pre-mRNA indicate that stems II and III are equally important for interaction with the U7 snRNP and for processing, whereas mutations in stem I have moderate effects on processing efficiency, but do not impair complex formation with the U7 snRNP. Thus nucleotides near the processing site may be important for processing, but do not contribute to the assembly of an active complex by forming a stem I structure. The importance of stem III was confirmed by the ability of a complementary mutation in U7 RNA to suppress a stem III mutation in a complementation assay using Xenopus laevis oocytes. The main role of the factor(s) binding to the upstream hairpin loop is to stabilise the U7-pre-mRNA complex. This was shown by either stabilising (by mutation) or destabilising (by increased temperature) the U7-pre-mRNA base-pairing under conditions where hairpin factor binding was either allowed or prevented (by mutation or competition). The hairpin dependence of processing was found to be inversely related to the strength of the U7-pre-mRNA interaction.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We have studied the requirements for efficient histone-specific RNA 3' processing in nuclear extract from mammalian tissue culture cells. Processing is strongly impaired by mutations in the pre-mRNA spacer element that reduce the base-pairing potential with U7 RNA. Moreover, by exchanging the hairpin and spacer elements of two differently processed H4 genes, we find that this difference is exclusively due to the spacer element. Finally, processing is inhibited by the addition of competitor RNAs, if these contain a wild-type spacer sequence, but not if their spacer element is mutated. Conversely, the importance of the hairpin for histone RNA 3' processing is highly variable: A hairpin mutant of the H4-12 gene is processed with almost wild-type efficiency in extract from K21 mouse mastocytoma cells but is strongly affected in HeLa cell extract, whereas an identical hairpin mutant of the H4-1 gene is affected in both extracts. The hairpin defect of H4-12-specific RNA in HeLa cells can be overcome by a compensatory mutation that increases the base complementarity to U7 snRNA. Very similar results were also obtained in RNA competition experiments: processing of H4-12-specific RNA can be competed by RNA carrying a wild-type hairpin element in extract from HeLa, but not K21 cells, whereas processing of H4-1-specific RNA can be competed in both extracts. With two additional histone genes we obtained results that were in one case intermediate and in the other similar to those obtained with H4-1. These results suggest that hairpin binding factor(s) can cooperatively support the ability of U7 snRNPs to form an active processing complex, but is(are) not directly involved in the processing mechanism.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Identifying accurate numbers of soldiers determined to be medically not ready after completing soldier readiness processing may help inform Army leadership about ongoing pressures on the military involved in long conflict with regular deployment. In Army soldiers screened using the SRP checklist for deployment, what is the prevalence of soldiers determined to be medically not ready? Study group. 15,289 soldiers screened at all 25 Army deployment platform sites with the eSRP checklist over a 4-month period (June 20, 2009 to October 20, 2009). The data included for analysis included age, rank, component, gender and final deployment medical readiness status from MEDPROS database. Methods.^ This information was compiled and univariate analysis using chi-square was conducted for each of the key variables by medical readiness status. Results. Descriptive epidemiology Of the total sample 1548 (9.7%) were female and 14319 (90.2%) were male. Enlisted soldiers made up 13,543 (88.6%) of the sample and officers 1,746 (11.4%). In the sample, 1533 (10.0%) were soldiers over the age of 40 and 13756 (90.0%) were age 18-40. Reserve, National Guard and Active Duty made up 1,931 (12.6%), 2,942 (19.2%) and 10,416 (68.1%) respectively. Univariate analysis. Overall 1226 (8.0%) of the soldiers screened were determined to be medically not ready for deployment. Biggest predictive factor was female gender OR (2.8; 2.57-3.28) p<0.001. Followed by enlisted rank OR (2.01; 1.60-2.53) p<0.001. Reserve component OR (1.33; 1.16-1.53) p<0.001 and Guard OR (0.37; 0.30-0.46) p<0.001. For age > 40 demonstrated OR (1.2; 1.09-1.50) p<0.003. Overall the results underscore there may be key demographic groups relating to medical readiness that can be targeted with programs and funding to improve overall military medical readiness.^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

High Angular Resolution Diffusion Imaging (HARDI) techniques, including Diffusion Spectrum Imaging (DSI), have been proposed to resolve crossing and other complex fiber architecture in the human brain white matter. In these methods, directional information of diffusion is inferred from the peaks in the orientation distribution function (ODF). Extensive studies using histology on macaque brain, cat cerebellum, rat hippocampus and optic tracts, and bovine tongue are qualitatively in agreement with the DSI-derived ODFs and tractography. However, there are only two studies in the literature which validated the DSI results using physical phantoms and both these studies were not performed on a clinical MRI scanner. Also, the limited studies which optimized DSI in a clinical setting, did not involve a comparison against physical phantoms. Finally, there is lack of consensus on the necessary pre- and post-processing steps in DSI; and ground truth diffusion fiber phantoms are not yet standardized. Therefore, the aims of this dissertation were to design and construct novel diffusion phantoms, employ post-processing techniques in order to systematically validate and optimize (DSI)-derived fiber ODFs in the crossing regions on a clinical 3T MR scanner, and develop user-friendly software for DSI data reconstruction and analysis. Phantoms with a fixed crossing fiber configuration of two crossing fibers at 90° and 45° respectively along with a phantom with three crossing fibers at 60°, using novel hollow plastic capillaries and novel placeholders, were constructed. T2-weighted MRI results on these phantoms demonstrated high SNR, homogeneous signal, and absence of air bubbles. Also, a technique to deconvolve the response function of an individual peak from the overall ODF was implemented, in addition to other DSI post-processing steps. This technique greatly improved the angular resolution of the otherwise unresolvable peaks in a crossing fiber ODF. The effects of DSI acquisition parameters and SNR on the resultant angular accuracy of DSI on the clinical scanner were studied and quantified using the developed phantoms. With a high angular direction sampling and reasonable levels of SNR, quantification of a crossing region in the 90°, 45° and 60° phantoms resulted in a successful detection of angular information with mean ± SD of 86.93°±2.65°, 44.61°±1.6° and 60.03°±2.21° respectively, while simultaneously enhancing the ODFs in regions containing single fibers. For the applicability of these validated methodologies in DSI, improvement in ODFs and fiber tracking from known crossing fiber regions in normal human subjects were demonstrated; and an in-house software package in MATLAB which streamlines the data reconstruction and post-processing for DSI, with easy to use graphical user interface was developed. In conclusion, the phantoms developed in this dissertation offer a means of providing ground truth for validation of reconstruction and tractography algorithms of various diffusion models (including DSI). Also, the deconvolution methodology (when applied as an additional DSI post-processing step) significantly improved the angular accuracy of the ODFs obtained from DSI, and should be applicable to ODFs obtained from the other high angular resolution diffusion imaging techniques.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In a series of attempts to research and document relevant sloshing type phenomena, a series of experiments have been conducted. The aim of this paper is to describe the setup and data processing of such experiments. A sloshing tank is subjected to angular motion. As a result pressure registers are obtained at several locations, together with the motion data, torque and a collection of image and video information. The experimental rig and the data acquisition systems are described. Useful information for experimental sloshing research practitioners is provided. This information is related to the liquids used in the experiments, the dying techniques, tank building processes, synchronization of acquisition systems, etc. A new procedure for reconstructing experimental data, that takes into account experimental uncertainties, is presented. This procedure is based on a least squares spline approximation of the data. Based on a deterministic approach to the first sloshing wave impact event in a sloshing experiment, an uncertainty analysis procedure of the associated first pressure peak value is described.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The manipulation and handling of an ever increasing volume of data by current data-intensive applications require novel techniques for e?cient data management. Despite recent advances in every aspect of data management (storage, access, querying, analysis, mining), future applications are expected to scale to even higher degrees, not only in terms of volumes of data handled but also in terms of users and resources, often making use of multiple, pre-existing autonomous, distributed or heterogeneous resources.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

To date, big data applications have focused on the store-and-process paradigm. In this paper we describe an initiative to deal with big data applications for continuous streams of events. In many emerging applications, the volume of data being streamed is so large that the traditional ‘store-then-process’ paradigm is either not suitable or too inefficient. Moreover, soft-real time requirements might severely limit the engineering solutions. Many scenarios fit this description. In network security for cloud data centres, for instance, very high volumes of IP packets and events from sensors at firewalls, network switches and routers and servers need to be analyzed and should detect attacks in minimal time, in order to limit the effect of the malicious activity over the IT infrastructure. Similarly, in the fraud department of a credit card company, payment requests should be processed online and need to be processed as quickly as possible in order to provide meaningful results in real-time. An ideal system would detect fraud during the authorization process that lasts hundreds of milliseconds and deny the payment authorization, minimizing the damage to the user and the credit card company.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Microarray technique is rather powerful, as it allows to test up thousands of genes at a time, but this produces an overwhelming set of data files containing huge amounts of data, which is quite difficult to pre-process, separate, classify and correlate for interesting conclusions to be extracted. Modern machine learning, data mining and clustering techniques based on information theory, are needed to read and interpret the information contents buried in those large data sets. Independent Component Analysis method can be used to correct the data affected by corruption processes or to filter the uncorrectable one and then clustering methods can group similar genes or classify samples. In this paper a hybrid approach is used to obtain a two way unsupervised clustering for a corrected microarray data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper proposes the optimization relaxation approach based on the analogue Hopfield Neural Network (HNN) for cluster refinement of pre-classified Polarimetric Synthetic Aperture Radar (PolSAR) image data. We consider the initial classification provided by the maximum-likelihood classifier based on the complex Wishart distribution, which is then supplied to the HNN optimization approach. The goal is to improve the classification results obtained by the Wishart approach. The classification improvement is verified by computing a cluster separability coefficient and a measure of homogeneity within the clusters. During the HNN optimization process, for each iteration and for each pixel, two consistency coefficients are computed, taking into account two types of relations between the pixel under consideration and its corresponding neighbors. Based on these coefficients and on the information coming from the pixel itself, the pixel under study is re-classified. Different experiments are carried out to verify that the proposed approach outperforms other strategies, achieving the best results in terms of separability and a trade-off with the homogeneity preserving relevant structures in the image. The performance is also measured in terms of computational central processing unit (CPU) times.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Due to the advancement of both, information technology in general, and databases in particular; data storage devices are becoming cheaper and data processing speed is increasing. As result of this, organizations tend to store large volumes of data holding great potential information. Decision Support Systems, DSS try to use the stored data to obtain valuable information for organizations. In this paper, we use both data models and use cases to represent the functionality of data processing in DSS following Software Engineering processes. We propose a methodology to develop DSS in the Analysis phase, respective of data processing modeling. We have used, as a starting point, a data model adapted to the semantics involved in multidimensional databases or data warehouses, DW. Also, we have taken an algorithm that provides us with all the possible ways to automatically cross check multidimensional model data. Using the aforementioned, we propose diagrams and descriptions of use cases, which can be considered as patterns representing the DSS functionality, in regard to DW data processing, DW on which DSS are based. We highlight the reusability and automation benefits that this can be achieved, and we think this study can serve as a guide in the development of DSS.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Due to the relative transparency of its embryos and larvae, the zebrafish is an ideal model organism for bioimaging approaches in vertebrates. Novel microscope technologies allow the imaging of developmental processes in unprecedented detail, and they enable the use of complex image-based read-outs for high-throughput/high-content screening. Such applications can easily generate Terabytes of image data, the handling and analysis of which becomes a major bottleneck in extracting the targeted information. Here, we describe the current state of the art in computational image analysis in the zebrafish system. We discuss the challenges encountered when handling high-content image data, especially with regard to data quality, annotation, and storage. We survey methods for preprocessing image data for further analysis, and describe selected examples of automated image analysis, including the tracking of cells during embryogenesis, heartbeat detection, identification of dead embryos, recognition of tissues and anatomical landmarks, and quantification of behavioral patterns of adult fish. We review recent examples for applications using such methods, such as the comprehensive analysis of cell lineages during early development, the generation of a three-dimensional brain atlas of zebrafish larvae, and high-throughput drug screens based on movement patterns. Finally, we identify future challenges for the zebrafish image analysis community, notably those concerning the compatibility of algorithms and data formats for the assembly of modular analysis pipelines.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the last years, there has been an increase in the amount of real-time data generated. Sensors attached to things are transforming how we interact with our environment. Extracting meaningful information from these streams of data is essential for some application areas and requires processing systems that scale to varying conditions in data sources, complex queries, and system failures. This paper describes ongoing research on the development of a scalable RDF streaming engine.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Con el auge del Cloud Computing, las aplicaciones de proceso de datos han sufrido un incremento de demanda, y por ello ha cobrado importancia lograr m�ás eficiencia en los Centros de Proceso de datos. El objetivo de este trabajo es la obtenci�ón de herramientas que permitan analizar la viabilidad y rentabilidad de diseñar Centros de Datos especializados para procesamiento de datos, con una arquitectura, sistemas de refrigeraci�ón, etc. adaptados. Algunas aplicaciones de procesamiento de datos se benefician de las arquitecturas software, mientras que en otras puede ser m�ás eficiente un procesamiento con arquitectura hardware. Debido a que ya hay software con muy buenos resultados en el procesamiento de grafos, como el sistema XPregel, en este proyecto se realizará una arquitectura hardware en VHDL, implementando el algoritmo PageRank de Google de forma escalable. Se ha escogido este algoritmo ya que podr��á ser m�ás eficiente en arquitectura hardware, debido a sus características concretas que se indicaráan m�ás adelante. PageRank sirve para ordenar las p�áginas por su relevancia en la web, utilizando para ello la teorí��a de grafos, siendo cada página web un vértice de un grafo; y los enlaces entre páginas, las aristas del citado grafo. En este proyecto, primero se realizará un an�álisis del estado de la técnica. Se supone que la implementaci�ón en XPregel, un sistema de procesamiento de grafos, es una de las m�ás eficientes. Por ello se estudiará esta �ultima implementaci�ón. Sin embargo, debido a que Xpregel procesa, en general, algoritmos que trabajan con grafos; no tiene en cuenta ciertas caracterí��sticas del algoritmo PageRank, por lo que la implementaci�on no es �optima. Esto es debido a que en PageRank, almacenar todos los datos que manda un mismo v�értice es un gasto innecesario de memoria ya que todos los mensajes que manda un vértice son iguales entre sí e iguales a su PageRank. Se realizará el diseño en VHDL teniendo en cuenta esta caracter��ística del citado algoritmo,evitando almacenar varias veces los mensajes que son iguales. Se ha elegido implementar PageRank en VHDL porque actualmente las arquitecturas de los sistemas operativos no escalan adecuadamente. Se busca evaluar si con otra arquitectura se obtienen mejores resultados. Se realizará un diseño partiendo de cero, utilizando la memoria ROM de IPcore de Xillinx (Software de desarrollo en VHDL), generada autom�áticamente. Se considera hacer cuatro tipos de módulos para que as�� el procesamiento se pueda hacer en paralelo. Se simplificar�á la estructura de XPregel con el fin de intentar aprovechar la particularidad de PageRank mencionada, que hace que XPregel no le saque el m�aximo partido. Despu�és se escribirá el c�ódigo, realizando una estructura escalable, ya que en la computación intervienen millones de páginas web. A continuación, se sintetizar�á y se probará el código en una FPGA. El �ultimo paso será una evaluaci�ón de la implementaci�ón, y de posibles mejoras en cuanto al consumo.