915 resultados para Data replication processes


Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Blochmannia are obligately intracellular bacterial mutualists of ants of the tribe Camponotini. Blochmannia perform key nutritional functions for the host, including synthesis of several essential amino acids. We used Illumina technology to sequence the genome of Blochmannia associated with Camponotus vafer. RESULTS: Although Blochmannia vafer retains many nutritional functions, it is missing glutamine synthetase (glnA), a component of the nitrogen recycling pathway encoded by the previously sequenced B. floridanus and B. pennsylvanicus. With the exception of Ureaplasma, B. vafer is the only sequenced bacterium to date that encodes urease but lacks the ability to assimilate ammonia into glutamine or glutamate. Loss of glnA occurred in a deletion hotspot near the putative replication origin. Overall, compared to the likely gene set of their common ancestor, 31 genes are missing or eroded in B. vafer, compared to 28 in B. floridanus and four in B. pennsylvanicus. Three genes (queA, visC and yggS) show convergent loss or erosion, suggesting relaxed selection for their functions. Eight B. vafer genes contain frameshifts in homopolymeric tracts that may be corrected by transcriptional slippage. Two of these encode DNA replication proteins: dnaX, which we infer is also frameshifted in B. floridanus, and dnaG. CONCLUSIONS: Comparing the B. vafer genome with B. pennsylvanicus and B. floridanus refines the core genes shared within the mutualist group, thereby clarifying functions required across ant host species. This third genome also allows us to track gene loss and erosion in a phylogenetic context to more fully understand processes of genome reduction.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Historically, only partial assessments of data quality have been performed in clinical trials, for which the most common method of measuring database error rates has been to compare the case report form (CRF) to database entries and count discrepancies. Importantly, errors arising from medical record abstraction and transcription are rarely evaluated as part of such quality assessments. Electronic Data Capture (EDC) technology has had a further impact, as paper CRFs typically leveraged for quality measurement are not used in EDC processes. METHODS AND PRINCIPAL FINDINGS: The National Institute on Drug Abuse Treatment Clinical Trials Network has developed, implemented, and evaluated methodology for holistically assessing data quality on EDC trials. We characterize the average source-to-database error rate (14.3 errors per 10,000 fields) for the first year of use of the new evaluation method. This error rate was significantly lower than the average of published error rates for source-to-database audits, and was similar to CRF-to-database error rates reported in the published literature. We attribute this largely to an absence of medical record abstraction on the trials we examined, and to an outpatient setting characterized by less acute patient conditions. CONCLUSIONS: Historically, medical record abstraction is the most significant source of error by an order of magnitude, and should be measured and managed during the course of clinical trials. Source-to-database error rates are highly dependent on the amount of structured data collection in the clinical setting and on the complexity of the medical record, dependencies that should be considered when developing data quality benchmarks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The inherent complexity of statistical methods and clinical phenomena compel researchers with diverse domains of expertise to work in interdisciplinary teams, where none of them have a complete knowledge in their counterpart's field. As a result, knowledge exchange may often be characterized by miscommunication leading to misinterpretation, ultimately resulting in errors in research and even clinical practice. Though communication has a central role in interdisciplinary collaboration and since miscommunication can have a negative impact on research processes, to the best of our knowledge, no study has yet explored how data analysis specialists and clinical researchers communicate over time. METHODS/PRINCIPAL FINDINGS: We conducted qualitative analysis of encounters between clinical researchers and data analysis specialists (epidemiologist, clinical epidemiologist, and data mining specialist). These encounters were recorded and systematically analyzed using a grounded theory methodology for extraction of emerging themes, followed by data triangulation and analysis of negative cases for validation. A policy analysis was then performed using a system dynamics methodology looking for potential interventions to improve this process. Four major emerging themes were found. Definitions using lay language were frequently employed as a way to bridge the language gap between the specialties. Thought experiments presented a series of "what if" situations that helped clarify how the method or information from the other field would behave, if exposed to alternative situations, ultimately aiding in explaining their main objective. Metaphors and analogies were used to translate concepts across fields, from the unfamiliar to the familiar. Prolepsis was used to anticipate study outcomes, thus helping specialists understand the current context based on an understanding of their final goal. CONCLUSION/SIGNIFICANCE: The communication between clinical researchers and data analysis specialists presents multiple challenges that can lead to errors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Adrenergic receptors are prototypic models for the study of the relations between structure and function of G protein-coupled receptors. Each receptor is encoded by a distinct gene. These receptors are integral membrane proteins with several striking structural features. They consist of a single subunit containing seven stretches of 20-28 hydrophobic amino acids that represent potential membrane-spanning alpha-helixes. Many of these receptors share considerable amino acid sequence homology, particularly in the transmembrane domains. All of these macromolecules share other similarities that include one or more potential sites of extracellular N-linked glycosylation near the amino terminus and several potential sites of regulatory phosphorylation that are located intracellularly. By using a variety of techniques, it has been demonstrated that various regions of the receptor molecules are critical for different receptor functions. The seven transmembrane regions of the receptors appear to form a ligand-binding pocket. Cysteine residues in the extracellular domains may stabilize the ligand-binding pocket by participating in disulfide bonds. The cytoplasmic domains contain regions capable of interacting with G proteins and various kinases and are therefore important in such processes as signal transduction, receptor-G protein coupling, receptor sequestration, and down-regulation. Finally, regions of these macromolecules may undergo posttranslational modifications important in the regulation of receptor function. Our understanding of these complex relations is constantly evolving and much work remains to be done. Greater understanding of the basic mechanisms involved in G protein-coupled, receptor-mediated signal transduction may provide leads into the nature of certain pathophysiological states.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An enterprise information system (EIS) is an integrated data-applications platform characterized by diverse, heterogeneous, and distributed data sources. For many enterprises, a number of business processes still depend heavily on static rule-based methods and extensive human expertise. Enterprises are faced with the need for optimizing operation scheduling, improving resource utilization, discovering useful knowledge, and making data-driven decisions.

This thesis research is focused on real-time optimization and knowledge discovery that addresses workflow optimization, resource allocation, as well as data-driven predictions of process-execution times, order fulfillment, and enterprise service-level performance. In contrast to prior work on data analytics techniques for enterprise performance optimization, the emphasis here is on realizing scalable and real-time enterprise intelligence based on a combination of heterogeneous system simulation, combinatorial optimization, machine-learning algorithms, and statistical methods.

On-demand digital-print service is a representative enterprise requiring a powerful EIS.We use real-life data from Reischling Press, Inc. (RPI), a digit-print-service provider (PSP), to evaluate our optimization algorithms.

In order to handle the increase in volume and diversity of demands, we first present a high-performance, scalable, and real-time production scheduling algorithm for production automation based on an incremental genetic algorithm (IGA). The objective of this algorithm is to optimize the order dispatching sequence and balance resource utilization. Compared to prior work, this solution is scalable for a high volume of orders and it provides fast scheduling solutions for orders that require complex fulfillment procedures. Experimental results highlight its potential benefit in reducing production inefficiencies and enhancing the productivity of an enterprise.

We next discuss analysis and prediction of different attributes involved in hierarchical components of an enterprise. We start from a study of the fundamental processes related to real-time prediction. Our process-execution time and process status prediction models integrate statistical methods with machine-learning algorithms. In addition to improved prediction accuracy compared to stand-alone machine-learning algorithms, it also performs a probabilistic estimation of the predicted status. An order generally consists of multiple series and parallel processes. We next introduce an order-fulfillment prediction model that combines advantages of multiple classification models by incorporating flexible decision-integration mechanisms. Experimental results show that adopting due dates recommended by the model can significantly reduce enterprise late-delivery ratio. Finally, we investigate service-level attributes that reflect the overall performance of an enterprise. We analyze and decompose time-series data into different components according to their hierarchical periodic nature, perform correlation analysis,

and develop univariate prediction models for each component as well as multivariate models for correlated components. Predictions for the original time series are aggregated from the predictions of its components. In addition to a significant increase in mid-term prediction accuracy, this distributed modeling strategy also improves short-term time-series prediction accuracy.

In summary, this thesis research has led to a set of characterization, optimization, and prediction tools for an EIS to derive insightful knowledge from data and use them as guidance for production management. It is expected to provide solutions for enterprises to increase reconfigurability, accomplish more automated procedures, and obtain data-driven recommendations or effective decisions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cells have evolved oscillators with different frequencies to coordinate periodic processes. Here we studied the interaction of two oscillators, the cell division cycle (CDC) and the yeast metabolic cycle (YMC), in budding yeast. Previous work suggested that the CDC and YMC interact to separate high oxygen consumption (HOC) from DNA replication to prevent genetic damage. To test this hypothesis, we grew diverse strains in chemostat and measured DNA replication and oxygen consumption with high temporal resolution at different growth rates. Our data showed that HOC is not strictly separated from DNA replication; rather, cell cycle Start is coupled with the initiation of HOC and catabolism of storage carbohydrates. The logic of this YMC-CDC coupling may be to ensure that DNA replication and cell division occur only when sufficient cellular energy reserves have accumulated. Our results also uncovered a quantitative relationship between CDC period and YMC period across different strains. More generally, our approach shows how studies in genetically diverse strains efficiently identify robust phenotypes and steer the experimentalist away from strain-specific idiosyncrasies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

© 2016 Burnetti et al. Cells have evolved oscillators with different frequencies to coordinate periodic processes. Here we studied the interaction of two oscillators, the cell division cycle (CDC) and the yeast metabolic cycle (YMC), in budding yeast. Previous work suggested that the CDC and YMC interact to separate high oxygen consumption (HOC) from DNA replication to prevent genetic damage. To test this hypothesis, we grew diverse strains in chemostat and measured DNA replication and oxygen consumption with high temporal resolution at different growth rates. Our data showed that HOC is not strictly separated from DNA replication; rather, cell cycle Start is coupled with the initiation of HOC and catabolism of storage carbohydrates. The logic of this YMC-CDC coupling may be to ensure that DNA replication and cell division occur only when sufficient cellular energy reserves have accumulated. Our results also uncovered a quantitative relationship between CDC period and YMC period across different strains. More generally, our approach shows how studies in genetically diverse strains efficiently identify robust phenotypes and steer the experimentalist away from strain-specific idiosyncrasies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Technology-supported citizen science has created huge volumes of data with increasing potential to facilitate scientific progress, however, verifying data quality is still a substantial hurdle due to the limitations of existing data quality mechanisms. In this study, we adopted a mixed methods approach to investigate community-based data validation practices and the characteristics of records of wildlife species observations that affected the outcomes of collaborative data quality management in an online community where people record what they see in the nature. The findings describe the processes that both relied upon and added to information provenance through information stewardship behaviors, which led to improved reliability and informativity. The likelihood of community-based validation interactions were predicted by several factors, including the types of organisms observed and whether the data were submitted from a mobile device. We conclude with implications for technology design, citizen science practices, and research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study two marked point process models based on the Cox process. These models are used to describe the probabilistic structure of the rainfall intensity process. Mathematical formulation of the models is described and some second-moment characteristics of the rainfall depth, and aggregated processes are considered. The derived second-order properties of the accumulated rainfall amounts at different levels of aggregation are used in order to examine the model fit. A brief data analysis is presented. Copyright © 1998 John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A casting route is often the most cost-effective means of producing engineering components. However, certain materials, particularly those based on Ti, TiAl and Zr alloy systems, are very reactive in the molten condition and must be melted in special furnaces. Induction Skull Melting (ISM) is the most widely-used process for melting these alloys prior to casting components such as turbine blades, engine valves, turbocharger rotors and medical prostheses. A major research project is underway with the specific target of developing robust techniques for casting TiAl components. The aims include increasing the superheat in the molten metal to allow thin section components to be cast, improving the quality of the cast components and increasing the energy efficiency of the process. As part of this, the University of Greenwich (UK) is developing a computer model of the ISM process in close collaboration with the University of Birmingham (UK) where extensive melting trials are being undertaken. This paper describes the experimental measurements to obtain data to feed into and to validate the model. These include measurements of the true RMS current applied to the induction coil, the heat transfer from the molten metal to the crucible cooling water, and the shape of the column of semi-levitated molten metal. Data are presented for Al, Ni and TiAl.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thermally stimulated current (TSC) spectroscopy is attracting increasing attention as a means of materials characterization, particularly in terms of measuring slow relaxation processes in solid samples. However, wider use of the technique within the pharmaceutical field has been inhibited by difficulties associated with the interpretation of TSC data, particularly in terms of deconvoluting dipolar relaxation processes from charge distribution phenomena. Here, we present evidence that space charge and electrode contact effects may play a significant role in the generation of peaks that have thus far proved difficult to interpret. We also introduce the use of a stabilization temperature in order to control the space charge magnitude. We have studied amorphous indometacin as a model drug compound and have varied the measurement parameters (stabilization and polarization temperatures), interpreting the changes in spectral composition in terms of charge redistribution processes. More specifically, we suggested that charge drift and diffusion processes, charge injection from the electrodes and high activation energy charge redistribution processes may all contribute to the appearance of shoulders and 'spurious' peaks. We present recommendations for eliminating or reducing these effects that may allow more confident interpretation of TSC data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Görzig, H., Engel, F., Brocks, H., Vogel, T. & Hemmje, M. (2015, August). Towards Data Management Planning Support for Research Data. Paper presented at the ASE International Conference on Data Science, Stanford, United States of America.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Noise is one of the main factors degrading the quality of original multichannel remote sensing data and its presence influences classification efficiency, object detection, etc. Thus, pre-filtering is often used to remove noise and improve the solving of final tasks of multichannel remote sensing. Recent studies indicate that a classical model of additive noise is not adequate enough for images formed by modern multichannel sensors operating in visible and infrared bands. However, this fact is often ignored by researchers designing noise removal methods and algorithms. Because of this, we focus on the classification of multichannel remote sensing images in the case of signal-dependent noise present in component images. Three approaches to filtering of multichannel images for the considered noise model are analysed, all based on discrete cosine transform in blocks. The study is carried out not only in terms of conventional efficiency metrics used in filtering (MSE) but also in terms of multichannel data classification accuracy (probability of correct classification, confusion matrix). The proposed classification system combines the pre-processing stage where a DCT-based filter processes the blocks of the multichannel remote sensing image and the classification stage. Two modern classifiers are employed, radial basis function neural network and support vector machines. Simulations are carried out for three-channel image of Landsat TM sensor. Different cases of learning are considered: using noise-free samples of the test multichannel image, the noisy multichannel image and the pre-filtered one. It is shown that the use of the pre-filtered image for training produces better classification in comparison to the case of learning for the noisy image. It is demonstrated that the best results for both groups of quantitative criteria are provided if a proposed 3D discrete cosine transform filter equipped by variance stabilizing transform is applied. The classification results obtained for data pre-filtered in different ways are in agreement for both considered classifiers. Comparison of classifier performance is carried out as well. The radial basis neural network classifier is less sensitive to noise in original images, but after pre-filtering the performance of both classifiers is approximately the same.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While documentation of climate effects on marine ecosystems has a long history, the underlying processes have often been elusive. In this paper we review some of the ecosystem responses to climate variability and discuss the possible mechanisms through which climate acts. Effects of climatological and oceanographic variables, such as temperature, sea ice, turbulence, and advection, on marine organisms are discussed in terms of their influence on growth, distribution, reproduction, activity rates, recruitment and mortality. Organisms tend to be limited to specific thermal ranges with experimental findings showing that sufficient oxygen supply by ventilation and circulation only occurs within these ranges. Indirect effects of climate forcing through effects on the food web are also discussed. Research and data needs required to improve our knowledge of the processes linking climate to ecosystem changes are presented along with our assessment of our ability to predict ecosystem responses to future climate change scenarios. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Novel techniques have been developed for increasing the value of cloud-affected sequences of Advanced Very High Resolution Radiometer (AVHRR) sea-surface temperature (SST) data and Sea-viewing Wide Field-of-view Sensor (SeaWiFS) ocean colour data for visualising dynamic physical and biological oceanic processes such as fronts, eddies and blooms. The proposed composite front map approach is to combine the location, strength and persistence of all fronts observed over several days into a single map, which allows intuitive interpretation of mesoscale structures. This method achieves a synoptic view without blurring dynamic features, an inherent problem with conventional time-averaging compositing methods. Objective validation confirms a significant improvement in feature visibility on composite maps compared to individual front maps. A further novel aspect is the automated detection of ocean colour fronts, correctly locating 96% of chlorophyll fronts in a test data set. A sizeable data set of 13,000 AVHRR and 1200 SeaWiFS scenes automatically processed using this technique is applied to the study of dynamic processes off the Iberian Peninsula such as mesoscale eddy generation, and many additional applications are identified. Front map animations provide a unique insight into the evolution of upwelling and eddies.