3 resultados para classification aided by clustering

em Duke University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

An enterprise information system (EIS) is an integrated data-applications platform characterized by diverse, heterogeneous, and distributed data sources. For many enterprises, a number of business processes still depend heavily on static rule-based methods and extensive human expertise. Enterprises are faced with the need for optimizing operation scheduling, improving resource utilization, discovering useful knowledge, and making data-driven decisions.

This thesis research is focused on real-time optimization and knowledge discovery that addresses workflow optimization, resource allocation, as well as data-driven predictions of process-execution times, order fulfillment, and enterprise service-level performance. In contrast to prior work on data analytics techniques for enterprise performance optimization, the emphasis here is on realizing scalable and real-time enterprise intelligence based on a combination of heterogeneous system simulation, combinatorial optimization, machine-learning algorithms, and statistical methods.

On-demand digital-print service is a representative enterprise requiring a powerful EIS.We use real-life data from Reischling Press, Inc. (RPI), a digit-print-service provider (PSP), to evaluate our optimization algorithms.

In order to handle the increase in volume and diversity of demands, we first present a high-performance, scalable, and real-time production scheduling algorithm for production automation based on an incremental genetic algorithm (IGA). The objective of this algorithm is to optimize the order dispatching sequence and balance resource utilization. Compared to prior work, this solution is scalable for a high volume of orders and it provides fast scheduling solutions for orders that require complex fulfillment procedures. Experimental results highlight its potential benefit in reducing production inefficiencies and enhancing the productivity of an enterprise.

We next discuss analysis and prediction of different attributes involved in hierarchical components of an enterprise. We start from a study of the fundamental processes related to real-time prediction. Our process-execution time and process status prediction models integrate statistical methods with machine-learning algorithms. In addition to improved prediction accuracy compared to stand-alone machine-learning algorithms, it also performs a probabilistic estimation of the predicted status. An order generally consists of multiple series and parallel processes. We next introduce an order-fulfillment prediction model that combines advantages of multiple classification models by incorporating flexible decision-integration mechanisms. Experimental results show that adopting due dates recommended by the model can significantly reduce enterprise late-delivery ratio. Finally, we investigate service-level attributes that reflect the overall performance of an enterprise. We analyze and decompose time-series data into different components according to their hierarchical periodic nature, perform correlation analysis,

and develop univariate prediction models for each component as well as multivariate models for correlated components. Predictions for the original time series are aggregated from the predictions of its components. In addition to a significant increase in mid-term prediction accuracy, this distributed modeling strategy also improves short-term time-series prediction accuracy.

In summary, this thesis research has led to a set of characterization, optimization, and prediction tools for an EIS to derive insightful knowledge from data and use them as guidance for production management. It is expected to provide solutions for enterprises to increase reconfigurability, accomplish more automated procedures, and obtain data-driven recommendations or effective decisions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existing theories explain why operons are advantageous in prokaryotes, but their occurrence in metazoans is an enigma. Nematode operon genes, typically consisting of growth genes, are significantly upregulated during recovery from growth-arrested states. This expression pattern is anticorrelated to nonoperon genes, consistent with a competition for transcriptional resources. We find that transcriptional resources are initially limiting during recovery and that recovering animals are highly sensitive to any additional decrease in transcriptional resources. We provide evidence that operons become advantageous because, by clustering growth genes into operons, fewer promoters compete for the limited transcriptional machinery, effectively increasing the concentration of transcriptional resources and accelerating recovery. Mathematical modeling reveals how a moderate increase in transcriptional resources can substantially enhance transcription rate and recovery. This design principle occurs in different nematodes and the chordate C. intestinalis. As transition from arrest to rapid growth is shared by many metazoans, operons could have evolved to facilitate these processes.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Human activities represent a significant burden on the global water cycle, with large and increasing demands placed on limited water resources by manufacturing, energy production and domestic water use. In addition to changing the quantity of available water resources, human activities lead to changes in water quality by introducing a large and often poorly-characterized array of chemical pollutants, which may negatively impact biodiversity in aquatic ecosystems, leading to impairment of valuable ecosystem functions and services. Domestic and industrial wastewaters represent a significant source of pollution to the aquatic environment due to inadequate or incomplete removal of chemicals introduced into waters by human activities. Currently, incomplete chemical characterization of treated wastewaters limits comprehensive risk assessment of this ubiquitous impact to water. In particular, a significant fraction of the organic chemical composition of treated industrial and domestic wastewaters remains uncharacterized at the molecular level. Efforts aimed at reducing the impacts of water pollution on aquatic ecosystems critically require knowledge of the composition of wastewaters to develop interventions capable of protecting our precious natural water resources.

The goal of this dissertation was to develop a robust, extensible and high-throughput framework for the comprehensive characterization of organic micropollutants in wastewaters by high-resolution accurate-mass mass spectrometry. High-resolution mass spectrometry provides the most powerful analytical technique available for assessing the occurrence and fate of organic pollutants in the water cycle. However, significant limitations in data processing, analysis and interpretation have limited this technique in achieving comprehensive characterization of organic pollutants occurring in natural and built environments. My work aimed to address these challenges by development of automated workflows for the structural characterization of organic pollutants in wastewater and wastewater impacted environments by high-resolution mass spectrometry, and to apply these methods in combination with novel data handling routines to conduct detailed fate studies of wastewater-derived organic micropollutants in the aquatic environment.

In Chapter 2, chemoinformatic tools were implemented along with novel non-targeted mass spectrometric analytical methods to characterize, map, and explore an environmentally-relevant “chemical space” in municipal wastewater. This was accomplished by characterizing the molecular composition of known wastewater-derived organic pollutants and substances that are prioritized as potential wastewater contaminants, using these databases to evaluate the pollutant-likeness of structures postulated for unknown organic compounds that I detected in wastewater extracts using high-resolution mass spectrometry approaches. Results showed that application of multiple computational mass spectrometric tools to structural elucidation of unknown organic pollutants arising in wastewaters improved the efficiency and veracity of screening approaches based on high-resolution mass spectrometry. Furthermore, structural similarity searching was essential for prioritizing substances sharing structural features with known organic pollutants or industrial and consumer chemicals that could enter the environment through use or disposal.

I then applied this comprehensive methodological and computational non-targeted analysis workflow to micropollutant fate analysis in domestic wastewaters (Chapter 3), surface waters impacted by water reuse activities (Chapter 4) and effluents of wastewater treatment facilities receiving wastewater from oil and gas extraction activities (Chapter 5). In Chapter 3, I showed that application of chemometric tools aided in the prioritization of non-targeted compounds arising at various stages of conventional wastewater treatment by partitioning high dimensional data into rational chemical categories based on knowledge of organic chemical fate processes, resulting in the classification of organic micropollutants based on their occurrence and/or removal during treatment. Similarly, in Chapter 4, high-resolution sampling and broad-spectrum targeted and non-targeted chemical analysis were applied to assess the occurrence and fate of organic micropollutants in a water reuse application, wherein reclaimed wastewater was applied for irrigation of turf grass. Results showed that organic micropollutant composition of surface waters receiving runoff from wastewater irrigated areas appeared to be minimally impacted by wastewater-derived organic micropollutants. Finally, Chapter 5 presents results of the comprehensive organic chemical composition of oil and gas wastewaters treated for surface water discharge. Concurrent analysis of effluent samples by complementary, broad-spectrum analytical techniques, revealed that low-levels of hydrophobic organic contaminants, but elevated concentrations of polymeric surfactants, which may effect the fate and analysis of contaminants of concern in oil and gas wastewaters.

Taken together, my work represents significant progress in the characterization of polar organic chemical pollutants associated with wastewater-impacted environments by high-resolution mass spectrometry. Application of these comprehensive methods to examine micropollutant fate processes in wastewater treatment systems, water reuse environments, and water applications in oil/gas exploration yielded new insights into the factors that influence transport, transformation, and persistence of organic micropollutants in these systems across an unprecedented breadth of chemical space.