5 resultados para Incremental mining

em Duke University


Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: Over the past two decades more than fifty thousand unique clinical and biological samples have been assayed using the Affymetrix HG-U133 and HG-U95 GeneChip microarray platforms. This substantial repository has been used extensively to characterize changes in gene expression between biological samples, but has not been previously mined en masse for changes in mRNA processing. We explored the possibility of using HG-U133 microarray data to identify changes in alternative mRNA processing in several available archival datasets. RESULTS: Data from these and other gene expression microarrays can now be mined for changes in transcript isoform abundance using a program described here, SplicerAV. Using in vivo and in vitro breast cancer microarray datasets, SplicerAV was able to perform both gene and isoform specific expression profiling within the same microarray dataset. Our reanalysis of Affymetrix U133 plus 2.0 data generated by in vitro over-expression of HRAS, E2F3, beta-catenin (CTNNB1), SRC, and MYC identified several hundred oncogene-induced mRNA isoform changes, one of which recognized a previously unknown mechanism of EGFR family activation. Using clinical data, SplicerAV predicted 241 isoform changes between low and high grade breast tumors; with changes enriched among genes coding for guanyl-nucleotide exchange factors, metalloprotease inhibitors, and mRNA processing factors. Isoform changes in 15 genes were associated with aggressive cancer across the three breast cancer datasets. CONCLUSIONS: Using SplicerAV, we identified several hundred previously uncharacterized isoform changes induced by in vitro oncogene over-expression and revealed a previously unknown mechanism of EGFR activation in human mammary epithelial cells. We analyzed Affymetrix GeneChip data from over 400 human breast tumors in three independent studies, making this the largest clinical dataset analyzed for en masse changes in alternative mRNA processing. The capacity to detect RNA isoform changes in archival microarray data using SplicerAV allowed us to carry out the first analysis of isoform specific mRNA changes directly associated with cancer survival.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mountaintop mining (MTM) is the primary procedure for surface coal exploration within the central Appalachian region of the eastern United States, and it is known to contaminate streams in local watersheds. In this study, we measured the chemical and isotopic compositions of water samples from MTM-impacted tributaries and streams in the Mud River watershed in West Virginia. We systematically document the isotopic compositions of three major constituents: sulfur isotopes in sulfate (δ(34)SSO4), carbon isotopes in dissolved inorganic carbon (δ(13)CDIC), and strontium isotopes ((87)Sr/(86)Sr). The data show that δ(34)SSO4, δ(13)CDIC, Sr/Ca, and (87)Sr/(86)Sr measured in saline- and selenium-rich MTM impacted tributaries are distinguishable from those of the surface water upstream of mining impacts. These tracers can therefore be used to delineate and quantify the impact of MTM in watersheds. High Sr/Ca and low (87)Sr/(86)Sr characterize tributaries that originated from active MTM areas, while tributaries from reclaimed MTM areas had low Sr/Ca and high (87)Sr/(86)Sr. Leaching experiments of rocks from the watershed show that pyrite oxidation and carbonate dissolution control the solute chemistry with distinct (87)Sr/(86)Sr ratios characterizing different rock sources. We propose that MTM operations that access the deeper Kanawha Formation generate residual mined rocks in valley fills from which effluents with distinctive (87)Sr/(86)Sr and Sr/Ca imprints affect the quality of the Appalachian watersheds.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An enterprise information system (EIS) is an integrated data-applications platform characterized by diverse, heterogeneous, and distributed data sources. For many enterprises, a number of business processes still depend heavily on static rule-based methods and extensive human expertise. Enterprises are faced with the need for optimizing operation scheduling, improving resource utilization, discovering useful knowledge, and making data-driven decisions.

This thesis research is focused on real-time optimization and knowledge discovery that addresses workflow optimization, resource allocation, as well as data-driven predictions of process-execution times, order fulfillment, and enterprise service-level performance. In contrast to prior work on data analytics techniques for enterprise performance optimization, the emphasis here is on realizing scalable and real-time enterprise intelligence based on a combination of heterogeneous system simulation, combinatorial optimization, machine-learning algorithms, and statistical methods.

On-demand digital-print service is a representative enterprise requiring a powerful EIS.We use real-life data from Reischling Press, Inc. (RPI), a digit-print-service provider (PSP), to evaluate our optimization algorithms.

In order to handle the increase in volume and diversity of demands, we first present a high-performance, scalable, and real-time production scheduling algorithm for production automation based on an incremental genetic algorithm (IGA). The objective of this algorithm is to optimize the order dispatching sequence and balance resource utilization. Compared to prior work, this solution is scalable for a high volume of orders and it provides fast scheduling solutions for orders that require complex fulfillment procedures. Experimental results highlight its potential benefit in reducing production inefficiencies and enhancing the productivity of an enterprise.

We next discuss analysis and prediction of different attributes involved in hierarchical components of an enterprise. We start from a study of the fundamental processes related to real-time prediction. Our process-execution time and process status prediction models integrate statistical methods with machine-learning algorithms. In addition to improved prediction accuracy compared to stand-alone machine-learning algorithms, it also performs a probabilistic estimation of the predicted status. An order generally consists of multiple series and parallel processes. We next introduce an order-fulfillment prediction model that combines advantages of multiple classification models by incorporating flexible decision-integration mechanisms. Experimental results show that adopting due dates recommended by the model can significantly reduce enterprise late-delivery ratio. Finally, we investigate service-level attributes that reflect the overall performance of an enterprise. We analyze and decompose time-series data into different components according to their hierarchical periodic nature, perform correlation analysis,

and develop univariate prediction models for each component as well as multivariate models for correlated components. Predictions for the original time series are aggregated from the predictions of its components. In addition to a significant increase in mid-term prediction accuracy, this distributed modeling strategy also improves short-term time-series prediction accuracy.

In summary, this thesis research has led to a set of characterization, optimization, and prediction tools for an EIS to derive insightful knowledge from data and use them as guidance for production management. It is expected to provide solutions for enterprises to increase reconfigurability, accomplish more automated procedures, and obtain data-driven recommendations or effective decisions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Selenium (Se) is a micronutrient necessary for the function of a variety of important enzymes; Se also exhibits a narrow range in concentrations between essentiality and toxicity. Oviparous vertebrates such as birds and fish are especially sensitive to Se toxicity, which causes reproductive impairment and defects in embryo development. Selenium occurs naturally in the Earth's crust, but it can be mobilized by a variety of anthropogenic activities, including agricultural practices, coal burning, and mining.

Mountaintop removal/valley fill (MTR/VF) coal mining is a form of surface mining found throughout central Appalachia in the United States that involves blasting off the tops of mountains to access underlying coal seams. Spoil rock from the mountain is placed into adjacent valleys, forming valley fills, which bury stream headwaters and negatively impact surface water quality. This research focused on the biological impacts of Se leached from MTR/VF coal mining operations located around the Mud River, West Virginia.

In order to assess the status of Se in a lotic (flowing) system such as the Mud River, surface water, insects, and fish samples including creek chub (Semotilus atromaculatus) and green sunfish (Lepomis cyanellus) were collected from a mining impacted site as well as from a reference site not impacted by mining. Analysis of samples from the mined site showed increased conductivity and Se in the surface waters compared to the reference site in addition to increased concentrations of Se in insects and fish. Histological analysis of mined site fish gills showed a lack of normal parasites, suggesting parasite populations may be disrupted due to poor water quality. X-ray absorption near edge spectroscopy techniques were used to determine the speciation of Se in insect and creek chub samples. Insects contained approximately 40-50% inorganic Se (selenate and selenite) and 50-60% organic Se (Se-methionine and Se-cystine) while fish tissues contained lower proportions of inorganic Se than insects, instead having higher proportions of organic Se in the forms of methyl-Se-cysteine, Se-cystine, and Se-methionine.

Otoliths, calcified inner ear structures, were also collected from Mud River creek chubs and green sunfish and analyzed for Se content using laser ablation inductively couple mass spectrometry (LA-ICP-MS). Significant differences were found between the two species of fish, based on the concentrations of otolith Se. Green sunfish otoliths from all sites contained background or low concentrations of otolith Se (< 1 µg/g) that were not significantly different between mined and unmined sites. In contrast creek chub otoliths from the historically mined site contained much higher (≥ 5 µg/g, up to approximately 68 µg/g) concentrations of Se than for the same species in the unmined site or for the green sunfish. Otolith Se concentrations were related to muscle Se concentrations for creek chubs (R2 = 0.54, p = 0.0002 for the last 20% of the otolith Se versus muscle Se) while no relationship was observed for green sunfish.

Additional experiments using biofilms grown in the Mud River showed increased Se in mined site biofilms compared to the reference site. When we fed fathead minnows (Pimephales promelas) on these biofilms in the laboratory they accumulated higher concentrations of Se in liver and ovary tissues compared to fathead minnows fed on reference site biofilms. No differences in Se accumulation were found in muscle from either treatment group. Biofilms were also centrifuged and separated into filamentous green algae and the remaining diatom fraction. The majority of Se was found in the diatom fraction with only about 1/3rd of total biofilm Se concentration present in the filamentous green algae fraction

Finally, zebrafish (Danio rerio) embryos were exposed to aqueous Se in the form of selenate, selenite, and L-selenomethionine in an attempt to determine if oxidative stress plays a role in selenium embryo toxicity. Selenate and selenite exposure did not induce embryo deformities (lordosis and craniofacial malformation). L-selenomethionine, however, induced significantly higher deformity rates at 100 µg/L compared to controls. Antioxidant rescue of L-selenomethionime induced deformities was attempted in embryos using N-acetylcysteine (NAC). Pretreatment with NAC significantly reduced deformities in the zebrafish embryos secondarily treated with L-selenomethionine, suggesting that oxidative stress may play a role in Se toxicity. Selenite exposure also induced a 6.6-fold increase in glutathione-S-transferase pi class 2 gene expression, which is involved in xenobiotic transformation. No changes in gene expression were observed for selenate or L-selenomethionine-exposed embryos.

The findings in this dissertation contribute to the understanding of how Se bioaccumulates in a lotic system and is transferred through a simulated foodweb in addition to further exploring oxidative stress as a potential mechanism for Se-induced embryo toxicity. Future studies should continue to pursue the role of oxidative stress and other mechanisms in Se toxicity and the biotransformation of Se in aquatic ecosystems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many factors such as poverty, ineffective institutions and environmental regulations may prevent developing countries from managing how natural resources are extracted to meet a strong market demand. Extraction for some resources has reached such proportions that evidence is measurable from space. We present recent evidence of the global demand for a single commodity and the ecosystem destruction resulting from commodity extraction, recorded by satellites for one of the most biodiverse areas of the world. We find that since 2003, recent mining deforestation in Madre de Dios, Peru is increasing nonlinearly alongside a constant annual rate of increase in international gold price (∼18%/yr). We detect that the new pattern of mining deforestation (1915 ha/year, 2006-2009) is outpacing that of nearby settlement deforestation. We show that gold price is linked with exponential increases in Peruvian national mercury imports over time (R(2) = 0.93, p = 0.04, 2003-2009). Given the past rates of increase we predict that mercury imports may more than double for 2011 (∼500 t/year). Virtually all of Peru's mercury imports are used in artisanal gold mining. Much of the mining increase is unregulated/artisanal in nature, lacking environmental impact analysis or miner education. As a result, large quantities of mercury are being released into the atmosphere, sediments and waterways. Other developing countries endowed with gold deposits are likely experiencing similar environmental destruction in response to recent record high gold prices. The increasing availability of satellite imagery ought to evoke further studies linking economic variables with land use and cover changes on the ground.