5 resultados para visual data analysis

em Digital Commons - Michigan Tech


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Analyzing large-scale gene expression data is a labor-intensive and time-consuming process. To make data analysis easier, we developed a set of pipelines for rapid processing and analysis poplar gene expression data for knowledge discovery. Of all pipelines developed, differentially expressed genes (DEGs) pipeline is the one designed to identify biologically important genes that are differentially expressed in one of multiple time points for conditions. Pathway analysis pipeline was designed to identify the differentially expression metabolic pathways. Protein domain enrichment pipeline can identify the enriched protein domains present in the DEGs. Finally, Gene Ontology (GO) enrichment analysis pipeline was developed to identify the enriched GO terms in the DEGs. Our pipeline tools can analyze both microarray gene data and high-throughput gene data. These two types of data are obtained by two different technologies. A microarray technology is to measure gene expression levels via microarray chips, a collection of microscopic DNA spots attached to a solid (glass) surface, whereas high throughput sequencing, also called as the next-generation sequencing, is a new technology to measure gene expression levels by directly sequencing mRNAs, and obtaining each mRNA’s copy numbers in cells or tissues. We also developed a web portal (http://sys.bio.mtu.edu/) to make all pipelines available to public to facilitate users to analyze their gene expression data. In addition to the analyses mentioned above, it can also perform GO hierarchy analysis, i.e. construct GO trees using a list of GO terms as an input.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nitrogen and water are essential for plant growth and development. In this study, we designed experiments to produce gene expression data of poplar roots under nitrogen starvation and water deprivation conditions. We found low concentration of nitrogen led first to increased root elongation followed by lateral root proliferation and eventually increased root biomass. To identify genes regulating root growth and development under nitrogen starvation and water deprivation, we designed a series of data analysis procedures, through which, we have successfully identified biologically important genes. Differentially Expressed Genes (DEGs) analysis identified the genes that are differentially expressed under nitrogen starvation or drought. Protein domain enrichment analysis identified enriched themes (in same domains) that are highly interactive during the treatment. Gene Ontology (GO) enrichment analysis allowed us to identify biological process changed during nitrogen starvation. Based on the above analyses, we examined the local Gene Regulatory Network (GRN) and identified a number of transcription factors. After testing, one of them is a high hierarchically ranked transcription factor that affects root growth under nitrogen starvation. It is very tedious and time-consuming to analyze gene expression data. To avoid doing analysis manually, we attempt to automate a computational pipeline that now can be used for identification of DEGs and protein domain analysis in a single run. It is implemented in scripts of Perl and R.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Turrialba is one of the largest and most active stratovolcanoes in the Central Cordillera of Costa Rica and an excellent target for validation of satellite data using ground based measurements due to its high elevation, relative ease of access, and persistent elevated SO2 degassing. The Ozone Monitoring Instrument (OMI) aboard the Aura satellite makes daily global observations of atmospheric trace gases and it is used in this investigation to obtain volcanic SO2 retrievals in the Turrialba volcanic plume. We present and evaluate the relative accuracy of two OMI SO2 data analysis procedures, the automatic Band Residual Index (BRI) technique and the manual Normalized Cloud-mass (NCM) method. We find a linear correlation and good quantitative agreement between SO2 burdens derived from the BRI and NCM techniques, with an improved correlation when wet season data are excluded. We also present the first comparisons between volcanic SO2 emission rates obtained from ground-based mini-DOAS measurements at Turrialba and three new OMI SO2 data analysis techniques: the MODIS smoke estimation, OMI SO2 lifetime, and OMI SO2 transect techniques. A robust validation of OMI SO2 retrievals was made, with both qualitative and quantitative agreements under specific atmospheric conditions, proving the utility of satellite measurements for estimating accurate SO2 emission rates and monitoring passively degassing volcanoes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Water resource depletion and sanitation are growing problems around the world. A solution to both of these problems is the use of composting latrines, as it requires no water and has been recommended by the World Health Organization as an improved sanitation technology. However, little analysis has been done on the decomposition process occurring inside the latrine, including what temperatures are reached and what variables most affect the composting process. Having better knowledge of how outside variables affect composting latrines can aid development workers on the choice of implementing such technology, and to better educate the users on the appropriate methods of maintenance. This report presents a full, detailed construction manual and temperature data analysis of a double vault composting latrine. During the author’s two year Peace Corps service in rural Paraguay he was involved with building twenty one composting latrines, and took detailed temperature readings and visual observations of his personal latrine for ten months. The author also took limited temperature readings of fourteen community member’s latrines over a three month period. These data points were analyzed to find correlations between compost temperatures and several variables. The two main variables found to affect the compost temperatures were the seasonal trends of the outside temperatures, and the mixing and addition of moisture to the compost. Outside seasonal temperature changes were compared to those of the compost and a linear regression was performed resulting in a R2-value of 0.89. Mixing the compost and adding water, or a water/urine mixture, resulted in temperature increases of the compost 100% of the time, with seasonal temperatures determining the rate and duration of the temperature increases. The temperature readings were also used to find events when certain temperatures were held for sufficient amounts of time to reach total pathogen destruction in the compost. Four different events were recorded when a temperature of 122°F (50°C) was held for at least 24 hours, ensuring total pathogen destruction in that area of the compost. One event of 114.8°F (46°C) held for one week was also recorded, again ensuring total pathogen destruction. Through the analysis of the temperature data, however, it was found that the compost only reached total pathogen destruction levels during ten percent of the data points. Because of this the storage time recommendation outlined by the World Health Organization should be complied with. The WHO recommends storing compost for 1.5-2 years in climates with ambient temperatures of 2-20°C (35-68°F), and for at least 1 year with ambient temperatures of 20-35°C (68-95°F). If these storage durations are obtainable the use of the double vault composting latrine is an economical and achievable solution to sanitation while conserving water resources.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Principal Component Analysis (PCA) is a popular method for dimension reduction that can be used in many fields including data compression, image processing, exploratory data analysis, etc. However, traditional PCA method has several drawbacks, since the traditional PCA method is not efficient for dealing with high dimensional data and cannot be effectively applied to compute accurate enough principal components when handling relatively large portion of missing data. In this report, we propose to use EM-PCA method for dimension reduction of power system measurement with missing data, and provide a comparative study of traditional PCA and EM-PCA methods. Our extensive experimental results show that EM-PCA method is more effective and more accurate for dimension reduction of power system measurement data than traditional PCA method when dealing with large portion of missing data set.