6 resultados para EXPRESSION DATA

em Digital Commons - Michigan Tech


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Analyzing large-scale gene expression data is a labor-intensive and time-consuming process. To make data analysis easier, we developed a set of pipelines for rapid processing and analysis poplar gene expression data for knowledge discovery. Of all pipelines developed, differentially expressed genes (DEGs) pipeline is the one designed to identify biologically important genes that are differentially expressed in one of multiple time points for conditions. Pathway analysis pipeline was designed to identify the differentially expression metabolic pathways. Protein domain enrichment pipeline can identify the enriched protein domains present in the DEGs. Finally, Gene Ontology (GO) enrichment analysis pipeline was developed to identify the enriched GO terms in the DEGs. Our pipeline tools can analyze both microarray gene data and high-throughput gene data. These two types of data are obtained by two different technologies. A microarray technology is to measure gene expression levels via microarray chips, a collection of microscopic DNA spots attached to a solid (glass) surface, whereas high throughput sequencing, also called as the next-generation sequencing, is a new technology to measure gene expression levels by directly sequencing mRNAs, and obtaining each mRNA’s copy numbers in cells or tissues. We also developed a web portal (http://sys.bio.mtu.edu/) to make all pipelines available to public to facilitate users to analyze their gene expression data. In addition to the analyses mentioned above, it can also perform GO hierarchy analysis, i.e. construct GO trees using a list of GO terms as an input.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Nitrogen and water are essential for plant growth and development. In this study, we designed experiments to produce gene expression data of poplar roots under nitrogen starvation and water deprivation conditions. We found low concentration of nitrogen led first to increased root elongation followed by lateral root proliferation and eventually increased root biomass. To identify genes regulating root growth and development under nitrogen starvation and water deprivation, we designed a series of data analysis procedures, through which, we have successfully identified biologically important genes. Differentially Expressed Genes (DEGs) analysis identified the genes that are differentially expressed under nitrogen starvation or drought. Protein domain enrichment analysis identified enriched themes (in same domains) that are highly interactive during the treatment. Gene Ontology (GO) enrichment analysis allowed us to identify biological process changed during nitrogen starvation. Based on the above analyses, we examined the local Gene Regulatory Network (GRN) and identified a number of transcription factors. After testing, one of them is a high hierarchically ranked transcription factor that affects root growth under nitrogen starvation. It is very tedious and time-consuming to analyze gene expression data. To avoid doing analysis manually, we attempt to automate a computational pipeline that now can be used for identification of DEGs and protein domain analysis in a single run. It is implemented in scripts of Perl and R.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Secondary metabolites play an important role in plant protection against biotic and abiotic stress. In Populus, phenolic glycosides (PGs) and condensed tannins (CTs) are two such groups of compounds derived from the common phenylpropanoid pathway. The basal levels and the inducibility of PGs and CTs depend on genetic as well as environmental factors, such as soil nitrogen (N) level. Carbohydrate allocation, transport and sink strength also affect PG and CT levels. A negative correlation between the levels of PGs and CTs was observed in several studies. However, the molecular mechanism underlying such relation is not known. We used a cell culture system to understand negative correlation of PGs and CTs. Under normal culture conditions, neither salicin nor higher-order PGs accumulated in cell cultures. Several factors, such as hormones, light, organelles and precursors were discussed in the context of aspen suspension cells’ inability to synthesize PGs. Salicin and its isomer, isosalicin, were detected in cell cultures fed with salicyl alcohol, salicylaldehyde and helicin. At higher levels (5 mM) of salicyl alcohol feeding, accumulation of salicins led to reduced CT production in the cells. Based on metabolic and gene expression data, the CT reduction in salicin-accumulating cells is partly a result of regulatory changes at the transcriptional level affecting carbon partitioning between growth processes, and phenylpropanoid CT biosynthesis. Based on molecular studies, the glycosyltransferases, GT1-2 and GT1-246, may function in glycosylation of simple phenolics, such as salicyl alcohol in cell cultures. The uptake of such glycosides into vacuole may be mediated to some extent by tonoplast localized multidrug-resistance associated protein transporters, PtMRP1 and PtMRP6. In Populus, sucrose is the common transported carbohydrate and its transport is possibly regulated by sucrose transporters (SUTs). SUTs are also capable of transporting simple PGs, such as salicin. Therefore, we characterized the SUT gene family in Populus and investigated, by transgenic analysis, the possible role of the most abundantly expressed member, PtSUT4, in PG-CT homeostasis using plants grown under varying nitrogen regimes. PtSUT4 transgenic plants were phenotypically similar to the wildtype plants except that the leaf area-to-stem volume ratio was higher for transgenic plants. In SUT4 transgenics, levels of non-structural carbohydrates, such as sucrose and starch, were altered in mature leaves. The levels of PGs and CTs were lower in green tissues of transgenic plants under N-replete, but were higher under N-depleted conditions, compared to the levels in wildtype plants. Based on our results, SUT4 partly regulates N-level dependent PG-CT homeostasis by differential carbohydrate allocation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The developmental processes and functions of an organism are controlled by the genes and the proteins that are derived from these genes. The identification of key genes and the reconstruction of gene networks can provide a model to help us understand the regulatory mechanisms for the initiation and progression of biological processes or functional abnormalities (e.g. diseases) in living organisms. In this dissertation, I have developed statistical methods to identify the genes and transcription factors (TFs) involved in biological processes, constructed their regulatory networks, and also evaluated some existing association methods to find robust methods for coexpression analyses. Two kinds of data sets were used for this work: genotype data and gene expression microarray data. On the basis of these data sets, this dissertation has two major parts, together forming six chapters. The first part deals with developing association methods for rare variants using genotype data (chapter 4 and 5). The second part deals with developing and/or evaluating statistical methods to identify genes and TFs involved in biological processes, and construction of their regulatory networks using gene expression data (chapter 2, 3, and 6). For the first part, I have developed two methods to find the groupwise association of rare variants with given diseases or traits. The first method is based on kernel machine learning and can be applied to both quantitative as well as qualitative traits. Simulation results showed that the proposed method has improved power over the existing weighted sum method (WS) in most settings. The second method uses multiple phenotypes to select a few top significant genes. It then finds the association of each gene with each phenotype while controlling the population stratification by adjusting the data for ancestry using principal components. This method was applied to GAW 17 data and was able to find several disease risk genes. For the second part, I have worked on three problems. First problem involved evaluation of eight gene association methods. A very comprehensive comparison of these methods with further analysis clearly demonstrates the distinct and common performance of these eight gene association methods. For the second problem, an algorithm named the bottom-up graphical Gaussian model was developed to identify the TFs that regulate pathway genes and reconstruct their hierarchical regulatory networks. This algorithm has produced very significant results and it is the first report to produce such hierarchical networks for these pathways. The third problem dealt with developing another algorithm called the top-down graphical Gaussian model that identifies the network governed by a specific TF. The network produced by the algorithm is proven to be of very high accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Sporulation is a process in which some bacteria divide asymmetrically to form tough protective endospores, which help them to survive in a hazardous environment for a quite long time. The factors which can trigger this process are diverse. Heat, radiation, chemicals and lacking of nutrition can all lead to the formation of endospores. This phenomenon will lead to low productivity during industrial production. However, the sporulation mechanism in a spore-forming bacterium, Clostridium theromcellum, is still unclear. Therefore, if a regulation network of sporulation can be built, we may figure out ways to inhibit this process. In this study, a computational method is applied to predict the sporulation network in Clostridium theromcellum. A working sporulation network model with 40 new predicted genes and 4 function groups is built by using a network construction program, CINPER. 5 sets of microarray expression data in Clostridium theromcellum under different conditions have been collected. The analysis shows the predicted result is reasonable.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Phytic acid is the major storage form of phosphorus and inositol in seeds and legumes. It forms insoluble phytate salts by chelating with positively charged mineral ions. Non-ruminant animals are not able to digest phytate due to the lack of phytases in their GI tracks, thus the undigested phytate is excreted leading to environmental contamination. Supplementation with phytases in animal feed has proven to be an effective strategy to alleviate nutritional and environmental issues. The unique catalytic and thermal stability properties of alkaline phytase from lily pollen (LlALP) suggest that it has the potential to be useful as a feed supplement. Our goal is to develop a method for the production of substantial amounts of rLlALP for animal feed and structural studies. rLlALP2 has been successfully expressed in the yeast, Pichia pastoris. However, expression yield was modest (8-10 mg/L). Gene copy number has been identified as an important parameter in enhancing protein yields. Multicopy clones were selected using Zeocin-resistance-based vectors and challenging transformants to high Zeocin levels under different conditions. Data indicate that increasing selection pressure led to the generation of clones with amplification of both rLlAlp2 and Zeor genes and the two genes were not equally amplified. Additionally, clones generated by step-wise methods led to clones with greater amplification. The effects of transgene copy number and gene sequence optimization on expression levels of rLlALP2 were examined. The data indicate that increasing the copy number of rLlAlp2 in transformed clones was detrimental to expression level. The use of a sequence-optimized rLlAlp2 (op-rLlAlp2) increased expression yield of the active enzyme by 25-50%, suggesting that transcription and translation efficiency are not major bottlenecks in the production of rLlALP2. Lowering induction temperature to 20 oC led to an increase in enzyme activity of 1.2 to 20-fold, suggesting that protein folding or post-translational processes may be limiting factors for rLlALP2 production. Cumulatively, optimization of copy number, gene sequence optimization and reduced temperature led to increase of rLlALP2 enzyme activity by three-fold (25-30 mg/L). In an effort to simplify the purification process of rLlALP2, extracellular expression of phytase was investigated. Extracellular expression is dependent on the presence of an appropriate secretion signal upstream of the transgene native signal peptide(s) present in the transgene may also influence secretion efficiency. The data suggest that deletion of both N- and C-terminal signal peptides of rLlALP2 enhanced α-mating factor (α-MF)-driven secretion of LlALP2 by four-fold. The secretion signal peptide of chicken egg white lysozyme was ineffective in secretion rLlALP2 in P. pastoris. To enhance rLlALP2 secretion, effectiveness of the strong inducible promoter (PAOX1) was compared with the constitutive promoter (PGAP). The intracellular yield of rLlALP2 was about four-fold greater under the control of PGAP compared to PAOX1 and extracellular expression level of rLlALP2 was around eight-fold (75-100 mg/L) greater. The successful production of active rLlALP2 in P. pastoris will allow us to conduct the animal feed supplementation studies and structural studies.