3 resultados para Return-based pricing kernel

em Digital Commons - Michigan Tech


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The developmental processes and functions of an organism are controlled by the genes and the proteins that are derived from these genes. The identification of key genes and the reconstruction of gene networks can provide a model to help us understand the regulatory mechanisms for the initiation and progression of biological processes or functional abnormalities (e.g. diseases) in living organisms. In this dissertation, I have developed statistical methods to identify the genes and transcription factors (TFs) involved in biological processes, constructed their regulatory networks, and also evaluated some existing association methods to find robust methods for coexpression analyses. Two kinds of data sets were used for this work: genotype data and gene expression microarray data. On the basis of these data sets, this dissertation has two major parts, together forming six chapters. The first part deals with developing association methods for rare variants using genotype data (chapter 4 and 5). The second part deals with developing and/or evaluating statistical methods to identify genes and TFs involved in biological processes, and construction of their regulatory networks using gene expression data (chapter 2, 3, and 6). For the first part, I have developed two methods to find the groupwise association of rare variants with given diseases or traits. The first method is based on kernel machine learning and can be applied to both quantitative as well as qualitative traits. Simulation results showed that the proposed method has improved power over the existing weighted sum method (WS) in most settings. The second method uses multiple phenotypes to select a few top significant genes. It then finds the association of each gene with each phenotype while controlling the population stratification by adjusting the data for ancestry using principal components. This method was applied to GAW 17 data and was able to find several disease risk genes. For the second part, I have worked on three problems. First problem involved evaluation of eight gene association methods. A very comprehensive comparison of these methods with further analysis clearly demonstrates the distinct and common performance of these eight gene association methods. For the second problem, an algorithm named the bottom-up graphical Gaussian model was developed to identify the TFs that regulate pathway genes and reconstruct their hierarchical regulatory networks. This algorithm has produced very significant results and it is the first report to produce such hierarchical networks for these pathways. The third problem dealt with developing another algorithm called the top-down graphical Gaussian model that identifies the network governed by a specific TF. The network produced by the algorithm is proven to be of very high accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Virtually every sector of business and industry that uses computing, including financial analysis, search engines, and electronic commerce, incorporate Big Data analysis into their business model. Sophisticated clustering algorithms are popular for deducing the nature of data by assigning labels to unlabeled data. We address two main challenges in Big Data. First, by definition, the volume of Big Data is too large to be loaded into a computer’s memory (this volume changes based on the computer used or available, but there is always a data set that is too large for any computer). Second, in real-time applications, the velocity of new incoming data prevents historical data from being stored and future data from being accessed. Therefore, we propose our Streaming Kernel Fuzzy c-Means (stKFCM) algorithm, which reduces both computational complexity and space complexity significantly. The proposed stKFCM only requires O(n2) memory where n is the (predetermined) size of a data subset (or data chunk) at each time step, which makes this algorithm truly scalable (as n can be chosen based on the available memory). Furthermore, only 2n2 elements of the full N × N (where N >> n) kernel matrix need to be calculated at each time-step, thus reducing both the computation time in producing the kernel elements and also the complexity of the FCM algorithm. Empirical results show that stKFCM, even with relatively very small n, can provide clustering performance as accurately as kernel fuzzy c-means run on the entire data set while achieving a significant speedup.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Increases in oil prices after the economic recession have been surprising for domestic oil production in the United States since the beginning of 2009. Not only did the conventional oil extraction increase, but unconventional oil production and exploration also improved greatly with the favorable economic conditions. This favorable economy encourages companies to invest in new reservoirs and technological developments. Recently, enhanced drilling techniques including hydraulic fracturing and horizontal drilling have been supporting the domestic economy by way of unconventional shale and tight oil from various U.S. locations. One of the main contributors to this oil boom is the unconventional oil production from the North Dakota Bakken field. Horizontal drilling has increased oil production in the Bakken field, but the economic issues of unconventional oil extraction are still debatable due to volatile oil prices, high decline rates of production, a limited production period, high production costs, and lack of transportation. The economic profitability and viability of the unconventional oil play in the North Dakota Bakken was tested with an economic analysis of average Bakken unconventional well features. Scenario analysis demonstrated that a typical North Dakota Bakken unconventional oil well is profitable and viable as shown by three financial metrics; net present value, internal rate of return, and break-even prices.