11 resultados para Constraint based modeling

em Cochin University of Science


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Agent based simulation is a widely developing area in artificial intelligence.The simulation studies are extensively used in different areas of disaster management. This work deals with the study of an agent based evacuation simulation which is being done to handle the various evacuation behaviors.Various emergent behaviors of agents are addressed here. Dynamic grouping behaviors of agents are studied. Collision detection and obstacle avoidances are also incorporated in this approach.Evacuation is studied with single exits and multiple exits and efficiency is measured in terms of evacuation rate, collision rate etc.Net logo is the tool used which helps in the efficient modeling of scenarios in evacuation

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This study is concerned with Autoregressive Moving Average (ARMA) models of time series. ARMA models form a subclass of the class of general linear models which represents stationary time series, a phenomenon encountered most often in practice by engineers, scientists and economists. It is always desirable to employ models which use parameters parsimoniously. Parsimony will be achieved by ARMA models because it has only finite number of parameters. Even though the discussion is primarily concerned with stationary time series, later we will take up the case of homogeneous non stationary time series which can be transformed to stationary time series. Time series models, obtained with the help of the present and past data is used for forecasting future values. Physical science as well as social science take benefits of forecasting models. The role of forecasting cuts across all fields of management-—finance, marketing, production, business economics, as also in signal process, communication engineering, chemical processes, electronics etc. This high applicability of time series is the motivation to this study.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study focuses on the onset of southwest monsoon over Kerala. India Meteorological Department (IMD) has been using a semi-objective method to define monsoon onset. The main objectives of the study are to understand the monsoon onset processes, to simulate monsoon onset in a GCM using as input the atmospheric conditions and Sea Surface Temperature, 10 days earlier to the onset, to develop a method for medium range prediction of the date of onset of southwest monsoon over Kerala and to examine the possibility of objectively defining the date of Monsoon Onset over Kerala (MOK). It gives a broad description of regional monsoon systems and monsoon onsets over Asia and Australia. Asian monsoon includes two separate subsystems, Indain monsoon and East Asian monsoon. It is seen from this study that the duration of the different phases of the onset process are dependent on the period of ISO. Based on the study of the monsoon onset process, modeling studies can be done for better understanding of the ocean-atmosphere interaction especially those associated with the warm pool in the Bay of Bengal and the Arabian Sea.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The thesis has covered various aspects of modeling and analysis of finite mean time series with symmetric stable distributed innovations. Time series analysis based on Box and Jenkins methods are the most popular approaches where the models are linear and errors are Gaussian. We highlighted the limitations of classical time series analysis tools and explored some generalized tools and organized the approach parallel to the classical set up. In the present thesis we mainly studied the estimation and prediction of signal plus noise model. Here we assumed the signal and noise follow some models with symmetric stable innovations.We start the thesis with some motivating examples and application areas of alpha stable time series models. Classical time series analysis and corresponding theories based on finite variance models are extensively discussed in second chapter. We also surveyed the existing theories and methods correspond to infinite variance models in the same chapter. We present a linear filtering method for computing the filter weights assigned to the observation for estimating unobserved signal under general noisy environment in third chapter. Here we consider both the signal and the noise as stationary processes with infinite variance innovations. We derived semi infinite, double infinite and asymmetric signal extraction filters based on minimum dispersion criteria. Finite length filters based on Kalman-Levy filters are developed and identified the pattern of the filter weights. Simulation studies show that the proposed methods are competent enough in signal extraction for processes with infinite variance.Parameter estimation of autoregressive signals observed in a symmetric stable noise environment is discussed in fourth chapter. Here we used higher order Yule-Walker type estimation using auto-covariation function and exemplify the methods by simulation and application to Sea surface temperature data. We increased the number of Yule-Walker equations and proposed a ordinary least square estimate to the autoregressive parameters. Singularity problem of the auto-covariation matrix is addressed and derived a modified version of the Generalized Yule-Walker method using singular value decomposition.In fifth chapter of the thesis we introduced partial covariation function as a tool for stable time series analysis where covariance or partial covariance is ill defined. Asymptotic results of the partial auto-covariation is studied and its application in model identification of stable auto-regressive models are discussed. We generalize the Durbin-Levinson algorithm to include infinite variance models in terms of partial auto-covariation function and introduce a new information criteria for consistent order estimation of stable autoregressive model.In chapter six we explore the application of the techniques discussed in the previous chapter in signal processing. Frequency estimation of sinusoidal signal observed in symmetric stable noisy environment is discussed in this context. Here we introduced a parametric spectrum analysis and frequency estimate using power transfer function. Estimate of the power transfer function is obtained using the modified generalized Yule-Walker approach. Another important problem in statistical signal processing is to identify the number of sinusoidal components in an observed signal. We used a modified version of the proposed information criteria for this purpose.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

After skin cancer, breast cancer accounts for the second greatest number of cancer diagnoses in women. Currently the etiologies of breast cancer are unknown, and there is no generally accepted therapy for preventing it. Therefore, the best way to improve the prognosis for breast cancer is early detection and treatment. Computer aided detection systems (CAD) for detecting masses or micro-calcifications in mammograms have already been used and proven to be a potentially powerful tool , so the radiologists are attracted by the effectiveness of clinical application of CAD systems. Fractal geometry is well suited for describing the complex physiological structures that defy the traditional Euclidean geometry, which is based on smooth shapes. The major contribution of this research include the development of • A new fractal feature to accurately classify mammograms into normal and normal (i)With masses (benign or malignant) (ii) with microcalcifications (benign or malignant) • A novel fast fractal modeling method to identify the presence of microcalcifications by fractal modeling of mammograms and then subtracting the modeled image from the original mammogram. The performances of these methods were evaluated using different standard statistical analysis methods. The results obtained indicate that the developed methods are highly beneficial for assisting radiologists in making diagnostic decisions. The mammograms for the study were obtained from the two online databases namely, MIAS (Mammographic Image Analysis Society) and DDSM (Digital Database for Screening Mammography.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Quantile functions are efficient and equivalent alternatives to distribution functions in modeling and analysis of statistical data (see Gilchrist, 2000; Nair and Sankaran, 2009). Motivated by this, in the present paper, we introduce a quantile based Shannon entropy function. We also introduce residual entropy function in the quantile setup and study its properties. Unlike the residual entropy function due to Ebrahimi (1996), the residual quantile entropy function determines the quantile density function uniquely through a simple relationship. The measure is used to define two nonparametric classes of distributions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Partial moments are extensively used in literature for modeling and analysis of lifetime data. In this paper, we study properties of partial moments using quantile functions. The quantile based measure determines the underlying distribution uniquely. We then characterize certain lifetime quantile function models. The proposed measure provides alternate definitions for ageing criteria. Finally, we explore the utility of the measure to compare the characteristics of two lifetime distributions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Modeling nonlinear systems using Volterra series is a century old method but practical realizations were hampered by inadequate hardware to handle the increased computational complexity stemming from its use. But interest is renewed recently, in designing and implementing filters which can model much of the polynomial nonlinearities inherent in practical systems. The key advantage in resorting to Volterra power series for this purpose is that nonlinear filters so designed can be made to work in parallel with the existing LTI systems, yielding improved performance. This paper describes the inclusion of a quadratic predictor (with nonlinearity order 2) with a linear predictor in an analog source coding system. Analog coding schemes generally ignore the source generation mechanisms but focuses on high fidelity reconstruction at the receiver. The widely used method of differential pnlse code modulation (DPCM) for speech transmission uses a linear predictor to estimate the next possible value of the input speech signal. But this linear system do not account for the inherent nonlinearities in speech signals arising out of multiple reflections in the vocal tract. So a quadratic predictor is designed and implemented in parallel with the linear predictor to yield improved mean square error performance. The augmented speech coder is tested on speech signals transmitted over an additive white gaussian noise (AWGN) channel.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, a novel fast method for modeling mammograms by deterministic fractal coding approach to detect the presence of microcalcifications, which are early signs of breast cancer, is presented. The modeled mammogram obtained using fractal encoding method is visually similar to the original image containing microcalcifications, and therefore, when it is taken out from the original mammogram, the presence of microcalcifications can be enhanced. The limitation of fractal image modeling is the tremendous time required for encoding. In the present work, instead of searching for a matching domain in the entire domain pool of the image, three methods based on mean and variance, dynamic range of the image blocks, and mass center features are used. This reduced the encoding time by a factor of 3, 89, and 13, respectively, in the three methods with respect to the conventional fractal image coding method with quad tree partitioning. The mammograms obtained from The Mammographic Image Analysis Society database (ground truth available) gave a total detection score of 87.6%, 87.6%, 90.5%, and 87.6%, for the conventional and the proposed three methods, respectively.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Software systems are progressively being deployed in many facets of human life. The implication of the failure of such systems, has an assorted impact on its customers. The fundamental aspect that supports a software system, is focus on quality. Reliability describes the ability of the system to function under specified environment for a specified period of time and is used to objectively measure the quality. Evaluation of reliability of a computing system involves computation of hardware and software reliability. Most of the earlier works were given focus on software reliability with no consideration for hardware parts or vice versa. However, a complete estimation of reliability of a computing system requires these two elements to be considered together, and thus demands a combined approach. The present work focuses on this and presents a model for evaluating the reliability of a computing system. The method involves identifying the failure data for hardware components, software components and building a model based on it, to predict the reliability. To develop such a model, focus is given to the systems based on Open Source Software, since there is an increasing trend towards its use and only a few studies were reported on the modeling and measurement of the reliability of such products. The present work includes a thorough study on the role of Free and Open Source Software, evaluation of reliability growth models, and is trying to present an integrated model for the prediction of reliability of a computational system. The developed model has been compared with existing models and its usefulness of is being discussed.