5 resultados para Soft computing
em Cochin University of Science
Resumo:
Microarray data analysis is one of data mining tool which is used to extract meaningful information hidden in biological data. One of the major focuses on microarray data analysis is the reconstruction of gene regulatory network that may be used to provide a broader understanding on the functioning of complex cellular systems. Since cancer is a genetic disease arising from the abnormal gene function, the identification of cancerous genes and the regulatory pathways they control will provide a better platform for understanding the tumor formation and development. The major focus of this thesis is to understand the regulation of genes responsible for the development of cancer, particularly colorectal cancer by analyzing the microarray expression data. In this thesis, four computational algorithms namely fuzzy logic algorithm, modified genetic algorithm, dynamic neural fuzzy network and Takagi Sugeno Kang-type recurrent neural fuzzy network are used to extract cancer specific gene regulatory network from plasma RNA dataset of colorectal cancer patients. Plasma RNA is highly attractive for cancer analysis since it requires a collection of small amount of blood and it can be obtained at any time in repetitive fashion allowing the analysis of disease progression and treatment response.
Resumo:
Post-transcriptional gene silencing by RNA interference is mediated by small interfering RNA called siRNA. This gene silencing mechanism can be exploited therapeutically to a wide variety of disease-associated targets, especially in AIDS, neurodegenerative diseases, cholesterol and cancer on mice with the hope of extending these approaches to treat humans. Over the recent past, a significant amount of work has been undertaken to understand the gene silencing mediated by exogenous siRNA. The design of efficient exogenous siRNA sequences is challenging because of many issues related to siRNA. While designing efficient siRNA, target mRNAs must be selected such that their corresponding siRNAs are likely to be efficient against that target and unlikely to accidentally silence other transcripts due to sequence similarity. So before doing gene silencing by siRNAs, it is essential to analyze their off-target effects in addition to their inhibition efficiency against a particular target. Hence designing exogenous siRNA with good knock-down efficiency and target specificity is an area of concern to be addressed. Some methods have been developed already by considering both inhibition efficiency and off-target possibility of siRNA against agene. Out of these methods, only a few have achieved good inhibition efficiency, specificity and sensitivity. The main focus of this thesis is to develop computational methods to optimize the efficiency of siRNA in terms of “inhibition capacity and off-target possibility” against target mRNAs with improved efficacy, which may be useful in the area of gene silencing and drug design for tumor development. This study aims to investigate the currently available siRNA prediction approaches and to devise a better computational approach to tackle the problem of siRNA efficacy by inhibition capacity and off-target possibility. The strength and limitations of the available approaches are investigated and taken into consideration for making improved solution. Thus the approaches proposed in this study extend some of the good scoring previous state of the art techniques by incorporating machine learning and statistical approaches and thermodynamic features like whole stacking energy to improve the prediction accuracy, inhibition efficiency, sensitivity and specificity. Here, we propose one Support Vector Machine (SVM) model, and two Artificial Neural Network (ANN) models for siRNA efficiency prediction. In SVM model, the classification property is used to classify whether the siRNA is efficient or inefficient in silencing a target gene. The first ANNmodel, named siRNA Designer, is used for optimizing the inhibition efficiency of siRNA against target genes. The second ANN model, named Optimized siRNA Designer, OpsiD, produces efficient siRNAs with high inhibition efficiency to degrade target genes with improved sensitivity-specificity, and identifies the off-target knockdown possibility of siRNA against non-target genes. The models are trained and tested against a large data set of siRNA sequences. The validations are conducted using Pearson Correlation Coefficient, Mathews Correlation Coefficient, Receiver Operating Characteristic analysis, Accuracy of prediction, Sensitivity and Specificity. It is found that the approach, OpsiD, is capable of predicting the inhibition capacity of siRNA against a target mRNA with improved results over the state of the art techniques. Also we are able to understand the influence of whole stacking energy on efficiency of siRNA. The model is further improved by including the ability to identify the “off-target possibility” of predicted siRNA on non-target genes. Thus the proposed model, OpsiD, can predict optimized siRNA by considering both “inhibition efficiency on target genes and off-target possibility on non-target genes”, with improved inhibition efficiency, specificity and sensitivity. Since we have taken efforts to optimize the siRNA efficacy in terms of “inhibition efficiency and offtarget possibility”, we hope that the risk of “off-target effect” while doing gene silencing in various bioinformatics fields can be overcome to a great extent. These findings may provide new insights into cancer diagnosis, prognosis and therapy by gene silencing. The approach may be found useful for designing exogenous siRNA for therapeutic applications and gene silencing techniques in different areas of bioinformatics.
Resumo:
One major component of power system operation is generation scheduling. The objective of the work is to develop efficient control strategies to the power scheduling problems through Reinforcement Learning approaches. The three important active power scheduling problems are Unit Commitment, Economic Dispatch and Automatic Generation Control. Numerical solution methods proposed for solution of power scheduling are insufficient in handling large and complex systems. Soft Computing methods like Simulated Annealing, Evolutionary Programming etc., are efficient in handling complex cost functions, but find limitation in handling stochastic data existing in a practical system. Also the learning steps are to be repeated for each load demand which increases the computation time.Reinforcement Learning (RL) is a method of learning through interactions with environment. The main advantage of this approach is it does not require a precise mathematical formulation. It can learn either by interacting with the environment or interacting with a simulation model. Several optimization and control problems have been solved through Reinforcement Learning approach. The application of Reinforcement Learning in the field of Power system has been a few. The objective is to introduce and extend Reinforcement Learning approaches for the active power scheduling problems in an implementable manner. The main objectives can be enumerated as:(i) Evolve Reinforcement Learning based solutions to the Unit Commitment Problem.(ii) Find suitable solution strategies through Reinforcement Learning approach for Economic Dispatch. (iii) Extend the Reinforcement Learning solution to Automatic Generation Control with a different perspective. (iv) Check the suitability of the scheduling solutions to one of the existing power systems.First part of the thesis is concerned with the Reinforcement Learning approach to Unit Commitment problem. Unit Commitment Problem is formulated as a multi stage decision process. Q learning solution is developed to obtain the optimwn commitment schedule. Method of state aggregation is used to formulate an efficient solution considering the minimwn up time I down time constraints. The performance of the algorithms are evaluated for different systems and compared with other stochastic methods like Genetic Algorithm.Second stage of the work is concerned with solving Economic Dispatch problem. A simple and straight forward decision making strategy is first proposed in the Learning Automata algorithm. Then to solve the scheduling task of systems with large number of generating units, the problem is formulated as a multi stage decision making task. The solution obtained is extended in order to incorporate the transmission losses in the system. To make the Reinforcement Learning solution more efficient and to handle continuous state space, a fimction approximation strategy is proposed. The performance of the developed algorithms are tested for several standard test cases. Proposed method is compared with other recent methods like Partition Approach Algorithm, Simulated Annealing etc.As the final step of implementing the active power control loops in power system, Automatic Generation Control is also taken into consideration.Reinforcement Learning has already been applied to solve Automatic Generation Control loop. The RL solution is extended to take up the approach of common frequency for all the interconnected areas, more similar to practical systems. Performance of the RL controller is also compared with that of the conventional integral controller.In order to prove the suitability of the proposed methods to practical systems, second plant ofNeyveli Thennal Power Station (NTPS IT) is taken for case study. The perfonnance of the Reinforcement Learning solution is found to be better than the other existing methods, which provide the promising step towards RL based control schemes for practical power industry.Reinforcement Learning is applied to solve the scheduling problems in the power industry and found to give satisfactory perfonnance. Proposed solution provides a scope for getting more profit as the economic schedule is obtained instantaneously. Since Reinforcement Learning method can take the stochastic cost data obtained time to time from a plant, it gives an implementable method. As a further step, with suitable methods to interface with on line data, economic scheduling can be achieved instantaneously in a generation control center. Also power scheduling of systems with different sources such as hydro, thermal etc. can be looked into and Reinforcement Learning solutions can be achieved.
Resumo:
Learning Disability (LD) is a general term that describes specific kinds of learning problems. It is a neurological condition that affects a child's brain and impairs his ability to carry out one or many specific tasks. The learning disabled children are neither slow nor mentally retarded. This disorder can make it problematic for a child to learn as quickly or in the same way as some child who isn't affected by a learning disability. An affected child can have normal or above average intelligence. They may have difficulty paying attention, with reading or letter recognition, or with mathematics. It does not mean that children who have learning disabilities are less intelligent. In fact, many children who have learning disabilities are more intelligent than an average child. Learning disabilities vary from child to child. One child with LD may not have the same kind of learning problems as another child with LD. There is no cure for learning disabilities and they are life-long. However, children with LD can be high achievers and can be taught ways to get around the learning disability. In this research work, data mining using machine learning techniques are used to analyze the symptoms of LD, establish interrelationships between them and evaluate the relative importance of these symptoms. To increase the diagnostic accuracy of learning disability prediction, a knowledge based tool based on statistical machine learning or data mining techniques, with high accuracy,according to the knowledge obtained from the clinical information, is proposed. The basic idea of the developed knowledge based tool is to increase the accuracy of the learning disability assessment and reduce the time used for the same. Different statistical machine learning techniques in data mining are used in the study. Identifying the important parameters of LD prediction using the data mining techniques, identifying the hidden relationship between the symptoms of LD and estimating the relative significance of each symptoms of LD are also the parts of the objectives of this research work. The developed tool has many advantages compared to the traditional methods of using check lists in determination of learning disabilities. For improving the performance of various classifiers, we developed some preprocessing methods for the LD prediction system. A new system based on fuzzy and rough set models are also developed for LD prediction. Here also the importance of pre-processing is studied. A Graphical User Interface (GUI) is designed for developing an integrated knowledge based tool for prediction of LD as well as its degree. The designed tool stores the details of the children in the student database and retrieves their LD report as and when required. The present study undoubtedly proves the effectiveness of the tool developed based on various machine learning techniques. It also identifies the important parameters of LD and accurately predicts the learning disability in school age children. This thesis makes several major contributions in technical, general and social areas. The results are found very beneficial to the parents, teachers and the institutions. They are able to diagnose the child’s problem at an early stage and can go for the proper treatments/counseling at the correct time so as to avoid the academic and social losses.
Resumo:
Learning Disability (LD) is a neurological condition that affects a child’s brain and impairs his ability to carry out one or many specific tasks. LD affects about 15 % of children enrolled in schools. The prediction of LD is a vital and intricate job. The aim of this paper is to design an effective and powerful tool, using the two intelligent methods viz., Artificial Neural Network and Adaptive Neuro-Fuzzy Inference System, for measuring the percentage of LD that affected in school-age children. In this study, we are proposing some soft computing methods in data preprocessing for improving the accuracy of the tool as well as the classifier. The data preprocessing is performed through Principal Component Analysis for attribute reduction and closest fit algorithm is used for imputing missing values. The main idea in developing the LD prediction tool is not only to predict the LD present in children but also to measure its percentage along with its class like low or minor or major. The system is implemented in Mathworks Software MatLab 7.10. The results obtained from this study have illustrated that the designed prediction system or tool is capable of measuring the LD effectively