8 resultados para Domain Ontology

em Cochin University of Science


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The detection of buried objects using time-domain freespace measurements was carried out in the near field. The location of a hidden object was determined from an analysis of the reflected signal. This method can be extended to detect any number of objects. Measurements were carried out in the X- and Ku-bands using ordinary rectangular pyramidal horn antennas of gain 15 dB. The same antenna was used as the transmitter and recei er. The experimental results were compared with simulated results by applying the two-dimensional finite-difference time-domain(FDTD)method, and agree well with each other. The dispersi e nature of the dielectric medium was considered for the simulation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data mining is one of the hottest research areas nowadays as it has got wide variety of applications in common man’s life to make the world a better place to live. It is all about finding interesting hidden patterns in a huge history data base. As an example, from a sales data base, one can find an interesting pattern like “people who buy magazines tend to buy news papers also” using data mining. Now in the sales point of view the advantage is that one can place these things together in the shop to increase sales. In this research work, data mining is effectively applied to a domain called placement chance prediction, since taking wise career decision is so crucial for anybody for sure. In India technical manpower analysis is carried out by an organization named National Technical Manpower Information System (NTMIS), established in 1983-84 by India's Ministry of Education & Culture. The NTMIS comprises of a lead centre in the IAMR, New Delhi, and 21 nodal centres located at different parts of the country. The Kerala State Nodal Centre is located at Cochin University of Science and Technology. In Nodal Centre, they collect placement information by sending postal questionnaire to passed out students on a regular basis. From this raw data available in the nodal centre, a history data base was prepared. Each record in this data base includes entrance rank ranges, reservation, Sector, Sex, and a particular engineering. From each such combination of attributes from the history data base of student records, corresponding placement chances is computed and stored in the history data base. From this data, various popular data mining models are built and tested. These models can be used to predict the most suitable branch for a particular new student with one of the above combination of criteria. Also a detailed performance comparison of the various data mining models is done.This research work proposes to use a combination of data mining models namely a hybrid stacking ensemble for better predictions. A strategy to predict the overall absorption rate for various branches as well as the time it takes for all the students of a particular branch to get placed etc are also proposed. Finally, this research work puts forward a new data mining algorithm namely C 4.5 * stat for numeric data sets which has been proved to have competent accuracy over standard benchmarking data sets called UCI data sets. It also proposes an optimization strategy called parameter tuning to improve the standard C 4.5 algorithm. As a summary this research work passes through all four dimensions for a typical data mining research work, namely application to a domain, development of classifier models, optimization and ensemble methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Regional climate models are becoming increasingly popular to provide high resolution climate change information for impacts assessments to inform adaptation options. Many countries and provinces requiring these assessments are as small as 200,000 km2 in size, significantly smaller than an ideal domain needed for successful applications of one-way nested regional climate models. Therefore assessments on sub-regional scales (e.g., river basins) are generally carried out using climate change simulations performed for relatively larger regions. Here we show that the seasonal mean hydrological cycle and the day-to-day precipitation variations of a sub-region within the model domain are sensitive to the domain size, even though the large scale circulation features over the region are largely insensitive. On seasonal timescales, the relatively smaller domains intensify the hydrological cycle by increasing the net transport of moisture into the study region and thereby enhancing the precipitation and local recycling of moisture. On daily timescales, the simulations run over smaller domains produce higher number of moderate precipitation days in the sub-region relative to the corresponding larger domain simulations. An assessment of daily variations of water vapor and the vertical velocity within the sub-region indicates that the smaller domains may favor more frequent moderate uplifting and subsequent precipitation in the region. The results remained largely insensitive to the horizontal resolution of the model, indicating the robustness of the domain size influence on the regional model solutions. These domain size dependent precipitation characteristics have the potential to add one more level of uncertainty to the downscaled projections.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Anticipating the increase in video information in future, archiving of news is an important activity in the visual media industry. When the volume of archives increases, it will be difficult for journalists to find the appropriate content using current search tools. This paper provides the details of the study we conducted about the news extraction systems used in different news channels in Kerala. Semantic web technologies can be used effectively since news archiving share many of the characteristics and problems of WWW. Since visual news archives of different media resources follow different metadata standards, interoperability between the resources is also an issue. World Wide Web Consortium has proposed a draft for an ontology framework for media resource which addresses the intercompatiblity issues. In this paper, the w3c proposed framework and its drawbacks is also discussed

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we have evolved a generic software architecture for a domain specific distributed embedded system. The system under consideration belongs to the Command, Control and Communication systems domain. The systems in such domain have very long operational lifetime. The quality attributes of these systems are equally important as the functional requirements. The main guiding principle followed in this paper for evolving the software architecture has been functional independence of the modules. The quality attributes considered most important for the system are maintainability and modifiability. Architectural styles best suited for the functionally independent modules are proposed with focus on these quality attributes. The software architecture for the system is envisioned as a collection of architecture styles of the functionally independent modules identified

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The standard separable two dimensional wavelet transform has achieved a great success in image denoising applications due to its sparse representation of images. However it fails to capture efficiently the anisotropic geometric structures like edges and contours in images as they intersect too many wavelet basis functions and lead to a non-sparse representation. In this paper a novel de-noising scheme based on multi directional and anisotropic wavelet transform called directionlet is presented. The image denoising in wavelet domain has been extended to the directionlet domain to make the image features to concentrate on fewer coefficients so that more effective thresholding is possible. The image is first segmented and the dominant direction of each segment is identified to make a directional map. Then according to the directional map, the directionlet transform is taken along the dominant direction of the selected segment. The decomposed images with directional energy are used for scale dependent subband adaptive optimal threshold computation based on SURE risk. This threshold is then applied to the sub-bands except the LLL subband. The threshold corrected sub-bands with the unprocessed first sub-band (LLL) are given as input to the inverse directionlet algorithm for getting the de-noised image. Experimental results show that the proposed method outperforms the standard wavelet-based denoising methods in terms of numeric and visual quality

Relevância:

20.00% 20.00%

Publicador:

Resumo:

When simulation modeling is used for performance improvement studies of complex systems such as transport terminals, domain specific conceptual modeling constructs could be used by modelers to create structured models. A two stage procedure which includes identification of the problem characteristics/cluster - ‘knowledge acquisition’ and identification of standard models for the problem cluster – ‘model abstraction’ was found to be effective in creating structured models when applied to certain logistic terminal systems. In this paper we discuss some methods and examples related the knowledge acquisition and model abstraction stages for the development of three different types of model categories of terminal systems