879 resultados para Edge based analysis


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this thesis, different techniques for image analysis of high density microarrays have been investigated. Most of the existing image analysis techniques require prior knowledge of image specific parameters and direct user intervention for microarray image quantification. The objective of this research work was to develop of a fully automated image analysis method capable of accurately quantifying the intensity information from high density microarrays images. The method should be robust against noise and contaminations that commonly occur in different stages of microarray development.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we propose a multispectral analysis system using wavelet based Principal Component Analysis (PCA), to improve the brain tissue classification from MRI images. Global transforms like PCA often neglects significant small abnormality details, while dealing with a massive amount of multispectral data. In order to resolve this issue, input dataset is expanded by detail coefficients from multisignal wavelet analysis. Then, PCA is applied on the new dataset to perform feature analysis. Finally, an unsupervised classification with Fuzzy C-Means clustering algorithm is used to measure the improvement in reproducibility and accuracy of the results. A detailed comparative analysis of classified tissues with those from conventional PCA is also carried out. Proposed method yielded good improvement in classification of small abnormalities with high sensitivity/accuracy values, 98.9/98.3, for clinical analysis. Experimental results from synthetic and clinical data recommend the new method as a promising approach in brain tissue analysis.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Multispectral analysis is a promising approach in tissue classification and abnormality detection from Magnetic Resonance (MR) images. But instability in accuracy and reproducibility of the classification results from conventional techniques keeps it far from clinical applications. Recent studies proposed Independent Component Analysis (ICA) as an effective method for source signals separation from multispectral MR data. However, it often fails to extract the local features like small abnormalities, especially from dependent real data. A multisignal wavelet analysis prior to ICA is proposed in this work to resolve these issues. Best de-correlated detail coefficients are combined with input images to give better classification results. Performance improvement of the proposed method over conventional ICA is effectively demonstrated by segmentation and classification using k-means clustering. Experimental results from synthetic and real data strongly confirm the positive effect of the new method with an improved Tanimoto index/Sensitivity values, 0.884/93.605, for reproduced small white matter lesions

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This study is an attempt to situate the quality of life and standard of living of local communities in ecotourism destinations inter alia their perception on forest conservation and the satisfaction level of the local community. 650 EDC/VSS members from Kerala demarcated into three zones constitute the data source. Four variables have been considered for evaluating the quality of life of the stakeholders of ecotourism sites, which is then funneled to the income-education spectrum for hypothesizing into the SLI framework. Zone-wise analysis of the community members working in tourism sector shows that the community members have benefited totally from tourism development in the region as they have got both employments as well as secured livelihood options. Most of the quality of life-indicators of the community in the eco-tourist centres show a promising position. The community perception does not show any negative impact on environment as well as on their local culture.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Formal Concept Analysis is an unsupervised learning technique for conceptual clustering. We introduce the notion of iceberg concept lattices and show their use in Knowledge Discovery in Databases (KDD). Iceberg lattices are designed for analyzing very large databases. In particular they serve as a condensed representation of frequent patterns as known from association rule mining. In order to show the interplay between Formal Concept Analysis and association rule mining, we discuss the algorithm TITANIC. We show that iceberg concept lattices are a starting point for computing condensed sets of association rules without loss of information, and are a visualization method for the resulting rules.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Association rules are a popular knowledge discovery technique for warehouse basket analysis. They indicate which items of the warehouse are frequently bought together. The problem of association rule mining has first been stated in 1993. Five years later, several research groups discovered that this problem has a strong connection to Formal Concept Analysis (FCA). In this survey, we will first introduce some basic ideas of this connection along a specific algorithm, TITANIC, and show how FCA helps in reducing the number of resulting rules without loss of information, before giving a general overview over the history and state of the art of applying FCA for association rule mining.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A key argument for modeling knowledge in ontologies is the easy re-use and re-engineering of the knowledge. However, beside consistency checking, current ontology engineering tools provide only basic functionalities for analyzing ontologies. Since ontologies can be considered as (labeled, directed) graphs, graph analysis techniques are a suitable answer for this need. Graph analysis has been performed by sociologists for over 60 years, and resulted in the vivid research area of Social Network Analysis (SNA). While social network structures in general currently receive high attention in the Semantic Web community, there are only very few SNA applications up to now, and virtually none for analyzing the structure of ontologies. We illustrate in this paper the benefits of applying SNA to ontologies and the Semantic Web, and discuss which research topics arise on the edge between the two areas. In particular, we discuss how different notions of centrality describe the core content and structure of an ontology. From the rather simple notion of degree centrality over betweenness centrality to the more complex eigenvector centrality based on Hermitian matrices, we illustrate the insights these measures provide on two ontologies, which are different in purpose, scope, and size.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

I present a novel design methodology for the synthesis of automatic controllers, together with a computational environment---the Control Engineer's Workbench---integrating a suite of programs that automatically analyze and design controllers for high-performance, global control of nonlinear systems. This work demonstrates that difficult control synthesis tasks can be automated, using programs that actively exploit and efficiently represent knowledge of nonlinear dynamics and phase space and effectively use the representation to guide and perform the control design. The Control Engineer's Workbench combines powerful numerical and symbolic computations with artificial intelligence reasoning techniques. As a demonstration, the Workbench automatically designed a high-quality maglev controller that outperforms a previous linear design by a factor of 20.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper describes a trainable system capable of tracking faces and facialsfeatures like eyes and nostrils and estimating basic mouth features such as sdegrees of openness and smile in real time. In developing this system, we have addressed the twin issues of image representation and algorithms for learning. We have used the invariance properties of image representations based on Haar wavelets to robustly capture various facial features. Similarly, unlike previous approaches this system is entirely trained using examples and does not rely on a priori (hand-crafted) models of facial features based on optical flow or facial musculature. The system works in several stages that begin with face detection, followed by localization of facial features and estimation of mouth parameters. Each of these stages is formulated as a problem in supervised learning from examples. We apply the new and robust technique of support vector machines (SVM) for classification in the stage of skin segmentation, face detection and eye detection. Estimation of mouth parameters is modeled as a regression from a sparse subset of coefficients (basis functions) of an overcomplete dictionary of Haar wavelets.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Image analysis and graphics synthesis can be achieved with learning techniques using directly image examples without physically-based, 3D models. In our technique: -- the mapping from novel images to a vector of "pose" and "expression" parameters can be learned from a small set of example images using a function approximation technique that we call an analysis network; -- the inverse mapping from input "pose" and "expression" parameters to output images can be synthesized from a small set of example images and used to produce new images using a similar synthesis network. The techniques described here have several applications in computer graphics, special effects, interactive multimedia and very low bandwidth teleconferencing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the eighties, John Aitchison (1986) developed a new methodological approach for the statistical analysis of compositional data. This new methodology was implemented in Basic routines grouped under the name CODA and later NEWCODA inMatlab (Aitchison, 1997). After that, several other authors have published extensions to this methodology: Marín-Fernández and others (2000), Barceló-Vidal and others (2001), Pawlowsky-Glahn and Egozcue (2001, 2002) and Egozcue and others (2003). (...)

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Usually, psychometricians apply classical factorial analysis to evaluate construct validity of order rank scales. Nevertheless, these scales have particular characteristics that must be taken into account: total scores and rank are highly relevant

Relevância:

40.00% 40.00%

Publicador:

Resumo:

One of the major problems in machine vision is the segmentation of images of natural scenes. This paper presents a new proposal for the image segmentation problem which has been based on the integration of edge and region information. The main contours of the scene are detected and used to guide the posterior region growing process. The algorithm places a number of seeds at both sides of a contour allowing stating a set of concurrent growing processes. A previous analysis of the seeds permits to adjust the homogeneity criterion to the regions's characteristics. A new homogeneity criterion based on clustering analysis and convex hull construction is proposed

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a study of connection availability in GMPLS over optical transport networks (OTN) taking into account different network topologies. Two basic path protection schemes are considered and compared with the no protection case. The selected topologies are heterogeneous in geographic coverage, network diameter, link lengths, and average node degree. Connection availability is also computed considering the reliability data of physical components and a well-known network availability model. Results show several correspondences between suitable path protection algorithms and several network topology characteristics