992 resultados para information bottleneck


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In image processing, segmentation algorithms constitute one of the main focuses of research. In this paper, new image segmentation algorithms based on a hard version of the information bottleneck method are presented. The objective of this method is to extract a compact representation of a variable, considered the input, with minimal loss of mutual information with respect to another variable, considered the output. First, we introduce a split-and-merge algorithm based on the definition of an information channel between a set of regions (input) of the image and the intensity histogram bins (output). From this channel, the maximization of the mutual information gain is used to optimize the image partitioning. Then, the merging process of the regions obtained in the previous phase is carried out by minimizing the loss of mutual information. From the inversion of the above channel, we also present a new histogram clustering algorithm based on the minimization of the mutual information loss, where now the input variable represents the histogram bins and the output is given by the set of regions obtained from the above split-and-merge algorithm. Finally, we introduce two new clustering algorithms which show how the information bottleneck method can be applied to the registration channel obtained when two multimodal images are correctly aligned. Different experiments on 2-D and 3-D images show the behavior of the proposed algorithms

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Understanding the guiding principles of sensory coding strategies is a main goal in computational neuroscience. Among others, the principles of predictive coding and slowness appear to capture aspects of sensory processing. Predictive coding postulates that sensory systems are adapted to the structure of their input signals such that information about future inputs is encoded. Slow feature analysis (SFA) is a method for extracting slowly varying components from quickly varying input signals, thereby learning temporally invariant features. Here, we use the information bottleneck method to state an information-theoretic objective function for temporally local predictive coding. We then show that the linear case of SFA can be interpreted as a variant of predictive coding that maximizes the mutual information between the current output of the system and the input signal in the next time step. This demonstrates that the slowness principle and predictive coding are intimately related.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The cross-sections of the Social Web and the Semantic Web has put folksonomy in the spot light for its potential in overcoming knowledge acquisition bottleneck and providing insight for "wisdom of the crowds". Folksonomy which comes as the results of collaborative tagging activities has provided insight into user's understanding about Web resources which might be useful for searching and organizing purposes. However, collaborative tagging vocabulary poses some challenges since tags are freely chosen by users and may exhibit synonymy and polysemy problem. In order to overcome these challenges and boost the potential of folksonomy as emergence semantics we propose to consolidate the diverse vocabulary into a consolidated entities and concepts. We propose to extract a tag ontology by ontology learning process to represent the semantics of a tagging community. This paper presents a novel approach to learn the ontology based on the widely used lexical database WordNet. We present personalization strategies to disambiguate the semantics of tags by combining the opinion of WordNet lexicographers and users’ tagging behavior together. We provide empirical evaluations by using the semantic information contained in the ontology in a tag recommendation experiment. The results show that by using the semantic relationships on the ontology the accuracy of the tag recommender has been improved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper reports laboratory experiments designed to study the impact of public information about past departure rates on congestion levels and travel costs. Our design is based on a discrete version of Arnott et al.'s (1990) bottleneck model. In all treatments, congestion occurs and the observed travel costs are quite similar to the predicted ones. Subjects' capacity to coordinate is not affected by the availability of public information on past departure rates, by the number of drivers or by the relative cost of delay. This seemingly absence of treatment effects is confirmed by our finding that a parameter-free reinforcement learning model best characterises individual behaviour.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The South American fur seal, Arctocephalus australis, was one of the earliest otariid seals to be exploited by humans: at least 6000 years ago on the Atlantic coast and 4000 on the Pacific coast of South America. More than 750,000 fur seals were killed in Uruguay until 1991. However, a climatological phenomenon-the severe 1997-1998 El Nino Southern Oscillation (ENSO)-was responsible for the decline of 72% Of the Peruvian fur seal population due to starvation as a consequence of warming of sea-surface temperatures and primary productivity reduction. Currently, there is no precise information on global population size or on the species` conservation status. The present study includes the first bottleneck test for the Pacific and Atlantic populations of A. australis based on the analysis of seven microsatellite loci. Genetic bottleneck compromises the evolutionary potential of a population to respond to environmental changes. The perspective becomes even more alarming due to current global warming models that predict stronger and more frequent ENSO events in the future. Our analysis found moderate support for deviation from neutrality-equilibrium for the Pacific population of fur seals and none for the Atlantic population. This difference among population reflects different demographic histories, and is consistent with a greater reduction in population size in the Pacific. Such an event could be a result of the synergic effects of recurrent ENSO events and the anthropogenic impact (sealing and prey overfishing) on this population.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this research is to investigate the consequences of sharing or using information generated in one phase of the project to subsequent life cycle phases. Sometimes the assumptions supporting the information change, and at other times the context within which the information was created changes in a way that causes the information to become invalid. Often these inconsistencies are not discovered till the damage has occurred. This study builds on previous research that proposed a framework based on the metaphor of ‘ecosystems’ to model such inconsistencies in the 'supply chain' of life cycle information (Brokaw and Mukherjee, 2012). The outcome of such inconsistencies often results in litigation. Therefore, this paper studies a set of legal cases that resulted from inconsistencies in life cycle information, within the ecosystems framework. For each project, the errant information type, creator and user of the information and their relationship, time of creation and usage of the information in the life cycle of the project are investigated to assess the causes of failure of precise and accurate information flow as well as the impact of such failures in later stages of the project. The analysis shows that the misleading information is mostly due to lack of collaboration. Besides, in all the studied cases, lack of compliance checking, imprecise data and insufficient clarifications hinder accurate and smooth flow of information. The paper presents findings regarding the bottleneck of the information flow process during the design, construction and post construction phases. It also highlights the role of collaboration as well as information integration and management during the project life cycle and presents a baseline for improvement in information supply chain through the life cycle of the project.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The slow down in the drug discovery pipeline is, in part, owing to a lack of structural and functional information available for new drug targets. Membrane proteins, the targets of well over 50% of marketed pharmaceuticals, present a particular challenge. As they are not naturally abundant, they must be produced recombinantly for the structural biology that is a prerequisite to structure-based drug design. Unfortunately, however, obtaining high yields of functional, recombinant membrane proteins remains a major bottleneck in contemporary bioscience. While repeated rounds of trial-and-error optimization have not (and cannot) reveal mechanistic details of the biology of recombinant protein production, examination of the host response has provided new insights. To this end, we published an early transcriptome analysis that identified genes implicated in high-yielding yeast cell factories, which has enabled the engineering of improved production strains. These advances offer hope that the bottleneck of membrane protein production can be relieved rationally.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Procedural knowledge is the knowledge required to perform certain tasks. It forms an important part of expertise, and is crucial for learning new tasks. This paper summarises existing work on procedural knowledge acquisition, and identifies two major challenges that remain to be solved in this field; namely, automating the acquisition process to tackle bottleneck in the formalization of procedural knowledge, and enabling machine understanding and manipulation of procedural knowledge. It is believed that recent advances in information extraction techniques can be applied compose a comprehensive solution to address these challenges. We identify specific tasks required to achieve the goal, and present detailed analyses of new research challenges and opportunities. It is expected that these analyses will interest researchers of various knowledge management tasks, particularly knowledge acquisition and capture.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The realization of the Semantic Web is constrained by a knowledge acquisition bottleneck, i.e. the problem of how to add RDF mark-up to the millions of ordinary web pages that already exist. Information Extraction (IE) has been proposed as a solution to the annotation bottleneck. In the task based evaluation reported here, we compared the performance of users without access to annotation, users working with annotations which had been produced from manually constructed knowledge bases, and users working with annotations augmented using IE. We looked at retrieval performance, overlap between retrieved items and the two sets of annotations, and usage of annotation options. Automatically generated annotations were found to add value to the browsing experience in the scenario investigated. Copyright 2005 ACM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Every high resolution imaging system suffers from the bottleneck problem. This problem relates to the huge amount of data transmission from the sensor array to a digital signal processing (DSP) and to bottleneck in performance, caused by the requirement to process a large amount of information in parallel. The same problem exists in biological vision systems, where the information, sensed by many millions of receptors should be transmitted and processed in real time. Models, describing the bottleneck problem solutions in biological systems fall in the field of visual attention. This paper presents the bottleneck problem existing in imagers used for real time salient target tracking and proposes a simple solution by employing models of attention, found in biological systems. The bottleneck problem in imaging systems is presented, the existing models of visual attention are discussed and the architecture of the proposed imager is shown.