7 resultados para Uniformly Convex
em Boston University Digital Common
Resumo:
BACKGROUND:In the current climate of high-throughput computational biology, the inference of a protein's function from related measurements, such as protein-protein interaction relations, has become a canonical task. Most existing technologies pursue this task as a classification problem, on a term-by-term basis, for each term in a database, such as the Gene Ontology (GO) database, a popular rigorous vocabulary for biological functions. However, ontology structures are essentially hierarchies, with certain top to bottom annotation rules which protein function predictions should in principle follow. Currently, the most common approach to imposing these hierarchical constraints on network-based classifiers is through the use of transitive closure to predictions.RESULTS:We propose a probabilistic framework to integrate information in relational data, in the form of a protein-protein interaction network, and a hierarchically structured database of terms, in the form of the GO database, for the purpose of protein function prediction. At the heart of our framework is a factorization of local neighborhood information in the protein-protein interaction network across successive ancestral terms in the GO hierarchy. We introduce a classifier within this framework, with computationally efficient implementation, that produces GO-term predictions that naturally obey a hierarchical 'true-path' consistency from root to leaves, without the need for further post-processing.CONCLUSION:A cross-validation study, using data from the yeast Saccharomyces cerevisiae, shows our method offers substantial improvements over both standard 'guilt-by-association' (i.e., Nearest-Neighbor) and more refined Markov random field methods, whether in their original form or when post-processed to artificially impose 'true-path' consistency. Further analysis of the results indicates that these improvements are associated with increased predictive capabilities (i.e., increased positive predictive value), and that this increase is consistent uniformly with GO-term depth. Additional in silico validation on a collection of new annotations recently added to GO confirms the advantages suggested by the cross-validation study. Taken as a whole, our results show that a hierarchical approach to network-based protein function prediction, that exploits the ontological structure of protein annotation databases in a principled manner, can offer substantial advantages over the successive application of 'flat' network-based methods.
Resumo:
We give an explicit and easy-to-verify characterization for subsets in finite total orders (infinitely many of them in general) to be uniformly definable by a first-order formula. From this characterization we derive immediately that Beth's definability theorem does not hold in any class of finite total orders, as well as that McColm's first conjecture is true for all classes of finite total orders. Another consequence is a natural 0-1 law for definable subsets on finite total orders expressed as a statement about the possible densities of first-order definable subsets.
Resumo:
A well-known paradigm for load balancing in distributed systems is the``power of two choices,''whereby an item is stored at the less loaded of two (or more) random alternative servers. We investigate the power of two choices in natural settings for distributed computing where items and servers reside in a geometric space and each item is associated with the server that is its nearest neighbor. This is in fact the backdrop for distributed hash tables such as Chord, where the geometric space is determined by clockwise distance on a one-dimensional ring. Theoretically, we consider the following load balancing problem. Suppose that servers are initially hashed uniformly at random to points in the space. Sequentially, each item then considers d candidate insertion points also chosen uniformly at random from the space,and selects the insertion point whose associated server has the least load. For the one-dimensional ring, and for Euclidean distance on the two-dimensional torus, we demonstrate that when n data items are hashed to n servers,the maximum load at any server is log log n / log d + O(1) with high probability. While our results match the well-known bounds in the standard setting in which each server is selected equiprobably, our applications do not have this feature, since the sizes of the nearest-neighbor regions around servers are non-uniform. Therefore, the novelty in our methods lies in developing appropriate tail bounds on the distribution of nearest-neighbor region sizes and in adapting previous arguments to this more general setting. In addition, we provide simulation results demonstrating the load balance that results as the system size scales into the millions.
Resumo:
We study the impact of heterogeneity of nodes, in terms of their energy, in wireless sensor networks that are hierarchically clustered. In these networks some of the nodes become cluster heads, aggregate the data of their cluster members and transmit it to the sink. We assume that a percentage of the population of sensor nodes is equipped with additional energy resources-this is a source of heterogeneity which may result from the initial setting or as the operation of the network evolves. We also assume that the sensors are randomly (uniformly) distributed and are not mobile, the coordinates of the sink and the dimensions of the sensor field are known. We show that the behavior of such sensor networks becomes very unstable once the first node dies, especially in the presence of node heterogeneity. Classical clustering protocols assume that all the nodes are equipped with the same amount of energy and as a result, they can not take full advantage of the presence of node heterogeneity. We propose SEP, a heterogeneous-aware protocol to prolong the time interval before the death of the first node (we refer to as stability period), which is crucial for many applications where the feedback from the sensor network must be reliable. SEP is based on weighted election probabilities of each node to become cluster head according to the remaining energy in each node. We show by simulation that SEP always prolongs the stability period compared to (and that the average throughput is greater than) the one obtained using current clustering protocols. We conclude by studying the sensitivity of our SEP protocol to heterogeneity parameters capturing energy imbalance in the network. We found that SEP yields longer stability region for higher values of extra energy brought by more powerful nodes.
Resumo:
Wireless sensor networks are characterized by limited energy resources. To conserve energy, application-specific aggregation (fusion) of data reports from multiple sensors can be beneficial in reducing the amount of data flowing over the network. Furthermore, controlling the topology by scheduling the activity of nodes between active and sleep modes has often been used to uniformly distribute the energy consumption among all nodes by de-synchronizing their activities. We present an integrated analytical model to study the joint performance of in-network aggregation and topology control. We define performance metrics that capture the tradeoffs among delay, energy, and fidelity of the aggregation. Our results indicate that to achieve high fidelity levels under medium to high event reporting load, shorter and fatter aggregation/routing trees (toward the sink) offer the best delay-energy tradeoff as long as topology control is well coordinated with routing.
Resumo:
We present two algorithms for computing distances along a non-convex polyhedral surface. The first algorithm computes exact minimal-geodesic distances and the second algorithm combines these distances to compute exact shortest-path distances along the surface. Both algorithms have been extended to compute the exact minimalgeodesic paths and shortest paths. These algorithms have been implemented and validated on surfaces for which the correct solutions are known, in order to verify the accuracy and to measure the run-time performance, which is cubic or less for each algorithm. The exact-distance computations carried out by these algorithms are feasible for large-scale surfaces containing tens of thousands of vertices, and are a necessary component of near-isometric surface flattening methods that accurately transform curved manifolds into flat representations.
Resumo:
This article describes a neural network model that addresses the acquisition of speaking skills by infants and subsequent motor equivalent production of speech sounds. The model learns two mappings during a babbling phase. A phonetic-to-orosensory mapping specifies a vocal tract target for each speech sound; these targets take the form of convex regions in orosensory coordinates defining the shape of the vocal tract. The babbling process wherein these convex region targets are formed explains how an infant can learn phoneme-specific and language-specific limits on acceptable variability of articulator movements. The model also learns an orosensory-to-articulatory mapping wherein cells coding desired movement directions in orosensory space learn articulator movements that achieve these orosensory movement directions. The resulting mapping provides a natural explanation for the formation of coordinative structures. This mapping also makes efficient use of redundancy in the articulator system, thereby providing the model with motor equivalent capabilities. Simulations verify the model's ability to compensate for constraints or perturbations applied to the articulators automatically and without new learning and to explain contextual variability seen in human speech production.