995 resultados para selection signature
Resumo:
R. Jensen and Q. Shen, 'Tolerance-based and Fuzzy-Rough Feature Selection,' Proceedings of the 16th International Conference on Fuzzy Systems (FUZZ-IEEE'07), pp. 877-882, 2007.
Resumo:
R. Jensen and Q. Shen, 'Webpage Classification with ACO-enhanced Fuzzy-Rough Feature Selection,' Proceedings of the Fifth International Conference on Rough Sets and Current Trends in Computing (RSCTC 2006), LNAI 4259, pp. 147-156, 2006.
Resumo:
C. Shang and Q. Shen. Aiding classification of gene expression data with feature selection: a comparative study. Computational Intelligence Research, 1(1):68-76.
Resumo:
Q. Shen and R. Jensen, 'Approximation-based feature selection and application for algae population estimation,' Applied Intelligence, vol. 28, no. 2, pp. 167-181, 2008. Sponsorship: EPSRC RONO: EP/E058388/1
Resumo:
Computational Intelligence and Feature Selection provides a high level audience with both the background and fundamental ideas behind feature selection with an emphasis on those techniques based on rough and fuzzy sets, including their hybridizations. It introduces set theory, fuzzy set theory, rough set theory, and fuzzy-rough set theory, and illustrates the power and efficacy of the feature selections described through the use of real-world applications and worked examples. Program files implementing major algorithms covered, together with the necessary instructions and datasets, are available on the Web.
Resumo:
Alexander, N.; Rhodes, M.; and Myers, H. (2007). International market selection: measuring actions instead of intentions. Journal of Services Marketing. 21(6), pp.424-434 RAE2008
Resumo:
Elliott, G. N., Worgan, H., Broadhurst, D. I., Draper, J. H., Scullion, J. (2007). Soil differentiation using fingerprint Fourier transform infrared spectroscopy, chemometrics and genetic algorithm-based feature selection. Soil Biology & Biochemistry, 39 (11), 2888-2896. Sponsorship: BBSRC / NERC RAE2008
Resumo:
Identification of common sub-sequences for a group of functionally related DNA sequences can shed light on the role of such elements in cell-specific gene expression. In the megakaryocytic lineage, no one single unique transcription factor was described as linage specific, raising the possibility that a cluster of gene promoter sequences presents a unique signature. Here, the megakaryocytic gene promoter group, which consists of both human and mouse 5' non-coding regions, served as a case study. A methodology for group-combinatorial search has been implemented as a customized software platform. It extracts the longest common sequences for a group of related DNA sequences and allows for single gaps of varying length, as well as double- and multiple-gap sequences. The results point to common DNA sequences in a group of genes that is selectively expressed in megakaryocytes, and which does not appear in a large group of control, random and specific sequences. This suggests a role for a combination of these sequences in cell-specific gene expression in the megakaryocytic lineage. The data also point to an intrinsic cross-species difference in the organization of 5' non-coding sequences within the mammalian genomes. This methodology may be used for the identification of regulatory sequences in other lineages.
Resumo:
Space carving has emerged as a powerful method for multiview scene reconstruction. Although a wide variety of methods have been proposed, the quality of the reconstruction remains highly-dependent on the photometric consistency measure, and the threshold used to carve away voxels. In this paper, we present a novel photo-consistency measure that is motivated by a multiset variant of the chamfer distance. The new measure is robust to high amounts of within-view color variance and also takes into account the projection angles of back-projected pixels. Another critical issue in space carving is the selection of the photo-consistency threshold used to determine what surface voxels are kept or carved away. In this paper, a reliable threshold selection technique is proposed that examines the photo-consistency values at contour generator points. Contour generators are points that lie on both the surface of the object and the visual hull. To determine the threshold, a percentile ranking of the photo-consistency values of these generator points is used. This improved technique is applicable to a wide variety of photo-consistency measures, including the new measure presented in this paper. Also presented in this paper is a method to choose between photo-consistency measures, and voxel array resolutions prior to carving using receiver operating characteristic (ROC) curves.
Resumo:
As distributed information services like the World Wide Web become increasingly popular on the Internet, problems of scale are clearly evident. A promising technique that addresses many of these problems is service (or document) replication. However, when a service is replicated, clients then need the additional ability to find a "good" provider of that service. In this paper we report on techniques for finding good service providers without a priori knowledge of server location or network topology. We consider the use of two principal metrics for measuring distance in the Internet: hops, and round-trip latency. We show that these two metrics yield very different results in practice. Surprisingly, we show data indicating that the number of hops between two hosts in the Internet is not strongly correlated to round-trip latency. Thus, the distance in hops between two hosts is not necessarily a good predictor of the expected latency of a document transfer. Instead of using known or measured distances in hops, we show that the extra cost at runtime incurred by dynamic latency measurement is well justified based on the resulting improved performance. In addition we show that selection based on dynamic latency measurement performs much better in practice that any static selection scheme. Finally, the difference between the distribution of hops and latencies is fundamental enough to suggest differences in algorithms for server replication. We show that conclusions drawn about service replication based on the distribution of hops need to be revised when the distribution of latencies is considered instead.
Resumo:
Replication is a commonly proposed solution to problems of scale associated with distributed services. However, when a service is replicated, each client must be assigned a server. Prior work has generally assumed that assignment to be static. In contrast, we propose dynamic server selection, and show that it enables application-level congestion avoidance. To make dynamic server selection practical, we demonstrate the use of three tools. In addition to direct measurements of round-trip latency, we introduce and validate two new tools: bprobe, which estimates the maximum possible bandwidth along a given path; and cprobe, which estimates the current congestion along a path. Using these tools we demonstrate dynamic server selection and compare it to previous static approaches. We show that dynamic server selection consistently outperforms static policies by as much as 50%. Furthermore, we demonstrate the importance of each of our tools in performing dynamic server selection.
Resumo:
High-speed networks, such as ATM networks, are expected to support diverse Quality of Service (QoS) constraints, including real-time QoS guarantees. Real-time QoS is required by many applications such as those that involve voice and video communication. To support such services, routing algorithms that allow applications to reserve the needed bandwidth over a Virtual Circuit (VC) have been proposed. Commonly, these bandwidth-reservation algorithms assign VCs to routes using the least-loaded concept, and thus result in balancing the load over the set of all candidate routes. In this paper, we show that for such reservation-based protocols|which allow for the exclusive use of a preset fraction of a resource's bandwidth for an extended period of time-load balancing is not desirable as it results in resource fragmentation, which adversely affects the likelihood of accepting new reservations. In particular, we show that load-balancing VC routing algorithms are not appropriate when the main objective of the routing protocol is to increase the probability of finding routes that satisfy incoming VC requests, as opposed to equalizing the bandwidth utilization along the various routes. We present an on-line VC routing scheme that is based on the concept of "load profiling", which allows a distribution of "available" bandwidth across a set of candidate routes to match the characteristics of incoming VC QoS requests. We show the effectiveness of our load-profiling approach when compared to traditional load-balancing and load-packing VC routing schemes.
Resumo:
A foundational issue underlying many overlay network applications ranging from routing to P2P file sharing is that of connectivity management, i.e., folding new arrivals into the existing mesh and re-wiring to cope with changing network conditions. Previous work has considered the problem from two perspectives: devising practical heuristics for specific applications designed to work well in real deployments, and providing abstractions for the underlying problem that are tractable to address via theoretical analyses, especially game-theoretic analysis. Our work unifies these two thrusts first by distilling insights gleaned from clean theoretical models, notably that under natural resource constraints, selfish players can select neighbors so as to efficiently reach near-equilibria that also provide high global performance. Using Egoist, a prototype overlay routing system we implemented on PlanetLab, we demonstrate that our neighbor selection primitives significantly outperform existing heuristics on a variety of performance metrics; that Egoist is competitive with an optimal, but unscalable full-mesh approach; and that it remains highly effective under significant churn. We also describe variants of Egoist's current design that would enable it to scale to overlays of much larger scale and allow it to cater effectively to applications, such as P2P file sharing in unstructured overlays, based on the use of primitives such as scoped-flooding rather than routing.
Resumo:
A foundational issue underlying many overlay network applications ranging from routing to P2P file sharing is that of connectivity management, i.e., folding new arrivals into an existing overlay, and re-wiring to cope with changing network conditions. Previous work has considered the problem from two perspectives: devising practical heuristics for specific applications designed to work well in real deployments, and providing abstractions for the underlying problem that are analytically tractable, especially via game-theoretic analysis. In this paper, we unify these two thrusts by using insights gleaned from novel, realistic theoretic models in the design of Egoist – a prototype overlay routing system that we implemented, deployed, and evaluated on PlanetLab. Using measurements on PlanetLab and trace-based simulations, we demonstrate that Egoist's neighbor selection primitives significantly outperform existing heuristics on a variety of performance metrics, including delay, available bandwidth, and node utilization. Moreover, we demonstrate that Egoist is competitive with an optimal, but unscalable full-mesh approach, remains highly effective under significant churn, is robust to cheating, and incurs minimal overhead. Finally, we discuss some of the potential benefits Egoist may offer to applications.
Resumo:
In a typical overlay network for routing or content sharing, each node must select a fixed number of immediate overlay neighbors for routing traffic or content queries. A selfish node entering such a network would select neighbors so as to minimize the weighted sum of expected access costs to all its destinations. Previous work on selfish neighbor selection has built intuition with simple models where edges are undirected, access costs are modeled by hop-counts, and nodes have potentially unbounded degrees. However, in practice, important constraints not captured by these models lead to richer games with substantively and fundamentally different outcomes. Our work models neighbor selection as a game involving directed links, constraints on the number of allowed neighbors, and costs reflecting both network latency and node preference. We express a node's "best response" wiring strategy as a k-median problem on asymmetric distance, and use this formulation to obtain pure Nash equilibria. We experimentally examine the properties of such stable wirings on synthetic topologies, as well as on real topologies and maps constructed from PlanetLab and AS-level Internet measurements. Our results indicate that selfish nodes can reap substantial performance benefits when connecting to overlay networks composed of non-selfish nodes. On the other hand, in overlays that are dominated by selfish nodes, the resulting stable wirings are optimized to such great extent that even non-selfish newcomers can extract near-optimal performance through naive wiring strategies.