Biblioteca Digital

991 resultados para stored

ART: An ontology based tool for the translation of papers into Semantic Web format

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To be presented at SIG/ISMB07 ontology workshop: http://bio-ontologies.org.uk/index.php To be published in BMC Bioinformatics. Sponsorship: JISC

Veja mais

Fuzzy-Rough Attribute Reduction with Application to Web Categorization.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

R. Jensen and Q. Shen, 'Fuzzy-Rough Attribute Reduction with Application to Web Categorization,' Fuzzy Sets and Systems, vol. 141, no. 3, pp. 469-485, 2004.

Veja mais

Webpage Classification with ACO-enhanced Fuzzy-Rough Feature Selection.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

R. Jensen and Q. Shen, 'Webpage Classification with ACO-enhanced Fuzzy-Rough Feature Selection,' Proceedings of the Fifth International Conference on Rough Sets and Current Trends in Computing (RSCTC 2006), LNAI 4259, pp. 147-156, 2006.

Veja mais

Exhibiting the behaviour of time-delayed systems via an extension to qualitative simulation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

I. Miguel and Q. Shen. Exhibiting the behaviour of time-delayed systems via an extension to qualitative simulation. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 35(2):298-305, 2005.

Veja mais

Absence of turnover and futile cycling of sucrose in leaves of Lolium temulentum L.: implications for metabolic compartmentation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cairns, A. J., Gallagher, J. A. (2004). Absence of turnover and futile cycling of sucrose in leaves of Lolium temulentum L.: implications for metabolic compartmentation. Planta, 219 (5), 836-846. Sponsorship: BBSRC RAE2008

Veja mais

Aplicação de MonetDB na avaliação de desempenho de bases de dados verticais

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertação apresentada à Universidade Fernando Pessoa como partes dos requisitos para a obtenção do grau de Mestre em Engenharia Informática, ramo de Sistemas de Informação e Multimédia

Veja mais

SIEGE: Smoking Induced Epithelial Gene Expression Database

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The SIEGE (Smoking Induced Epithelial Gene Expression) database is a clinical resource for compiling and analyzing gene expression data from epithelial cells of the human intra-thoracic airway. This database supports a translational research study whose goal is to profile the changes in airway gene expression that are induced by cigarette smoke. RNA is isolated from airway epithelium obtained at bronchoscopy from current-, former- and never-smoker subjects, and hybridized to Affymetrix HG-U133A Genechips, which measure the level of expression of ~22 500 human transcripts. The microarray data generated along with relevant patient information is uploaded to SIEGE by study administrators using the database's web interface, found at http://pulm.bumc.bu.edu/siegeDB. PERL-coded scripts integrated with SIEGE perform various quality control functions including the processing, filtering and formatting of stored data. The R statistical package is used to import database expression values and execute a number of statistical analyses including t-tests, correlation coefficients and hierarchical clustering. Values from all statistical analyses can be queried through CGI-based tools and web forms found on the �Search� section of the database website. Query results are embedded with graphical capabilities as well as with links to other databases containing valuable gene resources, including Entrez Gene, GO, Biocarta, GeneCards, dbSNP and the NCBI Map Viewer.

Veja mais

Quantitative Particle Characterization by Scattered Ultrasound

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The topic of this thesis is an acoustic scattering technique for detennining the compressibility and density of individual particles. The particles, which have diameters on the order of 10 µm, are modeled as fluid spheres. Ultrasonic tone bursts of 2 µsec duration and 30 MHz center frequency scatter from individual particles as they traverse the focal region of two confocally positioned transducers. One transducer acts as a receiver while the other both transmits and receives acoustic signals. The resulting scattered bursts are detected at 90° and at 180° (backscattered). Using either the long wavelength (Rayleigh) or the weak scatterer (Born) approximations, it is possible to detennine the compressibility and density of the particle provided we possess a priori knowledge of the particle size and the host properties. The detected scattered signals are digitized and stored in computer memory. With this information we can compute the mean compressibility and density averaged over a population of particles ( typically 1000 particles) or display histograms of scattered amplitude statistics. An experiment was run first run to assess the feasibility of using polystyrene polymer microspheres to calibrate the instrument. A second study was performed on the buffy coat harvested from whole human blood. Finally, chinese hamster ovary cells which were subject to hyperthermia treatment were studied in order to see if the instrument could detect heat induced membrane blebbing.

Veja mais

AIDA-Based Distributed File System

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a prototype implementation of a Distributed File System (DFS) based on the Adaptive Information Dispersal Algorithm (AIDA). Using AIDA, a file block is encoded and dispersed into smaller blocks stored on a number of DFS nodes distributed over a network. The implementation devises file creation, read, and write operations. In particular, when reading a file, the DFS accepts an optional timing constraint, which it uses to determine the level of redundancy needed for the read operation. The tighter the timing constraint, the more nodes in the DFS are queried for encoded blocks. Write operations update all blocks in all DFS nodes--with future implementations possibly including the use of read and write quorums. This work was conducted under the supervision of Professor Azer Bestavros (best@cs.bu.edu) in the Computer Science Department as part of Mohammad Makarechian's Master's project.

Veja mais

Geometric Generalizations of the Power of Two Choices

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A well-known paradigm for load balancing in distributed systems is the``power of two choices,''whereby an item is stored at the less loaded of two (or more) random alternative servers. We investigate the power of two choices in natural settings for distributed computing where items and servers reside in a geometric space and each item is associated with the server that is its nearest neighbor. This is in fact the backdrop for distributed hash tables such as Chord, where the geometric space is determined by clockwise distance on a one-dimensional ring. Theoretically, we consider the following load balancing problem. Suppose that servers are initially hashed uniformly at random to points in the space. Sequentially, each item then considers d candidate insertion points also chosen uniformly at random from the space,and selects the insertion point whose associated server has the least load. For the one-dimensional ring, and for Euclidean distance on the two-dimensional torus, we demonstrate that when n data items are hashed to n servers,the maximum load at any server is log log n / log d + O(1) with high probability. While our results match the well-known bounds in the standard setting in which each server is selected equiprobably, our applications do not have this feature, since the sizes of the nearest-neighbor regions around servers are non-uniform. Therefore, the novelty in our methods lies in developing appropriate tail bounds on the distribution of nearest-neighbor region sizes and in adapting previous arguments to this more general setting. In addition, we provide simulation results demonstrating the load balance that results as the system size scales into the millions.

Veja mais

On trip planning queries in spatial databases

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we discuss a new type of query in Spatial Databases, called Trip Planning Query (TPQ). Given a set of points P in space, where each point belongs to a category, and given two points s and e, TPQ asks for the best trip that starts at s, passes through exactly one point from each category, and ends at e. An example of a TPQ is when a user wants to visit a set of different places and at the same time minimize the total travelling cost, e.g. what is the shortest travelling plan for me to visit an automobile shop, a CVS pharmacy outlet, and a Best Buy shop along my trip from A to B? The trip planning query is an extension of the well-known TSP problem and therefore is NP-hard. The difficulty of this query lies in the existence of multiple choices for each category. In this paper, we first study fast approximation algorithms for the trip planning query in a metric space, assuming that the data set fits in main memory, and give the theory analysis of their approximation bounds. Then, the trip planning query is examined for data sets that do not fit in main memory and must be stored on disk. For the disk-resident data, we consider two cases. In one case, we assume that the points are located in Euclidean space and indexed with an Rtree. In the other case, we consider the problem of points that lie on the edges of a spatial network (e.g. road network) and the distance between two points is defined using the shortest distance over the network. Finally, we give an experimental evaluation of the proposed algorithms using synthetic data sets generated on real road networks.

Veja mais

On the Interaction Between an Operating System and Web Server

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper examines how and why web server performance changes as the workload at the server varies. We measure the performance of a PC acting as a standalone web server, running Apache on top of Linux. We use two important tools to understand what aspects of software architecture and implementation determine performance at the server. The first is a tool that we developed, called WebMonitor, which measures activity and resource consumption, both in the operating system and in the web server. The second is the kernel profiling facility distributed as part of Linux. We vary the workload at the server along two important dimensions: the number of clients concurrently accessing the server, and the size of the documents stored on the server. Our results quantify and show how more clients and larger files stress the web server and operating system in different and surprising ways. Our results also show the importance of fixed costs (i.e., opening and closing TCP connections, and updating the server log) in determining web server performance.

Veja mais

A Hierarchical Characterization of a Live Streaming Media Workload

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present what we believe to be the first thorough characterization of live streaming media content delivered over the Internet. Our characterization of over five million requests spanning a 28-day period is done at three increasingly granular levels, corresponding to clients, sessions, and transfers. Our findings support two important conclusions. First, we show that the nature of interactions between users and objects is fundamentally different for live versus stored objects. Access to stored objects is user driven, whereas access to live objects is object driven. This reversal of active/passive roles of users and objects leads to interesting dualities. For instance, our analysis underscores a Zipf-like profile for user interest in a given object, which is to be contrasted to the classic Zipf-like popularity of objects for a given user. Also, our analysis reveals that transfer lengths are highly variable and that this variability is due to the stickiness of clients to a particular live object, as opposed to structural (size) properties of objects. Second, based on observations we make, we conjecture that the particular characteristics of live media access workloads are likely to be highly dependent on the nature of the live content being accessed. In our study, this dependence is clear from the strong temporal correlations we observed in the traces, which we attribute to the synchronizing impact of live content on access characteristics. Based on our analyses, we present a model for live media workload generation that incorporates many of our findings, and which we implement in GISMO [19].

Veja mais

Simple Load Balancing for Distributed Hash Tables

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Distributed hash tables have recently become a useful building block for a variety of distributed applications. However, current schemes based upon consistent hashing require both considerable implementation complexity and substantial storage overhead to achieve desired load balancing goals. We argue in this paper that these goals can b e achieved more simply and more cost-effectively. First, we suggest the direct application of the "power of two choices" paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally b e extended to support other load balancing methods, including load-stealing or load-shedding schemes, as well as providing natural fault-tolerance mechanisms.

Veja mais

Amorphous Placement and Retrieval of Sensory Data in Sparse Mobile Ad-Hoc Networks

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract—Personal communication devices are increasingly being equipped with sensors that are able to passively collect information from their surroundings – information that could be stored in fairly small local caches. We envision a system in which users of such devices use their collective sensing, storage, and communication resources to query the state of (possibly remote) neighborhoods. The goal of such a system is to achieve the highest query success ratio using the least communication overhead (power). We show that the use of Data Centric Storage (DCS), or directed placement, is a viable approach for achieving this goal, but only when the underlying network is well connected. Alternatively, we propose, amorphous placement, in which sensory samples are cached locally and informed exchanges of cached samples is used to diffuse the sensory data throughout the whole network. In handling queries, the local cache is searched first for potential answers. If unsuccessful, the query is forwarded to one or more direct neighbors for answers. This technique leverages node mobility and caching capabilities to avoid the multi-hop communication overhead of directed placement. Using a simplified mobility model, we provide analytical lower and upper bounds on the ability of amorphous placement to achieve uniform field coverage in one and two dimensions. We show that combining informed shuffling of cached samples upon an encounter between two nodes, with the querying of direct neighbors could lead to significant performance improvements. For instance, under realistic mobility models, our simulation experiments show that amorphous placement achieves 10% to 40% better query answering ratio at a 25% to 35% savings in consumed power over directed placement.

Veja mais

991 resultados para stored

Filtro por publicador