942 resultados para Dynamic storage allocation (Computer science)
Resumo:
In an outsourced database system the data owner publishes information through a number of remote, untrusted servers with the goal of enabling clients to access and query the data more efficiently. As clients cannot trust servers, query authentication is an essential component in any outsourced database system. Clients should be given the capability to verify that the answers provided by the servers are correct with respect to the actual data published by the owner. While existing work provides authentication techniques for selection and projection queries, there is a lack of techniques for authenticating aggregation queries. This article introduces the first known authenticated index structures for aggregation queries. First, we design an index that features good performance characteristics for static environments, where few or no updates occur to the data. Then, we extend these ideas and propose more involved structures for the dynamic case, where the database owner is allowed to update the data arbitrarily. Our structures feature excellent average case performance for authenticating queries with multiple aggregate attributes and multiple selection predicates. We also implement working prototypes of the proposed techniques and experimentally validate the correctness of our ideas.
Resumo:
In this project we design and implement a centralized hashing table in the snBench sensor network environment. We discuss the feasibility of this approach and compare and contrast with the distributed hashing architecture, with particular discussion regarding the conditions under which a centralized architecture makes sense. There are numerous computational tasks that require persistence of data in a sensor network environment. To help motivate the need for data storage in snBench we demonstrate a practical application of the technology whereby a video camera can monitor a room to detect the presence of a person and send an alert to the appropriate authorities.
Resumo:
Spotting patterns of interest in an input signal is a very useful task in many different fields including medicine, bioinformatics, economics, speech recognition and computer vision. Example instances of this problem include spotting an object of interest in an image (e.g., a tumor), a pattern of interest in a time-varying signal (e.g., audio analysis), or an object of interest moving in a specific way (e.g., a human's body gesture). Traditional spotting methods, which are based on Dynamic Time Warping or hidden Markov models, use some variant of dynamic programming to register the pattern and the input while accounting for temporal variation between them. At the same time, those methods often suffer from several shortcomings: they may give meaningless solutions when input observations are unreliable or ambiguous, they require a high complexity search across the whole input signal, and they may give incorrect solutions if some patterns appear as smaller parts within other patterns. In this thesis, we develop a framework that addresses these three problems, and evaluate the framework's performance in spotting and recognizing hand gestures in video. The first contribution is a spatiotemporal matching algorithm that extends the dynamic programming formulation to accommodate multiple candidate hand detections in every video frame. The algorithm finds the best alignment between the gesture model and the input, and simultaneously locates the best candidate hand detection in every frame. This allows for a gesture to be recognized even when the hand location is highly ambiguous. The second contribution is a pruning method that uses model-specific classifiers to reject dynamic programming hypotheses with a poor match between the input and model. Pruning improves the efficiency of the spatiotemporal matching algorithm, and in some cases may improve the recognition accuracy. The pruning classifiers are learned from training data, and cross-validation is used to reduce the chance of overpruning. The third contribution is a subgesture reasoning process that models the fact that some gesture models can falsely match parts of other, longer gestures. By integrating subgesture reasoning the spotting algorithm can avoid the premature detection of a subgesture when the longer gesture is actually being performed. Subgesture relations between pairs of gestures are automatically learned from training data. The performance of the approach is evaluated on two challenging video datasets: hand-signed digits gestured by users wearing short sleeved shirts, in front of a cluttered background, and American Sign Language (ASL) utterances gestured by ASL native signers. The experiments demonstrate that the proposed method is more accurate and efficient than competing approaches. The proposed approach can be generally applied to alignment or search problems with multiple input observations, that use dynamic programming to find a solution.
Resumo:
Personal communication devices are increasingly equipped with sensors that are able to collect and locally store information from their environs. The mobility of users carrying such devices, and hence the mobility of sensor readings in space and time, opens new horizons for interesting applications. In particular, we envision a system in which the collective sensing, storage and communication resources, and mobility of these devices could be leveraged to query the state of (possibly remote) neighborhoods. Such queries would have spatio-temporal constraints which must be met for the query answers to be useful. Using a simplified mobility model, we analytically quantify the benefits from cooperation (in terms of the system's ability to satisfy spatio-temporal constraints), which we show to go beyond simple space-time tradeoffs. In managing the limited storage resources of such cooperative systems, the goal should be to minimize the number of unsatisfiable spatio-temporal constraints. We show that Data Centric Storage (DCS), or "directed placement", is a viable approach for achieving this goal, but only when the underlying network is well connected. Alternatively, we propose, "amorphous placement", in which sensory samples are cached locally, and shuffling of cached samples is used to diffuse the sensory data throughout the whole network. We evaluate conditions under which directed versus amorphous placement strategies would be more efficient. These results lead us to propose a hybrid placement strategy, in which the spatio-temporal constraints associated with a sensory data type determine the most appropriate placement strategy for that data type. We perform an extensive simulation study to evaluate the performance of directed, amorphous, and hybrid placement protocols when applied to queries that are subject to timing constraints. Our results show that, directed placement is better for queries with moderately tight deadlines, whereas amorphous placement is better for queries with looser deadlines, and that under most operational conditions, the hybrid technique gives the best compromise.
Resumo:
Modern cellular channels in 3G networks incorporate sophisticated power control and dynamic rate adaptation which can have significant impact on adaptive transport layer protocols, such as TCP. Though there exists studies that have evaluated the performance of TCP over such networks, they are based solely on observations at the transport layer and hence have no visibility into the impact of lower layer dynamics, which are a key characteristic of these networks. In this work, we present a detailed characterization of TCP behavior based on cross-layer measurement of transport layer, as well as RF and MAC layer parameters. In particular, through a series of active TCP/UDP experiments and measurement of the relevant variables at all three layers, we characterize both, the wireless scheduler and the radio link protocol in a commercial CDMA2000 network and assess their impact on TCP dynamics. Somewhat surprisingly, our findings indicate that the wireless scheduler is mostly insensitive to channel quality and sector load over short timescales and is mainly affected by the transport layer data rate. Furthermore, with the help of a robust correlation measure, Normalized Mutual Information, we were able to quantify the impact of the wireless scheduler and the radio link protocol on various TCP parameters such as the round trip time, throughput and packet loss rate.
Resumo:
Emerging configurable infrastructures such as large-scale overlays and grids, distributed testbeds, and sensor networks comprise diverse sets of available computing resources (e.g., CPU and OS capabilities and memory constraints) and network conditions (e.g., link delay, bandwidth, loss rate, and jitter) whose characteristics are both complex and time-varying. At the same time, distributed applications to be deployed on these infrastructures exhibit increasingly complex constraints and requirements on resources they wish to utilize. Examples include selecting nodes and links to schedule an overlay multicast file transfer across the Grid, or embedding a network experiment with specific resource constraints in a distributed testbed such as PlanetLab. Thus, a common problem facing the efficient deployment of distributed applications on these infrastructures is that of "mapping" application-level requirements onto the network in such a manner that the requirements of the application are realized, assuming that the underlying characteristics of the network are known. We refer to this problem as the network embedding problem. In this paper, we propose a new approach to tackle this combinatorially-hard problem. Thanks to a number of heuristics, our approach greatly improves performance and scalability over previously existing techniques. It does so by pruning large portions of the search space without overlooking any valid embedding. We present a construction that allows a compact representation of candidate embeddings, which is maintained by carefully controlling the order via which candidate mappings are inserted and invalid mappings are removed. We present an implementation of our proposed technique, which we call NETEMBED – a service that identify feasible mappings of a virtual network configuration (the query network) to an existing real infrastructure or testbed (the hosting network). We present results of extensive performance evaluation experiments of NETEMBED using several combinations of real and synthetic network topologies. Our results show that our NETEMBED service is quite effective in identifying one (or all) possible embeddings for quite sizable queries and hosting networks – much larger than what any of the existing techniques or services are able to handle.
Resumo:
Much work has been done on learning from failure in search to boost solving of combinatorial problems, such as clause-learning and clause-weighting in boolean satisfiability (SAT), nogood and explanation-based learning, and constraint weighting in constraint satisfaction problems (CSPs). Many of the top solvers in SAT use clause learning to good effect. A similar approach (nogood learning) has not had as large an impact in CSPs. Constraint weighting is a less fine-grained approach where the information learnt gives an approximation as to which variables may be the sources of greatest contention. In this work we present two methods for learning from search using restarts, in order to identify these critical variables prior to solving. Both methods are based on the conflict-directed heuristic (weighted-degree heuristic) introduced by Boussemart et al. and are aimed at producing a better-informed version of the heuristic by gathering information through restarting and probing of the search space prior to solving, while minimizing the overhead of these restarts. We further examine the impact of different sampling strategies and different measurements of contention, and assess different restarting strategies for the heuristic. Finally, two applications for constraint weighting are considered in detail: dynamic constraint satisfaction problems and unary resource scheduling problems.
Resumo:
An enterprise information system (EIS) is an integrated data-applications platform characterized by diverse, heterogeneous, and distributed data sources. For many enterprises, a number of business processes still depend heavily on static rule-based methods and extensive human expertise. Enterprises are faced with the need for optimizing operation scheduling, improving resource utilization, discovering useful knowledge, and making data-driven decisions.
This thesis research is focused on real-time optimization and knowledge discovery that addresses workflow optimization, resource allocation, as well as data-driven predictions of process-execution times, order fulfillment, and enterprise service-level performance. In contrast to prior work on data analytics techniques for enterprise performance optimization, the emphasis here is on realizing scalable and real-time enterprise intelligence based on a combination of heterogeneous system simulation, combinatorial optimization, machine-learning algorithms, and statistical methods.
On-demand digital-print service is a representative enterprise requiring a powerful EIS.We use real-life data from Reischling Press, Inc. (RPI), a digit-print-service provider (PSP), to evaluate our optimization algorithms.
In order to handle the increase in volume and diversity of demands, we first present a high-performance, scalable, and real-time production scheduling algorithm for production automation based on an incremental genetic algorithm (IGA). The objective of this algorithm is to optimize the order dispatching sequence and balance resource utilization. Compared to prior work, this solution is scalable for a high volume of orders and it provides fast scheduling solutions for orders that require complex fulfillment procedures. Experimental results highlight its potential benefit in reducing production inefficiencies and enhancing the productivity of an enterprise.
We next discuss analysis and prediction of different attributes involved in hierarchical components of an enterprise. We start from a study of the fundamental processes related to real-time prediction. Our process-execution time and process status prediction models integrate statistical methods with machine-learning algorithms. In addition to improved prediction accuracy compared to stand-alone machine-learning algorithms, it also performs a probabilistic estimation of the predicted status. An order generally consists of multiple series and parallel processes. We next introduce an order-fulfillment prediction model that combines advantages of multiple classification models by incorporating flexible decision-integration mechanisms. Experimental results show that adopting due dates recommended by the model can significantly reduce enterprise late-delivery ratio. Finally, we investigate service-level attributes that reflect the overall performance of an enterprise. We analyze and decompose time-series data into different components according to their hierarchical periodic nature, perform correlation analysis,
and develop univariate prediction models for each component as well as multivariate models for correlated components. Predictions for the original time series are aggregated from the predictions of its components. In addition to a significant increase in mid-term prediction accuracy, this distributed modeling strategy also improves short-term time-series prediction accuracy.
In summary, this thesis research has led to a set of characterization, optimization, and prediction tools for an EIS to derive insightful knowledge from data and use them as guidance for production management. It is expected to provide solutions for enterprises to increase reconfigurability, accomplish more automated procedures, and obtain data-driven recommendations or effective decisions.
Resumo:
Scheduling a set of jobs over a collection of machines to optimize a certain quality-of-service measure is one of the most important research topics in both computer science theory and practice. In this thesis, we design algorithms that optimize {\em flow-time} (or delay) of jobs for scheduling problems that arise in a wide range of applications. We consider the classical model of unrelated machine scheduling and resolve several long standing open problems; we introduce new models that capture the novel algorithmic challenges in scheduling jobs in data centers or large clusters; we study the effect of selfish behavior in distributed and decentralized environments; we design algorithms that strive to balance the energy consumption and performance.
The technically interesting aspect of our work is the surprising connections we establish between approximation and online algorithms, economics, game theory, and queuing theory. It is the interplay of ideas from these different areas that lies at the heart of most of the algorithms presented in this thesis.
The main contributions of the thesis can be placed in one of the following categories.
1. Classical Unrelated Machine Scheduling: We give the first polygorithmic approximation algorithms for minimizing the average flow-time and minimizing the maximum flow-time in the offline setting. In the online and non-clairvoyant setting, we design the first non-clairvoyant algorithm for minimizing the weighted flow-time in the resource augmentation model. Our work introduces iterated rounding technique for the offline flow-time optimization, and gives the first framework to analyze non-clairvoyant algorithms for unrelated machines.
2. Polytope Scheduling Problem: To capture the multidimensional nature of the scheduling problems that arise in practice, we introduce Polytope Scheduling Problem (\psp). The \psp problem generalizes almost all classical scheduling models, and also captures hitherto unstudied scheduling problems such as routing multi-commodity flows, routing multicast (video-on-demand) trees, and multi-dimensional resource allocation. We design several competitive algorithms for the \psp problem and its variants for the objectives of minimizing the flow-time and completion time. Our work establishes many interesting connections between scheduling and market equilibrium concepts, fairness and non-clairvoyant scheduling, and queuing theoretic notion of stability and resource augmentation analysis.
3. Energy Efficient Scheduling: We give the first non-clairvoyant algorithm for minimizing the total flow-time + energy in the online and resource augmentation model for the most general setting of unrelated machines.
4. Selfish Scheduling: We study the effect of selfish behavior in scheduling and routing problems. We define a fairness index for scheduling policies called {\em bounded stretch}, and show that for the objective of minimizing the average (weighted) completion time, policies with small stretch lead to equilibrium outcomes with small price of anarchy. Our work gives the first linear/ convex programming duality based framework to bound the price of anarchy for general equilibrium concepts such as coarse correlated equilibrium.
Resumo:
Review of: Peter Reimann and Hans Spada (eds), Learning in Humans and Machines: Towards an Interdisciplinary Learning Science, Pergamon. (1995). ISBN: 978-0080425696
Resumo:
The factors that are driving the development and use of grids and grid computing, such as size, dynamic features, distribution and heterogeneity, are also pushing to the forefront service quality issues. These include performance, reliability and security. Although grid middleware can address some of these issues on a wider scale, it has also become imperative to ensure adequate service provision at local level. Load sharing in clusters can contribute to the provision of a high quality service, by exploiting both static and dynamic information. This paper is concerned with the presentation of a load sharing scheme, that can satisfy grid computing requirements. It follows a proactive, non preemptive and distributed approach. Load information is gathered continuously before it is needed, and a task is allocated to the most appropriate node for execution. Performance and reliability are enhanced by the decentralised nature of the scheme and the symmetric roles of the nodes. In addition, the scheme exhibits transparency characteristics that facilitate integration with the grid.
Resumo:
Computer equipment, once viewed as leading edge, is quickly condemned as obsolete and banished to basement store rooms or rubbish bins. The magpie instincts of some of the academics and technicians at the University of Greenwich, London, preserved some such relics in cluttered offices and garages to the dismay of colleagues and partners. When the University moved into its new campus in the historic buildings of the Old Royal Naval College in the center of Greenwich, corridor space in King William Court provided an opportunity to display some of this equipment so that students could see these objects and gain a more vivid appreciation of their subject's history.
Resumo:
Computational Fluid Dynamics (CFD) is gradually becoming a powerful and almost essential tool for the design, development and optimization of engineering applications. However the mathematical modelling of the erratic turbulent motion remains the key issue when tackling such flow phenomena. The reliability of CFD analysis depends heavily on the turbulence model employed together with the wall functions implemented. In order to resolve the abrupt changes in the turbulent energy and other parameters situated at near wall regions a particularly fine mesh is necessary which inevitably increases the computer storage and run-time requirements. Turbulence modelling can be considered to be one of the three key elements in CFD. Precise mathematical theories have evolved for the other two key elements, grid generation and algorithm development. The principal objective of turbulence modelling is to enhance computational procedures of efficient accuracy to reproduce the main structures of three dimensional fluid flows. The flow within an electronic system can be characterized as being in a transitional state due to the low velocities and relatively small dimensions encountered. This paper presents simulated CFD results for an investigation into the predictive capability of turbulence models when considering both fluid flow and heat transfer phenomena. Also a new two-layer hybrid kε / kl turbulence model for electronic application areas will be presented which holds the advantages of being cheap in terms of the computational mesh required and is also economical with regards to run-time.
Resumo:
Distributed applications are being deployed on ever-increasing scale and with ever-increasing functionality. Due to the accompanying increase in behavioural complexity, self-management abilities, such as self-healing, have become core requirements. A key challenge is the smooth embedding of such functionality into our systems. Natural distributed systems such as ant colonies have evolved highly efficient behaviour. These emergent systems achieve high scalability through the use of low complexity communication strategies and are highly robust through large-scale replication of simple, anonymous entities. Ways to engineer this fundamentally non-deterministic behaviour for use in distributed applications are being explored. An emergent, dynamic, cluster management scheme, which forms part of a hierarchical resource management architecture, is presented. Natural biological systems, which embed self-healing behaviour at several levels, have influenced the architecture. The resulting system is a simple, lightweight and highly robust platform on which cluster-based autonomic applications can be deployed.
Proposed methodology for the use of computer simulation to enhance aircraft evacuation certification
Resumo:
In this paper a methodology for the application of computer simulation to evacuation certification of aircraft is suggested. This involves the use of computer simulation, historic certification data, component testing, and full-scale certification trials. The methodology sets out a framework for how computer simulation should be undertaken in a certification environment and draws on experience from both the marine and building industries. In addition, a phased introduction of computer models to certification is suggested. This involves as a first step the use of computer simulation in conjunction with full-scale testing. The combination of full-scale trial, computer simulation (and if necessary component testing) provides better insight into aircraft evacuation performance capabilities by generating a performance probability distribution rather than a single datum. Once further confidence in the technique is established the requirement for the full-scale demonstration could be dropped. The second step in the adoption of computer simulation for certification involves the introduction of several scenarios based on, for example, exit availability, instructed by accident analysis. The final step would be the introduction of more realistic accident scenarios. This would require the continued development of aircraft evacuation modeling technology to include additional behavioral features common in real accident scenarios.