18 resultados para Running Lamps.
em Boston University Digital Common
Resumo:
DSpace is an open source software platform that enables organizations to: - Capture and describe digital material using a submission workflow module, or a variety of programmatic ingest options - Distribute an organization's digital assets over the web through a search and retrieval system - Preserve digital assets over the long term This system documentation includes a functional overview of the system, which is a good introduction to the capabilities of the system, and should be readable by nontechnical personnel. Everyone should read this section first because it introduces some terminology used throughout the rest of the documentation. For people actually running a DSpace service, there is an installation guide, and sections on configuration and the directory structure. Note that as of DSpace 1.2, the administration user interface guide is now on-line help available from within the DSpace system. Finally, for those interested in the details of how DSpace works, and those potentially interested in modifying the code for their own purposes, there is a detailed architecture and design section.
Resumo:
Implementations are presented of two common algorithms for integer factorization, Pollard’s “p – 1” method and the SQUFOF method. The algorithms are implemented in the F# language, a functional programming language developed by Microsoft and officially released for the first time in 2010. The algorithms are thoroughly tested on a set of large integers (up to 64 bits in size), running both on a physical machine and a Windows Azure machine instance. Analysis of the relative performance between the two environments indicates comparable performance when taking into account the difference in computing power. Further analysis reveals that the relative performance of the Azure implementation tends to improve as the magnitudes of the integers increase, indicating that such an approach may be suitable for larger, more complex factorization tasks. Finally, several questions are presented for future research, including the performance of F# and related languages for more efficient, parallelizable algorithms, and the relative cost and performance of factorization algorithms in various environments, including physical hardware and commercial cloud computing offerings from the various vendors in the industry.
Resumo:
For communication-intensive parallel applications, the maximum degree of concurrency achievable is limited by the communication throughput made available by the network. In previous work [HPS94], we showed experimentally that the performance of certain parallel applications running on a workstation network can be improved significantly if a congestion control protocol is used to enhance network performance. In this paper, we characterize and analyze the communication requirements of a large class of supercomputing applications that fall under the category of fixed-point problems, amenable to solution by parallel iterative methods. This results in a set of interface and architectural features sufficient for the efficient implementation of the applications over a large-scale distributed system. In particular, we propose a direct link between the application and network layer, supporting congestion control actions at both ends. This in turn enhances the system's responsiveness to network congestion, improving performance. Measurements are given showing the efficacy of our scheme to support large-scale parallel computations.
Resumo:
Wireless sensor networks have recently emerged as enablers of important applications such as environmental, chemical and nuclear sensing systems. Such applications have sophisticated spatial-temporal semantics that set them aside from traditional wireless networks. For example, the computation of temperature averaged over the sensor field must take into account local densities. This is crucial since otherwise the estimated average temperature can be biased by over-sampling areas where a lot more sensors exist. Thus, we envision that a fundamental service that a wireless sensor network should provide is that of estimating local densities. In this paper, we propose a lightweight probabilistic density inference protocol, we call DIP, which allows each sensor node to implicitly estimate its neighborhood size without the explicit exchange of node identifiers as in existing density discovery schemes. The theoretical basis of DIP is a probabilistic analysis which gives the relationship between the number of sensor nodes contending in the neighborhood of a node and the level of contention measured by that node. Extensive simulations confirm the premise of DIP: it can provide statistically reliable and accurate estimates of local density at a very low energy cost and constant running time. We demonstrate how applications could be built on top of our DIP-based service by computing density-unbiased statistics from estimated local densities.
Resumo:
With the increased use of "Virtual Machines" (VMs) as vehicles that isolate applications running on the same host, it is necessary to devise techniques that enable multiple VMs to share underlying resources both fairly and efficiently. To that end, one common approach is to deploy complex resource management techniques in the hosting infrastructure. Alternately, in this paper, we advocate the use of self-adaptation in the VMs themselves based on feedback about resource usage and availability. Consequently, we define a "Friendly" VM (FVM) to be a virtual machine that adjusts its demand for system resources, so that they are both efficiently and fairly allocated to competing FVMs. Such properties are ensured using one of many provably convergent control rules, such as AIMD. By adopting this distributed application-based approach to resource management, it is not necessary to make assumptions about the underlying resources nor about the requirements of FVMs competing for these resources. To demonstrate the elegance and simplicity of our approach, we present a prototype implementation of our FVM framework in User-Mode Linux (UML)-an implementation that consists of less than 500 lines of code changes to UML. We present an analytic, control-theoretic model of FVM adaptation, which establishes convergence and fairness properties. These properties are also backed up with experimental results using our prototype FVM implementation.
Resumo:
This paper addresses the problem of analyzing performance of WWW servers. The web has experienced a phenomenal growth and has become the most popular Internet application. As a consequence of its large popularity, the Internet has suffered from various performance problems, such as network congestion and overloaded servers. These days, it is not uncommon to find servers refusing connections because they are overloaded. Performance has always been a key issue in the design and operation of on-line systems. With regard to Internet, performance is also critical, because users want fast and easy access to all objects (i.e., documents, pictures, audio, and video) available on the net. Thus, it is important to understand WWW performance issues. This paper focuses on the performance analysis of a Web server. Using a synthetic benchmark (WebStone), we analyze three different Web server software running on top of a Windows NT platform and performing some typical WWW tasks.
Resumo:
Server performance has become a crucial issue for improving the overall performance of the World-Wide Web. This paper describes Webmonitor, a tool for evaluating and understanding server performance, and presents new results for a realistic workload. Webmonitor measures activity and resource consumption, both within the kernel and in HTTP processes running in user space. Webmonitor is implemented using an efficient combination of sampling and event-driven techniques that exhibit low overhead. Our initial implementation is for the Apache World-Wide Web server running on the Linux operating system. We demonstrate the utility of Webmonitor by measuring and understanding the performance of a Pentium-based PC acting as a dedicated WWW server. Our workload uses a file size distribution with a heavy tail. This captures the fact that Web servers must concurrently handle some requests for large audio and video files, and a large number of requests for small documents, containing text or images. Our results show that in a Web server saturated by client requests, over 90% of the time spent handling HTTP requests is spent in the kernel. Furthermore, keeping TCP connections open, as required by TCP, causes a factor of 2-9 increase in the elapsed time required to service an HTTP request. Data gathered from Webmonitor provide insight into the causes of this performance penalty. Specifically, we observe a significant increase in resource consumption along three dimensions: the number of HTTP processes running at the same time, CPU utilization, and memory utilization. These results emphasize the important role of operating system and network protocol implementation in determining Web server performance.
Resumo:
ImageRover is a search by image content navigation tool for the world wide web. To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the appropriate image decompositions and indices, and store this extracted information in vector form for searches based on image content. At search time, users can iteratively guide the search through the selection of relevant examples. Search performance is made efficient through the use of an approximate, optimized k-d tree algorithm. The system employs a novel relevance feedback algorithm that selects the distance metrics appropriate for a particular query.
Resumo:
This paper examines how and why web server performance changes as the workload at the server varies. We measure the performance of a PC acting as a standalone web server, running Apache on top of Linux. We use two important tools to understand what aspects of software architecture and implementation determine performance at the server. The first is a tool that we developed, called WebMonitor, which measures activity and resource consumption, both in the operating system and in the web server. The second is the kernel profiling facility distributed as part of Linux. We vary the workload at the server along two important dimensions: the number of clients concurrently accessing the server, and the size of the documents stored on the server. Our results quantify and show how more clients and larger files stress the web server and operating system in different and surprising ways. Our results also show the importance of fixed costs (i.e., opening and closing TCP connections, and updating the server log) in determining web server performance.
Resumo:
ImageRover is a search by image content navigation tool for the world wide web. The staggering size of the WWW dictates certain strategies and algorithms for image collection, digestion, indexing, and user interface. This paper describes two key components of the ImageRover strategy: image digestion and relevance feedback. Image digestion occurs during image collection; robots digest the images they find, computing image decompositions and indices, and storing this extracted information in vector form for searches based on image content. Relevance feedback occurs during index search; users can iteratively guide the search through the selection of relevant examples. ImageRover employs a novel relevance feedback algorithm to determine the weighted combination of image similarity metrics appropriate for a particular query. ImageRover is available and running on the web site.
Resumo:
We consider a Delay Tolerant Network (DTN) whose users (nodes) are connected by an underlying Mobile Ad hoc Network (MANET) substrate. Users can declaratively express high-level policy constraints on how "content" should be routed. For example, content may be diverted through an intermediary DTN node for the purposes of preprocessing, authentication, etc. To support such capability, we implement Predicate Routing [7] where high-level constraints of DTN nodes are mapped into low-level routing predicates at the MANET level. Our testbed uses a Linux system architecture and leverages User Mode Linux [2] to emulate every node running a DTN Reference Implementation code [5]. In our initial prototype, we use the On Demand Distance Vector (AODV) MANET routing protocol. We use the network simulator ns-2 (ns-emulation version) to simulate the mobility and wireless connectivity of both DTN and MANET nodes. We show preliminary throughput results showing the efficient and correct operation of propagating routing predicates, and as a side effect, the performance benefit of content re-routing that dynamically (on-demand) breaks the underlying end-to-end TCP connection into shorter-length TCP connections.
Resumo:
The problem of discovering frequent poly-regions (i.e. regions of high occurrence of a set of items or patterns of a given alphabet) in a sequence is studied, and three efficient approaches are proposed to solve it. The first one is entropy-based and applies a recursive segmentation technique that produces a set of candidate segments which may potentially lead to a poly-region. The key idea of the second approach is the use of a set of sliding windows over the sequence. Each sliding window covers a sequence segment and keeps a set of statistics that mainly include the number of occurrences of each item or pattern in that segment. Combining these statistics efficiently yields the complete set of poly-regions in the given sequence. The third approach applies a technique based on the majority vote, achieving linear running time with a minimal number of false negatives. After identifying the poly-regions, the sequence is converted to a sequence of labeled intervals (each one corresponding to a poly-region). An efficient algorithm for mining frequent arrangements of intervals is applied to the converted sequence to discover frequently occurring arrangements of poly-regions in different parts of DNA, including coding regions. The proposed algorithms are tested on various DNA sequences producing results of significant biological meaning.
Resumo:
Attributing a dollar value to a keyword is an essential part of running any profitable search engine advertising campaign. When an advertiser has complete control over the interaction with and monetization of each user arriving on a given keyword, the value of that term can be accurately tracked. However, in many instances, the advertiser may monetize arrivals indirectly through one or more third parties. In such cases, it is typical for the third party to provide only coarse-grained reporting: rather than report each monetization event, users are aggregated into larger channels and the third party reports aggregate information such as total daily revenue for each channel. Examples of third parties that use channels include Amazon and Google AdSense. In such scenarios, the number of channels is generally much smaller than the number of keywords whose value per click (VPC) we wish to learn. However, the advertiser has flexibility as to how to assign keywords to channels over time. We introduce the channelization problem: how do we adaptively assign keywords to channels over the course of multiple days to quickly obtain accurate VPC estimates of all keywords? We relate this problem to classical results in weighing design, devise new adaptive algorithms for this problem, and quantify the performance of these algorithms experimentally. Our results demonstrate that adaptive weighing designs that exploit statistics of term frequency, variability in VPCs across keywords, and flexible channel assignments over time provide the best estimators of keyword VPCs.
Resumo:
A mechanism is proposed that integrates low-level (image processing), mid-level (recursive 3D trajectory estimation), and high-level (action recognition) processes. It is assumed that the system observes multiple moving objects via a single, uncalibrated video camera. A novel extended Kalman filter formulation is used in estimating the relative 3D motion trajectories up to a scale factor. The recursive estimation process provides a prediction and error measure that is exploited in higher-level stages of action recognition. Conversely, higher-level mechanisms provide feedback that allows the system to reliably segment and maintain the tracking of moving objects before, during, and after occlusion. The 3D trajectory, occlusion, and segmentation information are utilized in extracting stabilized views of the moving object. Trajectory-guided recognition (TGR) is proposed as a new and efficient method for adaptive classification of action. The TGR approach is demonstrated using "motion history images" that are then recognized via a mixture of Gaussian classifier. The system was tested in recognizing various dynamic human outdoor activities; e.g., running, walking, roller blading, and cycling. Experiments with synthetic data sets are used to evaluate stability of the trajectory estimator with respect to noise.
Resumo:
A combined 2D, 3D approach is presented that allows for robust tracking of moving people and recognition of actions. It is assumed that the system observes multiple moving objects via a single, uncalibrated video camera. Low-level features are often insufficient for detection, segmentation, and tracking of non-rigid moving objects. Therefore, an improved mechanism is proposed that integrates low-level (image processing), mid-level (recursive 3D trajectory estimation), and high-level (action recognition) processes. A novel extended Kalman filter formulation is used in estimating the relative 3D motion trajectories up to a scale factor. The recursive estimation process provides a prediction and error measure that is exploited in higher-level stages of action recognition. Conversely, higher-level mechanisms provide feedback that allows the system to reliably segment and maintain the tracking of moving objects before, during, and after occlusion. The 3D trajectory, occlusion, and segmentation information are utilized in extracting stabilized views of the moving object that are then used as input to action recognition modules. Trajectory-guided recognition (TGR) is proposed as a new and efficient method for adaptive classification of action. The TGR approach is demonstrated using "motion history images" that are then recognized via a mixture-of-Gaussians classifier. The system was tested in recognizing various dynamic human outdoor activities: running, walking, roller blading, and cycling. Experiments with real and synthetic data sets are used to evaluate stability of the trajectory estimator with respect to noise.