850 resultados para parallel scalability
Resumo:
This research focuses on automatically adapting a search engine size in response to fluctuations in query workload. Deploying a search engine in an Infrastructure as a Service (IaaS) cloud facilitates allocating or deallocating computer resources to or from the engine. Our solution is to contribute an adaptive search engine that will repeatedly re-evaluate its load and, when appropriate, switch over to a dierent number of active processors. We focus on three aspects and break them out into three sub-problems as follows: Continually determining the Number of Processors (CNP), New Grouping Problem (NGP) and Regrouping Order Problem (ROP). CNP means that (in the light of the changes in the query workload in the search engine) there is a problem of determining the ideal number of processors p active at any given time to use in the search engine and we call this problem CNP. NGP happens when changes in the number of processors are determined and it must also be determined which groups of search data will be distributed across the processors. ROP is how to redistribute this data onto processors while keeping the engine responsive and while also minimising the switchover time and the incurred network load. We propose solutions for these sub-problems. For NGP we propose an algorithm for incrementally adjusting the index to t the varying number of virtual machines. For ROP we present an ecient method for redistributing data among processors while keeping the search engine responsive. Regarding the solution for CNP, we propose an algorithm determining the new size of the search engine by re-evaluating its load. We tested the solution performance using a custom-build prototype search engine deployed in the Amazon EC2 cloud. Our experiments show that when we compare our NGP solution with computing the index from scratch, the incremental algorithm speeds up the index computation 2{10 times while maintaining a similar search performance. The chosen redistribution method is 25% to 50% faster than other methods and reduces the network load around by 30%. For CNP we present a deterministic algorithm that shows a good ability to determine a new size of search engine. When combined, these algorithms give an adapting algorithm that is able to adjust the search engine size with a variable workload.
Resumo:
Optimization of adaptive traffic signal timing is one of the most complex problems in traffic control systems. This dissertation presents a new method that applies the parallel genetic algorithm (PGA) to optimize adaptive traffic signal control in the presence of transit signal priority (TSP). The method can optimize the phase plan, cycle length, and green splits at isolated intersections with consideration for the performance of both the transit and the general vehicles. Unlike the simple genetic algorithm (GA), PGA can provide better and faster solutions needed for real-time optimization of adaptive traffic signal control. ^ An important component in the proposed method involves the development of a microscopic delay estimation model that was designed specifically to optimize adaptive traffic signal with TSP. Macroscopic delay models such as the Highway Capacity Manual (HCM) delay model are unable to accurately consider the effect of phase combination and phase sequence in delay calculations. In addition, because the number of phases and the phase sequence of adaptive traffic signal may vary from cycle to cycle, the phase splits cannot be optimized when the phase sequence is also a decision variable. A "flex-phase" concept was introduced in the proposed microscopic delay estimation model to overcome these limitations. ^ The performance of PGA was first evaluated against the simple GA. The results show that PGA achieved both faster convergence and lower delay for both under- or over-saturated traffic conditions. A VISSIM simulation testbed was then developed to evaluate the performance of the proposed PGA-based adaptive traffic signal control with TSP. The simulation results show that the PGA-based optimizer for adaptive TSP outperformed the fully actuated NEMA control in all test cases. The results also show that the PGA-based optimizer was able to produce TSP timing plans that benefit the transit vehicles while minimizing the impact of TSP on the general vehicles. The VISSIM testbed developed in this research provides a powerful tool to design and evaluate different TSP strategies under both actuated and adaptive signal control. ^
Resumo:
This research is motivated by a practical application observed at a printed circuit board (PCB) manufacturing facility. After assembly, the PCBs (or jobs) are tested in environmental stress screening (ESS) chambers (or batch processing machines) to detect early failures. Several PCBs can be simultaneously tested as long as the total size of all the PCBs in the batch does not violate the chamber capacity. PCBs from different production lines arrive dynamically to a queue in front of a set of identical ESS chambers, where they are grouped into batches for testing. Each line delivers PCBs that vary in size and require different testing (or processing) times. Once a batch is formed, its processing time is the longest processing time among the PCBs in the batch, and its ready time is given by the PCB arriving last to the batch. ESS chambers are expensive and a bottleneck. Consequently, its makespan has to be minimized. ^ A mixed-integer formulation is proposed for the problem under study and compared to a formulation recently published. The proposed formulation is better in terms of the number of decision variables, linear constraints and run time. A procedure to compute the lower bound is proposed. For sparse problems (i.e. when job ready times are dispersed widely), the lower bounds are close to optimum. ^ The problem under study is NP-hard. Consequently, five heuristics, two metaheuristics (i.e. simulated annealing (SA) and greedy randomized adaptive search procedure (GRASP)), and a decomposition approach (i.e. column generation) are proposed—especially to solve problem instances which require prohibitively long run times when a commercial solver is used. Extensive experimental study was conducted to evaluate the different solution approaches based on the solution quality and run time. ^ The decomposition approach improved the lower bounds (or linear relaxation solution) of the mixed-integer formulation. At least one of the proposed heuristic outperforms the Modified Delay heuristic from the literature. For sparse problems, almost all the heuristics report a solution close to optimum. GRASP outperforms SA at a higher computational cost. The proposed approaches are viable to implement as the run time is very short. ^
Resumo:
Parallel processing is prevalent in many manufacturing and service systems. Many manufactured products are built and assembled from several components fabricated in parallel lines. An example of this manufacturing system configuration is observed at a manufacturing facility equipped to assemble and test web servers. Characteristics of a typical web server assembly line are: multiple products, job circulation, and paralleling processing. The primary objective of this research was to develop analytical approximations to predict performance measures of manufacturing systems with job failures and parallel processing. The analytical formulations extend previous queueing models used in assembly manufacturing systems in that they can handle serial and different configurations of paralleling processing with multiple product classes, and job circulation due to random part failures. In addition, appropriate correction terms via regression analysis were added to the approximations in order to minimize the gap in the error between the analytical approximation and the simulation models. Markovian and general type manufacturing systems, with multiple product classes, job circulation due to failures, and fork and join systems to model parallel processing were studied. In the Markovian and general case, the approximations without correction terms performed quite well for one and two product problem instances. However, it was observed that the flow time error increased as the number of products and net traffic intensity increased. Therefore, correction terms for single and fork-join stations were developed via regression analysis to deal with more than two products. The numerical comparisons showed that the approximations perform remarkably well when the corrections factors were used in the approximations. In general, the average flow time error was reduced from 38.19% to 5.59% in the Markovian case, and from 26.39% to 7.23% in the general case. All the equations stated in the analytical formulations were implemented as a set of Matlab scripts. By using this set, operations managers of web server assembly lines, manufacturing or other service systems with similar characteristics can estimate different system performance measures, and make judicious decisions - especially setting delivery due dates, capacity planning, and bottleneck mitigation, among others.
Resumo:
Modern geographical databases, which are at the core of geographic information systems (GIS), store a rich set of aspatial attributes in addition to geographic data. Typically, aspatial information comes in textual and numeric format. Retrieving information constrained on spatial and aspatial data from geodatabases provides GIS users the ability to perform more interesting spatial analyses, and for applications to support composite location-aware searches; for example, in a real estate database: “Find the nearest homes for sale to my current location that have backyard and whose prices are between $50,000 and $80,000”. Efficient processing of such queries require combined indexing strategies of multiple types of data. Existing spatial query engines commonly apply a two-filter approach (spatial filter followed by nonspatial filter, or viceversa), which can incur large performance overheads. On the other hand, more recently, the amount of geolocation data has grown rapidly in databases due in part to advances in geolocation technologies (e.g., GPS-enabled smartphones) that allow users to associate location data to objects or events. The latter poses potential data ingestion challenges of large data volumes for practical GIS databases. In this dissertation, we first show how indexing spatial data with R-trees (a typical data pre-processing task) can be scaled in MapReduce—a widely-adopted parallel programming model for data intensive problems. The evaluation of our algorithms in a Hadoop cluster showed close to linear scalability in building R-tree indexes. Subsequently, we develop efficient algorithms for processing spatial queries with aspatial conditions. Novel techniques for simultaneously indexing spatial with textual and numeric data are developed to that end. Experimental evaluations with real-world, large spatial datasets measured query response times within the sub-second range for most cases, and up to a few seconds for a small number of cases, which is reasonable for interactive applications. Overall, the previous results show that the MapReduce parallel model is suitable for indexing tasks in spatial databases, and the adequate combination of spatial and aspatial attribute indexes can attain acceptable response times for interactive spatial queries with constraints on aspatial data.
Resumo:
With the exponential increasing demands and uses of GIS data visualization system, such as urban planning, environment and climate change monitoring, weather simulation, hydrographic gauge and so forth, the geospatial vector and raster data visualization research, application and technology has become prevalent. However, we observe that current web GIS techniques are merely suitable for static vector and raster data where no dynamic overlaying layers. While it is desirable to enable visual explorations of large-scale dynamic vector and raster geospatial data in a web environment, improving the performance between backend datasets and the vector and raster applications remains a challenging technical issue. This dissertation is to implement these challenging and unimplemented areas: how to provide a large-scale dynamic vector and raster data visualization service with dynamic overlaying layers accessible from various client devices through a standard web browser, and how to make the large-scale dynamic vector and raster data visualization service as rapid as the static one. To accomplish these, a large-scale dynamic vector and raster data visualization geographic information system based on parallel map tiling and a comprehensive performance improvement solution are proposed, designed and implemented. They include: the quadtree-based indexing and parallel map tiling, the Legend String, the vector data visualization with dynamic layers overlaying, the vector data time series visualization, the algorithm of vector data rendering, the algorithm of raster data re-projection, the algorithm for elimination of superfluous level of detail, the algorithm for vector data gridding and re-grouping and the cluster servers side vector and raster data caching.
Resumo:
Large read-only or read-write transactions with a large read set and a small write set constitute an important class of transactions used in such applications as data mining, data warehousing, statistical applications, and report generators. Such transactions are best supported with optimistic concurrency, because locking of large amounts of data for extended periods of time is not an acceptable solution. The abort rate in regular optimistic concurrency algorithms increases exponentially with the size of the transaction. The algorithm proposed in this dissertation solves this problem by using a new transaction scheduling technique that allows a large transaction to commit safely with significantly greater probability that can exceed several orders of magnitude versus regular optimistic concurrency algorithms. A performance simulation study and a formal proof of serializability and external consistency of the proposed algorithm are also presented.^ This dissertation also proposes a new query optimization technique (lazy queries). Lazy Queries is an adaptive query execution scheme which optimizes itself as the query runs. Lazy queries can be used to find an intersection of sub-queries in a very efficient way, which does not require full execution of large sub-queries nor does it require any statistical knowledge about the data.^ An efficient optimistic concurrency control algorithm used in a massively parallel B-tree with variable-length keys is introduced. B-trees with variable-length keys can be effectively used in a variety of database types. In particular, we show how such a B-tree was used in our implementation of a semantic object-oriented DBMS. The concurrency control algorithm uses semantically safe optimistic virtual "locks" that achieve very fine granularity in conflict detection. This algorithm ensures serializability and external consistency by using logical clocks and backward validation of transactional queries. A formal proof of correctness of the proposed algorithm is also presented. ^
Resumo:
Orthogonal Frequency-Division Multiplexing (OFDM) has been proved to be a promising technology that enables the transmission of higher data rate. Multicarrier Code-Division Multiple Access (MC-CDMA) is a transmission technique which combines the advantages of both OFDM and Code-Division Multiplexing Access (CDMA), so as to allow high transmission rates over severe time-dispersive multi-path channels without the need of a complex receiver implementation. Also MC-CDMA exploits frequency diversity via the different subcarriers, and therefore allows the high code rates systems to achieve good Bit Error Rate (BER) performances. Furthermore, the spreading in the frequency domain makes the time synchronization requirement much lower than traditional direct sequence CDMA schemes. There are still some problems when we use MC-CDMA. One is the high Peak-to-Average Power Ratio (PAPR) of the transmit signal. High PAPR leads to nonlinear distortion of the amplifier and results in inter-carrier self-interference plus out-of-band radiation. On the other hand, suppressing the Multiple Access Interference (MAI) is another crucial problem in the MC-CDMA system. Imperfect cross-correlation characteristics of the spreading codes and the multipath fading destroy the orthogonality among the users, and then cause MAI, which produces serious BER degradation in the system. Moreover, in uplink system the received signals at a base station are always asynchronous. This also destroys the orthogonality among the users, and hence, generates MAI which degrades the system performance. Besides those two problems, the interference should always be considered seriously for any communication system. In this dissertation, we design a novel MC-CDMA system, which has low PAPR and mitigated MAI. The new Semi-blind channel estimation and multi-user data detection based on Parallel Interference Cancellation (PIC) have been applied in the system. The Low Density Parity Codes (LDPC) has also been introduced into the system to improve the performance. Different interference models are analyzed in multi-carrier communication systems and then the effective interference suppression for MC-CDMA systems is employed in this dissertation. The experimental results indicate that our system not only significantly reduces the PAPR and MAI but also effectively suppresses the outside interference with low complexity. Finally, we present a practical cognitive application of the proposed system over the software defined radio platform.
Resumo:
With hundreds of millions of users reporting locations and embracing mobile technologies, Location Based Services (LBSs) are raising new challenges. In this dissertation, we address three emerging problems in location services, where geolocation data plays a central role. First, to handle the unprecedented growth of generated geolocation data, existing location services rely on geospatial database systems. However, their inability to leverage combined geographical and textual information in analytical queries (e.g. spatial similarity joins) remains an open problem. To address this, we introduce SpsJoin, a framework for computing spatial set-similarity joins. SpsJoin handles combined similarity queries that involve textual and spatial constraints simultaneously. LBSs use this system to tackle different types of problems, such as deduplication, geolocation enhancement and record linkage. We define the spatial set-similarity join problem in a general case and propose an algorithm for its efficient computation. Our solution utilizes parallel computing with MapReduce to handle scalability issues in large geospatial databases. Second, applications that use geolocation data are seldom concerned with ensuring the privacy of participating users. To motivate participation and address privacy concerns, we propose iSafe, a privacy preserving algorithm for computing safety snapshots of co-located mobile devices as well as geosocial network users. iSafe combines geolocation data extracted from crime datasets and geosocial networks such as Yelp. In order to enhance iSafe's ability to compute safety recommendations, even when crime information is incomplete or sparse, we need to identify relationships between Yelp venues and crime indices at their locations. To achieve this, we use SpsJoin on two datasets (Yelp venues and geolocated businesses) to find venues that have not been reviewed and to further compute the crime indices of their locations. Our results show a statistically significant dependence between location crime indices and Yelp features. Third, review centered LBSs (e.g., Yelp) are increasingly becoming targets of malicious campaigns that aim to bias the public image of represented businesses. Although Yelp actively attempts to detect and filter fraudulent reviews, our experiments showed that Yelp is still vulnerable. Fraudulent LBS information also impacts the ability of iSafe to provide correct safety values. We take steps toward addressing this problem by proposing SpiDeR, an algorithm that takes advantage of the richness of information available in Yelp to detect abnormal review patterns. We propose a fake venue detection solution that applies SpsJoin on Yelp and U.S. housing datasets. We validate the proposed solutions using ground truth data extracted by our experiments and reviews filtered by Yelp.
Resumo:
The Center for Humanities in an Urban Environment presents a screening of the WPBT South Florida Public Television Documentary "Parallel Lives". Afterward, the central figures of the film Mrs Arva Moore Parks and Dr Dorothy Jenkins Fields joined the audience for a talk back. Event held on August 31, 2011 at the Green Library, Modesto Maidique Campus, Florida International University.
Resumo:
This thesis explores the relationship of architecture and water through the design of an urban spa that offers both a bodily and a poetic experience of water. Research included investigation of recent architectural projects that enhance and order the view, sound, and touch of water as well as projects that integrate fountains, showers and reflecting pools into the experience of a building. In the design of the spa, the movement of water was based metaphorically on the natural water cycle: evaporation, condensation and collection of water in pools. The building presents fountains, rivulets, and pools in a descending sequence that represents the natural flow of water. The temperature of water and the activities of the spa follow the same descending sequence, progressing from a warm water bath at the top of the building to cool swimming pool at the plaza level in a contemporary interpretation of the experience of a Roman Bath.
Resumo:
Orthogonal Frequency-Division Multiplexing (OFDM) has been proved to be a promising technology that enables the transmission of higher data rate. Multicarrier Code-Division Multiple Access (MC-CDMA) is a transmission technique which combines the advantages of both OFDM and Code-Division Multiplexing Access (CDMA), so as to allow high transmission rates over severe time-dispersive multi-path channels without the need of a complex receiver implementation. Also MC-CDMA exploits frequency diversity via the different subcarriers, and therefore allows the high code rates systems to achieve good Bit Error Rate (BER) performances. Furthermore, the spreading in the frequency domain makes the time synchronization requirement much lower than traditional direct sequence CDMA schemes. There are still some problems when we use MC-CDMA. One is the high Peak-to-Average Power Ratio (PAPR) of the transmit signal. High PAPR leads to nonlinear distortion of the amplifier and results in inter-carrier self-interference plus out-of-band radiation. On the other hand, suppressing the Multiple Access Interference (MAI) is another crucial problem in the MC-CDMA system. Imperfect cross-correlation characteristics of the spreading codes and the multipath fading destroy the orthogonality among the users, and then cause MAI, which produces serious BER degradation in the system. Moreover, in uplink system the received signals at a base station are always asynchronous. This also destroys the orthogonality among the users, and hence, generates MAI which degrades the system performance. Besides those two problems, the interference should always be considered seriously for any communication system. In this dissertation, we design a novel MC-CDMA system, which has low PAPR and mitigated MAI. The new Semi-blind channel estimation and multi-user data detection based on Parallel Interference Cancellation (PIC) have been applied in the system. The Low Density Parity Codes (LDPC) has also been introduced into the system to improve the performance. Different interference models are analyzed in multi-carrier communication systems and then the effective interference suppression for MC-CDMA systems is employed in this dissertation. The experimental results indicate that our system not only significantly reduces the PAPR and MAI but also effectively suppresses the outside interference with low complexity. Finally, we present a practical cognitive application of the proposed system over the software defined radio platform.
Resumo:
Parallel processing is prevalent in many manufacturing and service systems. Many manufactured products are built and assembled from several components fabricated in parallel lines. An example of this manufacturing system configuration is observed at a manufacturing facility equipped to assemble and test web servers. Characteristics of a typical web server assembly line are: multiple products, job circulation, and paralleling processing. The primary objective of this research was to develop analytical approximations to predict performance measures of manufacturing systems with job failures and parallel processing. The analytical formulations extend previous queueing models used in assembly manufacturing systems in that they can handle serial and different configurations of paralleling processing with multiple product classes, and job circulation due to random part failures. In addition, appropriate correction terms via regression analysis were added to the approximations in order to minimize the gap in the error between the analytical approximation and the simulation models. Markovian and general type manufacturing systems, with multiple product classes, job circulation due to failures, and fork and join systems to model parallel processing were studied. In the Markovian and general case, the approximations without correction terms performed quite well for one and two product problem instances. However, it was observed that the flow time error increased as the number of products and net traffic intensity increased. Therefore, correction terms for single and fork-join stations were developed via regression analysis to deal with more than two products. The numerical comparisons showed that the approximations perform remarkably well when the corrections factors were used in the approximations. In general, the average flow time error was reduced from 38.19% to 5.59% in the Markovian case, and from 26.39% to 7.23% in the general case. All the equations stated in the analytical formulations were implemented as a set of Matlab scripts. By using this set, operations managers of web server assembly lines, manufacturing or other service systems with similar characteristics can estimate different system performance measures, and make judicious decisions - especially setting delivery due dates, capacity planning, and bottleneck mitigation, among others.
Resumo:
The reverse time migration algorithm (RTM) has been widely used in the seismic industry to generate images of the underground and thus reduce the risk of oil and gas exploration. Its widespread use is due to its high quality in underground imaging. The RTM is also known for its high computational cost. Therefore, parallel computing techniques have been used in their implementations. In general, parallel approaches for RTM use a coarse granularity by distributing the processing of a subset of seismic shots among nodes of distributed systems. Parallel approaches with coarse granularity for RTM have been shown to be very efficient since the processing of each seismic shot can be performed independently. For this reason, RTM algorithm performance can be considerably improved by using a parallel approach with finer granularity for the processing assigned to each node. This work presents an efficient parallel algorithm for 3D reverse time migration with fine granularity using OpenMP. The propagation algorithm of 3D acoustic wave makes up much of the RTM. Different load balancing were analyzed in order to minimize possible losses parallel performance at this stage. The results served as a basis for the implementation of other phases RTM: backpropagation and imaging condition. The proposed algorithm was tested with synthetic data representing some of the possible underground structures. Metrics such as speedup and efficiency were used to analyze its parallel performance. The migrated sections show that the algorithm obtained satisfactory performance in identifying subsurface structures. As for the parallel performance, the analysis clearly demonstrate the scalability of the algorithm achieving a speedup of 22.46 for the propagation of the wave and 16.95 for the RTM, both with 24 threads.
Resumo:
The main focus of this research is to design and develop a high performance linear actuator based on a four bar mechanism. The present work includes the detailed analysis (kinematics and dynamics), design, implementation and experimental validation of the newly designed actuator. High performance is characterized by the acceleration of the actuator end effector. The principle of the newly designed actuator is to network the four bar rhombus configuration (where some bars are extended to form an X shape) to attain high acceleration. Firstly, a detailed kinematic analysis of the actuator is presented and kinematic performance is evaluated through MATLAB simulations. A dynamic equation of the actuator is achieved by using the Lagrangian dynamic formulation. A SIMULINK control model of the actuator is developed using the dynamic equation. In addition, Bond Graph methodology is presented for the dynamic simulation. The Bond Graph model comprises individual component modeling of the actuator along with control. Required torque was simulated using the Bond Graph model. Results indicate that, high acceleration (around 20g) can be achieved with modest (3 N-m or less) torque input. A practical prototype of the actuator is designed using SOLIDWORKS and then produced to verify the proof of concept. The design goal was to achieve the peak acceleration of more than 10g at the middle point of the travel length, when the end effector travels the stroke length (around 1 m). The actuator is primarily designed to operate in standalone condition and later to use it in the 3RPR parallel robot. A DC motor is used to operate the actuator. A quadrature encoder is attached with the DC motor to control the end effector. The associated control scheme of the actuator is analyzed and integrated with the physical prototype. From standalone experimentation of the actuator, around 17g acceleration was achieved by the end effector (stroke length was 0.2m to 0.78m). Results indicate that the developed dynamic model results are in good agreement. Finally, a Design of Experiment (DOE) based statistical approach is also introduced to identify the parametric combination that yields the greatest performance. Data are collected by using the Bond Graph model. This approach is helpful in designing the actuator without much complexity.