9 resultados para parallel computation model

em Digital Commons at Florida International University


Relevância:

80.00% 80.00%

Publicador:

Resumo:

As massive data sets become increasingly available, people are facing the problem of how to effectively process and understand these data. Traditional sequential computing models are giving way to parallel and distributed computing models, such as MapReduce, both due to the large size of the data sets and their high dimensionality. This dissertation, as in the same direction of other researches that are based on MapReduce, tries to develop effective techniques and applications using MapReduce that can help people solve large-scale problems. Three different problems are tackled in the dissertation. The first one deals with processing terabytes of raster data in a spatial data management system. Aerial imagery files are broken into tiles to enable data parallel computation. The second and third problems deal with dimension reduction techniques that can be used to handle data sets of high dimensionality. Three variants of the nonnegative matrix factorization technique are scaled up to factorize matrices of dimensions in the order of millions in MapReduce based on different matrix multiplication implementations. Two algorithms, which compute CANDECOMP/PARAFAC and Tucker tensor decompositions respectively, are parallelized in MapReduce based on carefully partitioning the data and arranging the computation to maximize data locality and parallelism.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Modern geographical databases, which are at the core of geographic information systems (GIS), store a rich set of aspatial attributes in addition to geographic data. Typically, aspatial information comes in textual and numeric format. Retrieving information constrained on spatial and aspatial data from geodatabases provides GIS users the ability to perform more interesting spatial analyses, and for applications to support composite location-aware searches; for example, in a real estate database: “Find the nearest homes for sale to my current location that have backyard and whose prices are between $50,000 and $80,000”. Efficient processing of such queries require combined indexing strategies of multiple types of data. Existing spatial query engines commonly apply a two-filter approach (spatial filter followed by nonspatial filter, or viceversa), which can incur large performance overheads. On the other hand, more recently, the amount of geolocation data has grown rapidly in databases due in part to advances in geolocation technologies (e.g., GPS-enabled smartphones) that allow users to associate location data to objects or events. The latter poses potential data ingestion challenges of large data volumes for practical GIS databases. In this dissertation, we first show how indexing spatial data with R-trees (a typical data pre-processing task) can be scaled in MapReduce—a widely-adopted parallel programming model for data intensive problems. The evaluation of our algorithms in a Hadoop cluster showed close to linear scalability in building R-tree indexes. Subsequently, we develop efficient algorithms for processing spatial queries with aspatial conditions. Novel techniques for simultaneously indexing spatial with textual and numeric data are developed to that end. Experimental evaluations with real-world, large spatial datasets measured query response times within the sub-second range for most cases, and up to a few seconds for a small number of cases, which is reasonable for interactive applications. Overall, the previous results show that the MapReduce parallel model is suitable for indexing tasks in spatial databases, and the adequate combination of spatial and aspatial attribute indexes can attain acceptable response times for interactive spatial queries with constraints on aspatial data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This research investigates a new structural system utilising modular construction. Five-sided boxes are cast on-site and stacked together to form a building. An analytical model was created of a typical building in each of two different analysis programs utilising the finite element method (Robot Millennium and ETABS). The pros and cons of both Robot Millennium and ETABS are listed at several key stages in the development of an analytical model utilising this structural system. Robot Millennium was initially utilised but created an analytical model too large to be successfully run. The computation requirements were too large for conventional computers. Therefore Robot Millennium was abandoned in favour of ETABS, whose more simplistic algorithms and assumptions permitted running this large computation model. Tips are provided as well as pitfalls signalled throughout the process of modelling such complex buildings of this type. ^ The building under high seismic loading required a new horizontal shear mechanism. This dissertation has proposed to create a secondary floor that ties to the modular box through the use of gunwales, and roughened surfaces with epoxy coatings. In addition, vertical connections necessitated a new type of shear wall. These shear walls consisted of waffled external walls tied through both reinforcement and a secondary concrete pour. ^ This structural system has generated a new building which was found to be very rigid compared to a conventional structure. The proposed modular building exhibited a period of 1.27 seconds, which is about one-fifth of a conventional building. The maximum lateral drift occurs under seismic loading with a magnitude of 6.14 inches which is one-quarter of a conventional building's drift. The deflected shape and pattern of the interstorey drifts are consistent with those of a coupled shear wall building. In conclusion, the computer analysis indicate that this new structure exceeds current code requirements for both hurricane winds and high seismic loads, and concomitantly provides a shortened construction time with reduced funding. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

3D geographic information system (GIS) is data and computation intensive in nature. Internet users are usually equipped with low-end personal computers and network connections of limited bandwidth. Data reduction and performance optimization techniques are of critical importance in quality of service (QoS) management for online 3D GIS. In this research, QoS management issues regarding distributed 3D GIS presentation were studied to develop 3D TerraFly, an interactive 3D GIS that supports high quality online terrain visualization and navigation. ^ To tackle the QoS management challenges, multi-resolution rendering model, adaptive level of detail (LOD) control and mesh simplification algorithms were proposed to effectively reduce the terrain model complexity. The rendering model is adaptively decomposed into sub-regions of up-to-three detail levels according to viewing distance and other dynamic quality measurements. The mesh simplification algorithm was designed as a hybrid algorithm that combines edge straightening and quad-tree compression to reduce the mesh complexity by removing geometrically redundant vertices. The main advantage of this mesh simplification algorithm is that grid mesh can be directly processed in parallel without triangulation overhead. Algorithms facilitating remote accessing and distributed processing of volumetric GIS data, such as data replication, directory service, request scheduling, predictive data retrieving and caching were also proposed. ^ A prototype of the proposed 3D TerraFly implemented in this research demonstrates the effectiveness of our proposed QoS management framework in handling interactive online 3D GIS. The system implementation details and future directions of this research are also addressed in this thesis. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Today's wireless networks rely mostly on infrastructural support for their operation. With the concept of ubiquitous computing growing more popular, research on infrastructureless networks have been rapidly growing. However, such types of networks face serious security challenges when deployed. This dissertation focuses on designing a secure routing solution and trust modeling for these infrastructureless networks. ^ The dissertation presents a trusted routing protocol that is capable of finding a secure end-to-end route in the presence of malicious nodes acting either independently or in collusion, The solution protects the network from active internal attacks, known to be the most severe types of attacks in an ad hoc application. Route discovery is based on trust levels of the nodes, which need to be dynamically computed to reflect the malicious behavior in the network. As such, we have developed a trust computational model in conjunction with the secure routing protocol that analyzes the different malicious behavior and quantifies them in the model itself. Our work is the first step towards protecting an ad hoc network from colluding internal attack. To demonstrate the feasibility of the approach, extensive simulation has been carried out to evaluate the protocol efficiency and scalability with both network size and mobility. ^ This research has laid the foundation for developing a variety of techniques that will permit people to justifiably trust the use of ad hoc networks to perform critical functions, as well as to process sensitive information without depending on any infrastructural support and hence will enhance the use of ad hoc applications in both military and civilian domains. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Optimization of adaptive traffic signal timing is one of the most complex problems in traffic control systems. This dissertation presents a new method that applies the parallel genetic algorithm (PGA) to optimize adaptive traffic signal control in the presence of transit signal priority (TSP). The method can optimize the phase plan, cycle length, and green splits at isolated intersections with consideration for the performance of both the transit and the general vehicles. Unlike the simple genetic algorithm (GA), PGA can provide better and faster solutions needed for real-time optimization of adaptive traffic signal control. ^ An important component in the proposed method involves the development of a microscopic delay estimation model that was designed specifically to optimize adaptive traffic signal with TSP. Macroscopic delay models such as the Highway Capacity Manual (HCM) delay model are unable to accurately consider the effect of phase combination and phase sequence in delay calculations. In addition, because the number of phases and the phase sequence of adaptive traffic signal may vary from cycle to cycle, the phase splits cannot be optimized when the phase sequence is also a decision variable. A "flex-phase" concept was introduced in the proposed microscopic delay estimation model to overcome these limitations. ^ The performance of PGA was first evaluated against the simple GA. The results show that PGA achieved both faster convergence and lower delay for both under- or over-saturated traffic conditions. A VISSIM simulation testbed was then developed to evaluate the performance of the proposed PGA-based adaptive traffic signal control with TSP. The simulation results show that the PGA-based optimizer for adaptive TSP outperformed the fully actuated NEMA control in all test cases. The results also show that the PGA-based optimizer was able to produce TSP timing plans that benefit the transit vehicles while minimizing the impact of TSP on the general vehicles. The VISSIM testbed developed in this research provides a powerful tool to design and evaluate different TSP strategies under both actuated and adaptive signal control. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Parallel processing is prevalent in many manufacturing and service systems. Many manufactured products are built and assembled from several components fabricated in parallel lines. An example of this manufacturing system configuration is observed at a manufacturing facility equipped to assemble and test web servers. Characteristics of a typical web server assembly line are: multiple products, job circulation, and paralleling processing. The primary objective of this research was to develop analytical approximations to predict performance measures of manufacturing systems with job failures and parallel processing. The analytical formulations extend previous queueing models used in assembly manufacturing systems in that they can handle serial and different configurations of paralleling processing with multiple product classes, and job circulation due to random part failures. In addition, appropriate correction terms via regression analysis were added to the approximations in order to minimize the gap in the error between the analytical approximation and the simulation models. Markovian and general type manufacturing systems, with multiple product classes, job circulation due to failures, and fork and join systems to model parallel processing were studied. In the Markovian and general case, the approximations without correction terms performed quite well for one and two product problem instances. However, it was observed that the flow time error increased as the number of products and net traffic intensity increased. Therefore, correction terms for single and fork-join stations were developed via regression analysis to deal with more than two products. The numerical comparisons showed that the approximations perform remarkably well when the corrections factors were used in the approximations. In general, the average flow time error was reduced from 38.19% to 5.59% in the Markovian case, and from 26.39% to 7.23% in the general case. All the equations stated in the analytical formulations were implemented as a set of Matlab scripts. By using this set, operations managers of web server assembly lines, manufacturing or other service systems with similar characteristics can estimate different system performance measures, and make judicious decisions - especially setting delivery due dates, capacity planning, and bottleneck mitigation, among others.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Parallel processing is prevalent in many manufacturing and service systems. Many manufactured products are built and assembled from several components fabricated in parallel lines. An example of this manufacturing system configuration is observed at a manufacturing facility equipped to assemble and test web servers. Characteristics of a typical web server assembly line are: multiple products, job circulation, and paralleling processing. The primary objective of this research was to develop analytical approximations to predict performance measures of manufacturing systems with job failures and parallel processing. The analytical formulations extend previous queueing models used in assembly manufacturing systems in that they can handle serial and different configurations of paralleling processing with multiple product classes, and job circulation due to random part failures. In addition, appropriate correction terms via regression analysis were added to the approximations in order to minimize the gap in the error between the analytical approximation and the simulation models. Markovian and general type manufacturing systems, with multiple product classes, job circulation due to failures, and fork and join systems to model parallel processing were studied. In the Markovian and general case, the approximations without correction terms performed quite well for one and two product problem instances. However, it was observed that the flow time error increased as the number of products and net traffic intensity increased. Therefore, correction terms for single and fork-join stations were developed via regression analysis to deal with more than two products. The numerical comparisons showed that the approximations perform remarkably well when the corrections factors were used in the approximations. In general, the average flow time error was reduced from 38.19% to 5.59% in the Markovian case, and from 26.39% to 7.23% in the general case. All the equations stated in the analytical formulations were implemented as a set of Matlab scripts. By using this set, operations managers of web server assembly lines, manufacturing or other service systems with similar characteristics can estimate different system performance measures, and make judicious decisions - especially setting delivery due dates, capacity planning, and bottleneck mitigation, among others.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the presented thesis work, the meshfree method with distance fields was coupled with the lattice Boltzmann method to obtain solutions of fluid-structure interaction problems. The thesis work involved development and implementation of numerical algorithms, data structure, and software. Numerical and computational properties of the coupling algorithm combining the meshfree method with distance fields and the lattice Boltzmann method were investigated. Convergence and accuracy of the methodology was validated by analytical solutions. The research was focused on fluid-structure interaction solutions in complex, mesh-resistant domains as both the lattice Boltzmann method and the meshfree method with distance fields are particularly adept in these situations. Furthermore, the fluid solution provided by the lattice Boltzmann method is massively scalable, allowing extensive use of cutting edge parallel computing resources to accelerate this phase of the solution process. The meshfree method with distance fields allows for exact satisfaction of boundary conditions making it possible to exactly capture the effects of the fluid field on the solid structure.