13 resultados para HDFS bottleneck

em Digital Commons at Florida International University


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This research is motivated by a practical application observed at a printed circuit board (PCB) manufacturing facility. After assembly, the PCBs (or jobs) are tested in environmental stress screening (ESS) chambers (or batch processing machines) to detect early failures. Several PCBs can be simultaneously tested as long as the total size of all the PCBs in the batch does not violate the chamber capacity. PCBs from different production lines arrive dynamically to a queue in front of a set of identical ESS chambers, where they are grouped into batches for testing. Each line delivers PCBs that vary in size and require different testing (or processing) times. Once a batch is formed, its processing time is the longest processing time among the PCBs in the batch, and its ready time is given by the PCB arriving last to the batch. ESS chambers are expensive and a bottleneck. Consequently, its makespan has to be minimized. ^ A mixed-integer formulation is proposed for the problem under study and compared to a formulation recently published. The proposed formulation is better in terms of the number of decision variables, linear constraints and run time. A procedure to compute the lower bound is proposed. For sparse problems (i.e. when job ready times are dispersed widely), the lower bounds are close to optimum. ^ The problem under study is NP-hard. Consequently, five heuristics, two metaheuristics (i.e. simulated annealing (SA) and greedy randomized adaptive search procedure (GRASP)), and a decomposition approach (i.e. column generation) are proposed—especially to solve problem instances which require prohibitively long run times when a commercial solver is used. Extensive experimental study was conducted to evaluate the different solution approaches based on the solution quality and run time. ^ The decomposition approach improved the lower bounds (or linear relaxation solution) of the mixed-integer formulation. At least one of the proposed heuristic outperforms the Modified Delay heuristic from the literature. For sparse problems, almost all the heuristics report a solution close to optimum. GRASP outperforms SA at a higher computational cost. The proposed approaches are viable to implement as the run time is very short. ^

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds—parent-to-child or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5–54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3–127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Parallel processing is prevalent in many manufacturing and service systems. Many manufactured products are built and assembled from several components fabricated in parallel lines. An example of this manufacturing system configuration is observed at a manufacturing facility equipped to assemble and test web servers. Characteristics of a typical web server assembly line are: multiple products, job circulation, and paralleling processing. The primary objective of this research was to develop analytical approximations to predict performance measures of manufacturing systems with job failures and parallel processing. The analytical formulations extend previous queueing models used in assembly manufacturing systems in that they can handle serial and different configurations of paralleling processing with multiple product classes, and job circulation due to random part failures. In addition, appropriate correction terms via regression analysis were added to the approximations in order to minimize the gap in the error between the analytical approximation and the simulation models. Markovian and general type manufacturing systems, with multiple product classes, job circulation due to failures, and fork and join systems to model parallel processing were studied. In the Markovian and general case, the approximations without correction terms performed quite well for one and two product problem instances. However, it was observed that the flow time error increased as the number of products and net traffic intensity increased. Therefore, correction terms for single and fork-join stations were developed via regression analysis to deal with more than two products. The numerical comparisons showed that the approximations perform remarkably well when the corrections factors were used in the approximations. In general, the average flow time error was reduced from 38.19% to 5.59% in the Markovian case, and from 26.39% to 7.23% in the general case. All the equations stated in the analytical formulations were implemented as a set of Matlab scripts. By using this set, operations managers of web server assembly lines, manufacturing or other service systems with similar characteristics can estimate different system performance measures, and make judicious decisions - especially setting delivery due dates, capacity planning, and bottleneck mitigation, among others.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A job shop with one batch processing and several discrete machines is analyzed. Given a set of jobs, their process routes, processing requirements, and size, the objective is to schedule the jobs such that the makespan is minimized. The batch processing machine can process a batch of jobs as long as the machine capacity is not violated. The batch processing time is equal to the longest processing job in the batch. The problem under study can be represented as Jm:batch:Cmax. If no batches were formed, the scheduling problem under study reduces to the classical job shop scheduling problem (i.e. Jm:: Cmax), which is known to be NP-hard. This research extends the scheduling literature by combining Jm::Cmax with batch processing. The primary contributions are the mathematical formulation, a new network representation and several solution approaches. The problem under study is observed widely in metal working and other industries, but received limited or no attention due to its complexity. A novel network representation of the problem using disjunctive and conjunctive arcs, and a mathematical formulation are proposed to minimize the makespan. Besides that, several algorithms, like batch forming heuristics, dispatching rules, Modified Shifting Bottleneck, Tabu Search (TS) and Simulated Annealing (SA), were developed and implemented. An experimental study was conducted to evaluate the proposed heuristics, and the results were compared to those from a commercial solver (i.e., CPLEX). TS and SA, with the combination of MWKR-FF as the initial solution, gave the best solutions among all the heuristics proposed. Their results were close to CPLEX; and for some larger instances, with total operations greater than 225, they were competitive in terms of solution quality and runtime. For some larger problem instances, CPLEX was unable to report a feasible solution even after running for several hours. Between SA and the experimental study indicated that SA produced a better average Cmax for all instances. The solution approaches proposed will benefit practitioners to schedule a job shop (with both discrete and batch processing machines) more efficiently. The proposed solution approaches are easier to implement and requires short run times to solve large problem instances.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation develops a process improvement method for service operations based on the Theory of Constraints (TOC), a management philosophy that has been shown to be effective in manufacturing for decreasing WIP and improving throughput. While TOC has enjoyed much attention and success in the manufacturing arena, its application to services in general has been limited. The contribution to industry and knowledge is a method for improving global performance measures based on TOC principles. The method proposed in this dissertation will be tested using discrete event simulation based on the scenario of the service factory of airline turnaround operations. To evaluate the method, a simulation model of aircraft turn operations of a U.S. based carrier was made and validated using actual data from airline operations. The model was then adjusted to reflect an application of the Theory of Constraints for determining how to deploy the scarce resource of ramp workers. The results indicate that, given slight modifications to TOC terminology and the development of a method for constraint identification, the Theory of Constraints can be applied with success to services. Bottlenecks in services must be defined as those processes for which the process rates and amount of work remaining are such that completing the process will not be possible without an increase in the process rate. The bottleneck ratio is used to determine to what degree a process is a constraint. Simulation results also suggest that redefining performance measures to reflect a global business perspective of reducing costs related to specific flights versus the operational local optimum approach of turning all aircraft quickly results in significant savings to the company. Savings to the annual operating costs of the airline were simulated to equal 30% of possible current expenses for misconnecting passengers with a modest increase in utilization of the workers through a more efficient heuristic of deploying them to the highest priority tasks. This dissertation contributes to the literature on service operations by describing a dynamic, adaptive dispatch approach to manage service factory operations similar to airline turnaround operations using the management philosophy of the Theory of Constraints.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This research is motivated by the need for considering lot sizing while accepting customer orders in a make-to-order (MTO) environment, in which each customer order must be delivered by its due date. Job shop is the typical operation model used in an MTO operation, where the production planner must make three concurrent decisions; they are order selection, lot size, and job schedule. These decisions are usually treated separately in the literature and are mostly led to heuristic solutions. The first phase of the study is focused on a formal definition of the problem. Mathematical programming techniques are applied to modeling this problem in terms of its objective, decision variables, and constraints. A commercial solver, CPLEX is applied to solve the resulting mixed-integer linear programming model with small instances to validate the mathematical formulation. The computational result shows it is not practical for solving problems of industrial size, using a commercial solver. The second phase of this study is focused on development of an effective solution approach to this problem of large scale. The proposed solution approach is an iterative process involving three sequential decision steps of order selection, lot sizing, and lot scheduling. A range of simple sequencing rules are identified for each of the three subproblems. Using computer simulation as the tool, an experiment is designed to evaluate their performance against a set of system parameters. For order selection, the proposed weighted most profit rule performs the best. The shifting bottleneck and the earliest operation finish time both are the best scheduling rules. For lot sizing, the proposed minimum cost increase heuristic, based on the Dixon-Silver method performs the best, when the demand-to-capacity ratio at the bottleneck machine is high. The proposed minimum cost heuristic, based on the Wagner-Whitin algorithm is the best lot-sizing heuristic for shops of a low demand-to-capacity ratio. The proposed heuristic is applied to an industrial case to further evaluate its performance. The result shows it can improve an average of total profit by 16.62%. This research contributes to the production planning research community with a complete mathematical definition of the problem and an effective solution approach to solving the problem of industry scale.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This research aims at a study of the hybrid flow shop problem which has parallel batch-processing machines in one stage and discrete-processing machines in other stages to process jobs of arbitrary sizes. The objective is to minimize the makespan for a set of jobs. The problem is denoted as: FF: batch1,sj:Cmax. The problem is formulated as a mixed-integer linear program. The commercial solver, AMPL/CPLEX, is used to solve problem instances to their optimality. Experimental results show that AMPL/CPLEX requires considerable time to find the optimal solution for even a small size problem, i.e., a 6-job instance requires 2 hours in average. A bottleneck-first-decomposition heuristic (BFD) is proposed in this study to overcome the computational (time) problem encountered while using the commercial solver. The proposed BFD heuristic is inspired by the shifting bottleneck heuristic. It decomposes the entire problem into three sub-problems, and schedules the sub-problems one by one. The proposed BFD heuristic consists of four major steps: formulating sub-problems, prioritizing sub-problems, solving sub-problems and re-scheduling. For solving the sub-problems, two heuristic algorithms are proposed; one for scheduling a hybrid flow shop with discrete processing machines, and the other for scheduling parallel batching machines (single stage). Both consider job arrival and delivery times. An experiment design is conducted to evaluate the effectiveness of the proposed BFD, which is further evaluated against a set of common heuristics including a randomized greedy heuristic and five dispatching rules. The results show that the proposed BFD heuristic outperforms all these algorithms. To evaluate the quality of the heuristic solution, a procedure is developed to calculate a lower bound of makespan for the problem under study. The lower bound obtained is tighter than other bounds developed for related problems in literature. A meta-search approach based on the Genetic Algorithm concept is developed to evaluate the significance of further improving the solution obtained from the proposed BFD heuristic. The experiment indicates that it reduces the makespan by 1.93 % in average within a negligible time when problem size is less than 50 jobs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Paracalanus quasimodo and Temora turbinata are two calanoid copepods prominent in the planktonic communities of the southeastern United States. Despite their prominence, the species and population level structure of these copepods is yet unexplored. The phylogeographic, temporal and phylogenetic structure of P. quasimodo and T. turbinata are examined in my study. Samples were collected from ten sites along the Gulf of Mexico and Florida peninsular coasts. Three sites were sampled quarterly for two years. Individuals were screened for unique ITS-1 sequences with denaturing gradient gel electrophoresis. Unique variants were sequenced at the nuclear ITS-1 and mitochondrial COI loci. Sampling sites were analyzed for pairwise community differences and for variances between geographic and temporal groupings. Genetic variants were analyzed for phylogenetic and coalescent topology. Paracalanus quasimodo is highly structured geographically with populations divided between the Gulf of Mexico, temperate Atlantic and subtropical Atlantic, in addition to isolation by distance. No significant differences were detected between the T. turbinata samples. Both P. quasimodo and T. turbinata are stable within sites over time and between sites within a sampling period, with two exceptions. The first was a pilot sample from Miami taken two years prior to the general sampling whose community showed significant differences from most of the other Miami samples. Paracalanus quasimodo had a positive correlation of Fst with time. The second was high temporal variability detected in the samples from Fort Pierce. Phylogenetically, both P. quasimodo and T. turbinata were in well supported, congeneric clades. Paracalanus quasimodo was not monophyletic, divided into two well-supported clades. Temora turbinata variants were in one clade with insignificant support for topology within the clade and very little intraspecific variation. Paracalanus quasimodo and T. turbinata populations show opposite trends. Paracalanus quasimodo occurs near shore and shows population structure mediated by hydrological features and distance, both geographic and temporal. The phylogeny shows two deeply divergent clades suggestive of cryptic speciation. In contrast, T. turbinata populations range further offshore and show little geographic or temporal structure. However, the low genetic variation detected in this region suggests a recent bottleneck event.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Toll plazas have several toll payment types such as manual, automatic coin machines, electronic and mixed lanes. In places with high traffic flow, the presence of toll plaza causes a lot of traffic congestion; this creates a bottleneck for the traffic flow, unless the correct mix of payment types is in operation. The objective of this research is to determine the optimal lane configuration for the mix of the methods of payment so that the waiting time in the queue at the toll plaza is minimized. A queuing model representing the toll plaza system and a nonlinear integer program have been developed to determine the optimal mix. The numerical results show that the waiting time can be decreased at the toll plaza by changing the lane configuration. For the case study developed an improvement in the waiting time as high as 96.37 percent was noticed during the morning peak hour.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Variable Speed Limit (VSL) strategies identify and disseminate dynamic speed limits that are determined to be appropriate based on prevailing traffic conditions, road surface conditions, and weather conditions. This dissertation develops and evaluates a shockwave-based VSL system that uses a heuristic switching logic-based controller with specified thresholds of prevailing traffic flow conditions. The system aims to improve operations and mobility at critical bottlenecks. Before traffic breakdown occurrence, the proposed VSL’s goal is to prevent or postpone breakdown by decreasing the inflow and achieving uniform distribution in speed and flow. After breakdown occurrence, the VSL system aims to dampen traffic congestion by reducing the inflow traffic to the congested area and increasing the bottleneck capacity by deactivating the VSL at the head of the congested area. The shockwave-based VSL system pushes the VSL location upstream as the congested area propagates upstream. In addition to testing the system using infrastructure detector-based data, this dissertation investigates the use of Connected Vehicle trajectory data as input to the shockwave-based VSL system performance. Since the field Connected Vehicle data are not available, as part of this research, Vehicle-to-Infrastructure communication is modeled in the microscopic simulation to obtain individual vehicle trajectories. In this system, wavelet transform is used to analyze aggregated individual vehicles’ speed data to determine the locations of congestion. The currently recommended calibration procedures of simulation models are generally based on the capacity, volume and system-performance values and do not specifically examine traffic breakdown characteristics. However, since the proposed VSL strategies are countermeasures to the impacts of breakdown conditions, considering breakdown characteristics in the calibration procedure is important to have a reliable assessment. Several enhancements were proposed in this study to account for the breakdown characteristics at bottleneck locations in the calibration process. In this dissertation, performance of shockwave-based VSL is compared to VSL systems with different fixed VSL message sign locations utilizing the calibrated microscopic model. The results show that shockwave-based VSL outperforms fixed-location VSL systems, and it can considerably decrease the maximum back of queue and duration of breakdown while increasing the average speed during breakdown.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Disk drives are the bottleneck in the processing of large amounts of data used in almost all common applications. File systems attempt to reduce this by storing data sequentially on the disk drives, thereby reducing the access latencies. Although this strategy is useful when data is retrieved sequentially, the access patterns in real world workloads is not necessarily sequential and this mismatch results in storage I/O performance degradation. This thesis demonstrates that one way to improve the storage performance is to reorganize data on disk drives in the same way in which it is mostly accessed. We identify two classes of accesses: static, where access patterns do not change over the lifetime of the data and dynamic, where access patterns frequently change over short durations of time, and propose, implement and evaluate layout strategies for each of these. Our strategies are implemented in a way that they can be seamlessly integrated or removed from the system as desired. We evaluate our layout strategies for static policies using tree-structured XML data where accesses to the storage device are mostly of two kinds - parent-tochild or child-to-sibling. Our results show that for a specific class of deep-focused queries, the existing file system layout policy performs better by 5-54X. For the non-deep-focused queries, our native layout mechanism shows an improvement of 3-127X. To improve performance of the dynamic access patterns, we implement a self-optimizing storage system that performs rearranges popular block accesses on a dedicated partition based on the observed workload characteristics. Our evaluation shows an improvement of over 80% in the disk busy times over a range of workloads. These results show that applying the knowledge of data access patterns for allocation decisions can substantially improve the I/O performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Parallel processing is prevalent in many manufacturing and service systems. Many manufactured products are built and assembled from several components fabricated in parallel lines. An example of this manufacturing system configuration is observed at a manufacturing facility equipped to assemble and test web servers. Characteristics of a typical web server assembly line are: multiple products, job circulation, and paralleling processing. The primary objective of this research was to develop analytical approximations to predict performance measures of manufacturing systems with job failures and parallel processing. The analytical formulations extend previous queueing models used in assembly manufacturing systems in that they can handle serial and different configurations of paralleling processing with multiple product classes, and job circulation due to random part failures. In addition, appropriate correction terms via regression analysis were added to the approximations in order to minimize the gap in the error between the analytical approximation and the simulation models. Markovian and general type manufacturing systems, with multiple product classes, job circulation due to failures, and fork and join systems to model parallel processing were studied. In the Markovian and general case, the approximations without correction terms performed quite well for one and two product problem instances. However, it was observed that the flow time error increased as the number of products and net traffic intensity increased. Therefore, correction terms for single and fork-join stations were developed via regression analysis to deal with more than two products. The numerical comparisons showed that the approximations perform remarkably well when the corrections factors were used in the approximations. In general, the average flow time error was reduced from 38.19% to 5.59% in the Markovian case, and from 26.39% to 7.23% in the general case. All the equations stated in the analytical formulations were implemented as a set of Matlab scripts. By using this set, operations managers of web server assembly lines, manufacturing or other service systems with similar characteristics can estimate different system performance measures, and make judicious decisions - especially setting delivery due dates, capacity planning, and bottleneck mitigation, among others.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Paracalanus quasimodo and Temora turbinata are two calanoid copepods prominent in the planktonic communities of the southeastern United States. Despite their prominence, the species and population level structure of these copepods is yet unexplored. The phylogeographic, temporal and phylogenetic structure of P. quasimodo and T. turbinata are examined in my study. Samples were collected from ten sites along the Gulf of Mexico and Florida peninsular coasts. Three sites were sampled quarterly for two years. Individuals were screened for unique ITS-1 sequences with denaturing gradient gel electrophoresis. Unique variants were sequenced at the nuclear ITS-1 and mitochondrial COI loci. Sampling sites were analyzed for pairwise community differences and for variances between geographic and temporal groupings. Genetic variants were analyzed for phylogenetic and coalescent topology. Paracalanus quasimodo is highly structured geographically with populations divided between the Gulf of Mexico, temperate Atlantic and subtropical Atlantic, in addition to isolation by distance. No significant differences were detected between the T. turbinata samples. Both P. quasimodo and T. turbinata are stable within sites over time and between sites within a sampling period, with two exceptions. The first was a pilot sample from Miami taken two years prior to the general sampling whose community showed significant differences from most of the other Miami samples. Paracalanus quasimodo had a positive correlation of Fst with time. The second was high temporal variability detected in the samples from Fort Pierce. Phylogenetically, both P. quasimodo and T. turbinata were in well supported, congeneric clades. Paracalanus quasimodo was not monophyletic, divided into two well-supported clades. Temora turbinata variants were in one clade with insignificant support for topology within the clade and very little intraspecific variation. Paracalanus quasimodo and T. turbinata populations show opposite trends. Paracalanus quasimodo occurs near shore and shows population structure mediated by hydrological features and distance, both geographic and temporal. The phylogeny shows two deeply divergent clades suggestive of cryptic speciation. In contrast, T. turbinata populations range further offshore and show little geographic or temporal structure. However, the low genetic variation detected in this region suggests a recent bottleneck event.