9 resultados para parallel processing systems

em Digital Commons at Florida International University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Parallel processing is prevalent in many manufacturing and service systems. Many manufactured products are built and assembled from several components fabricated in parallel lines. An example of this manufacturing system configuration is observed at a manufacturing facility equipped to assemble and test web servers. Characteristics of a typical web server assembly line are: multiple products, job circulation, and paralleling processing. The primary objective of this research was to develop analytical approximations to predict performance measures of manufacturing systems with job failures and parallel processing. The analytical formulations extend previous queueing models used in assembly manufacturing systems in that they can handle serial and different configurations of paralleling processing with multiple product classes, and job circulation due to random part failures. In addition, appropriate correction terms via regression analysis were added to the approximations in order to minimize the gap in the error between the analytical approximation and the simulation models. Markovian and general type manufacturing systems, with multiple product classes, job circulation due to failures, and fork and join systems to model parallel processing were studied. In the Markovian and general case, the approximations without correction terms performed quite well for one and two product problem instances. However, it was observed that the flow time error increased as the number of products and net traffic intensity increased. Therefore, correction terms for single and fork-join stations were developed via regression analysis to deal with more than two products. The numerical comparisons showed that the approximations perform remarkably well when the corrections factors were used in the approximations. In general, the average flow time error was reduced from 38.19% to 5.59% in the Markovian case, and from 26.39% to 7.23% in the general case. All the equations stated in the analytical formulations were implemented as a set of Matlab scripts. By using this set, operations managers of web server assembly lines, manufacturing or other service systems with similar characteristics can estimate different system performance measures, and make judicious decisions - especially setting delivery due dates, capacity planning, and bottleneck mitigation, among others.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Parallel processing is prevalent in many manufacturing and service systems. Many manufactured products are built and assembled from several components fabricated in parallel lines. An example of this manufacturing system configuration is observed at a manufacturing facility equipped to assemble and test web servers. Characteristics of a typical web server assembly line are: multiple products, job circulation, and paralleling processing. The primary objective of this research was to develop analytical approximations to predict performance measures of manufacturing systems with job failures and parallel processing. The analytical formulations extend previous queueing models used in assembly manufacturing systems in that they can handle serial and different configurations of paralleling processing with multiple product classes, and job circulation due to random part failures. In addition, appropriate correction terms via regression analysis were added to the approximations in order to minimize the gap in the error between the analytical approximation and the simulation models. Markovian and general type manufacturing systems, with multiple product classes, job circulation due to failures, and fork and join systems to model parallel processing were studied. In the Markovian and general case, the approximations without correction terms performed quite well for one and two product problem instances. However, it was observed that the flow time error increased as the number of products and net traffic intensity increased. Therefore, correction terms for single and fork-join stations were developed via regression analysis to deal with more than two products. The numerical comparisons showed that the approximations perform remarkably well when the corrections factors were used in the approximations. In general, the average flow time error was reduced from 38.19% to 5.59% in the Markovian case, and from 26.39% to 7.23% in the general case. All the equations stated in the analytical formulations were implemented as a set of Matlab scripts. By using this set, operations managers of web server assembly lines, manufacturing or other service systems with similar characteristics can estimate different system performance measures, and make judicious decisions - especially setting delivery due dates, capacity planning, and bottleneck mitigation, among others.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as ƒ-test is performed during each node's split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Bio-systems are inherently complex information processing systems. Furthermore, physiological complexities of biological systems limit the formation of a hypothesis in terms of behavior and the ability to test hypothesis. More importantly the identification and classification of mutation in patients are centric topics in today's cancer research. Next generation sequencing (NGS) technologies can provide genome-wide coverage at a single nucleotide resolution and at reasonable speed and cost. The unprecedented molecular characterization provided by NGS offers the potential for an individualized approach to treatment. These advances in cancer genomics have enabled scientists to interrogate cancer-specific genomic variants and compare them with the normal variants in the same patient. Analysis of this data provides a catalog of somatic variants, present in tumor genome but not in the normal tissue DNA. In this dissertation, we present a new computational framework to the problem of predicting the number of mutations on a chromosome for a certain patient, which is a fundamental problem in clinical and research fields. We begin this dissertation with the development of a framework system that is capable of utilizing published data from a longitudinal study of patients with acute myeloid leukemia (AML), who's DNA from both normal as well as malignant tissues was subjected to NGS analysis at various points in time. By processing the sequencing data at the time of cancer diagnosis using the components of our framework, we tested it by predicting the genomic regions to be mutated at the time of relapse and, later, by comparing our results with the actual regions that showed mutations (discovered at relapse time). We demonstrate that this coupling of the algorithm pipeline can drastically improve the predictive abilities of searching a reliable molecular signature. Arguably, the most important result of our research is its superior performance to other methods like Radial Basis Function Network, Sequential Minimal Optimization, and Gaussian Process. In the final part of this dissertation, we present a detailed significance, stability and statistical analysis of our model. A performance comparison of the results are presented. This work clearly lays a good foundation for future research for other types of cancer.^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The development cost of any civil infrastructure is very high; during its life span, the civil structure undergoes a lot of physical loads and environmental effects which damage the structure. Failing to identify this damage at an early stage may result in severe property loss and may become a potential threat to people and the environment. Thus, there is a need to develop effective damage detection techniques to ensure the safety and integrity of the structure. One of the Structural Health Monitoring methods to evaluate a structure is by using statistical analysis. In this study, a civil structure measuring 8 feet in length, 3 feet in diameter, embedded with thermocouple sensors at 4 different levels is analyzed under controlled and variable conditions. With the help of statistical analysis, possible damage to the structure was analyzed. The analysis could detect the structural defects at various levels of the structure.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as f-test is performed during each node’s split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This research is motivated by a practical application observed at a printed circuit board (PCB) manufacturing facility. After assembly, the PCBs (or jobs) are tested in environmental stress screening (ESS) chambers (or batch processing machines) to detect early failures. Several PCBs can be simultaneously tested as long as the total size of all the PCBs in the batch does not violate the chamber capacity. PCBs from different production lines arrive dynamically to a queue in front of a set of identical ESS chambers, where they are grouped into batches for testing. Each line delivers PCBs that vary in size and require different testing (or processing) times. Once a batch is formed, its processing time is the longest processing time among the PCBs in the batch, and its ready time is given by the PCB arriving last to the batch. ESS chambers are expensive and a bottleneck. Consequently, its makespan has to be minimized. ^ A mixed-integer formulation is proposed for the problem under study and compared to a formulation recently published. The proposed formulation is better in terms of the number of decision variables, linear constraints and run time. A procedure to compute the lower bound is proposed. For sparse problems (i.e. when job ready times are dispersed widely), the lower bounds are close to optimum. ^ The problem under study is NP-hard. Consequently, five heuristics, two metaheuristics (i.e. simulated annealing (SA) and greedy randomized adaptive search procedure (GRASP)), and a decomposition approach (i.e. column generation) are proposed—especially to solve problem instances which require prohibitively long run times when a commercial solver is used. Extensive experimental study was conducted to evaluate the different solution approaches based on the solution quality and run time. ^ The decomposition approach improved the lower bounds (or linear relaxation solution) of the mixed-integer formulation. At least one of the proposed heuristic outperforms the Modified Delay heuristic from the literature. For sparse problems, almost all the heuristics report a solution close to optimum. GRASP outperforms SA at a higher computational cost. The proposed approaches are viable to implement as the run time is very short. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This research aims at a study of the hybrid flow shop problem which has parallel batch-processing machines in one stage and discrete-processing machines in other stages to process jobs of arbitrary sizes. The objective is to minimize the makespan for a set of jobs. The problem is denoted as: FF: batch1,sj:Cmax. The problem is formulated as a mixed-integer linear program. The commercial solver, AMPL/CPLEX, is used to solve problem instances to their optimality. Experimental results show that AMPL/CPLEX requires considerable time to find the optimal solution for even a small size problem, i.e., a 6-job instance requires 2 hours in average. A bottleneck-first-decomposition heuristic (BFD) is proposed in this study to overcome the computational (time) problem encountered while using the commercial solver. The proposed BFD heuristic is inspired by the shifting bottleneck heuristic. It decomposes the entire problem into three sub-problems, and schedules the sub-problems one by one. The proposed BFD heuristic consists of four major steps: formulating sub-problems, prioritizing sub-problems, solving sub-problems and re-scheduling. For solving the sub-problems, two heuristic algorithms are proposed; one for scheduling a hybrid flow shop with discrete processing machines, and the other for scheduling parallel batching machines (single stage). Both consider job arrival and delivery times. An experiment design is conducted to evaluate the effectiveness of the proposed BFD, which is further evaluated against a set of common heuristics including a randomized greedy heuristic and five dispatching rules. The results show that the proposed BFD heuristic outperforms all these algorithms. To evaluate the quality of the heuristic solution, a procedure is developed to calculate a lower bound of makespan for the problem under study. The lower bound obtained is tighter than other bounds developed for related problems in literature. A meta-search approach based on the Genetic Algorithm concept is developed to evaluate the significance of further improving the solution obtained from the proposed BFD heuristic. The experiment indicates that it reduces the makespan by 1.93 % in average within a negligible time when problem size is less than 50 jobs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This dissertation presents and evaluates a methodology for scheduling medical application workloads in virtualized computing environments. Such environments are being widely adopted by providers of "cloud computing" services. In the context of provisioning resources for medical applications, such environments allow users to deploy applications on distributed computing resources while keeping their data secure. Furthermore, higher level services that further abstract the infrastructure-related issues can be built on top of such infrastructures. For example, a medical imaging service can allow medical professionals to process their data in the cloud, easing them from the burden of having to deploy and manage these resources themselves. In this work, we focus on issues related to scheduling scientific workloads on virtualized environments. We build upon the knowledge base of traditional parallel job scheduling to address the specific case of medical applications while harnessing the benefits afforded by virtualization technology. To this end, we provide the following contributions: (1) An in-depth analysis of the execution characteristics of the target applications when run in virtualized environments. (2) A performance prediction methodology applicable to the target environment. (3) A scheduling algorithm that harnesses application knowledge and virtualization-related benefits to provide strong scheduling performance and quality of service guarantees. In the process of addressing these pertinent issues for our target user base (i.e. medical professionals and researchers), we provide insight that benefits a large community of scientific application users in industry and academia. Our execution time prediction and scheduling methodologies are implemented and evaluated on a real system running popular scientific applications. We find that we are able to predict the execution time of a number of these applications with an average error of 15%. Our scheduling methodology, which is tested with medical image processing workloads, is compared to that of two baseline scheduling solutions and we find that it outperforms them in terms of both the number of jobs processed and resource utilization by 20–30%, without violating any deadlines. We conclude that our solution is a viable approach to supporting the computational needs of medical users, even if the cloud computing paradigm is not widely adopted in its current form.