10 resultados para Hybrid clustering algorithm
em Digital Commons at Florida International University
Resumo:
The effectiveness of an optimization algorithm can be reduced to its ability to navigate an objective function’s topology. Hybrid optimization algorithms combine various optimization algorithms using a single meta-heuristic so that the hybrid algorithm is more robust, computationally efficient, and/or accurate than the individual algorithms it is made of. This thesis proposes a novel meta-heuristic that uses search vectors to select the constituent algorithm that is appropriate for a given objective function. The hybrid is shown to perform competitively against several existing hybrid and non-hybrid optimization algorithms over a set of three hundred test cases. This thesis also proposes a general framework for evaluating the effectiveness of hybrid optimization algorithms. Finally, this thesis presents an improved Method of Characteristics Code with novel boundary conditions, which better characterizes pipelines than previous codes. This code is coupled with the hybrid optimization algorithm in order to optimize the operation of real-world piston pumps.
Resumo:
This research is to establish new optimization methods for pattern recognition and classification of different white blood cells in actual patient data to enhance the process of diagnosis. Beckman-Coulter Corporation supplied flow cytometry data of numerous patients that are used as training sets to exploit the different physiological characteristics of the different samples provided. The methods of Support Vector Machines (SVM) and Artificial Neural Networks (ANN) were used as promising pattern classification techniques to identify different white blood cell samples and provide information to medical doctors in the form of diagnostic references for the specific disease states, leukemia. The obtained results prove that when a neural network classifier is well configured and trained with cross-validation, it can perform better than support vector classifiers alone for this type of data. Furthermore, a new unsupervised learning algorithm---Density based Adaptive Window Clustering algorithm (DAWC) was designed to process large volumes of data for finding location of high data cluster in real-time. It reduces the computational load to ∼O(N) number of computations, and thus making the algorithm more attractive and faster than current hierarchical algorithms.
Resumo:
Online Social Network (OSN) services provided by Internet companies bring people together to chat, share the information, and enjoy the information. Meanwhile, huge amounts of data are generated by those services (they can be regarded as the social media ) every day, every hour, even every minute, and every second. Currently, researchers are interested in analyzing the OSN data, extracting interesting patterns from it, and applying those patterns to real-world applications. However, due to the large-scale property of the OSN data, it is difficult to effectively analyze it. This dissertation focuses on applying data mining and information retrieval techniques to mine two key components in the social media data — users and user-generated contents. Specifically, it aims at addressing three problems related to the social media users and contents: (1) how does one organize the users and the contents? (2) how does one summarize the textual contents so that users do not have to go over every post to capture the general idea? (3) how does one identify the influential users in the social media to benefit other applications, e.g., Marketing Campaign? The contribution of this dissertation is briefly summarized as follows. (1) It provides a comprehensive and versatile data mining framework to analyze the users and user-generated contents from the social media. (2) It designs a hierarchical co-clustering algorithm to organize the users and contents. (3) It proposes multi-document summarization methods to extract core information from the social network contents. (4) It introduces three important dimensions of social influence, and a dynamic influence model for identifying influential users.
Resumo:
Online Social Network (OSN) services provided by Internet companies bring people together to chat, share the information, and enjoy the information. Meanwhile, huge amounts of data are generated by those services (they can be regarded as the social media ) every day, every hour, even every minute, and every second. Currently, researchers are interested in analyzing the OSN data, extracting interesting patterns from it, and applying those patterns to real-world applications. However, due to the large-scale property of the OSN data, it is difficult to effectively analyze it. This dissertation focuses on applying data mining and information retrieval techniques to mine two key components in the social media data — users and user-generated contents. Specifically, it aims at addressing three problems related to the social media users and contents: (1) how does one organize the users and the contents? (2) how does one summarize the textual contents so that users do not have to go over every post to capture the general idea? (3) how does one identify the influential users in the social media to benefit other applications, e.g., Marketing Campaign? The contribution of this dissertation is briefly summarized as follows. (1) It provides a comprehensive and versatile data mining framework to analyze the users and user-generated contents from the social media. (2) It designs a hierarchical co-clustering algorithm to organize the users and contents. (3) It proposes multi-document summarization methods to extract core information from the social network contents. (4) It introduces three important dimensions of social influence, and a dynamic influence model for identifying influential users.
Resumo:
Numerical optimization is a technique where a computer is used to explore design parameter combinations to find extremes in performance factors. In multi-objective optimization several performance factors can be optimized simultaneously. The solution to multi-objective optimization problems is not a single design, but a family of optimized designs referred to as the Pareto frontier. The Pareto frontier is a trade-off curve in the objective function space composed of solutions where performance in one objective function is traded for performance in others. A Multi-Objective Hybridized Optimizer (MOHO) was created for the purpose of solving multi-objective optimization problems by utilizing a set of constituent optimization algorithms. MOHO tracks the progress of the Pareto frontier approximation development and automatically switches amongst those constituent evolutionary optimization algorithms to speed the formation of an accurate Pareto frontier approximation. Aerodynamic shape optimization is one of the oldest applications of numerical optimization. MOHO was used to perform shape optimization on a 0.5-inch ballistic penetrator traveling at Mach number 2.5. Two objectives were simultaneously optimized: minimize aerodynamic drag and maximize penetrator volume. This problem was solved twice. The first time the problem was solved by using Modified Newton Impact Theory (MNIT) to determine the pressure drag on the penetrator. In the second solution, a Parabolized Navier-Stokes (PNS) solver that includes viscosity was used to evaluate the drag on the penetrator. The studies show the difference in the optimized penetrator shapes when viscosity is absent and present in the optimization. In modern optimization problems, objective function evaluations may require many hours on a computer cluster to perform these types of analysis. One solution is to create a response surface that models the behavior of the objective function. Once enough data about the behavior of the objective function has been collected, a response surface can be used to represent the actual objective function in the optimization process. The Hybrid Self-Organizing Response Surface Method (HYBSORSM) algorithm was developed and used to make response surfaces of objective functions. HYBSORSM was evaluated using a suite of 295 non-linear functions. These functions involve from 2 to 100 variables demonstrating robustness and accuracy of HYBSORSM.
Resumo:
This research aims at a study of the hybrid flow shop problem which has parallel batch-processing machines in one stage and discrete-processing machines in other stages to process jobs of arbitrary sizes. The objective is to minimize the makespan for a set of jobs. The problem is denoted as: FF: batch1,sj:Cmax. The problem is formulated as a mixed-integer linear program. The commercial solver, AMPL/CPLEX, is used to solve problem instances to their optimality. Experimental results show that AMPL/CPLEX requires considerable time to find the optimal solution for even a small size problem, i.e., a 6-job instance requires 2 hours in average. A bottleneck-first-decomposition heuristic (BFD) is proposed in this study to overcome the computational (time) problem encountered while using the commercial solver. The proposed BFD heuristic is inspired by the shifting bottleneck heuristic. It decomposes the entire problem into three sub-problems, and schedules the sub-problems one by one. The proposed BFD heuristic consists of four major steps: formulating sub-problems, prioritizing sub-problems, solving sub-problems and re-scheduling. For solving the sub-problems, two heuristic algorithms are proposed; one for scheduling a hybrid flow shop with discrete processing machines, and the other for scheduling parallel batching machines (single stage). Both consider job arrival and delivery times. An experiment design is conducted to evaluate the effectiveness of the proposed BFD, which is further evaluated against a set of common heuristics including a randomized greedy heuristic and five dispatching rules. The results show that the proposed BFD heuristic outperforms all these algorithms. To evaluate the quality of the heuristic solution, a procedure is developed to calculate a lower bound of makespan for the problem under study. The lower bound obtained is tighter than other bounds developed for related problems in literature. A meta-search approach based on the Genetic Algorithm concept is developed to evaluate the significance of further improving the solution obtained from the proposed BFD heuristic. The experiment indicates that it reduces the makespan by 1.93 % in average within a negligible time when problem size is less than 50 jobs.
Resumo:
The accurate and reliable estimation of travel time based on point detector data is needed to support Intelligent Transportation System (ITS) applications. It has been found that the quality of travel time estimation is a function of the method used in the estimation and varies for different traffic conditions. In this study, two hybrid on-line travel time estimation models, and their corresponding off-line methods, were developed to achieve better estimation performance under various traffic conditions, including recurrent congestion and incidents. The first model combines the Mid-Point method, which is a speed-based method, with a traffic flow-based method. The second model integrates two speed-based methods: the Mid-Point method and the Minimum Speed method. In both models, the switch between travel time estimation methods is based on the congestion level and queue status automatically identified by clustering analysis. During incident conditions with rapidly changing queue lengths, shock wave analysis-based refinements are applied for on-line estimation to capture the fast queue propagation and recovery. Travel time estimates obtained from existing speed-based methods, traffic flow-based methods, and the models developed were tested using both simulation and real-world data. The results indicate that all tested methods performed at an acceptable level during periods of low congestion. However, their performances vary with an increase in congestion. Comparisons with other estimation methods also show that the developed hybrid models perform well in all cases. Further comparisons between the on-line and off-line travel time estimation methods reveal that off-line methods perform significantly better only during fast-changing congested conditions, such as during incidents. The impacts of major influential factors on the performance of travel time estimation, including data preprocessing procedures, detector errors, detector spacing, frequency of travel time updates to traveler information devices, travel time link length, and posted travel time range, were investigated in this study. The results show that these factors have more significant impacts on the estimation accuracy and reliability under congested conditions than during uncongested conditions. For the incident conditions, the estimation quality improves with the use of a short rolling period for data smoothing, more accurate detector data, and frequent travel time updates.
Resumo:
Modern power networks incorporate communications and information technology infrastructure into the electrical power system to create a smart grid in terms of control and operation. The smart grid enables real-time communication and control between consumers and utility companies allowing suppliers to optimize energy usage based on price preference and system technical issues. The smart grid design aims to provide overall power system monitoring, create protection and control strategies to maintain system performance, stability and security. This dissertation contributed to the development of a unique and novel smart grid test-bed laboratory with integrated monitoring, protection and control systems. This test-bed was used as a platform to test the smart grid operational ideas developed here. The implementation of this system in the real-time software creates an environment for studying, implementing and verifying novel control and protection schemes developed in this dissertation. Phasor measurement techniques were developed using the available Data Acquisition (DAQ) devices in order to monitor all points in the power system in real time. This provides a practical view of system parameter changes, system abnormal conditions and its stability and security information system. These developments provide valuable measurements for technical power system operators in the energy control centers. Phasor Measurement technology is an excellent solution for improving system planning, operation and energy trading in addition to enabling advanced applications in Wide Area Monitoring, Protection and Control (WAMPAC). Moreover, a virtual protection system was developed and implemented in the smart grid laboratory with integrated functionality for wide area applications. Experiments and procedures were developed in the system in order to detect the system abnormal conditions and apply proper remedies to heal the system. A design for DC microgrid was developed to integrate it to the AC system with appropriate control capability. This system represents realistic hybrid AC/DC microgrids connectivity to the AC side to study the use of such architecture in system operation to help remedy system abnormal conditions. In addition, this dissertation explored the challenges and feasibility of the implementation of real-time system analysis features in order to monitor the system security and stability measures. These indices are measured experimentally during the operation of the developed hybrid AC/DC microgrids. Furthermore, a real-time optimal power flow system was implemented to optimally manage the power sharing between AC generators and DC side resources. A study relating to real-time energy management algorithm in hybrid microgrids was performed to evaluate the effects of using energy storage resources and their use in mitigating heavy load impacts on system stability and operational security.
Resumo:
Recently, energy efficiency or green IT has become a hot issue for many IT infrastructures as they attempt to utilize energy-efficient strategies in their enterprise IT systems in order to minimize operational costs. Networking devices are shared resources connecting important IT infrastructures, especially in a data center network they are always operated 24/7 which consume a huge amount of energy, and it has been obviously shown that this energy consumption is largely independent of the traffic through the devices. As a result, power consumption in networking devices is becoming more and more a critical problem, which is of interest for both research community and general public. Multicast benefits group communications in saving link bandwidth and improving application throughput, both of which are important for green data center. In this paper, we study the deployment strategy of multicast switches in hybrid mode in energy-aware data center network: a case of famous fat-tree topology. The objective is to find the best location to deploy multicast switch not only to achieve optimal bandwidth utilization but also to minimize power consumption. We show that it is possible to easily achieve nearly 50% of energy consumption after applying our proposed algorithm.
Resumo:
Modern power networks incorporate communications and information technology infrastructure into the electrical power system to create a smart grid in terms of control and operation. The smart grid enables real-time communication and control between consumers and utility companies allowing suppliers to optimize energy usage based on price preference and system technical issues. The smart grid design aims to provide overall power system monitoring, create protection and control strategies to maintain system performance, stability and security. This dissertation contributed to the development of a unique and novel smart grid test-bed laboratory with integrated monitoring, protection and control systems. This test-bed was used as a platform to test the smart grid operational ideas developed here. The implementation of this system in the real-time software creates an environment for studying, implementing and verifying novel control and protection schemes developed in this dissertation. Phasor measurement techniques were developed using the available Data Acquisition (DAQ) devices in order to monitor all points in the power system in real time. This provides a practical view of system parameter changes, system abnormal conditions and its stability and security information system. These developments provide valuable measurements for technical power system operators in the energy control centers. Phasor Measurement technology is an excellent solution for improving system planning, operation and energy trading in addition to enabling advanced applications in Wide Area Monitoring, Protection and Control (WAMPAC). Moreover, a virtual protection system was developed and implemented in the smart grid laboratory with integrated functionality for wide area applications. Experiments and procedures were developed in the system in order to detect the system abnormal conditions and apply proper remedies to heal the system. A design for DC microgrid was developed to integrate it to the AC system with appropriate control capability. This system represents realistic hybrid AC/DC microgrids connectivity to the AC side to study the use of such architecture in system operation to help remedy system abnormal conditions. In addition, this dissertation explored the challenges and feasibility of the implementation of real-time system analysis features in order to monitor the system security and stability measures. These indices are measured experimentally during the operation of the developed hybrid AC/DC microgrids. Furthermore, a real-time optimal power flow system was implemented to optimally manage the power sharing between AC generators and DC side resources. A study relating to real-time energy management algorithm in hybrid microgrids was performed to evaluate the effects of using energy storage resources and their use in mitigating heavy load impacts on system stability and operational security.