2 resultados para Fuzzy c-means algorithm

em Digital Commons at Florida International University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Freeway systems are becoming more congested each day. One contribution to freeway traffic congestion comprises platoons of on-ramp traffic merging into freeway mainlines. As a relatively low-cost countermeasure to the problem, ramp meters are being deployed in both directions of an 11-mile section of I-95 in Miami-Dade County, Florida. The local Fuzzy Logic (FL) ramp metering algorithm implemented in Seattle, Washington, has been selected for deployment. The FL ramp metering algorithm is powered by the Fuzzy Logic Controller (FLC). The FLC depends on a series of parameters that can significantly alter the behavior of the controller, thus affecting the performance of ramp meters. However, the most suitable values for these parameters are often difficult to determine, as they vary with current traffic conditions. Thus, for optimum performance, the parameter values must be fine-tuned. This research presents a new method of fine tuning the FLC parameters using Particle Swarm Optimization (PSO). PSO attempts to optimize several important parameters of the FLC. The objective function of the optimization model incorporates the METANET macroscopic traffic flow model to minimize delay time, subject to the constraints of reasonable ranges of ramp metering rates and FLC parameters. To further improve the performance, a short-term traffic forecasting module using a discrete Kalman filter was incorporated to predict the downstream freeway mainline occupancy. This helps to detect the presence of downstream bottlenecks. The CORSIM microscopic simulation model was selected as the platform to evaluate the performance of the proposed PSO tuning strategy. The ramp-metering algorithm incorporating the tuning strategy was implemented using CORSIM's run-time extension (RTE) and was tested on the aforementioned I-95 corridor. The performance of the FLC with PSO tuning was compared with the performance of the existing FLC without PSO tuning. The results show that the FLC with PSO tuning outperforms the existing FL metering, fixed-time metering, and existing conditions without metering in terms of total travel time savings, average speed, and system-wide throughput.