8 resultados para frequency based knowledge discovery
em Digital Commons at Florida International University
Resumo:
The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^
Resumo:
High efficiency of power converters placed between renewable energy sources and the utility grid is required to maximize the utilization of these sources. Power quality is another aspect that requires large passive elements (inductors, capacitors) to be placed between these sources and the grid. The main objective is to develop higher-level high frequency-based power converter system (HFPCS) that optimizes the use of hybrid renewable power injected into the power grid. The HFPCS provides high efficiency, reduced size of passive components, higher levels of power density realization, lower harmonic distortion, higher reliability, and lower cost. The dynamic modeling for each part in this system is developed, simulated and tested. The steady-state performance of the grid-connected hybrid power system with battery storage is analyzed. Various types of simulations were performed and a number of algorithms were developed and tested to verify the effectiveness of the power conversion topologies. A modified hysteresis-control strategy for the rectifier and the battery charging/discharging system was developed and implemented. A voltage oriented control (VOC) scheme was developed to control the energy injected into the grid. The developed HFPCS was compared experimentally with other currently available power converters. The developed HFPCS was employed inside a microgrid system infrastructure, connecting it to the power grid to verify its power transfer capabilities and grid connectivity. Grid connectivity tests verified these power transfer capabilities of the developed converter in addition to its ability of serving the load in a shared manner. In order to investigate the performance of the developed system, an experimental setup for the HF-based hybrid generation system was constructed. We designed a board containing a digital signal processor chip on which the developed control system was embedded. The board was fabricated and experimentally tested. The system's high precision requirements were verified. Each component of the system was built and tested separately, and then the whole system was connected and tested. The simulation and experimental results confirm the effectiveness of the developed converter system for grid-connected hybrid renewable energy systems as well as for hybrid electric vehicles and other industrial applications.
Resumo:
Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. ^ This thesis describes a heterogeneous database system being developed at High-performance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii) a framework for intelligent computing and communication on the Internet applying the concepts of our work. ^
Resumo:
With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.
Resumo:
There is growing popularity in the use of composite indices and rankings for cross-organizational benchmarking. However, little attention has been paid to alternative methods and procedures for the computation of these indices and how the use of such methods may impact the resulting indices and rankings. This dissertation developed an approach for assessing composite indices and rankings based on the integration of a number of methods for aggregation, data transformation and attribute weighting involved in their computation. The integrated model developed is based on the simulation of composite indices using methods and procedures proposed in the area of multi-criteria decision making (MCDM) and knowledge discovery in databases (KDD). The approach developed in this dissertation was automated through an IT artifact that was designed, developed and evaluated based on the framework and guidelines of the design science paradigm of information systems research. This artifact dynamically generates multiple versions of indices and rankings by considering different methodological scenarios according to user specified parameters. The computerized implementation was done in Visual Basic for Excel 2007. Using different performance measures, the artifact produces a number of excel outputs for the comparison and assessment of the indices and rankings. In order to evaluate the efficacy of the artifact and its underlying approach, a full empirical analysis was conducted using the World Bank's Doing Business database for the year 2010, which includes ten sub-indices (each corresponding to different areas of the business environment and regulation) for 183 countries. The output results, which were obtained using 115 methodological scenarios for the assessment of this index and its ten sub-indices, indicated that the variability of the component indicators considered in each case influenced the sensitivity of the rankings to the methodological choices. Overall, the results of our multi-method assessment were consistent with the World Bank rankings except in cases where the indices involved cost indicators measured in per capita income which yielded more sensitive results. Low income level countries exhibited more sensitivity in their rankings and less agreement between the benchmark rankings and our multi-method based rankings than higher income country groups.
Resumo:
Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. This thesis describes a heterogeneous database system being developed at Highperformance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i.) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii.) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii.) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv.) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v.) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi.) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii.) a framework for intelligent computing and communication on the Internet applying the concepts of our work.
Resumo:
With the advantages and popularity of Permanent Magnet (PM) motors due to their high power density, there is an increasing incentive to use them in variety of applications including electric actuation. These applications have strict noise emission standards. The generation of audible noise and associated vibration modes are characteristics of all electric motors, it is especially problematic in low speed sensorless control rotary actuation applications using high frequency voltage injection technique. This dissertation is aimed at solving the problem of optimizing the sensorless control algorithm for low noise and vibration while achieving at least 12 bit absolute accuracy for speed and position control. The low speed sensorless algorithm is simulated using an improved Phase Variable Model, developed and implemented in a hardware-in-the-loop prototyping environment. Two experimental testbeds were developed and built to test and verify the algorithm in real time.^ A neural network based modeling approach was used to predict the audible noise due to the high frequency injected carrier signal. This model was created based on noise measurements in an especially built chamber. The developed noise model is then integrated into the high frequency based sensorless control scheme so that appropriate tradeoffs and mitigation techniques can be devised. This will improve the position estimation and control performance while keeping the noise below a certain level. Genetic algorithms were used for including the noise optimization parameters into the developed control algorithm.^ A novel wavelet based filtering approach was proposed in this dissertation for the sensorless control algorithm at low speed. This novel filter was capable of extracting the position information at low values of injection voltage where conventional filters fail. This filtering approach can be used in practice to reduce the injected voltage in sensorless control algorithm resulting in significant reduction of noise and vibration.^ Online optimization of sensorless position estimation algorithm was performed to reduce vibration and to improve the position estimation performance. The results obtained are important and represent original contributions that can be helpful in choosing optimal parameters for sensorless control algorithm in many practical applications.^
Resumo:
This study tests Ogbu and Simons' Cultural-Ecological Theory of School Performance using data from the Progress in International Reading Literacy Study of 2001 (PIRLS), a large-scale international survey and reading assessment involving fourth grade students from 35 countries, including the United States. This theory argues that Black immigrant students outperform their non-immigrant counterparts, academically, and that achievement differences are attributed to stronger educational commitment in Black immigrant families. Four hypotheses are formulated to test this theory: Black immigrant students have (a) more receptive attitudes toward reading; (b) a more positive reading self-concept; and (c) a higher level of reading literacy. Furthermore, (d) the relationship of immigrant status to reading perceptions and literacy persists after including selected predictors. These hypotheses are tested separately for girls and boys, while also examining immigrant students' generational status (i.e., foreign-born or second-generation). ^ PIRLS data from a subset of Black students (N=525) in the larger U.S. sample of 3,763 are analyzed to test the hypotheses, using analysis of variance, correlation and multiple regression techniques. Findings reveal that hypotheses a and b are not confirmed (contradicting the Cultural-Ecological Theory) and c and d are partially supported (lending partial support to the theory). Specifically, immigrant and non-immigrant students did not differ in attitudes toward reading or reading self-concept; second-generation immigrant boys outperformed both non-immigrant and foreign-born immigrant boys in reading literacy, but no differences were found among girls; and, while being second-generation immigrant had a relatively stronger relationship to reading literacy for boys, among girls, selected socio-cultural predictors, number of books in the home and length of U.S. residence, had relatively stronger relationship to reading self-concept than did immigrant status. This study, therefore, indicates that future research employing the Cultural-Ecological Theory should: (a) take gender and generational status into account (b) identify additional socio-cultural predictors of Black children's academic perceptions and performance; and (c) continue to build on this body of evidence-based knowledge to better inform educational policy and school personnel in addressing needs of all children. ^