7 resultados para Knowledge organization
em Digital Commons at Florida International University
Resumo:
The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^
Resumo:
With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.
Resumo:
In broad terms — including a thief's use of existing credit card, bank, or other accounts — the number of identity fraud victims in the United States ranges 9-10 million per year, or roughly 4% of the US adult population. The average annual theft per stolen identity was estimated at $6,383 in 2006, up approximately 22% from $5,248 in 2003; an increase in estimated total theft from $53.2 billion in 2003 to $56.6 billion in 2006. About three million Americans each year fall victim to the worst kind of identity fraud: new account fraud. Names, Social Security numbers, dates of birth, and other data are acquired fraudulently from the issuing organization, or from the victim then these data are used to create fraudulent identity documents. In turn, these are presented to other organizations as evidence of identity, used to open new lines of credit, secure loans, “flip” property, or otherwise turn a profit in a victim's name. This is much more time consuming — and typically more costly — to repair than fraudulent use of existing accounts. ^ This research borrows from well-established theoretical backgrounds, in an effort to answer the question – what is it that makes identity documents credible? Most importantly, identification of the components of credibility draws upon personal construct psychology, the underpinning for the repertory grid technique, a form of structured interviewing that arrives at a description of the interviewee’s constructs on a given topic, such as credibility of identity documents. This represents substantial contribution to theory, being the first research to use the repertory grid technique to elicit from experts, their mental constructs used to evaluate credibility of different types of identity documents reviewed in the course of opening new accounts. The research identified twenty-one characteristics, different ones of which are present on different types of identity documents. Expert evaluations of these documents in different scenarios suggest that visual characteristics are most important for a physical document, while authenticated personal data are most important for a digital document. ^
Resumo:
In broad terms — including a thief's use of existing credit card, bank, or other accounts — the number of identity fraud victims in the United States ranges 9-10 million per year, or roughly 4% of the US adult population. The average annual theft per stolen identity was estimated at $6,383 in 2006, up approximately 22% from $5,248 in 2003; an increase in estimated total theft from $53.2 billion in 2003 to $56.6 billion in 2006. About three million Americans each year fall victim to the worst kind of identity fraud: new account fraud. Names, Social Security numbers, dates of birth, and other data are acquired fraudulently from the issuing organization, or from the victim then these data are used to create fraudulent identity documents. In turn, these are presented to other organizations as evidence of identity, used to open new lines of credit, secure loans, “flip” property, or otherwise turn a profit in a victim's name. This is much more time consuming — and typically more costly — to repair than fraudulent use of existing accounts. This research borrows from well-established theoretical backgrounds, in an effort to answer the question – what is it that makes identity documents credible? Most importantly, identification of the components of credibility draws upon personal construct psychology, the underpinning for the repertory grid technique, a form of structured interviewing that arrives at a description of the interviewee’s constructs on a given topic, such as credibility of identity documents. This represents substantial contribution to theory, being the first research to use the repertory grid technique to elicit from experts, their mental constructs used to evaluate credibility of different types of identity documents reviewed in the course of opening new accounts. The research identified twenty-one characteristics, different ones of which are present on different types of identity documents. Expert evaluations of these documents in different scenarios suggest that visual characteristics are most important for a physical document, while authenticated personal data are most important for a digital document.
Resumo:
This route planner, funded by the Palm Beach Metropolitan Planning Organization (MPO), is a joint effort by Florida International University GIS Center and University of Florida Geomatics Program at Fort Lauderdale Research and Education Center. It is designed as a planning tool for bicyclists. Assistance was received from the Palm Beach County Bicycle, Greenways, Pedestrian Advisory Committee.
Resumo:
This route planner, funded by the Miami-Dade Metropolitan Planning Organization(MPO), is a joint effort by Florida International University GIS Center and University of Florida Geomatics Program at Fort Lauderdale Research and Education Center. It is designed as a planning tool for bicyclists. Assistance was received from the Miami-Dade County Bicycle/Pedestrian Advisory Committee and various cyclists and transportation professionals.
Resumo:
This route planner, funded by the Broward Metropolitan Planning Organization (MPO), is a joint effort by Florida International University GIS Center and University of Florida Geomatics Program at Fort Lauderdale Research and Education Center. It is designed as a planning tool for bicyclists. Assistance was received from the Broward County Bicycle/Pedestrian Advisory Committee and various cyclists and transportation professionals.