7 resultados para Data validation
em Digital Commons at Florida International University
Resumo:
With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.
Resumo:
With the advent of peer to peer networks, and more importantly sensor networks, the desire to extract useful information from continuous and unbounded streams of data has become more prominent. For example, in tele-health applications, sensor based data streaming systems are used to continuously and accurately monitor Alzheimer's patients and their surrounding environment. Typically, the requirements of such applications necessitate the cleaning and filtering of continuous, corrupted and incomplete data streams gathered wirelessly in dynamically varying conditions. Yet, existing data stream cleaning and filtering schemes are incapable of capturing the dynamics of the environment while simultaneously suppressing the losses and corruption introduced by uncertain environmental, hardware, and network conditions. Consequently, existing data cleaning and filtering paradigms are being challenged. This dissertation develops novel schemes for cleaning data streams received from a wireless sensor network operating under non-linear and dynamically varying conditions. The study establishes a paradigm for validating spatio-temporal associations among data sources to enhance data cleaning. To simplify the complexity of the validation process, the developed solution maps the requirements of the application on a geometrical space and identifies the potential sensor nodes of interest. Additionally, this dissertation models a wireless sensor network data reduction system by ascertaining that segregating data adaptation and prediction processes will augment the data reduction rates. The schemes presented in this study are evaluated using simulation and information theory concepts. The results demonstrate that dynamic conditions of the environment are better managed when validation is used for data cleaning. They also show that when a fast convergent adaptation process is deployed, data reduction rates are significantly improved. Targeted applications of the developed methodology include machine health monitoring, tele-health, environment and habitat monitoring, intermodal transportation and homeland security.
Resumo:
Distance learning is growing and transforming educational institutions. The increasing use of distance learning by higher education institutions and particularly community colleges coupled with the higher level of student attrition in online courses than in traditional classrooms suggests that increased attention should be paid to factors that affect online student course completion. The purpose of the study was to develop and validate an instrument to predict community college online student course completion based on faculty perceptions, yielding a prediction model of online course completion rates. Social Presence and Media Richness theories were used to develop a theoretically-driven measure of online course completion. This research study involved surveying 311 community college faculty who taught at least one online course in the past 2 years. Email addresses of participating faculty were provided by two south Florida community colleges. Each participant was contacted through email, and a link to an Internet survey was given. The survey response rate was 63% (192 out of 303 available questionnaires). Data were analyzed through factor analysis, alpha reliability, and multiple regression. The exploratory factor analysis using principal component analysis with varimax rotation yielded a four-factor solution that accounted for 48.8% of the variance. Consistent with Social Presence theory, the factors with their percent of variance in parentheses were: immediacy (21.2%), technological immediacy (11.0%), online communication and interactivity (10.3%), and intimacy (6.3%). Internal consistency of the four factors was calculated using Cronbach's alpha (1951) with reliability coefficients ranging between .680 and .828. Multiple regression analysis yielded a model that significantly predicted 11% of the variance of the dependent variable, the percentage of student who completed the online course. As indicated in the literature (Johnson & Keil, 2002; Newberry, 2002), Media Richness theory appears to be closely related to Social Presence theory. However, elements from this theory did not emerge in the factor analysis.
Resumo:
This dissertation evaluated the feasibility of using commercially available immortalized cell lines in building a tissue engineered in vitro blood-brain barrier (BBB) co-culture model for preliminary drug development studies. Mouse endothelial cell line and rat astrocyte cell lines purchased from American Type Culture Collections (ATCC) were the building blocks of the co-culture model. An astrocyte derived acellular extracellular matrix (aECM) was introduced in the co-culture model to provide a novel in vitro biomimetic basement membrane for the endothelial cells to form endothelial tight junctions. Trans-endothelial electrical resistance (TEER) and solute mass transport studies were engaged to quantitatively evaluate the tight junction formation on the in-vitro BBB models. Immuno-fluorescence microscopy and Western Blot analysis were used to qualitatively verify the in vitro expression of occludin, one of the earliest discovered tight junction proteins. Experimental data from a total of 12 experiments conclusively showed that the novel BBB in vitro co-culture model with the astrocyte derived aECM (CO+aECM) was promising in terms of establishing tight junction formation represented by TEER values, transport profiles and tight junction protein expression when compared with traditional co-culture (CO) model setups and endothelial cells cultured alone. Experimental data were also found to be comparable with several existing in vitro BBB models built from various methods. In vitro colorimetric sulforhodamine B (SRB) assay revealed that the co-cultured samples with aECM resulted in less cell loss on the basal sides of the insert membranes than that from traditional co-culture samples. The novel tissue engineering approach using immortalized cell lines with the addition of aECM was proven to be a relevant alternative to the traditional BBB in vitro modeling.
Resumo:
Due to the rapid advances in computing and sensing technologies, enormous amounts of data are being generated everyday in various applications. The integration of data mining and data visualization has been widely used to analyze these massive and complex data sets to discover hidden patterns. For both data mining and visualization to be effective, it is important to include the visualization techniques in the mining process and to generate the discovered patterns for a more comprehensive visual view. In this dissertation, four related problems: dimensionality reduction for visualizing high dimensional datasets, visualization-based clustering evaluation, interactive document mining, and multiple clusterings exploration are studied to explore the integration of data mining and data visualization. In particular, we 1) propose an efficient feature selection method (reliefF + mRMR) for preprocessing high dimensional datasets; 2) present DClusterE to integrate cluster validation with user interaction and provide rich visualization tools for users to examine document clustering results from multiple perspectives; 3) design two interactive document summarization systems to involve users efforts and generate customized summaries from 2D sentence layouts; and 4) propose a new framework which organizes the different input clusterings into a hierarchical tree structure and allows for interactive exploration of multiple clustering solutions.
Resumo:
Hydrophobicity as measured by Log P is an important molecular property related to toxicity and carcinogenicity. With increasing public health concerns for the effects of Disinfection By-Products (DBPs), there are considerable benefits in developing Quantitative Structure and Activity Relationship (QSAR) models capable of accurately predicting Log P. In this research, Log P values of 173 DBP compounds in 6 functional classes were used to develop QSAR models, by applying 3 molecular descriptors, namely, Energy of the Lowest Unoccupied Molecular Orbital (ELUMO), Number of Chlorine (NCl) and Number of Carbon (NC) by Multiple Linear Regression (MLR) analysis. The QSAR models developed were validated based on the Organization for Economic Co-operation and Development (OECD) principles. The model Applicability Domain (AD) and mechanistic interpretation were explored. Considering the very complex nature of DBPs, the established QSAR models performed very well with respect to goodness-of-fit, robustness and predictability. The predicted values of Log P of DBPs by the QSAR models were found to be significant with a correlation coefficient R2 from 81% to 98%. The Leverage Approach by Williams Plot was applied to detect and remove outliers, consequently increasing R 2 by approximately 2% to 13% for different DBP classes. The developed QSAR models were statistically validated for their predictive power by the Leave-One-Out (LOO) and Leave-Many-Out (LMO) cross validation methods. Finally, Monte Carlo simulation was used to assess the variations and inherent uncertainties in the QSAR models of Log P and determine the most influential parameters in connection with Log P prediction. The developed QSAR models in this dissertation will have a broad applicability domain because the research data set covered six out of eight common DBP classes, including halogenated alkane, halogenated alkene, halogenated aromatic, halogenated aldehyde, halogenated ketone, and halogenated carboxylic acid, which have been brought to the attention of regulatory agencies in recent years. Furthermore, the QSAR models are suitable to be used for prediction of similar DBP compounds within the same applicability domain. The selection and integration of various methodologies developed in this research may also benefit future research in similar fields.
Resumo:
This thesis extended previous research on critical decision making and problem solving by refining and validating a measure designed to assess the use of critical thinking and critical discussion in sociomoral dilemmas. The purpose of this thesis was twofold: 1) to refine the administration of the Critical Thinking Subscale of the CDP to elicit more adequate responses and for purposes of refining the coding and scoring procedures for the total measure, and 2) to collect preliminary data on the initial reliabilities of the measure. Subjects consisted of 40 undergraduate students at Florida International University. Results indicate that the use of longer probes on the Critical Thinking Subscale was more effective in eliciting adequate responses necessary for coding and evaluating the subjects performance. Analyses on the psychometric properties of the measure consisted of test-retest reliability and inter-rater reliability.