11 resultados para Data selection

em Digital Commons at Florida International University


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Physiological signals, which are controlled by the autonomic nervous system (ANS), could be used to detect the affective state of computer users and therefore find applications in medicine and engineering. The Pupil Diameter (PD) seems to provide a strong indication of the affective state, as found by previous research, but it has not been investigated fully yet. ^ In this study, new approaches based on monitoring and processing the PD signal for off-line and on-line affective assessment ("relaxation" vs. "stress") are proposed. Wavelet denoising and Kalman filtering methods are first used to remove abrupt changes in the raw Pupil Diameter (PD) signal. Then three features (PDmean, PDmax and PDWalsh) are extracted from the preprocessed PD signal for the affective state classification. In order to select more relevant and reliable physiological data for further analysis, two types of data selection methods are applied, which are based on the paired t-test and subject self-evaluation, respectively. In addition, five different kinds of the classifiers are implemented on the selected data, which achieve average accuracies up to 86.43% and 87.20%, respectively. Finally, the receiver operating characteristic (ROC) curve is utilized to investigate the discriminating potential of each individual feature by evaluation of the area under the ROC curve, which reaches values above 0.90. ^ For the on-line affective assessment, a hard threshold is implemented first in order to remove the eye blinks from the PD signal and then a moving average window is utilized to obtain the representative value PDr for every one-second time interval of PD. There are three main steps for the on-line affective assessment algorithm, which are preparation, feature-based decision voting and affective determination. The final results show that the accuracies are 72.30% and 73.55% for the data subsets, which were respectively chosen using two types of data selection methods (paired t-test and subject self-evaluation). ^ In order to further analyze the efficiency of affective recognition through the PD signal, the Galvanic Skin Response (GSR) was also monitored and processed. The highest affective assessment classification rate obtained from GSR processing is only 63.57% (based on the off-line processing algorithm). The overall results confirm that the PD signal should be considered as one of the most powerful physiological signals to involve in future automated real-time affective recognition systems, especially for detecting the "relaxation" vs. "stress" states.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Physiological signals, which are controlled by the autonomic nervous system (ANS), could be used to detect the affective state of computer users and therefore find applications in medicine and engineering. The Pupil Diameter (PD) seems to provide a strong indication of the affective state, as found by previous research, but it has not been investigated fully yet. In this study, new approaches based on monitoring and processing the PD signal for off-line and on-line affective assessment (“relaxation” vs. “stress”) are proposed. Wavelet denoising and Kalman filtering methods are first used to remove abrupt changes in the raw Pupil Diameter (PD) signal. Then three features (PDmean, PDmax and PDWalsh) are extracted from the preprocessed PD signal for the affective state classification. In order to select more relevant and reliable physiological data for further analysis, two types of data selection methods are applied, which are based on the paired t-test and subject self-evaluation, respectively. In addition, five different kinds of the classifiers are implemented on the selected data, which achieve average accuracies up to 86.43% and 87.20%, respectively. Finally, the receiver operating characteristic (ROC) curve is utilized to investigate the discriminating potential of each individual feature by evaluation of the area under the ROC curve, which reaches values above 0.90. For the on-line affective assessment, a hard threshold is implemented first in order to remove the eye blinks from the PD signal and then a moving average window is utilized to obtain the representative value PDr for every one-second time interval of PD. There are three main steps for the on-line affective assessment algorithm, which are preparation, feature-based decision voting and affective determination. The final results show that the accuracies are 72.30% and 73.55% for the data subsets, which were respectively chosen using two types of data selection methods (paired t-test and subject self-evaluation). In order to further analyze the efficiency of affective recognition through the PD signal, the Galvanic Skin Response (GSR) was also monitored and processed. The highest affective assessment classification rate obtained from GSR processing is only 63.57% (based on the off-line processing algorithm). The overall results confirm that the PD signal should be considered as one of the most powerful physiological signals to involve in future automated real-time affective recognition systems, especially for detecting the “relaxation” vs. “stress” states.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. ^ Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a twofold “custom wrapper” approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. ^ Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. ^ This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data. ^ The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior. ^ The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Due to the rapid advances in computing and sensing technologies, enormous amounts of data are being generated everyday in various applications. The integration of data mining and data visualization has been widely used to analyze these massive and complex data sets to discover hidden patterns. For both data mining and visualization to be effective, it is important to include the visualization techniques in the mining process and to generate the discovered patterns for a more comprehensive visual view. In this dissertation, four related problems: dimensionality reduction for visualizing high dimensional datasets, visualization-based clustering evaluation, interactive document mining, and multiple clusterings exploration are studied to explore the integration of data mining and data visualization. In particular, we 1) propose an efficient feature selection method (reliefF + mRMR) for preprocessing high dimensional datasets; 2) present DClusterE to integrate cluster validation with user interaction and provide rich visualization tools for users to examine document clustering results from multiple perspectives; 3) design two interactive document summarization systems to involve users efforts and generate customized summaries from 2D sentence layouts; and 4) propose a new framework which organizes the different input clusterings into a hierarchical tree structure and allows for interactive exploration of multiple clustering solutions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Context: Research suggests internships, mentorship, and specialized school programs positively influence career selection; however, little data exists specific to athletic training. Objective: We identified high school (HS) experiences influencing career choice in college athletic training students (ATS). Design: Our survey included 35 Likert-type close-ended questions, which were reviewed by a panel of faculty and peers to establish content and construct validity. Setting: Participants completed an online questionnaire at their convenience. Participants: 217 college ATS (153 female, 64 male) from a random selection of accredited programs on the east coast. We excluded minors, freshmen, and undecided majors from the study. Informed consent was implied by proceeding to the questionnaire. Data Collection and Analysis: We used descriptive statistics to analyze the data collected via a secure website. Results: Mentors were most influential in the decision of career path (62.4%;n=131/210) with 85.2% (n=138/162) reporting mentors were readily available to answer questions regarding career options and 53.1% (n=86/162) counseled them regarding HS electives. Of participants involved in an internship (41.0%;n=86/210), most developed such opportunities independently (66.3%;n=57/86). Respondents who attended traditional HS suggested providing diverse electives (71.9%;n=133/185), additional internship (53.5%;n=99/185), and mentorship (33.0%;n=61/185) opportunities to effectively educate students regarding career options. Conclusions: College ATS that gained internship experience during HS report the opportunity positively influenced their career selection. Mentors support HS students by offering insight and expertise in guiding students’ career choices. Participants suggested HS afford diverse electives with internship and mentorship opportunities to positively influence interested students towards pursuing a career in athletic training.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

County jurisdictions in America are increasingly exercising self-government in the provision of public community services through the context of second order federalism. In states exercising this form of contemporary governance, county governments with "reformed" policy-making structures and professional management practices, have begun to rival or surpass municipalities in the delivery of local services with regional implications such as environmental protection (Benton 2002, 2003; Marando and Reeves, 1993). ^ The voter referendum, a form of direct democracy, is an important component of county land preservation and environmental protection governmental policies. The recent growth and success of land preservation voter referendums nationwide reflects an increase in citizen participation in government and their desire to protect vacant land and its natural environment from threats of over-development, urbanization and sprawl, loss of open space and farmland, deterioration of ecosystems, and inadequate park and recreational amenities. ^ The study's design employs a sequential, mixed method. First, a quantitative approach employs the Heckman two-step model. It is fitted with variables for the non-random sample of 227 voter referendum counties and all non-voter referendum counties in the U.S. from 1988 to 2009. Second, the qualitative data collected from the in-depth investigation of three South Florida county case studies with twelve public administrator interviews is transformed for integration with the quantitative findings. The purpose of the qualitative method is to complement, explain and enrich the statistical analysis of county demographic, socio-economic, terrain, regional, governance and government, political preference, environmentalism, and referendum-specific factors. ^ The research finds that government factors are significant in terms of the success of land preservation voter referendums; more specifically, the presence of self-government authority (home rule charter), a reformed structure (county administrator/manager or elected executive), and environmental interest groups. In addition, this study concludes that successful counties are often located coastal, exhibit population and housing growth, and have older and more educated citizens who vote democratic in presidential elections. The analysis of case study documents and public administrator interviews finds that pragmatic considerations of timing, local politics and networking of regional stakeholders are also important features of success. Further research is suggested utilizing additional public participation, local government and public administration factors.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Habitat selection decisions by consumers has the potential to shape ecosystems. Understanding the factors that influence habitat selection is therefore critical to understanding ecosystem function. This is especially true of mesoconsumers because they provide the link between upper and lower tropic levels. We examined the factors influencing microhabitat selection of marine mesoconsumers – juvenile giant shovelnose rays (Glaucostegus typus), reticulate whiprays (Himantura uarnak), and pink whiprays (H. fai) – in a coastal ecosystem with intact predator and prey populations and marked spatial and temporal thermal heterogeneity. Using a combination of belt transects and data on water temperature, tidal height, prey abundance, predator abundance and ray behavior, we found that giant shovelnose rays and reticulate whiprays were most often found resting in nearshore microhabitats, especially at low tidal heights during the warm season. Microhabitat selection did not match predictions derived from distributions of prey. Although at a course scale, ray distributions appeared to match predictions of behavioral thermoregulation theory, fine-scale examination revealed a mismatch. The selection of the shallow nearshore microhabitat at low tidal heights during periods of high predator abundance (warm season) suggests that this microhabitat may serve as a refuge, although it may come with metabolic costs due to higher temperatures. The results of this study highlight the importance of predators in the habitat selection decisions of mesoconsumers and that within thermal gradients, factors, such as predation risk, must be considered in addition to behavioral thermoregulation to explain habitat selection decisions. Furthermore, increasing water temperatures predicted by climate change may result in complex trade-offs that might have important implications for ecosystem dynamics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The promise of Wireless Sensor Networks (WSNs) is the autonomous collaboration of a collection of sensors to accomplish some specific goals which a single sensor cannot offer. Basically, sensor networking serves a range of applications by providing the raw data as fundamentals for further analyses and actions. The imprecision of the collected data could tremendously mislead the decision-making process of sensor-based applications, resulting in an ineffectiveness or failure of the application objectives. Due to inherent WSN characteristics normally spoiling the raw sensor readings, many research efforts attempt to improve the accuracy of the corrupted or "dirty" sensor data. The dirty data need to be cleaned or corrected. However, the developed data cleaning solutions restrict themselves to the scope of static WSNs where deployed sensors would rarely move during the operation. Nowadays, many emerging applications relying on WSNs need the sensor mobility to enhance the application efficiency and usage flexibility. The location of deployed sensors needs to be dynamic. Also, each sensor would independently function and contribute its resources. Sensors equipped with vehicles for monitoring the traffic condition could be depicted as one of the prospective examples. The sensor mobility causes a transient in network topology and correlation among sensor streams. Based on static relationships among sensors, the existing methods for cleaning sensor data in static WSNs are invalid in such mobile scenarios. Therefore, a solution of data cleaning that considers the sensor movements is actively needed. This dissertation aims to improve the quality of sensor data by considering the consequences of various trajectory relationships of autonomous mobile sensors in the system. First of all, we address the dynamic network topology due to sensor mobility. The concept of virtual sensor is presented and used for spatio-temporal selection of neighboring sensors to help in cleaning sensor data streams. This method is one of the first methods to clean data in mobile sensor environments. We also study the mobility pattern of moving sensors relative to boundaries of sub-areas of interest. We developed a belief-based analysis to determine the reliable sets of neighboring sensors to improve the cleaning performance, especially when node density is relatively low. Finally, we design a novel sketch-based technique to clean data from internal sensors where spatio-temporal relationships among sensors cannot lead to the data correlations among sensor streams.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a two-fold "custom wrapper" approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

County jurisdictions in America are increasingly exercising self-government in the provision of public community services through the context of second order federalism. In states exercising this form of contemporary governance, county governments with “reformed” policy-making structures and professional management practices, have begun to rival or surpass municipalities in the delivery of local services with regional implications such as environmental protection (Benton 2002, 2003; Marando and Reeves, 1993). The voter referendum, a form of direct democracy, is an important component of county land preservation and environmental protection governmental policies. The recent growth and success of land preservation voter referendums nationwide reflects an increase in citizen participation in government and their desire to protect vacant land and its natural environment from threats of over-development, urbanization and sprawl, loss of open space and farmland, deterioration of ecosystems, and inadequate park and recreational amenities. The study’s design employs a sequential, mixed method. First, a quantitative approach employs the Heckman two-step model. It is fitted with variables for the non-random sample of 227 voter referendum counties and all non-voter referendum counties in the U.S. from 1988 to 2009. Second, the qualitative data collected from the in-depth investigation of three South Florida county case studies with twelve public administrator interviews is transformed for integration with the quantitative findings. The purpose of the qualitative method is to complement, explain and enrich the statistical analysis of county demographic, socio-economic, terrain, regional, governance and government, political preference, environmentalism, and referendum-specific factors. The research finds that government factors are significant in terms of the success of land preservation voter referendums; more specifically, the presence of self-government authority (home rule charter), a reformed structure (county administrator/manager or elected executive), and environmental interest groups. In addition, this study concludes that successful counties are often located coastal, exhibit population and housing growth, and have older and more educated citizens who vote democratic in presidential elections. The analysis of case study documents and public administrator interviews finds that pragmatic considerations of timing, local politics and networking of regional stakeholders are also important features of success. Further research is suggested utilizing additional public participation, local government and public administration factors.