5 resultados para Knowledge Discovery Tools

em Brock University, Canada


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and deterministic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel metaheuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS metaheuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and determinis- tic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel meta–heuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS meta–heuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Feature selection plays an important role in knowledge discovery and data mining nowadays. In traditional rough set theory, feature selection using reduct - the minimal discerning set of attributes - is an important area. Nevertheless, the original definition of a reduct is restrictive, so in one of the previous research it was proposed to take into account not only the horizontal reduction of information by feature selection, but also a vertical reduction considering suitable subsets of the original set of objects. Following the work mentioned above, a new approach to generate bireducts using a multi--objective genetic algorithm was proposed. Although the genetic algorithms were used to calculate reduct in some previous works, we did not find any work where genetic algorithms were adopted to calculate bireducts. Compared to the works done before in this area, the proposed method has less randomness in generating bireducts. The genetic algorithm system estimated a quality of each bireduct by values of two objective functions as evolution progresses, so consequently a set of bireducts with optimized values of these objectives was obtained. Different fitness evaluation methods and genetic operators, such as crossover and mutation, were applied and the prediction accuracies were compared. Five datasets were used to test the proposed method and two datasets were used to perform a comparison study. Statistical analysis using the one-way ANOVA test was performed to determine the significant difference between the results. The experiment showed that the proposed method was able to reduce the number of bireducts necessary in order to receive a good prediction accuracy. Also, the influence of different genetic operators and fitness evaluation strategies on the prediction accuracy was analyzed. It was shown that the prediction accuracies of the proposed method are comparable with the best results in machine learning literature, and some of them outperformed it.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There continues to be a shortage of health professionals interested in providing care for the older adult. Part of the problem seems to stem from the negative perceptions of geriatrics as a clinical speciality. This study examines the knowledge, attitudes and career decisions of physical therapy students in Ontario before and after an educational intervention. Surveys were conducted with 144 physical therapy students from five universities before and after their geriatrics course in order to measure their knowledge, attitudes and interest in working with older adults. The incoming class of physical therapy students (n = 1 86) acted as control subjects for the study. The Revised Palmore Facts On Aging Quiz measured the students' knowledge of aging (Miller & Dodder, 1980). The Revised Tuckman-Lorge (Axelrod & Eisdorfer, 1961) and the Kogan Old People Scales (Kogan, 1961) were used to examine attitude. An environmental scale was developed based on the work of Snape (1986) to measure the impact of the working conditions on the students' career choices. A 10-point Likert-type scale based on the work of Michlelutte & Diseker (1985) was modified and used to measure career interest in working with the elderly. On independent sample t-tests, positive attitudes were related to the demographic characteristic of gender; ethnicity was negatively related; and marital status was found to be unrelated to attitude (fi<.05). Having a relationship with an older adult and taking courses in gerontology were also found to be positively related to attitude (fi<.05). Results on a betweensubjects design which compared students before and after the course found that knowledge scores improved from pretest to posttest (fi<.05). In general, attitude scores improved from T1 to T2 on both measurement tools (b<.05). The environmental and vocational interest scales yielded statistically significant differences between the control and experimental groups during the intervention period (p<.05). The results of this research indicated that knowledge and attitudes improve after an educational intervention; however, there was little impact on the students' overall career decisions. Further research is indicated to examine the complex relationship between attitude and behaviour and its impact on students' career choices. In addition, the impact of geriatric clinical environment on students' attitudes and career decisions needs to be further explored.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hepatocellular Carcinoma (HCC) is a major healthcare problem, representing the third most common cause of cancer-related mortality worldwide. Chronic infections with Hepatitis B virus (HBV) and/or Hepatitis C virus (HCV) are the major risk factors for the development of HCC. The incidence of HBV -associated HCC is in decline as a result of an effective HBV vaccine; however, since an equally effective HCV vaccine has not yet been developed, there are 130 million HCV infected patients worldwide who are at a high-risk for developing HCC. Because reliable parameters and/or tools for the early detection of HCC among high-risk individuals are severely lacking, HCC patients are always diagnosed at a late stage where surgical solutions or effective treatment are not possible. Using urine as a non-invasive sample source, two different approaches (proteomic-based and genomic-based approaches) were pursued with the common goal of discovering potential biomarker candidates for the early detection of HCC among high-risk chronic HCV infected patients. Urine was collected from 106 HCV infected Egyptian patients, 32 of whom had already developed HCC and 74 patients who were diagnosed as HCC-free at the time of initial sample collection. In addition to these patients, urine samples were also collected from 12 healthy control individuals. Total urinary proteins, Trans-renal nucleic acid (Tr-NA) and microRNA (miRNA) were isolated from urine using novel methodologies and silicon carbide-loaded spin columns. In the first, "proteomic-based", approach, liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) was used to identify potential candidates from pooled urine samples. This was followed by validating relative expression levels of proteins present in urine among all the patients using quantitative real time-PCR (qRT-PCR). This approach revealed that significant over-expression of three proteins: DJ-1, Chromatin Assembly Factor-1 (CAF-1) and 11 Moemen Abdalla HCC Biomarkers Heat Shock Protein 60 (HSP60), were characteristic events among HCC-post HCV infected patients. As a single-based HCC biomarker, CAF-1 over-expression identified HCC among HCV infected patients with a specificity of 90%, sensitivity of 66% and with an overall diagnostic accuracy of 78%. Moreover, the CAF-lIHSP60 tandem identified HCC among HCV infected patients with a specificity of 92%, sensitivity of 61 % and with an overall diagnostic accuracy of 77%. In the second genomic-based approach, two different approaches were processed. The first approach was the miRNA-based approach. The expression levels of miRNAs isolated from urine were studied using the Illumina MicroRNA Expression Profiling Assay. This was followed by qRT-PCR-based validation of deregulated expression of identified miRNA candidates among all the patients. This approach shed the light on the deregulated expression of a number of miRNAs, which may have a role in either the development of HCC among HCV infected patients (i.e. miR-640, miR-765, miR-200a, miR-521 and miR-520) or may allow for a better understanding of the viral-host interaction (miR-152, miR-486, miR-219, miR452, miR-425, miR-154 and miR-31). Moreover, the deregulated expression of both miR-618 and miR-650 appeared to be a common event among HCC-post HCV infected patients. The results of the search for putative targets of these two miRNA suggested that miR-618 may be a potent oncogene, as it targets the tumor-suppressor gene Low density lipoprotein-related protein 12 (LPR12), while miR-650 may be a potent tumor-suppressor gene, as it is supposed to downregulate the TNF receptor-associated factor-4 (TRAF4) oncogene. The specificity of miR-618 and miR-650 deregulated expression patterns for the early detection of HCC among HCV infected patients was 68% and 58%, respectively, whereas the sensitivity was 64% and 72%, respectively. When the deregulated expression of both miRNAs was combined as a tandem biomarker, the specificity and the sensitivity were 75% and 58% respectively. 111 Moemen Abdalla HCC Biomarkers In the second, "Trans-renal nucleic acid-based", approach, the urinary apoptotic nucleic acid (uaNA) levels of 70ng/mL or more were found to be a good predictor of HCC among chronic HCV infected patients. The specificity and the sensitivity of this diagnostic approach were 76% and 86%, respectively, with an overall diagnostic value of 81 %. The uaNA levels positively correlated to HCC disease progression as monitored by epigenetic changes of a panel of eight tumor-suppressor genes (TSGs) using methylation-sensitive PCR. Moreover, the pairing of high uaNA levels (:::: 70 ng/mL) and CAF-1 over-expreSSIOn produced a highly specific (l 00%) multiple-based HCC biomarker with an acceptable sensitivity of 64%, and with a diagnostic accuracy of 82%. In comparison to the previous pairing, the uaNA levels (:::: 70 ng/mL) in tandem with HSP60 over-expression was less specific (89%) but highly sensitive (72%), resulting in a diagnostic accuracy of 64%. The specificities of miR-650 deregulated expression in combination with either high uaNA content or HSP 60 over-expression were 82% and 79%, respectively, whereas, the sensitivities of these combinations were 64% and 58%, respectively. The potential biomarkers identified in this study compare favorably with the diagnostic accuracy of the a-fetoprotein levels test, which has a specificity of 75%, sensitivity of 68% and an overall diagnostic accuracy of 70%. Here we present an intriguing study which shows the significance of using urine as a noninvasive sample source for the identification of promising HCC biomarkers. We have also introduced new techniques for the isolation of different urinary macromolecules, especially miRNA, from urine. Furthermore, we strongly recommend the potential biomarkers indentified in this study as focal points of any future research on HCC diagnosis. A larger testing pool will determine if their use is practical for mass population screening. This explorative study identified potential targets that merit further investigation for the development of diagnostically accurate biomarkers isolated from 1-2 mL urine samples that were acquired in a non-invasive manner.