16 resultados para gene selection

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Feature selection techniques are critical to the analysis of high dimensional datasets. This is especially true in gene selection from microarray data which are commonly with extremely high feature-to-sample ratio. In addition to the essential objectives such as to reduce data noise, to reduce data redundancy, to improve sample classification accuracy, and to improve model generalization property, feature selection also helps biologists to focus on the selected genes to further validate their biological hypotheses.
Results: In this paper we describe an improved hybrid system for gene selection. It is based on a recently proposed genetic ensemble (GE) system. To enhance the generalization property of the selected genes or gene subsets and to overcome the overfitting problem of the GE system, we devised a mapping strategy to fuse the goodness information of each gene provided by multiple filtering algorithms. This information is then used for initialization and mutation operation of the genetic ensemble system.
Conclusion: We used four benchmark microarray datasets (including both binary-class and multi-class classification problems) for concept proving and model evaluation. The experimental results indicate that the proposed multi-filter enhanced genetic ensemble (MF-GE) system is able to improve sample classification accuracy, generate more compact gene subset, and converge to the selection results more quickly. The MF-GE system is very flexible as various combinations of multiple filters and classifiers can be incorporated based on the data characteristics and the user preferences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces a novel method for gene selection based on a modification of analytic hierarchy process (AHP). The modified AHP (MAHP) is able to deal with quantitative factors that are statistics of five individual gene ranking methods: two-sample t-test, entropy test, receiver operating characteristic curve, Wilcoxon test, and signal to noise ratio. The most prominent discriminant genes serve as inputs to a range of classifiers including linear discriminant analysis, k-nearest neighbors, probabilistic neural network, support vector machine, and multilayer perceptron. Gene subsets selected by MAHP are compared with those of four competing approaches: information gain, symmetrical uncertainty, Bhattacharyya distance and ReliefF. Four benchmark microarray datasets: diffuse large B-cell lymphoma, leukemia cancer, prostate and colon are utilized for experiments. As the number of samples in microarray data datasets are limited, the leave one out cross validation strategy is applied rather than the traditional cross validation. Experimental results demonstrate the significant dominance of the proposed MAHP against the competing methods in terms of both accuracy and stability. With a benefit of inexpensive computational cost, MAHP is useful for cancer diagnosis using DNA gene expression profiles in the real clinical practice.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a modification to the analytic hierarchy process (AHP) to select the most informative genes that serve as inputs to an interval type-2 fuzzy logic system (IT2FLS) for cancer classification. Unlike the conventional AHP, the modified AHP allows us to process quantitative factors that are ranking outcomes of individual gene selection methods including t-test, entropy, receiver operating characteristic curve, Wilcoxon test, and signal-to-noise ratio. The IT2FLS is introduced for the classification task due to its great ability for handling nonlinear, noisy, and outlier data, which are common problems in cancer microarray gene expression profiles. An unsupervised learning strategy using the fuzzy c-means clustering is employed to initialize parameters of the IT2FLS. Other classifiers such as multilayer perceptron network, support vector machine, and fuzzy ARTMAP are also implemented for comparisons. Experiments are carried out on three well-known microarray datasets: diffuse large B-cell lymphoma, leukemia cancer, and prostate. Rather than the traditional cross validation, leave-one-out cross-validation strategy is applied for the experiments. Results demonstrate the performance dominance of the IT2FLS against the competing classifiers. More noticeably, the modified AHP improves the classification performance not only of the IT2FLS but of all other classifiers as well. Accordingly, the proposed combination between the modified AHP and IT2FLS is a powerful tool for cancer classification and can be implemented as a real clinical decision support system that is useful for medical practitioners.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper introduces an approach to cancer classification through gene expression profiles by designing supervised learning hidden Markov models (HMMs). Gene expression of each tumor type is modelled by an HMM, which maximizes the likelihood of the data. Prominent discriminant genes are selected by a novel method based on a modification of the analytic hierarchy process (AHP). Unlike conventional AHP, the modified AHP allows to process quantitative factors that are ranking outcomes of individual gene selection methods including t-test, entropy, receiver operating characteristic curve, Wilcoxon test and signal to noise ratio. The modified AHP aggregates ranking results of individual gene selection methods to form stable and robust gene subsets. Experimental results demonstrate the performance dominance of the HMM approach against six comparable classifiers. Results also show that gene subsets generated by modified AHP lead to greater accuracy and stability compared to competing gene selection methods, i.e. information gain, symmetrical uncertainty, Bhattacharyya distance, and ReliefF. The modified AHP improves the classification performance not only of the HMM but also of all other classifiers. Accordingly, the proposed combination between the modified AHP and HMM is a powerful tool for cancer classification and useful as a real clinical decision support system for medical practitioners.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Lung cancer is a leading cause of cancer-related death worldwide. The early diagnosis of cancer has demonstrated to be greatly helpful for curing the disease effectively. Microarray technology provides a promising approach of exploiting gene profiles for cancer diagnosis. In this study, the authors propose a gene expression programming (GEP)-based model to predict lung cancer from microarray data. The authors use two gene selection methods to extract the significant lung cancer related genes, and accordingly propose different GEP-based prediction models. Prediction performance evaluations and comparisons between the authors' GEP models and three representative machine learning methods, support vector machine, multi-layer perceptron and radial basis function neural network, were conducted thoroughly on real microarray lung cancer datasets. Reliability was assessed by the cross-data set validation. The experimental results show that the GEP model using fewer feature genes outperformed other models in terms of accuracy, sensitivity, specificity and area under the receiver operating characteristic curve. It is concluded that GEP model is a better solution to lung cancer prediction problems.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This article reports our experience in agent-based hybrid construction for microarray data analysis. The contributions are twofold: We demonstrate that agent-based approaches are suitable for building hybrid systems in general, and that a genetic ensemble system is appropriate for microarray data analysis in particular. Created using an agent-based framework, this genetic ensemble system for microarray data analysis excels in both sample classification accuracy and gene selection reproducibility.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

 Although population genetic theory is largely based on the premise that loci under study are selectively neutral, it has been acknowledged that the study of DNA sequence data under the influence of selection can be useful. In some circumstances, these loci show increased population differentiation and gene diversity. Highly polymorphic loci may be especially useful when studying populations having low levels of diversity overall, such as is often the case with threatened or newly established invasive populations. Using common starlings Sturnus vulgaris sampled from invasive Australian populations, we investigated sequence data of the dopamine receptor D4 gene (DRD4), a locus suspected to be under selection for novelty-seeking behaviour in a range of taxa including humans and passerine birds. We hypothesised that such behaviour may be advantageous when species encounter novel environments, such as during invasion. In addition to analyses to detect the presence of selection, we also estimated population differentiation and gene diversity using DRD4 data and compared these estimates to those from microsatellite and mitochondrial DNA sequence data, using the same individuals. We found little evidence for selection on DRD4 in starlings. However, we did find elevated levels of within-population gene diversity when compared to microsatellites and mitochondrial DNA sequence, as well as a greater degree of population differentiation. We suggest that sequence data from putatively nonneutral loci are a useful addition to studies of invasive populations, where low genetic variability is expected.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coral reef fishes are expected to experience rising sea surface temperatures due to climate change. How well tropical reef fishes will respond to these increased temperatures and which genes are important in the response to elevated temperatures is not known. Microarray technology provides a powerful tool for gene discovery studies, but the development of microarrays for individual species can be expensive and time-consuming. In this study, we tested the suitability of a Danio rerio oligonucleotide microarray for application in a species with few genomic resources, the coral reef fish Pomacentrus moluccensis. Results from a comparative genomic hybridization experiment and direct sequence comparisons indicate that for most genes there is considerable sequence similarity between the two species, suggesting that the D. rerio array is useful for genomic studies of P. moluccensis. We employed this heterologous microarray approach to characterize the early transcriptional response to heat stress in P. moluccensis. A total of 111 gene loci, many of which are involved in protein processing, transcription, and cell growth, showed significant changes in transcript abundance following exposure to elevated temperatures. Changes in transcript abundance were validated for a selection of candidate genes using quantitative real-time polymerase chain reaction. This study demonstrates that heterologous microarrays can be successfully employed to study species for which specific microarrays have not yet been developed, and so have the potential to greatly enhance the utility of microarray technology to the field of environmental and functional genomics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mutations in the leucine-rich, glioma-inactivated 1 gene, LGI1, cause autosomal-dominant lateral temporal lobe epilepsy via unknown mechanisms. LGI1 belongs to a subfamily of leucine-rich repeat genes comprising four members (LGI1–LGI4) in mammals. In this study, both comparative developmental as well as molecular evolutionary methods were applied to investigate the evolution of the LGI gene family and, subsequently, of the functional importance of its different gene members. Our phylogenetic studies suggest that LGI genes evolved early in the vertebrate lineage. Genetic and expression analyses of all five zebrafish lgi genes revealed duplications of lgi1 and lgi2, each resulting in two paralogous gene copies with mostly nonoverlapping expression patterns. Furthermore, all vertebrate LGI1 orthologs experience high levels of purifying selection that argue for an essential role of this gene in neural development or function. The approach of combining expression and selection data used here exemplarily demonstrates that in poorly characterized gene families a framework of evolutionary and expression analyses can identify those genes that are functionally most important and are therefore prime candidates for human disorders.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Interleukins 2 and 15 (IL-2 and IL-15) are highly differentiated but related cytokines with overlapping, yet also distinct functions, and established benefits for medical drug use. The present study identified a gene for an ancient third IL-2/15 family member in reptiles and mammals, interleukin 15-like (IL-15L), which hitherto was only reported in fish. IL-15L genes with intact open reading frames (ORFs) and evidence of transcription, and a recent past of purifying selection, were found for cattle, horse, sheep, pig and rabbit. In human and mouse the IL-15L ORF is incapacitated. Although deduced IL-15L proteins share only ~21 % overall amino acid identity with IL-15, they share many of the IL-15 residues important for binding to receptor chain IL-15Rα, and recombinant bovine IL-15L was shown to interact with IL-15Rα indeed. Comparison of sequence motifs indicates that capacity for binding IL-15Rα is an ancestral characteristic of the IL-2/15/15L family, in accordance with a recent study which showed that in fish both IL-2 and IL-15 can bind IL-15Rα. Evidence reveals that the species lineage leading to mammals started out with three similar cytokines IL-2, IL-15 and IL-15L, and that later in evolution (1) IL-2 and IL-2Rα receptor chain acquired a new and specific binding mode and (2) IL-15L was lost in several but not all groups of mammals. The present study forms an important step forward in understanding this potent family of cytokines, and may help to improve future strategies for their application in veterinarian and human medicine.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Colour is an important factor in food detection and acquisition by animals using visually based foraging. Colour can be used to identify the suitability of a food source or improve the efficiency of food detection, and can even be linked to mate choice. Food colour preferences are known to exist, but whether these preferences are heritable and how these preferences evolve is unknown. Using the freshwater fish Poecilia reticulata, we artificially selected for chase behaviour towards two different-coloured moving stimuli: red and blue spots. A response to selection was only seen for chase behaviours towards the red, with realized heritabilities ranging from 0.25 to 0.30. Despite intense selection, no significant chase response was recorded for the blue-selected lines. This lack of response may be due to the motion-detection mechanism in the guppy visual system and may have novel implications for the evolvability of responses to colour-related signals. The behavioural response to several colours after five generations of selection suggests that the colour opponency system of the fish may regulate the response to selection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

 BACKGROUND: Interactions between wildlife and humans are increasing. Urban animals are often less wary of humans than their non-urban counterparts, which could be explained by habituation, adaptation or local site selection. Under local site selection, individuals that are less tolerant of humans are less likely to settle in urban areas. However, there is little evidence for such temperament-based site selection, and even less is known about its underlying genetic basis. We tested whether site selection in urban and non-urban habitats by black swans (Cygnus atratus) was associated with polymorphisms in two genes linked to fear in animals, the dopamine receptor D4 (DRD4) and serotonin transporter (SERT) genes.

RESULTS: Wariness in swans was highly repeatable between disturbance events (repeatability = 0.61) and non-urban swans initiated escape from humans earlier than urban swans. We found no inter-individual variation in the SERT gene, but identified five DRD4 genotypes and an association between DRD4 genotype and wariness. Individuals possessing the most common DRD4 genotype were less wary than individuals possessing rarer genotypes. As predicted by the local site selection hypothesis, genotypes associated with wary behaviour were over three times more frequent at the non-urban site. This resulted in moderate population differentiation at DRD4 (FST = 0.080), despite the sites being separated by only 30 km, a short distance for this highly-mobile species. Low population differentiation at neutrally-selected microsatellite loci and the likely occasional migration of swans between the populations reduces the likelihood of local site adaptations.

CONCLUSION: Our results suggest that wariness in swans is partly genetically-determined and that wary swans settle in less-disturbed areas. More generally, our findings suggest that site-specific management strategies may be necessary that consider the temperament of local animals.