7 resultados para feature vector
em Brock University, Canada
Resumo:
ABSTRACT Recombinant adenoviruses are currently under intense investigation as potential gene delivery and gene expression vectors with applications in human and veterinary medicine. As part of our efforts to develop a bovine adenovirus type 2 (BAV2) based vector system, the nucleotide sequence of BAV2 was determined. Sixty-six open reading frames (ORFs) were found with the potential to encode polypeptides that were at least 50 amino acid (aa) residue long. Thirty-one of the BAV2 polypeptide sequences were found to share homology to already identified adenovirus proteins. The arrangement of the genes revealed that the BAV2 genomic organization closely resembles that of well-characterized human adenoviruses. In the course of this study, continuous propagation of BAV2 over many generations in cell culture resulted in the isolation of a BAV2 spontaneous mutant in which the E3 region was deleted. Restriction enzyme, sequencing and PCR analyses produced concordant results that precisely located the deletion and revealed that its size was exactly 1299 bp. The E3-deleted virus was plaque-purified and further propagated in cell culture. It appeared that the replication of such a virus lacking a portion of the E3 region was not affected, at least in cell culture. Attempts to rescue a recombinant BAV2 virus with the bacterial kanamycin resistance gene in the E3 region yielded a candidate as verified with extensive Southern blotting and PCR analyses. Attempts to purify the recombinant virus were not successful, suggesting that such recombinant BAV2 was helper-dependent. Ten clones containing full-length BAV2 genomes in a pWE15 cosmid vector were constructed. The infectivity of these constructs was tested by using different transfection methods. The BAV2 genomic clones did appear to be infectious only after extended incubation period. This may be due to limitations of various transfection methods tested, or biological differences between virus- and E. co//-derived BAV2 DNA.
Resumo:
A feature-based fitness function is applied in a genetic programming system to synthesize stochastic gene regulatory network models whose behaviour is defined by a time course of protein expression levels. Typically, when targeting time series data, the fitness function is based on a sum-of-errors involving the values of the fluctuating signal. While this approach is successful in many instances, its performance can deteriorate in the presence of noise. This thesis explores a fitness measure determined from a set of statistical features characterizing the time series' sequence of values, rather than the actual values themselves. Through a series of experiments involving symbolic regression with added noise and gene regulatory network models based on the stochastic 'if-calculus, it is shown to successfully target oscillating and non-oscillating signals. This practical and versatile fitness function offers an alternate approach, worthy of consideration for use in algorithms that evaluate noisy or stochastic behaviour.
Resumo:
Remote sensing techniques involving hyperspectral imagery have applications in a number of sciences that study some aspects of the surface of the planet. The analysis of hyperspectral images is complex because of the large amount of information involved and the noise within that data. Investigating images with regard to identify minerals, rocks, vegetation and other materials is an application of hyperspectral remote sensing in the earth sciences. This thesis evaluates the performance of two classification and clustering techniques on hyperspectral images for mineral identification. Support Vector Machines (SVM) and Self-Organizing Maps (SOM) are applied as classification and clustering techniques, respectively. Principal Component Analysis (PCA) is used to prepare the data to be analyzed. The purpose of using PCA is to reduce the amount of data that needs to be processed by identifying the most important components within the data. A well-studied dataset from Cuprite, Nevada and a dataset of more complex data from Baffin Island were used to assess the performance of these techniques. The main goal of this research study is to evaluate the advantage of training a classifier based on a small amount of data compared to an unsupervised method. Determining the effect of feature extraction on the accuracy of the clustering and classification method is another goal of this research. This thesis concludes that using PCA increases the learning accuracy, and especially so in classification. SVM classifies Cuprite data with a high precision and the SOM challenges SVM on datasets with high level of noise (like Baffin Island).
Resumo:
Adenoviruses are the most commonly used in the development of oncolytic therapy. Oncolytic adenoviruses are genetically modified to selectivity replicate in and kill tumor cells. The p53 molecule is a tumor suppressor protein that responds to viral infection through the activation of apoptosis, which is inhibited by adenovirus E1B55kDa protein leading to progressive viral lytic cycle. The non-specificity of replication has limited the use of wild type adenovirus in cancer therapy. This issue was resolved by using an E1b deleted Ad that can only replicate in cells with a deficiency in the p53 protein, a common feature of most cancer cells. Although demonstrating a moderate success rate, E1b55kDa deleted Ad has not been approved as a standard therapy for all cancer types. Several studies have revealed that E1b deleted Ad replication was independent of p53 status in the cell, as the virus replicated better in some p53 deficient cancers more than others. However, this mechanism has not been investigated deeply. Therefore, the objective of this study is to understand the relationship between p53 status, levels and functional activity, and oncolytic Ad5dlE1b55kDa replication efficiency. Firstly, five transient p53 expression vectors that contain different regulatory elements were engineered and then evaluated in H1299, HEK293 and HeLa cell lines. Data indicated that vector that contains the MARs and HPRE regulatory elements achieved the highest stability of p53 expression. Secondly, we used these vectors to examine the effect of various p53 expression levels on the replication efficiency of oncolytic Ad5dlE1b55kDa. We found that the level of p53 in the cell had an insignificant effect on the oncolytic viruses’ replication. However, the functional activity of p53 had a significant effect on its replication, as Ad5dlE1b55kDa was shown to have selective activity in H1299 cells (p53-null). In contrast, a decrease in viral replication was found in HeLa cells (p53-positive). Finally, the effect of p53’s functional activity on the replication efficiency of oncolytic Ad5dlE1b55kDa was examined. Viral growth was evaluated in H1299 cells expressing number of p53 mutants. P53-R175H mutant successfully rescued viral growth by allowing the virus to exert its mechanism of selectivity. The mechanism entailed deregulating the expression of specific genes, cell cycle and apoptosis, in the p53 pathway to promote its production leading to efficient oncolytic effect. These results confirmed that oncolytic Ad5dlE1b55kDa sensitivity is mutation-type specific. Therefore, before it is applied clinically as cancer therapy for p53 deficient tumors, the type of p53 mutation must be determined for efficient antitumor effect.
Resumo:
The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and deterministic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel metaheuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS metaheuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.
Resumo:
The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and determinis- tic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel meta–heuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS meta–heuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.
Resumo:
New Feature at Niagara – Clark Hill Islands (5 islands situated in the rapids of the Niagara River). These islands are currently known as Dufferin Islands, 22 ½ cm. x 15 ½ cm, n.d.