10 resultados para Correlation algorithm
em Brock University, Canada
Resumo:
In this thesis we are going to analyze the dictionary graphs and some other kinds of graphs using the PagerRank algorithm. We calculated the correlation between the degree and PageRank of all nodes for a graph obtained from Merriam-Webster dictionary, a French dictionary and WordNet hypernym and synonym dictionaries. Our conclusion was that PageRank can be a good tool to compare the quality of dictionaries. We studied some artificial social and random graphs. We found that when we omitted some random nodes from each of the graphs, we have not noticed any significant changes in the ranking of the nodes according to their PageRank. We also discovered that some social graphs selected for our study were less resistant to the changes of PageRank.
Resumo:
Solid state nuclear magnetic resonance (NMR) spectroscopy is a powerful technique for studying structural and dynamical properties of disordered and partially ordered materials, such as glasses, polymers, liquid crystals, and biological materials. In particular, twodimensional( 2D) NMR methods such as ^^C-^^C correlation spectroscopy under the magicangle- spinning (MAS) conditions have been used to measure structural constraints on the secondary structure of proteins and polypeptides. Amyloid fibrils implicated in a broad class of diseases such as Alzheimer's are known to contain a particular repeating structural motif, called a /5-sheet. However, the details of such structures are poorly understood, primarily because the structural constraints extracted from the 2D NMR data in the form of the so-called Ramachandran (backbone torsion) angle distributions, g{^,'4)), are strongly model-dependent. Inverse theory methods are used to extract Ramachandran angle distributions from a set of 2D MAS and constant-time double-quantum-filtered dipolar recoupling (CTDQFD) data. This is a vastly underdetermined problem, and the stability of the inverse mapping is problematic. Tikhonov regularization is a well-known method of improving the stability of the inverse; in this work it is extended to use a new regularization functional based on the Laplacian rather than on the norm of the function itself. In this way, one makes use of the inherently two-dimensional nature of the underlying Ramachandran maps. In addition, a modification of the existing numerical procedure is performed, as appropriate for an underdetermined inverse problem. Stability of the algorithm with respect to the signal-to-noise (S/N) ratio is examined using a simulated data set. The results show excellent convergence to the true angle distribution function g{(j),ii) for the S/N ratio above 100.
Resumo:
The cr ystal structure of the compound 2-benzoylethylidene-3-(2,4- dibromophenyl)-2,3-dihydro-5-phenyl-l,3,4-thiadiazole* C23H16Br2NZOS (BRMEO) has been determined by using three dimensiona l x-ray diffraction data. The crys tal form is monoclinic, space group P21/c, a = 17.492(4), o -.t' 0 R 0 b =: 16.979(1), c = 14.962(1) A, "X. =o= 90 ',= 106.46(1) , z = 8, graphite-monochromatized Mo~ rad iation, Jl= 0.710J3~, D = 1.62g/cc and o D = 1.65g/cc. The data were col lected on ~ Nonius CAD-4 c diffractometer. The following atoms were made anisotropic: Br, S, N, 0, C7, and C14-C16 for each i ndependent molecu le ; the rest were left isotropic. For 3112 independent refl ec tions with F > 6G\F), R == 0.057. The compound has two independent molecules within the asymmetric unit. Two different conformers were observed which pack well together. /l The S---O interaction distances of 2.493(6) and 2 . 478(7) A were observed for molecules A and B respectively. These values are consistent with earlier findings for 2-benzoylmethylene-3-(2,4-dibromophenyl)- ~~ 2,3-dihydro-5-phenyl-l,3,4-thiadiazole C22H14Br2N20S (BRPHO) and 2-benzoylpropylidene-3-(2,4-dibromophenyl)-2,3-dihydroiii ,'r 5-phenyl-l,3,4-thiadiazole C24H18Br2N20S (BRPETO ) where S---O distances are l ess than the van der Waals (3.251\) but greater than those expected for () a single bond (1.50A). From the results and the literature it appears obvious that the energy/reaction coordinate pathway has a minimum between the end structures (the mono- and bicyclic compounds). * See reference (21) for nomenclature.
Resumo:
This thesis introduces the Salmon Algorithm, a search meta-heuristic which can be used for a variety of combinatorial optimization problems. This algorithm is loosely based on the path finding behaviour of salmon swimming upstream to spawn. There are a number of tunable parameters in the algorithm, so experiments were conducted to find the optimum parameter settings for different search spaces. The algorithm was tested on one instance of the Traveling Salesman Problem and found to have superior performance to an Ant Colony Algorithm and a Genetic Algorithm. It was then tested on three coding theory problems - optimal edit codes, optimal Hamming distance codes, and optimal covering codes. The algorithm produced improvements on the best known values for five of six of the test cases using edit codes. It matched the best known results on four out of seven of the Hamming codes as well as three out of three of the covering codes. The results suggest the Salmon Algorithm is competitive with established guided random search techniques, and may be superior in some search spaces.
Resumo:
Understanding the machinery of gene regulation to control gene expression has been one of the main focuses of bioinformaticians for years. We use a multi-objective genetic algorithm to evolve a specialized version of side effect machines for degenerate motif discovery. We compare some suggested objectives for the motifs they find, test different multi-objective scoring schemes and probabilistic models for the background sequence models and report our results on a synthetic dataset and some biological benchmarking suites. We conclude with a comparison of our algorithm with some widely used motif discovery algorithms in the literature and suggest future directions for research in this area.
Resumo:
Cellular stress resistance has been shown to be highly correlated with longevity. However, the mechanisms conferring this stress resistance have yet to be identified. Maintenance of protein homeostasis is a critical component of cellular maintenance and stress resistance. Superior protein homeostasis capacities may thus underlie the greater stress resistance observed in longer-lived animals; however, little vertebrate data have been provided supporting this idea. I used two different experimental approaches to test the associations of protein homeostasis capacities with stress resistance and lifespan: 1) a comparison between a large set of vertebrate species with varying body masses and lifespans and 2) a comparison of long-lived Snell dwarf mice and their normal littermates. Protein homeostasis mechanisms including protein degradation activity, protein repair activity and molecular chaperone levels were examined. These measurements were performed in liver, heart and brain tissues, and isolated myoblasts. My results indicated that neither protein degradation nor protein repair were upregulated in association with enhanced stress resistance and longevity in an inter-species and intraspecies context. Furthermore, my results did show that there is a positive correlation between molecular chaperone levels and maximum lifespan (MLSP). However, there was no elevation of chaperone levels in the long-lived Snell dwarf mouse, indicating there are other mechanisms linked to their increased lifespan. Therefore, these results suggest that molecular chaperones are involved in increasing animal lifespan in an interspecies context.
Resumo:
DNA assembly is among the most fundamental and difficult problems in bioinformatics. Near optimal assembly solutions are available for bacterial and small genomes, however assembling large and complex genomes especially the human genome using Next-Generation-Sequencing (NGS) technologies is shown to be very difficult because of the highly repetitive and complex nature of the human genome, short read lengths, uneven data coverage and tools that are not specifically built for human genomes. Moreover, many algorithms are not even scalable to human genome datasets containing hundreds of millions of short reads. The DNA assembly problem is usually divided into several subproblems including DNA data error detection and correction, contig creation, scaffolding and contigs orientation; each can be seen as a distinct research area. This thesis specifically focuses on creating contigs from the short reads and combining them with outputs from other tools in order to obtain better results. Three different assemblers including SOAPdenovo [Li09], Velvet [ZB08] and Meraculous [CHS+11] are selected for comparative purposes in this thesis. Obtained results show that this thesis’ work produces comparable results to other assemblers and combining our contigs to outputs from other tools, produces the best results outperforming all other investigated assemblers.
Resumo:
This research evaluated (a) the correlation between math anxiety, math attitudes, and achievement in math and (b) comparison among these variables in terms of gender among grade 9 students in a high school located in southern Ontario. Data were compiled from participant responses to the Attitudes Toward Math Inventory (ATMI) and the Math Anxiety Rating Scale for Adolescents (MARS-A), and achievement data were gathered from participants’ grade 9 academic math course marks and the EQAO Grade 9 Assessment of Mathematics. Nonparametric tests were conducted to determine whether there were relationships between the variables and to explore whether gender differences in anxiety, attitudes, and achievement existed for this sample. Results indicated that math anxiety was not related to math achievement but was a strong correlate of attitudes toward math. A strong positive relationship was found between math attitudes and achievement in math. Specifically, self-confidence in math, enjoyment of math, value of math, and motivation were all positive correlates of achievement in math. Also, results for gender comparisons were nonsignificant, indicating that gender differences in math anxiety, math attitudes, and math achievement scores were not prevalent in this group of grade 9 students. Therefore, attitudes toward math were considered to be a stronger predictor of performance than math anxiety or gender for this group.
Resumo:
Ordered gene problems are a very common classification of optimization problems. Because of their popularity countless algorithms have been developed in an attempt to find high quality solutions to the problems. It is also common to see many different types of problems reduced to ordered gene style problems as there are many popular heuristics and metaheuristics for them due to their popularity. Multiple ordered gene problems are studied, namely, the travelling salesman problem, bin packing problem, and graph colouring problem. In addition, two bioinformatics problems not traditionally seen as ordered gene problems are studied: DNA error correction and DNA fragment assembly. These problems are studied with multiple variations and combinations of heuristics and metaheuristics with two distinct types or representations. The majority of the algorithms are built around the Recentering- Restarting Genetic Algorithm. The algorithm variations were successful on all problems studied, and particularly for the two bioinformatics problems. For DNA Error Correction multiple cases were found with 100% of the codes being corrected. The algorithm variations were also able to beat all other state-of-the-art DNA Fragment Assemblers on 13 out of 16 benchmark problem instances.
Resumo:
Understanding the relationship between genetic diseases and the genes associated with them is an important problem regarding human health. The vast amount of data created from a large number of high-throughput experiments performed in the last few years has resulted in an unprecedented growth in computational methods to tackle the disease gene association problem. Nowadays, it is clear that a genetic disease is not a consequence of a defect in a single gene. Instead, the disease phenotype is a reflection of various genetic components interacting in a complex network. In fact, genetic diseases, like any other phenotype, occur as a result of various genes working in sync with each other in a single or several biological module(s). Using a genetic algorithm, our method tries to evolve communities containing the set of potential disease genes likely to be involved in a given genetic disease. Having a set of known disease genes, we first obtain a protein-protein interaction (PPI) network containing all the known disease genes. All the other genes inside the procured PPI network are then considered as candidate disease genes as they lie in the vicinity of the known disease genes in the network. Our method attempts to find communities of potential disease genes strongly working with one another and with the set of known disease genes. As a proof of concept, we tested our approach on 16 breast cancer genes and 15 Parkinson's Disease genes. We obtained comparable or better results than CIPHER, ENDEAVOUR and GPEC, three of the most reliable and frequently used disease-gene ranking frameworks.