877 resultados para Graph Algorithms
Resumo:
A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
There is an increasing interest in the application of Evolutionary Algorithms (EAs) to induce classification rules. This hybrid approach can benefit areas where classical methods for rule induction have not been very successful. One example is the induction of classification rules in imbalanced domains. Imbalanced data occur when one or more classes heavily outnumber other classes. Frequently, classical machine learning (ML) classifiers are not able to learn in the presence of imbalanced data sets, inducing classification models that always predict the most numerous classes. In this work, we propose a novel hybrid approach to deal with this problem. We create several balanced data sets with all minority class cases and a random sample of majority class cases. These balanced data sets are fed to classical ML systems that produce rule sets. The rule sets are combined creating a pool of rules and an EA is used to build a classifier from this pool of rules. This hybrid approach has some advantages over undersampling, since it reduces the amount of discarded information, and some advantages over oversampling, since it avoids overfitting. The proposed approach was experimentally analysed and the experimental results show an improvement in the classification performance measured as the area under the receiver operating characteristics (ROC) curve.
Resumo:
J.A. Ferreira Neto, E.C. Santos Junior, U. Fra Paleo, D. Miranda Barros, and M.C.O. Moreira. 2011. Optimal subdivision of land in agrarian reform projects: an analysis using genetic algorithms. Cien. Inv. Agr. 38(2): 169-178. The objective of this manuscript is to develop a new procedure to achieve optimal land subdivision using genetic algorithms (GA). The genetic algorithm was tested in the rural settlement of Veredas, located in Minas Gerais, Brazil. This implementation was based on the land aptitude and its productivity index. The sequence of tests in the study was carried out in two areas with eight different agricultural aptitude classes, including one area of 391.88 ha subdivided into 12 lots and another of 404.1763 ha subdivided into 14 lots. The effectiveness of the method was measured using the shunting line standard value of a parceled area lot`s productivity index. To evaluate each parameter, a sequence of 15 calculations was performed to record the best individual fitness average (MMI) found for each parameter variation. The best parameter combination found in testing and used to generate the new parceling with the GA was the following: 320 as the generation number, a population of 40 individuals, 0.8 mutation tax, and a 0.3 renewal tax. The solution generated rather homogeneous lots in terms of productive capacity.
Resumo:
We describe the canonical and microcanonical Monte Carlo algorithms for different systems that can be described by spin models. Sites of the lattice, chosen at random, interchange their spin values, provided they are different. The canonical ensemble is generated by performing exchanges according to the Metropolis prescription whereas in the microcanonical ensemble, exchanges are performed as long as the total energy remains constant. A systematic finite size analysis of intensive quantities and a comparison with results obtained from distinct ensembles are performed and the quality of results reveal that the present approach may be an useful tool for the study of phase transitions, specially first-order transitions. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
In this paper we present a novel approach for multispectral image contextual classification by combining iterative combinatorial optimization algorithms. The pixel-wise decision rule is defined using a Bayesian approach to combine two MRF models: a Gaussian Markov Random Field (GMRF) for the observations (likelihood) and a Potts model for the a priori knowledge, to regularize the solution in the presence of noisy data. Hence, the classification problem is stated according to a Maximum a Posteriori (MAP) framework. In order to approximate the MAP solution we apply several combinatorial optimization methods using multiple simultaneous initializations, making the solution less sensitive to the initial conditions and reducing both computational cost and time in comparison to Simulated Annealing, often unfeasible in many real image processing applications. Markov Random Field model parameters are estimated by Maximum Pseudo-Likelihood (MPL) approach, avoiding manual adjustments in the choice of the regularization parameters. Asymptotic evaluations assess the accuracy of the proposed parameter estimation procedure. To test and evaluate the proposed classification method, we adopt metrics for quantitative performance assessment (Cohen`s Kappa coefficient), allowing a robust and accurate statistical analysis. The obtained results clearly show that combining sub-optimal contextual algorithms significantly improves the classification performance, indicating the effectiveness of the proposed methodology. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The assessment of routing protocols for mobile wireless networks is a difficult task, because of the networks` dynamic behavior and the absence of benchmarks. However, some of these networks, such as intermittent wireless sensors networks, periodic or cyclic networks, and some delay tolerant networks (DTNs), have more predictable dynamics, as the temporal variations in the network topology can be considered as deterministic, which may make them easier to study. Recently, a graph theoretic model-the evolving graphs-was proposed to help capture the dynamic behavior of such networks, in view of the construction of least cost routing and other algorithms. The algorithms and insights obtained through this model are theoretically very efficient and intriguing. However, there is no study about the use of such theoretical results into practical situations. Therefore, the objective of our work is to analyze the applicability of the evolving graph theory in the construction of efficient routing protocols in realistic scenarios. In this paper, we use the NS2 network simulator to first implement an evolving graph based routing protocol, and then to use it as a benchmark when comparing the four major ad hoc routing protocols (AODV, DSR, OLSR and DSDV). Interestingly, our experiments show that evolving graphs have the potential to be an effective and powerful tool in the development and analysis of algorithms for dynamic networks, with predictable dynamics at least. In order to make this model widely applicable, however, some practical issues still have to be addressed and incorporated into the model, like adaptive algorithms. We also discuss such issues in this paper, as a result of our experience.
Resumo:
Let M = (V, E, A) be a mixed graph with vertex set V, edge set E and arc set A. A cycle cover of M is a family C = {C(1), ... , C(k)} of cycles of M such that each edge/arc of M belongs to at least one cycle in C. The weight of C is Sigma(k)(i=1) vertical bar C(i)vertical bar. The minimum cycle cover problem is the following: given a strongly connected mixed graph M without bridges, find a cycle cover of M with weight as small as possible. The Chinese postman problem is: given a strongly connected mixed graph M, find a minimum length closed walk using all edges and arcs of M. These problems are NP-hard. We show that they can be solved in polynomial time if M has bounded tree-width. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
We consider the problems of finding the maximum number of vertex-disjoint triangles (VTP) and edge-disjoint triangles (ETP) in a simple graph. Both problems are NP-hard. The algorithm with the best approximation ratio known so far for these problems has ratio 3/2 + epsilon, a result that follows from a more general algorithm for set packing obtained by Hurkens and Schrijver [On the size of systems of sets every t of which have an SDR, with an application to the worst-case ratio of heuristics for packing problems, SIAM J. Discrete Math. 2(1) (1989) 68-72]. We present improvements on the approximation ratio for restricted cases of VTP and ETP that are known to be APX-hard: we give an approximation algorithm for VTP on graphs with maximum degree 4 with ratio slightly less than 1.2, and for ETP on graphs with maximum degree 5 with ratio 4/3. We also present an exact linear-time algorithm for VTP on the class of indifference graphs. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
Chagas disease is nowadays the most serious parasitic health problem. This disease is caused by Trypanosoma cruzi. The great number of deaths and the insufficient effectiveness of drugs against this parasite have alarmed the scientific community worldwide. In an attempt to overcome this problem, a model for the design and prediction of new antitrypanosomal agents was obtained. This used a mixed approach, containing simple descriptors based on fragments and topological substructural molecular design descriptors. A data set was made up of 188 compounds, 99 of them characterized an antitrypanosomal activity and 88 compounds that belong to other pharmaceutical categories. The model showed sensitivity, specificity and accuracy values above 85%. Quantitative fragmental contributions were also calculated. Then, and to confirm the quality of the model, 15 structures of molecules tested as antitrypanosomal compounds (that we did not include in this study) were predicted, taking into account the information on the abovementioned calculated fragmental contributions. The model showed an accuracy of 100% which means that the ""in silico"" methodology developed by our team is promising for the rational design of new antitrypanosomal drugs. (C) 2009 Wiley Periodicals, Inc. J Comput Chem 31: 882-894. 2010
Resumo:
The increasing resistance of Mycobacterium tuberculosis to the existing drugs has alarmed the worldwide scientific community. In an attempt to overcome this problem, two models for the design and prediction of new antituberculosis agents were obtained. The first used a mixed approach, containing descriptors based on fragments and the topological substructural molecular design approach (TOPS-MODE) descriptors. The other model used a combination of two-dimensional (2D) and three-dimensional (3D) descriptors. A data set of 167 compounds with great structural variability, 72 of them antituberculosis agents and 95 compounds belonging to other pharmaceutical categories, was analyzed. The first model showed sensitivity, specificity, and accuracy values above 80% and the second one showed values higher than 75% for these statistical indices. Subsequently, 12 structures of imidazoles not included in this study were designed, taking into account the two models. In both cases accuracy was 100%, showing that the methodology in silico developed by us is promising for the rational design of antituberculosis drugs.