826 resultados para Population set-based methods
Resumo:
This paper applies a genetic algorithm with hierarchically structured population to solve unconstrained optimization problems. The population has individuals distributed in several overlapping clusters, each one with a leader and a variable number of support individuals. The hierarchy establishes that leaders must be fitter than its supporters with the topological organization of the clusters following a tree. Computational tests evaluate different population structures, population sizes and crossover operators for better algorithm performance. A set of known benchmark test problems is solved and the results found are compared with those obtained from other methods described in the literature, namely, two genetic algorithms, a simulated annealing, a differential evolution and a particle swarm optimization. The results indicate that the method employed is capable of achieving better performance than the previous approaches in regard as the two criteria usually employed for comparisons: the number of function evaluations and rate of success. The method also has a superior performance if the number of problems solved is taken into account. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
Textual document set has become an important and rapidly growing information source in the web. Text classification is one of the crucial technologies for information organisation and management. Text classification has become more and more important and attracted wide attention of researchers from different research fields. In this paper, many feature selection methods, the implement algorithms and applications of text classification are introduced firstly. However, because there are much noise in the knowledge extracted by current data-mining techniques for text classification, it leads to much uncertainty in the process of text classification which is produced from both the knowledge extraction and knowledge usage, therefore, more innovative techniques and methods are needed to improve the performance of text classification. It has been a critical step with great challenge to further improve the process of knowledge extraction and effectively utilization of the extracted knowledge. Rough Set decision making approach is proposed to use Rough Set decision techniques to more precisely classify the textual documents which are difficult to separate by the classic text classification methods. The purpose of this paper is to give an overview of existing text classification technologies, to demonstrate the Rough Set concepts and the decision making approach based on Rough Set theory for building more reliable and effective text classification framework with higher precision, to set up an innovative evaluation metric named CEI which is very effective for the performance assessment of the similar research, and to propose a promising research direction for addressing the challenging problems in text classification, text mining and other relative fields.
Resumo:
Feature selection aims to determine a minimal feature subset from a problem domain while retaining a suitably high accuracy in representing the original features. Rough set theory (RST) has been used as such a tool with much success. RST enables the discovery of data dependencies and the reduction of the number of attributes contained in a dataset using the data alone, requiring no additional information. This chapter describes the fundamental ideas behind RST-based approaches and reviews related feature selection methods that build on these ideas. Extensions to the traditional rough set approach are discussed, including recent selection methods based on tolerance rough sets, variable precision rough sets and fuzzy-rough sets. Alternative search mechanisms are also highly important in rough set feature selection. The chapter includes the latest developments in this area, including RST strategies based on hill-climbing, genetic algorithms and ant colony optimization.
Resumo:
This work presents exact algorithms for the Resource Allocation and Cyclic Scheduling Problems (RA&CSPs). Cyclic Scheduling Problems arise in a number of application areas, such as in hoist scheduling, mass production, compiler design (implementing scheduling loops on parallel architectures), software pipelining, and in embedded system design. The RA&CS problem concerns time and resource assignment to a set of activities, to be indefinitely repeated, subject to precedence and resource capacity constraints. In this work we present two constraint programming frameworks facing two different types of cyclic problems. In first instance, we consider the disjunctive RA&CSP, where the allocation problem considers unary resources. Instances are described through the Synchronous Data-flow (SDF) Model of Computation. The key problem of finding a maximum-throughput allocation and scheduling of Synchronous Data-Flow graphs onto a multi-core architecture is NP-hard and has been traditionally solved by means of heuristic (incomplete) algorithms. We propose an exact (complete) algorithm for the computation of a maximum-throughput mapping of applications specified as SDFG onto multi-core architectures. Results show that the approach can handle realistic instances in terms of size and complexity. Next, we tackle the Cyclic Resource-Constrained Scheduling Problem (i.e. CRCSP). We propose a Constraint Programming approach based on modular arithmetic: in particular, we introduce a modular precedence constraint and a global cumulative constraint along with their filtering algorithms. Many traditional approaches to cyclic scheduling operate by fixing the period value and then solving a linear problem in a generate-and-test fashion. Conversely, our technique is based on a non-linear model and tackles the problem as a whole: the period value is inferred from the scheduling decisions. The proposed approaches have been tested on a number of non-trivial synthetic instances and on a set of realistic industrial instances achieving good results on practical size problem.
Resumo:
Obesity is a multifactorial trait, which comprises an independent risk factor for cardiovascular disease (CVD). The aim of the current work is to study the complex etiology beneath obesity and identify genetic variations and/or factors related to nutrition that contribute to its variability. To this end, a set of more than 2300 white subjects who participated in a nutrigenetics study was used. For each subject a total of 63 factors describing genetic variants related to CVD (24 in total), gender, and nutrition (38 in total), e.g. average daily intake in calories and cholesterol, were measured. Each subject was categorized according to body mass index (BMI) as normal (BMI ≤ 25) or overweight (BMI > 25). Two artificial neural network (ANN) based methods were designed and used towards the analysis of the available data. These corresponded to i) a multi-layer feed-forward ANN combined with a parameter decreasing method (PDM-ANN), and ii) a multi-layer feed-forward ANN trained by a hybrid method (GA-ANN) which combines genetic algorithms and the popular back-propagation training algorithm.
Resumo:
A simulation model adopting a health system perspective showed population-based screening with DXA, followed by alendronate treatment of persons with osteoporosis, or with anamnestic fracture and osteopenia, to be cost-effective in Swiss postmenopausal women from age 70, but not in men. INTRODUCTION: We assessed the cost-effectiveness of a population-based screen-and-treat strategy for osteoporosis (DXA followed by alendronate treatment if osteoporotic, or osteopenic in the presence of fracture), compared to no intervention, from the perspective of the Swiss health care system. METHODS: A published Markov model assessed by first-order Monte Carlo simulation was refined to reflect the diagnostic process and treatment effects. Women and men entered the model at age 50. Main screening ages were 65, 75, and 85 years. Age at bone densitometry was flexible for persons fracturing before the main screening age. Realistic assumptions were made with respect to persistence with intended 5 years of alendronate treatment. The main outcome was cost per quality-adjusted life year (QALY) gained. RESULTS: In women, costs per QALY were Swiss francs (CHF) 71,000, CHF 35,000, and CHF 28,000 for the main screening ages of 65, 75, and 85 years. The threshold of CHF 50,000 per QALY was reached between main screening ages 65 and 75 years. Population-based screening was not cost-effective in men. CONCLUSION: Population-based DXA screening, followed by alendronate treatment in the presence of osteoporosis, or of fracture and osteopenia, is a cost-effective option in Swiss postmenopausal women after age 70.
Resumo:
We present a technique to reconstruct the electromagnetic properties of a medium or a set of objects buried inside it from boundary measurements when applying electric currents through a set of electrodes. The electromagnetic parameters may be recovered by means of a gradient method without a priori information on the background. The shape, location and size of objects, when present, are determined by a topological derivative-based iterative procedure. The combination of both strategies allows improved reconstructions of the objects and their properties, assuming a known background.
Resumo:
An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).
Development of novel DNA-based methods for the measurement of length polymorphisms (microsatellites)
Resumo:
The Macroscopic Fundamental Diagram (MFD) relates space-mean density and flow. Since the MFD represents the area-wide network traffic performance, studies on perimeter control strategies and network-wide traffic state estimation utilising the MFD concept have been reported. Most previous works have utilised data from fixed sensors, such as inductive loops, to estimate the MFD, which can cause biased estimation in urban networks due to queue spillovers at intersections. To overcome the limitation, recent literature reports the use of trajectory data obtained from probe vehicles. However, these studies have been conducted using simulated datasets; limited works have discussed the limitations of real datasets and their impact on the variable estimation. This study compares two methods for estimating traffic state variables of signalised arterial sections: a method based on cumulative vehicle counts (CUPRITE), and one based on vehicles’ trajectory from taxi Global Positioning System (GPS) log. The comparisons reveal some characteristics of taxi trajectory data available in Brisbane, Australia. The current trajectory data have limitations in quantity (i.e., the penetration rate), due to which the traffic state variables tend to be underestimated. Nevertheless, the trajectory-based method successfully captures the features of traffic states, which suggests that the trajectories from taxis can be a good estimator for the network-wide traffic states.
Resumo:
Finite-state methods have been adopted widely in computational morphology and related linguistic applications. To enable efficient development of finite-state based linguistic descriptions, these methods should be a freely available resource for academic language research and the language technology industry. The following needs can be identified: (i) a registry that maps the existing approaches, implementations and descriptions, (ii) managing the incompatibilities of the existing tools, (iii) increasing synergy and complementary functionality of the tools, (iv) persistent availability of the tools used to manipulate the archived descriptions, (v) an archive for free finite-state based tools and linguistic descriptions. Addressing these challenges contributes to building a common research infrastructure for advanced language technology.