855 resultados para optimal feature selection
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
In this paper, we perform a thorough analysis of a spectral phase-encoded time spreading optical code division multiple access (SPECTS-OCDMA) system based on Walsh-Hadamard (W-H) codes aiming not only at finding optimal code-set selections but also at assessing its loss of security due to crosstalk. We prove that an inadequate choice of codes can make the crosstalk between active users to become large enough so as to cause the data from the user of interest to be detected by other user. The proposed algorithm for code optimization targets code sets that produce minimum bit error rate (BER) among all codes for a specific number of simultaneous users. This methodology allows us to find optimal code sets for any OCDMA system, regardless the code family used and the number of active users. This procedure is crucial for circumventing the unexpected lack of security due to crosstalk. We also show that a SPECTS-OCDMA system based on W-H 32(64) fundamentally limits the number of simultaneous users to 4(8) with no security violation due to crosstalk. More importantly, we prove that only a small fraction of the available code sets is actually immune to crosstalk with acceptable BER (<10(-9)) i.e., approximately 0.5% for W-H 32 with four simultaneous users, and about 1 x 10(-4)% for W-H 64 with eight simultaneous users.
Resumo:
Radiomics is the high-throughput extraction and analysis of quantitative image features. For non-small cell lung cancer (NSCLC) patients, radiomics can be applied to standard of care computed tomography (CT) images to improve tumor diagnosis, staging, and response assessment. The first objective of this work was to show that CT image features extracted from pre-treatment NSCLC tumors could be used to predict tumor shrinkage in response to therapy. This is important since tumor shrinkage is an important cancer treatment endpoint that is correlated with probability of disease progression and overall survival. Accurate prediction of tumor shrinkage could also lead to individually customized treatment plans. To accomplish this objective, 64 stage NSCLC patients with similar treatments were all imaged using the same CT scanner and protocol. Quantitative image features were extracted and principal component regression with simulated annealing subset selection was used to predict shrinkage. Cross validation and permutation tests were used to validate the results. The optimal model gave a strong correlation between the observed and predicted shrinkages with . The second objective of this work was to identify sets of NSCLC CT image features that are reproducible, non-redundant, and informative across multiple machines. Feature sets with these qualities are needed for NSCLC radiomics models to be robust to machine variation and spurious correlation. To accomplish this objective, test-retest CT image pairs were obtained from 56 NSCLC patients imaged on three CT machines from two institutions. For each machine, quantitative image features with concordance correlation coefficient values greater than 0.90 were considered reproducible. Multi-machine reproducible feature sets were created by taking the intersection of individual machine reproducible feature sets. Redundant features were removed through hierarchical clustering. The findings showed that image feature reproducibility and redundancy depended on both the CT machine and the CT image type (average cine 4D-CT imaging vs. end-exhale cine 4D-CT imaging vs. helical inspiratory breath-hold 3D CT). For each image type, a set of cross-machine reproducible, non-redundant, and informative image features was identified. Compared to end-exhale 4D-CT and breath-hold 3D-CT, average 4D-CT derived image features showed superior multi-machine reproducibility and are the best candidates for clinical correlation.
Resumo:
This paper studies feature subset selection in classification using a multiobjective estimation of distribution algorithm. We consider six functions, namely area under ROC curve, sensitivity, specificity, precision, F1 measure and Brier score, for evaluation of feature subsets and as the objectives of the problem. One of the characteristics of these objective functions is the existence of noise in their values that should be appropriately handled during optimization. Our proposed algorithm consists of two major techniques which are specially designed for the feature subset selection problem. The first one is a solution ranking method based on interval values to handle the noise in the objectives of this problem. The second one is a model estimation method for learning a joint probabilistic model of objectives and variables which is used to generate new solutions and advance through the search space. To simplify model estimation, l1 regularized regression is used to select a subset of problem variables before model learning. The proposed algorithm is compared with a well-known ranking method for interval-valued objectives and a standard multiobjective genetic algorithm. Particularly, the effects of the two new techniques are experimentally investigated. The experimental results show that the proposed algorithm is able to obtain comparable or better performance on the tested datasets.
Resumo:
Errata sheet inserted.
Resumo:
Aircraft manufacturing industries are looking for solutions in order to increase their productivity. One of the solutions is to apply the metrology systems during the production and assembly processes. Metrology Process Model (MPM) (Maropoulos et al, 2007) has been introduced which emphasises metrology applications with assembly planning, manufacturing processes and product designing. Measurability analysis is part of the MPM and the aim of this analysis is to check the feasibility for measuring the designed large scale components. Measurability Analysis has been integrated in order to provide an efficient matching system. Metrology database is structured by developing the Metrology Classification Model. Furthermore, the feature-based selection model is also explained. By combining two classification models, a novel approach and selection processes for integrated measurability analysis system (MAS) are introduced and such integrated MAS could provide much more meaningful matching results for the operators. © Springer-Verlag Berlin Heidelberg 2010.
Resumo:
A decision-maker, when faced with a limited and fixed budget to collect data in support of a multiple attribute selection decision, must decide how many samples to observe from each alternative and attribute. This allocation decision is of particular importance when the information gained leads to uncertain estimates of the attribute values as with sample data collected from observations such as measurements, experimental evaluations, or simulation runs. For example, when the U.S. Department of Homeland Security must decide upon a radiation detection system to acquire, a number of performance attributes are of interest and must be measured in order to characterize each of the considered systems. We identified and evaluated several approaches to incorporate the uncertainty in the attribute value estimates into a normative model for a multiple attribute selection decision. Assuming an additive multiple attribute value model, we demonstrated the idea of propagating the attribute value uncertainty and describing the decision values for each alternative as probability distributions. These distributions were used to select an alternative. With the goal of maximizing the probability of correct selection we developed and evaluated, under several different sets of assumptions, procedures to allocate the fixed experimental budget across the multiple attributes and alternatives. Through a series of simulation studies, we compared the performance of these allocation procedures to the simple, but common, allocation procedure that distributed the sample budget equally across the alternatives and attributes. We found the allocation procedures that were developed based on the inclusion of decision-maker knowledge, such as knowledge of the decision model, outperformed those that neglected such information. Beginning with general knowledge of the attribute values provided by Bayesian prior distributions, and updating this knowledge with each observed sample, the sequential allocation procedure performed particularly well. These observations demonstrate that managing projects focused on a selection decision so that the decision modeling and the experimental planning are done jointly, rather than in isolation, can improve the overall selection results.
Resumo:
The selection of the optimal operating conditions for an industrial acrylonitrile recovery unit was conducted by the systematic application of the response surface methodology, based on the minimum energy consumption and products specifications as process constraints. Unit models and plant simulation were validated against operating data and information. A sensitivity analysis was carried out in order to identify the set of parameters that strongly affect the trajectories of the system while keeping products specifications. The results suggest that energy savings of up to 10% are possible by systematically adjusting operating conditions.
Resumo:
The analysis of investment in the electric power has been the subject of intensive research for many years. The efficient generation and distribution of electrical energy is a difficult task involving the operation of a complex network of facilities, often located over very large geographical regions. Electric power utilities have made use of an enormous range of mathematical models. Some models address time spans which last for a fraction of a second, such as those that deal with lightning strikes on transmission lines while at the other end of the scale there are models which address time horizons consisting of ten or twenty years; these usually involve long range planning issues. This thesis addresses the optimal long term capacity expansion of an interconnected power system. The aim of this study has been to derive a new, long term planning model which recognises the regional differences which exist for energy demand and which are present in the construction and operation of power plant and transmission line equipment. Perhaps the most innovative feature of the new model is the direct inclusion of regional energy demand curves in the nonlinear form. This results in a nonlinear capacity expansion model. After review of the relevant literature, the thesis first develops a model for the optimal operation of a power grid. This model directly incorporates regional demand curves. The model is a nonlinear programming problem containing both integer and continuous variables. A solution algorithm is developed which is based upon a resource decomposition scheme that separates the integer variables from the continuous ones. The decompostion of the operating problem leads to an interactive scheme which employs a mixed integer programming problem, known as the master, to generate trial operating configurations. The optimum operating conditions of each trial configuration is found using a smooth nonlinear programming model. The dual vector recovered from this model is subsequently used by the master to generate the next trial configuration. The solution algorithm progresses until lower and upper bounds converge. A range of numerical experiments are conducted and these experiments are included in the discussion. Using the operating model as a basis, a regional capacity expansion model is then developed. It determines the type, location and capacity of additional power plants and transmission lines, which are required to meet predicted electicity demands. A generalised resource decompostion scheme, similar to that used to solve the operating problem, is employed. The solution algorithm is used to solve a range of test problems and the results of these numerical experiments are reported. Finally, the expansion problem is applied to the Queensland electricity grid in Australia.