994 resultados para sample subset optimization (SSO)


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the formulation of a combinatorial optimization problem with the following characteristics: (i) the search space is the power set of a finite set structured as a Boolean lattice; (ii) the cost function forms a U-shaped curve when applied to any lattice chain. This formulation applies for feature selection in the context of pattern recognition. The known approaches for this problem are branch-and-bound algorithms and heuristics that explore partially the search space. Branch-and-bound algorithms are equivalent to the full search, while heuristics are not. This paper presents a branch-and-bound algorithm that differs from the others known by exploring the lattice structure and the U-shaped chain curves of the search space. The main contribution of this paper is the architecture of this algorithm that is based on the representation and exploration of the search space by new lattice properties proven here. Several experiments, with well known public data, indicate the superiority of the proposed method to the sequential floating forward selection (SFFS), which is a popular heuristic that gives good results in very short computational time. In all experiments, the proposed method got better or equal results in similar or even smaller computational time. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work, the separation of nine phenolic acids (benzoic, caffeic, chlorogenic, p-coumaric, ferulic, gallic, protocatechuic, syringic, and vanillic acid) was approached by a 32 factorial design in electrolytes consisting of sodium tetraborate buffer(STB) in the concentration range of 10-50 mmol L(-1) and methanol in the volume percentage of 5-20%. Derringer`s desirability functions combined globally were tested as response functions. An optimal electrolyte composed by 50 mmol L(-1) tetraborate buffer at pH 9.2, and 7.5% (v/v) methanol allowed baseline resolution of all phenolic acids under investigation in less than 15 min. In order to promote sample clean up, to preconcentrate the phenolic fraction and to release esterified phenolic acids from the fruit matrix, elaborate liquid-liquid extraction procedures followed by alkaline hydrolysis were performed. The proposed methodology was fully validated (linearity from 10.0 to 100 mu g mL(-1), R(2) > 0.999: LOD and LOQ from 1.32 to 3.80 mu g mL(-1) and from 4.01 to 11.5 mu g mL(-1), respectively; intra-day precision better than 2.8% CV for migration time and 5.4% CV for peak area; inter-day precision better than 4.8% CV for migration time and 4.8-11% CV for peak area: recoveries from 81% to 115%) and applied successfully to the evaluation of phenolic contents of abiu-roxo (Chrysophyllum caimito), wild mulberry growing in Brazil (Morus nigra L.) and tree tomato (Cyphomandra betacea). Values in the range of 1.50-47.3 mu g g(-1) were found, with smaller amounts occurring as free phenolic acids. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Optimization of photo-Fenton degradation of copper phthalocyanine blue was achieved by response surface methodology (RSM) constructed with the aid of a sequential injection analysis (SIA) system coupled to a homemade photo-reactor. Highest degradation percentage was obtained at the following conditions [H(2)O(2)]/[phthalocyanine] = 7, [H(2)O(2)]/[FeSO(4)] = 10, pH = 2.5, and stopped flow time in the photo reactor = 30 s. The SIA system was designed to prepare a monosegment containing the reagents and sample, to pump it toward the photo-reactor for the specified time and send the products to a flow-through spectrophotometer for monitoring the color reduction of the dye. Changes in parameters such as reagent molar ratios. residence time and pH were made by modifications in the software commanding the SI system, without the need for physical reconfiguration of reagents around the selection valve. The proposed procedure and system fed the statistical program with degradation data for fast construction of response surface plots. After optimization, 97% of the dye was degraded. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mineral potential mapping is the process of combining a set of input maps, each representing a distinct geo-scientific variable, to produce a single map which ranks areas according to their potential to host deposits of a particular type. The maps are combined using a mapping function which must be either provided by an expert (knowledge-driven approach), or induced from sample data (data-driven approach). Current data-driven approaches using multilayer perceptrons (MLPs) to represent the mapping function have several inherent problems: they rely heavily on subjective judgment in selecting training data and are highly sensitive to this selection; they do not utilize the contextual information provided by unlabeled data; and, there is no objective interpretation of the values output by the MLP. This paper presents a novel approach which overcomes these three problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Feature selection techniques are critical to the analysis of high dimensional datasets. This is especially true in gene selection from microarray data which are commonly with extremely high feature-to-sample ratio. In addition to the essential objectives such as to reduce data noise, to reduce data redundancy, to improve sample classification accuracy, and to improve model generalization property, feature selection also helps biologists to focus on the selected genes to further validate their biological hypotheses.
Results: In this paper we describe an improved hybrid system for gene selection. It is based on a recently proposed genetic ensemble (GE) system. To enhance the generalization property of the selected genes or gene subsets and to overcome the overfitting problem of the GE system, we devised a mapping strategy to fuse the goodness information of each gene provided by multiple filtering algorithms. This information is then used for initialization and mutation operation of the genetic ensemble system.
Conclusion: We used four benchmark microarray datasets (including both binary-class and multi-class classification problems) for concept proving and model evaluation. The experimental results indicate that the proposed multi-filter enhanced genetic ensemble (MF-GE) system is able to improve sample classification accuracy, generate more compact gene subset, and converge to the selection results more quickly. The MF-GE system is very flexible as various combinations of multiple filters and classifiers can be incorporated based on the data characteristics and the user preferences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose: To compare tear film osmolarity measurements between in situ and vapor pressure osmometers. Repeatability of in situ measurements and the effect of sample collection techniques on tear film osmolarity were also evaluated.

Methods: Osmolarity was measured in one randomly determined eye of 52 healthy participants using the in situ (TearLab Corporation, San Diego, CA) and the vapor pressure (Vapro 5520; Wescor, Inc., Logan, UT) osmometers. In a subset of 20 participants, tear osmolarity was measured twice on-eye with the in situ osmometer and was additionally determined on a sample of nonstimulated collected tears (3 µL) with both instruments.

Results: Mean (SD) tear film osmolarity with the in situ osmometer was 299.2 (10.3) mOsmol/L compared with 298.4 (10) mmol/kg with the vapor pressure osmometer, which correlated moderately (r = 0.5, P < 0.05). Limits of agreement between the two instruments were -19.7 to +20.5 mOsmol/L. Using collected tears, measurements with the vapor pressure osmometer were marginally higher (mean [SD], 303.0 [11.0] vs 299.3 [8.0] mOsmol/L; P > 0.05) but correlated well with those using the in situ osmometer (r = 0.9, P < 0.05). The mean (SD) osmolarity of on-eye tears was 5.0 (6.6) mOsmol/L higher than that of collected tears, when both measurements were conducted with the in situ osmometer. This was a consistent effect because the measurements correlated well (r = 0.65, P < 0.05).The in situ osmometer showed good repeatability with a coefficient of repeatability of 9.4 mOsmol/L (r = 0.8, P < 0.05).

Conclusions: Correlation between the two instruments was better when compared on collected tear samples. Tear film osmolarity measurement is influenced by the sample collection technique with the osmolarity of on-eye tears being higher than that of collected tears. This highlights the importance of measuring tear film osmolarity directly on-eye. The in situ osmometer has good repeatability for conducting this measurement.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Intelligent Water Drop (IWD) algorithm is a recent stochastic swarm-based method that is useful for solving combinatorial and function optimization problems. In this paper, we investigate the effectiveness of the selection method in the solution construction phase of the IWD algorithm. Instead of the fitness proportionate selection method in the original IWD algorithm, two ranking-based selection methods, namely linear ranking and exponential ranking, are proposed. Both ranking-based selection methods aim to solve the identified limitations of the fitness proportionate selection method as well as to enable the IWD algorithm to escape from local optima and ensure its search diversity. To evaluate the usefulness of the proposed ranking-based selection methods, a series of experiments pertaining to three combinatorial optimization problems, i.e., rough set feature subset selection, multiple knapsack and travelling salesman problems, is conducted. The results demonstrate that the exponential ranking selection method is able to preserve the search diversity, therefore improving the performance of the IWD algorithm. © 2014 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Surface modification of precipitated calcium carbonate particles (calcite) in a planetary ball mill using stearic acid as a modification agent for making dispersion in hydrocarbon oil was investigated. Different parameters for processing (milling) such as milling time, ball-to-sample ratio, and molar ratio of the reactant were varied and analyzed for optimization. The physical properties of the hydrophobically modified calcium carbonate particles were measured; the particle size and morphology of the resulting samples were characterized by transmission electron microscopy and X-ray diffraction. The surface coating thickness was estimated using small angle X-ray scattering. © 2014 American Coatings Association & Oil and Colour Chemists' Association.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Industrial producers face the task of optimizing production process in an attempt to achieve the desired quality such as mechanical properties with the lowest energy consumption. In industrial carbon fiber production, the fibers are processed in bundles containing (batches) several thousand filaments and consequently the energy optimization will be a stochastic process as it involves uncertainty, imprecision or randomness. This paper presents a stochastic optimization model to reduce energy consumption a given range of desired mechanical properties. Several processing condition sets are developed and for each set of conditions, 50 samples of fiber are analyzed for their tensile strength and modulus. The energy consumption during production of the samples is carefully monitored on the processing equipment. Then, five standard distribution functions are examined to determine those which can best describe the distribution of mechanical properties of filaments. To verify the distribution goodness of fit and correlation statistics, the Kolmogorov-Smirnov test is used. In order to estimate the selected distribution (Weibull) parameters, the maximum likelihood, least square and genetic algorithm methods are compared. An array of factors including the sample size, the confidence level, and relative error of estimated parameters are used for evaluating the tensile strength and modulus properties. The energy consumption and N2 gas cost are modeled by Convex Hull method. Finally, in order to optimize the carbon fiber production quality and its energy consumption and total cost, mixed integer linear programming is utilized. The results show that using the stochastic optimization models, we are able to predict the production quality in a given range and minimize the energy consumption of its industrial process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Prognosis, such as predicting mortality, is common in medicine. When confronted with small numbers of samples, as in rare medical conditions, the task is challenging. We propose a framework for classification with data with small numbers of samples. Conceptually, our solution is a hybrid of multi-task and transfer learning, employing data samples from source tasks as in transfer learning, but considering all tasks together as in multi-task learning. Each task is modelled jointly with other related tasks by directly augmenting the data from other tasks. The degree of augmentation depends on the task relatedness and is estimated directly from the data. We apply the model on three diverse real-world data sets (healthcare data, handwritten digit data and face data) and show that our method outperforms several state-of-the-art multi-task learning baselines. We extend the model for online multi-task learning where the model parameters are incrementally updated given new data or new tasks. The novelty of our method lies in offering a hybrid multi-task/transfer learning model to exploit sharing across tasks at the data-level and joint parameter learning.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is crucial for a neuron spike sorting algorithm to cluster data from different neurons efficiently. In this study, the search capability of the Genetic Algorithm (GA) is exploited for identifying the optimal feature subset for neuron spike sorting with a clustering algorithm. Two important objectives of the optimization process are considered: to reduce the number of features and increase the clustering performance. Specifically, we employ a binary GA with the silhouette evaluation criterion as the fitness function for neuron spike sorting using the Super-Paramagnetic Clustering (SPC) algorithm. The clustering results of SPC with and without the GA-based feature selector are evaluated using benchmark synthetic neuron spike data sets. The outcome indicates the usefulness of the GA in identifying a smaller feature set with improved clustering performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Markovian algorithms for estimating the global maximum or minimum of real valued functions defined on some domain Omega subset of R-d are presented. Conditions on the search schemes that preserve the asymptotic distribution are derived. Global and local search schemes satisfying these conditions are analysed and shown to yield sharper confidence intervals when compared to the i.i.d. case.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An economic-statistical model is developed for variable parameters (VP) (X) over bar charts in which all design parameters vary adaptively, that is, each of the design parameters (sample size, sampling interval and control-limit width) vary as a function of the most recent process information. The cost function due to controlling the process quality through a VP (X) over bar chart is derived. During the optimization of the cost function, constraints are imposed on the expected times to signal when the process is in and out of control. In this way, required statistical properties can be assured. Through a numerical example, the proposed economic-statistical design approach for VP (X) over bar charts is compared to the economic design for VP (X) over bar charts and to the economic-statistical and economic designs for fixed parameters (FP) (X) over bar charts in terms of the operating cost and the expected times to signal. From this example, it is possible to assess the benefits provided by the proposed model. Varying some input parameters, their effect on the optimal cost and on the optimal values of the design parameters was analysed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main variables found on procedure of the dissolution silicate rocks using acid dissolution in teflon open vessel for analysis of micro elements by ICP-AES has been determined. The results obtained for some samples showed strong dependence of the rock mineralogical composition, then it was recommended an alkaline fusion step after acid dissolution. The decomposition procedure use 20 mi of an acid mixture of HF:HNO3 in the proportion 3:1 for a fraction of 250 mg pulverized sample. The recommended temperatures were 60 degrees C for attack and 90 degrees C for acid volatilization. The fusion step with 50 mg LiBO2 at 1000 degrees C may be used if non-attacked residue is observed in the solution. The whole time was 6 h per sample. Nine types os silicate rocks that show mineralogical and chemical different compositions were chosen for obtaining the optimization of the variables. The elements used were Ce, Y, Yb and Zr. In addition, ultrassonic nebulization has been used. The percentual standard deviations obtained for five determinations were 0.7 and 1.4 for triplicate samples. The mineralogical and textural information from the petrographical analysis of the samples indicated the need of increasing the fusion step on the optimized procedure.