40 resultados para stochastic search variable selection
em Repositório Científico do Instituto Politécnico de Lisboa - Portugal
Resumo:
Tuberculosis (TB) is a worldwide infectious disease that has shown over time extremely high mortality levels. The urgent need to develop new antitubercular drugs is due to the increasing rate of appearance of multi-drug resistant strains to the commonly used drugs, and the longer durations of therapy and recovery, particularly in immuno-compromised patients. The major goal of the present study is the exploration of data from different families of compounds through the use of a variety of machine learning techniques so that robust QSAR-based models can be developed to further guide in the quest for new potent anti-TB compounds. Eight QSAR models were built using various types of descriptors (from ADRIANA.Code and Dragon software) with two publicly available structurally diverse data sets, including recent data deposited in PubChem. QSAR methodologies used Random Forests and Associative Neural Networks. Predictions for the external evaluation sets obtained accuracies in the range of 0.76-0.88 (for active/inactive classifications) and Q(2)=0.66-0.89 for regressions. Models developed in this study can be used to estimate the anti-TB activity of drug candidates at early stages of drug development (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be cornputationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional. datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10(5) features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster.
Resumo:
This paper is on the self-scheduling problem for a thermal power producer taking part in a pool-based electricity market as a price-taker, having bilateral contracts and emission-constrained. An approach based on stochastic mixed-integer linear programming approach is proposed for solving the self-scheduling problem. Uncertainty regarding electricity price is considered through a set of scenarios computed by simulation and scenario-reduction. Thermal units are modelled by variable costs, start-up costs and technical operating constraints, such as: forbidden operating zones, ramp up/down limits and minimum up/down time limits. A requirement on emission allowances to mitigate carbon footprint is modelled by a stochastic constraint. Supply functions for different emission allowance levels are accessed in order to establish the optimal bidding strategy. A case study is presented to illustrate the usefulness and the proficiency of the proposed approach in supporting biding strategies. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
Materials selection is a matter of great importance to engineering design and software tools are valuable to inform decisions in the early stages of product development. However, when a set of alternative materials is available for the different parts a product is made of, the question of what optimal material mix to choose for a group of parts is not trivial. The engineer/designer therefore goes about this in a part-by-part procedure. Optimizing each part per se can lead to a global sub-optimal solution from the product point of view. An optimization procedure to deal with products with multiple parts, each with discrete design variables, and able to determine the optimal solution assuming different objectives is therefore needed. To solve this multiobjective optimization problem, a new routine based on Direct MultiSearch (DMS) algorithm is created. Results from the Pareto front can help the designer to align his/hers materials selection for a complete set of materials with product attribute objectives, depending on the relative importance of each objective.
Resumo:
The interplay of seasonality, the system's nonlinearities and intrinsic stochasticity, is studied for a seasonally forced susceptible-exposed-infective-recovered stochastic model. The model is explored in the parameter region that corresponds to childhood infectious diseases such as measles. The power spectrum of the stochastic fluctuations around the attractors of the deterministic system that describes the model in the thermodynamic limit is computed analytically and validated by stochastic simulations for large system sizes. Size effects are studied through additional simulations. Other effects such as switching between coexisting attractors induced by stochasticity often mentioned in the literature as playing an important role in the dynamics of childhood infectious diseases are also investigated. The main conclusion is that stochastic amplification, rather than these effects, is the key ingredient to understand the observed incidence patterns.
Resumo:
Lossless compression algorithms of the Lempel-Ziv (LZ) family are widely used nowadays. Regarding time and memory requirements, LZ encoding is much more demanding than decoding. In order to speed up the encoding process, efficient data structures, like suffix trees, have been used. In this paper, we explore the use of suffix arrays to hold the dictionary of the LZ encoder, and propose an algorithm to search over it. We show that the resulting encoder attains roughly the same compression ratios as those based on suffix trees. However, the amount of memory required by the suffix array is fixed, and much lower than the variable amount of memory used by encoders based on suffix trees (which depends on the text to encode). We conclude that suffix arrays, when compared to suffix trees in terms of the trade-off among time, memory, and compression ratio, may be preferable in scenarios (e.g., embedded systems) where memory is at a premium and high speed is not critical.
Resumo:
Motion compensated frame interpolation (MCFI) is one of the most efficient solutions to generate side information (SI) in the context of distributed video coding. However, it creates SI with rather significant motion compensated errors for some frame regions while rather small for some other regions depending on the video content. In this paper, a low complexity Infra mode selection algorithm is proposed to select the most 'critical' blocks in the WZ frame and help the decoder with some reliable data for those blocks. For each block, the novel coding mode selection algorithm estimates the encoding rate for the Intra based and WZ coding modes and determines the best coding mode while maintaining a low encoder complexity. The proposed solution is evaluated in terms of rate-distortion performance with improvements up to 1.2 dB regarding a WZ coding mode only solution.
Resumo:
This paper describes experimental work done towards the search for more profitable and sustainable alternatives regarding biodiesel production, using heterogeneous catalysts instead of the conventional homogenous alkaline catalysts, such as NaOH, KOH or sodium methoxide, for the methanolysis reaction. This experimental work is a first stage on the development and optimization of new solid catalysts, able to produce biodiesel from vegetable oils. The heterogeneous catalytic process has many differences from the currently used in industry homogeneous process. The main advantage is that, it requires lower investment costs, since no need for separation steps of methanol/catalyst, biodiesel/catalyst and glycerine/catalyst. This work resulted in the selection of CaO and CaO modified with Li catalysts, which showed very good catalytic performances with high activity and stability. In fact FAME yields higher than 92% were observed in two consecutive reaction batches without expensive intermediate reactivation procedures. Therefore, those catalysts appear to be suitable for biodiesel production.
Resumo:
Reclaimed water from small wastewater treatment facilities in the rural areas of the Beira Interior region (Portugal) may constitute an alternative water source for aquifer recharge. A 21-month monitoring period in a constructed wetland treatment system has shown that 21,500 m(3) year(-1) of treated wastewater (reclaimed water) could be used for aquifer recharge. A GIS-based multi-criteria analysis was performed, combining ten thematic maps and economic, environmental and technical criteria, in order to produce a suitability map for the location of sites for reclaimed water infiltration. The areas chosen for aquifer recharge with infiltration basins are mainly composed of anthrosol with more than 1 m deep and fine sand texture, which allows an average infiltration velocity of up to 1 m d(-1). These characteristics will provide a final polishing treatment of the reclaimed water after infiltration (soil aquifer treatment (SAT)), suitable for the removal of the residual load (trace organics, nutrients, heavy metals and pathogens). The risk of groundwater contamination is low since the water table in the anthrosol areas ranges from 10 m to 50 m. Oil the other hand, these depths allow a guaranteed unsaturated area suitable for SAT. An area of 13,944 ha was selected for study, but only 1607 ha are suitable for reclaimed water infiltration. Approximately 1280 m(2) were considered enough to set up 4 infiltration basins to work in flooding and drying cycles.
Resumo:
We study the implications of the searches based on H -> tau(+)tau-by the ATLAS and CMS collaborations on the parameter space of the two-Higgs-doublet model (2HDM). In the 2HDM, the scalars can decay into a tau pair with a branching ratio larger than the SM one, leading to constraints on the 2HDM parameter space. We show that in model II, values of tan beta > 1.8 are definitively excluded if the pseudoscalar is in the mass range 110 GeV < m(A) < 145 GeV. We have also discussed the implications for the 2HDM of the recent dimuon search by the ATLAS collaboration for a CP-odd scalar in the mass range 4-12 GeV.
Resumo:
Introdução – O incremento do tempo de exposição à microgravidade origina um descondicionamento músculo-esquelético que precisa de ser prevenido através do treino. Objetivos – Identificar os padrões destas alterações e descrever os programas de treino em microgravidade e estratégias pós-exposição. Método – A pesquisa da revisão da literatura foi conduzida através da MEDLINE/PubMed e PEDro com as seguintes palavras--chave: “spaceflight rehabilitation”, “spaceflight muscle”, “microgravity muscle” e “bed rest muscle”, seguida de uma seleção dos artigos. Resultados – Os estudos encontrados apresentam uma resposta músculo-tendinosa diferencial sendo que o treino protege total ou parcialmente estas estruturas. Conclusão – O treino de resistance de intensidade elevada e baixas repetições associado a exercícios específicos é o mais adequado para responder ao descondicionamento. - ABSTRACT - Introduction – The increased microgravity exposition time raised the need for training programs to avoid muscle and tendinous deconditioning. Objectives – To identify the deconditioning patterns and to identify and describe the training programs used for its prevention during and after microgravity exposure. Methods – This literature review is based on a search conducted via MEDLINE/PubMed and PEDro using the following search words: “spaceflight rehabilitation”, “spaceflight muscle”, “microgravity muscle” and “bed rest muscle”. The search was followed by an article selection. Results – The studies reveal a differential exposure phenomenon for which the training programs reviewed are partly effective. Conclusion – According to the literature the high intensity low volume resistance programs with specific exercises are more appropriate to address the deconditioning problem.
Resumo:
Power converters play a vital role in the integration of wind power into the electrical grid. Variable-speed wind turbine generator systems have a considerable interest of application for grid connection at constant frequency. In this paper, comprehensive simulation studies are carried out with three power converter topologies: matrix, two-level and multilevel. A fractional-order control strategy is studied for the variable-speed operation of wind turbine generator systems. The studies are in order to compare power converter topologies and control strategies. The studies reveal that the multilevel converter and the proposed fractional-order control strategy enable an improvement in the power quality, in comparison with the other power converters using a classical integer-order control strategy. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
As wind power generation undergoes rapid growth, new technical challenges emerge: dynamic stability and power quality. The influence of wind speed disturbances and a pitch control malfunction on the quality of the energy injected into the electric grid is studied for variable-speed wind turbines with different power-electronic converter topologies. Additionally, a new control strategy is proposed for the variable-speed operation of wind turbines with permanent magnet synchronous generators. The performance of disturbance attenuation and system robustness is ascertained. Simulation results are presented and conclusions are duly drawn. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
Trabalho Final de mestrado para obtenção do grau de Mestre em Engenharia Civil