Biblioteca Digital

153 resultados para attribute subset selection

em CentAUR: Central Archive University of Reading - UK

Sparse model identification using orthogonal forward regression with basis pursuit and D-optimality

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An efficient model identification algorithm for a large class of linear-in-the-parameters models is introduced that simultaneously optimises the model approximation ability, sparsity and robustness. The derived model parameters in each forward regression step are initially estimated via the orthogonal least squares (OLS), followed by being tuned with a new gradient-descent learning algorithm based on the basis pursuit that minimises the l(1) norm of the parameter estimate vector. The model subset selection cost function includes a D-optimality design criterion that maximises the determinant of the design matrix of the subset to ensure model robustness and to enable the model selection procedure to automatically terminate at a sparse model. The proposed approach is based on the forward OLS algorithm using the modified Gram-Schmidt procedure. Both the parameter tuning procedure, based on basis pursuit, and the model selection criterion, based on the D-optimality that is effective in ensuring model robustness, are integrated with the forward regression. As a consequence the inherent computational efficiency associated with the conventional forward OLS approach is maintained in the proposed algorithm. Examples demonstrate the effectiveness of the new approach.

Nonlinear model structure design and construction using orthogonal least squares and D-optimality design

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A very efficient learning algorithm for model subset selection is introduced based on a new composite cost function that simultaneously optimizes the model approximation ability and model robustness and adequacy. The derived model parameters are estimated via forward orthogonal least squares, but the model subset selection cost function includes a D-optimality design criterion that maximizes the determinant of the design matrix of the subset to ensure the model robustness, adequacy, and parsimony of the final model. The proposed approach is based on the forward orthogonal least square (OLS) algorithm, such that new D-optimality-based cost function is constructed based on the orthogonalization process to gain computational advantages and hence to maintain the inherent advantage of computational efficiency associated with the conventional forward OLS approach. Illustrative examples are included to demonstrate the effectiveness of the new approach.

Neurofuzzy design and model construction of nonlinear dynamical processes from data

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A common problem in many data based modelling algorithms such as associative memory networks is the problem of the curse of dimensionality. In this paper, a new two-stage neurofuzzy system design and construction algorithm (NeuDeC) for nonlinear dynamical processes is introduced to effectively tackle this problem. A new simple preprocessing method is initially derived and applied to reduce the rule base, followed by a fine model detection process based on the reduced rule set by using forward orthogonal least squares model structure detection. In both stages, new A-optimality experimental design-based criteria we used. In the preprocessing stage, a lower bound of the A-optimality design criterion is derived and applied as a subset selection metric, but in the later stage, the A-optimality design criterion is incorporated into a new composite cost function that minimises model prediction error as well as penalises the model parameter variance. The utilisation of NeuDeC leads to unbiased model parameters with low parameter variance and the additional benefit of a parsimonious model structure. Numerical examples are included to demonstrate the effectiveness of this new modelling approach for high dimensional inputs.

Nonlinear model structure detection using optimum experimental design and orthogonal least squares

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A very efficient learning algorithm for model subset selection is introduced based on a new composite cost function that simultaneously optimizes the model approximation ability and model adequacy. The derived model parameters are estimated via forward orthogonal least squares, but the subset selection cost function includes an A-optimality design criterion to minimize the variance of the parameter estimates that ensures the adequacy and parsimony of the final model. An illustrative example is included to demonstrate the effectiveness of the new approach.

A comparative review of dimension reduction methods in approximate Bayesian computation

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Approximate Bayesian computation (ABC) methods make use of comparisons between simulated and observed summary statistics to overcome the problem of computationally intractable likelihood functions. As the practical implementation of ABC requires computations based on vectors of summary statistics, rather than full data sets, a central question is how to derive low-dimensional summary statistics from the observed data with minimal loss of information. In this article we provide a comprehensive review and comparison of the performance of the principal methods of dimension reduction proposed in the ABC literature. The methods are split into three nonmutually exclusive classes consisting of best subset selection methods, projection techniques and regularization. In addition, we introduce two new methods of dimension reduction. The first is a best subset selection method based on Akaike and Bayesian information criteria, and the second uses ridge regression as a regularization procedure. We illustrate the performance of these dimension reduction techniques through the analysis of three challenging models and data sets.

Key performance indicators (KPIs) and priority setting in using the multi-attribute approach for assessing sustainable intelligent buildings

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main objectives of this paper are to: firstly, identify key issues related to sustainable intelligent buildings (environmental, social, economic and technological factors); develop a conceptual model for the selection of the appropriate KPIs; secondly, test critically stakeholder's perceptions and values of selected KPIs intelligent buildings; and thirdly develop a new model for measuring the level of sustainability for sustainable intelligent buildings. This paper uses a consensus-based model (Sustainable Built Environment Tool- SuBETool), which is analysed using the analytical hierarchical process (AHP) for multi-criteria decision-making. The use of the multi-attribute model for priority setting in the sustainability assessment of intelligent buildings is introduced. The paper commences by reviewing the literature on sustainable intelligent buildings research and presents a pilot-study investigating the problems of complexity and subjectivity. This study is based upon a survey perceptions held by selected stakeholders and the value they attribute to selected KPIs. It is argued that the benefit of the new proposed model (SuBETool) is a ‘tool’ for ‘comparative’ rather than an absolute measurement. It has the potential to provide useful lessons from current sustainability assessment methods for strategic future of sustainable intelligent buildings in order to improve a building's performance and to deliver objective outcomes. Findings of this survey enrich the field of intelligent buildings in two ways. Firstly, it gives a detailed insight into the selection of sustainable building indicators, as well as their degree of importance. Secondly, it tesst critically stakeholder's perceptions and values of selected KPIs intelligent buildings. It is concluded that the priority levels for selected criteria is largely dependent on the integrated design team, which includes the client, architects, engineers and facilities managers.

Ratio selection for classification models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper is concerned with the selection of inputs for classification models based on ratios of measured quantities. For this purpose, all possible ratios are built from the quantities involved and variable selection techniques are used to choose a convenient subset of ratios. In this context, two selection techniques are proposed: one based on a pre-selection procedure and another based on a genetic algorithm. In an example involving the financial distress prediction of companies, the models obtained from ratios selected by the proposed techniques compare favorably to a model using ratios usually found in the financial distress literature.

Automatic near real-time selection of flood water levels from high resolution Synthetic Aperture Radar images for assimilation into hydraulic models: a case study

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Flood extents caused by fluvial floods in urban and rural areas may be predicted by hydraulic models. Assimilation may be used to correct the model state and improve the estimates of the model parameters or external forcing. One common observation assimilated is the water level at various points along the modelled reach. Distributed water levels may be estimated indirectly along the flood extents in Synthetic Aperture Radar (SAR) images by intersecting the extents with the floodplain topography. It is necessary to select a subset of levels for assimilation because adjacent levels along the flood extent will be strongly correlated. A method for selecting such a subset automatically and in near real-time is described, which would allow the SAR water levels to be used in a forecasting model. The method first selects candidate waterline points in flooded rural areas having low slope. The waterline levels and positions are corrected for the effects of double reflections between the water surface and emergent vegetation at the flood edge. Waterline points are also selected in flooded urban areas away from radar shadow and layover caused by buildings, with levels similar to those in adjacent rural areas. The resulting points are thinned to reduce spatial autocorrelation using a top-down clustering approach. The method was developed using a TerraSAR-X image from a particular case study involving urban and rural flooding. The waterline points extracted proved to be spatially uncorrelated, with levels reasonably similar to those determined manually from aerial photographs, and in good agreement with those of nearby gauges.

Advanced feature selection methods in multinominal dementia classification from structural MRI data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent studies showed that features extracted from brain MRIs can well discriminate Alzheimer’s disease from Mild Cognitive Impairment. This study provides an algorithm that sequentially applies advanced feature selection methods for findings the best subset of features in terms of binary classification accuracy. The classifiers that provided the highest accuracies, have been then used for solving a multi-class problem by the one-versus-one strategy. Although several approaches based on Regions of Interest (ROIs) extraction exist, the prediction power of features has not yet investigated by comparing filter and wrapper techniques. The findings of this work suggest that (i) the IntraCranial Volume (ICV) normalization can lead to overfitting and worst the accuracy prediction of test set and (ii) the combined use of a Random Forest-based filter with a Support Vector Machines-based wrapper, improves accuracy of binary classification.

Estimation after subpopulation selection in adaptive seamless trials

Relevância:

30.00% 30.00%

Publicador:

Resumo:

During the development of new therapies, it is not uncommon to test whether a new treatment works better than the existing treatment for all patients who suffer from a condition (full population) or for a subset of the full population (subpopulation). One approach that may be used for this objective is to have two separate trials, where in the first trial, data are collected to determine if the new treatment benefits the full population or the subpopulation. The second trial is a confirmatory trial to test the new treatment in the population selected in the first trial. In this paper, we consider the more efficient two-stage adaptive seamless designs (ASDs), where in stage 1, data are collected to select the population to test in stage 2. In stage 2, additional data are collected to perform confirmatory analysis for the selected population. Unlike the approach that uses two separate trials, for ASDs, stage 1 data are also used in the confirmatory analysis. Although ASDs are efficient, using stage 1 data both for selection and confirmatory analysis introduces selection bias and consequently statistical challenges in making inference. We will focus on point estimation for such trials. In this paper, we describe the extent of bias for estimators that ignore multiple hypotheses and selecting the population that is most likely to give positive trial results based on observed stage 1 data. We then derive conditionally unbiased estimators and examine their mean squared errors for different scenarios.

Selection-mutation balance models with epistatic selection

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an application of birth-and-death processes on configuration spaces to a generalized mutation4 selection balance model. The model describes the aging of population as a process of accumulation of mu5 tations in a genotype. A rigorous treatment demands that mutations correspond to points in abstract spaces. 6 Our model describes an infinite-population, infinite-sites model in continuum. The dynamical equation which 7 describes the system, is of Kimura-Maruyama type. The problem can be posed in terms of evolution of states 8 (differential equation) or, equivalently, represented in terms of Feynman-Kac formula. The questions of interest 9 are the existence of a solution, its asymptotic behavior, and properties of the limiting state. In the non-epistatic 10 case the problem was posed and solved in [Steinsaltz D., Evans S.N., Wachter K.W., Adv. Appl. Math., 2005, 11 35(1)]. In our model we consider a topological space X as the space of positions of mutations and the influence of epistatic potentials

Bias from farmer self-selection in genetically modified crop productivity estimates: Evidence from Indian data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the continuing debate over the impact of genetically modified (GM) crops on farmers of developing countries, it is important to accurately measure magnitudes such as farm-level yield gains from GM crop adoption. Yet most farm-level studies in the literature do not control for farmer self-selection, a potentially important source of bias in such estimates. We use farm-level panel data from Indian cotton farmers to investigate the yield effect of GM insect-resistant cotton. We explicitly take into account the fact that the choice of crop variety is an endogenous variable which might lead to bias from self-selection. A production function is estimated using a fixed-effects model to control for selection bias. Our results show that efficient farmers adopt Bacillus thuringiensis (Bt) cotton at a higher rate than their less efficient peers. This suggests that cross-sectional estimates of the yield effect of Bt cotton, which do not control for self-selection effects, are likely to be biased upwards. However, after controlling for selection bias, we still find that there is a significant positive yield effect from adoption of Bt cotton that more than offsets the additional cost of Bt seed.

Modelling habitat selection and distribution of the critically endangered Jerdon's courser Rhinoptilus bitorquatus in scrub jungle: an application of a new tracking method

Relevância:

20.00% 20.00%

Publicador:

Resumo:

1. Jerdon's courser Rhinoptilus bitorquatus is a nocturnally active cursorial bird that is only known to occur in a small area of scrub jungle in Andhra Pradesh, India, and is listed as critically endangered by the IUCN. Information on its habitat requirements is needed urgently to underpin conservation measures. We quantified the habitat features that correlated with the use of different areas of scrub jungle by Jerdon's coursers, and developed a model to map potentially suitable habitat over large areas from satellite imagery and facilitate the design of surveys of Jerdon's courser distribution. 2. We used 11 arrays of 5-m long tracking strips consisting of smoothed fine soil to detect the footprints of Jerdon's coursers, and measured tracking rates (tracking events per strip night). We counted the number of bushes and trees, and described other attributes of vegetation and substrate in a 10-m square plot centred on each strip. We obtained reflectance data from Landsat 7 satellite imagery for the pixel within which each strip lay. 3. We used logistic regression models to describe the relationship between tracking rate by Jerdon's coursers and characteristics of the habitat around the strips, using ground-based survey data and satellite imagery. 4. Jerdon's coursers were most likely to occur where the density of large (>2 m tall) bushes was in the range 300-700 ha(-1) and where the density of smaller bushes was less than 1000 ha(-1). This habitat was detectable using satellite imagery. 5. Synthesis and applications. The occurrence of Jerdon's courser is strongly correlated with the density of bushes and trees, and is in turn affected by grazing with domestic livestock, woodcutting and mechanical clearance of bushes to create pasture, orchards and farmland. It is likely that there is an optimal level of grazing and woodcutting that would maintain or create suitable conditions for the species. Knowledge of the species' distribution is incomplete and there is considerable pressure from human use of apparently suitable habitats. Hence, distribution mapping is a high conservation priority. A two-step procedure is proposed, involving the use of ground surveys of bush density to calibrate satellite image-based mapping of potential habitat. These maps could then be used to select priority areas for Jerdon's courser surveys. The use of tracking strips to study habitat selection and distribution has potential in studies of other scarce and secretive species.

The impact of contractor selection method on transaction costs: a review

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The basic premise of transaction-cost theory is that the decision to outsource, rather than to undertake work in-house, is determined by the relative costs incurred in each of these forms of economic organization. In construction the "make or buy" decision invariably leads to a contract. Reducing the costs of entering into a contractual relationship (transaction costs) raises the value of production and is therefore desirable. Commonly applied methods of contractor selection may not minimise the costs of contracting. Research evidence suggests that although competitive tendering typically results in the lowest bidder winning the contract this may not represent the lowest project cost after completion. Multi-parameter and quantitative models for contractor selection have been developed to identify the best (or least risky) among bidders. A major area in which research is still needed is in investigating the impact of different methods of contractor selection on the costs of entering into a contract and the decision to outsource.

Variable selection and interpretation in correlation principal components

Relevância:

20.00% 20.00%

Publicador:

«
1
2
3
4
5
6
7
8
9
10
11
»