961 resultados para Selection Problems


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.

While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.

For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a case-based heuristic selection approach for automated university course and exam timetabling. The method described in this paper is motivated by the goal of developing timetabling systems that are fundamentally more general than the current state of the art. Heuristics that worked well in previous similar situations are memorized in a case base and are retrieved for solving the problem in hand. Knowledge discovery techniques are employed in two distinct scenarios. Firstly, we model the problem and the problem solving situations along with specific heuristics for those problems. Secondly, we refine the case base and discard cases which prove to be non-useful in solving new problems. Experimental results are presented and analyzed. It is shown that case based reasoning can act effectively as an intelligent approach to learn which heuristics work well for particular timetabling situations. We conclude by outlining and discussing potential research issues in this critical area of knowledge discovery for different difficult timetabling problems.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a case-based heuristic selection approach for automated university course and exam timetabling. The method described in this paper is motivated by the goal of developing timetabling systems that are fundamentally more general than the current state of the art. Heuristics that worked well in previous similar situations are memorized in a case base and are retrieved for solving the problem in hand. Knowledge discovery techniques are employed in two distinct scenarios. Firstly, we model the problem and the problem solving situations along with specific heuristics for those problems. Secondly, we refine the case base and discard cases which prove to be non-useful in solving new problems. Experimental results are presented and analyzed. It is shown that case based reasoning can act effectively as an intelligent approach to learn which heuristics work well for particular timetabling situations. We conclude by outlining and discussing potential research issues in this critical area of knowledge discovery for different difficult timetabling problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For a sustainable building industry, not only should the environmental and economic indicators be evaluated but also the societal indicators for building. Current indicators can be in conflict with each other, thus decision making is difficult to clearly quantify and assess sustainability. For the sustainable building, the objectives of decreasing both adverse environmental impact and cost are in conflict. In addition, even though both objectives may be satisfied, building management systems may present other problems such as convenience of occupants, flexibility of building, or technical maintenance, which are difficult to quantify as exact assessment data. These conflicting problems confronting building managers or planners render building management more difficult. This paper presents a methodology to evaluate a sustainable building considering socio-economic and environmental characteristics of buildings, and is intended to assist the decision making for building planners or practitioners. The suggested methodology employs three main concepts: linguistic variables, fuzzy numbers, and an analytic hierarchy process. The linguistic variables are used to represent the degree of appropriateness of qualitative indicators, which are vague or uncertain. These linguistic variables are then translated into fuzzy numbers to reflect their uncertainties and aggregated into the final fuzzy decision value using a hierarchical structure. Through a case study, the suggested methodology is applied to the evaluation of a building. The result demonstrates that the suggested approach can be a useful tool for evaluating a building for sustainability.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error estimate may be converted into a data-based penalty function and the performance of the estimate is governed by the quality of the error estimate. We consider several penalty functions, involving error estimates on independent test data, empirical VC dimension, empirical VC entropy, and margin-based quantities. We also consider the maximal difference between the error on the first half of the training data and the second half, and the expected maximal discrepancy, a closely related capacity estimate that can be calculated by Monte Carlo integration. Maximal discrepancy penalty functions are appealing for pattern classification problems, since their computation is equivalent to empirical risk minimization over the training data with some labels flipped.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Packaged software is pre-built with the intention of licensing it to users in domestic settings and work organisations. This thesis focuses upon the work organisation where packaged software has been characterised as one of the latest ‘solutions’ to the problems of information systems. The study investigates the packaged software selection process that has, to date, been largely viewed as objective and rational. In contrast, this interpretive study is based on a 21⁄2 year long field study of organisational experiences with packaged software selection at T.Co, a consultancy organisation based in the United Kingdom. Emerging from the iterative process of case study and action research is an alternative theory of packaged software selection. The research argues that packaged software selection is far from the rationalistic and linear process that previous studies suggest. Instead, the study finds that aspects of the traditional process of selection incorporating the activities of gathering requirements, evaluation and selection based on ‘best fit’ may or may not take place. Furthermore, even where these aspects occur they may not have equal weight or impact upon implementation and usage as may be expected. This is due to the influence of those multiple realities which originate from the organisational and market environments within which packages are created, selected and used, the lack of homogeneity in organisational contexts and the variously interpreted characteristics of the package in question.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The key to reducing cost of electric vehicles is integration. All too often systems such as the motor, motor controller, batteries and vehicle chassis/body are considered as separate problems. The truth is that a lot of trade-offs can be made between these systems, causing an overall improvement in many areas including total cost. Motor controller and battery cost have a relatively simple relationship; the less energy lost in the motor controller the less energy that has to be carried in the batteries, hence the lower the battery cost. A motor controller’s cost is primarily influenced by the cost of the switches. This paper will therefore present a method of assessing the optimal switch selection on the premise that the optimal switch is the one that produces the lowest system cost, where system cost is the cost of batteries + switches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nowadays, most of the infrastructure development projects undertaken are complex in nature. Practically, public clients who do not have a good understanding of the design and management may suffer severe losses, especially for infrastructure projects. There is a need for luring the right consultant to secure client's investment in infrastructure developments. Throughout the project life cycle, consultants play vital role from the inception to completion stage of a project. A few studies in Malaysia show that infrastructure projects involving irrigation and drainage have experience problems such as poor workmanship, delay and cost overrun due to the consultant's inability or the client incompetence of recruiting consultants in time. This highlights the need of aided decision making and an efficient system to select the best consultant by using Decision Support System (DSS). On the other hand, recent trends reveal that most DSS in construction only concentrate on decision model development. These models are impractical and unused as they are complicated or difficult for laymen such as project managers to utilize. Thus, this research attempts to develop an efficient DSS for consultant selection namely consultDeSS. Driven by the motivation and research aims, this study deployed Design Science Research Methodology (DSRM) dominant with a combination of case studies at the Malaysian Department of Irrigation and Drainage (DID). Two real projects involving irrigation and drainage infrastructure were used to design, implement and evaluate the artefact. The 3-tier consultDeSS was revised after the evaluation and the design was significantly improved based on user feedback. By developing desirable tools that fit client's needs will enhance the productivity and minimize conflict within groups and organisations. The tool is more usable and efficient compared to previous studies in construction. Thus, this research has demonstrated a purposeful artefact with a practical and valid structured development approach that is applicable in a variety of problems in construction discipline.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose – This paper seeks to analyse the process of packaged software selection in a small organization, focussing particularly on the role of IT consultants as intermediaries in the process. Design/methodology/approach – This is based upon a longitudinal, qualitative field study concerning the adoption of a customer relationship management package in an SME management consultancy. Findings – The authors illustrate how the process of “salesmanship”, an activity directed by the vendor/consultant and focussed on the interests of senior management, marginalises user needs and ultimately secures the procurement of the software package. Research limitations/implications – Despite the best intentions the authors lose something of the rich detail of the lived experience of technology in presenting the case study as a linear narrative. Specifically, the authors have been unable to do justice to the complexity of the multifarious ways in which individual perceptions of the project were influenced and shaped by the opinions of others. Practical implications – Practitioners, particularly those from within SMEs, should be made aware of the ways in which external parties may have a vested interest in steering projects in a particular direction, which may not necessarily align with their own interests. Originality/value – This study highlights in detail the role of consultants and vendors in software selection processes, an area which has received minimal attention to date. Prior work in this area emphasises the necessary conditions for, and positive outcomes of, appointing external parties in an SME context, with only limited attention being paid to the potential problems such engagements may bring.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

By definition, the two faces of a pi bond are equivalent.1 However, they are rendered nonequivalent in most molecules because of the absence of a plane of symmetry encompassing the double bond and the adjacent substituents. As a result, additions to trigonal centers from the two faces need not be equally facile. Exploiting this stereodifferentiation in a controlled manner represents one of the core problems in organic synthesis. Evidently, the factors which determine such diastereoselection need to be delineated in as much detail as possible.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we develop a game theoretic approach for clustering features in a learning problem. Feature clustering can serve as an important preprocessing step in many problems such as feature selection, dimensionality reduction, etc. In this approach, we view features as rational players of a coalitional game where they form coalitions (or clusters) among themselves in order to maximize their individual payoffs. We show how Nash Stable Partition (NSP), a well known concept in the coalitional game theory, provides a natural way of clustering features. Through this approach, one can obtain some desirable properties of the clusters by choosing appropriate payoff functions. For a small number of features, the NSP based clustering can be found by solving an integer linear program (ILP). However, for large number of features, the ILP based approach does not scale well and hence we propose a hierarchical approach. Interestingly, a key result that we prove on the equivalence between a k-size NSP of a coalitional game and minimum k-cut of an appropriately constructed graph comes in handy for large scale problems. In this paper, we use feature selection problem (in a classification setting) as a running example to illustrate our approach. We conduct experiments to illustrate the efficacy of our approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The freshwater river systems and floodplains of Bangladesh are the breeding grounds for 13 endemic species of carps and barbs and a large number of other fish species, including a number of exotic carps and other species that have been introduced for aquaculture. Since 1967, breeding of endemic and exotic aquaculture species for seed producton through hypophysation techniques has become a common practice. The paper describes the present status of broodstock management, identifies problems, and suggests some guidelines to control negative selection and inbreeding in hatchery stocks in Bangladesh.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The application of high performance textiles has grown significantly in the last 10 to 15 years. Various research groups throughout the United Kingdom, such as the Department of Trade and Industry, have identified technical textiles as a field for future development. There is little design guidance for joining of flexible materials or general property models that can be applied to theses materials. This lack is due to the large diversity of properties, structures and resulting behaviours of the materials that are classified as "Flexible Materials". This dissertation explores the issues that are involved in characterising the materials at the fibre, bulk and textile levels. Different units of measurement are used for each stage of the manufacturing process of flexible materials and this disparity creates problems when trying to make general comparisons (e.g. comparing textiles to polymer films). Thus, a possible solution to this is to create selection charts that allow designers to compare the strength of materials for a given mass per unit area. A design tool was created using the Cambridge Engineering Selector (CES) software to enable the selection of joining processes for material. The tool is effective in selecting a reduced number of viable joining processes. Through case studies it was shown that designers are required to examine the selected processes (identified by the software) in greater detail - in particular the economics and geometry of the joint - in order to identify the optimum joining process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In most recent substructuring methods, a fundamental role is played by the coarse space. For some of these methods (e.g. BDDC and FETI-DP), its definition relies on a 'minimal' set of coarse nodes (sometimes called corners) which assures invertibility of local subdomain problems and also of the global coarse problem. This basic set is typically enhanced by enforcing continuity of functions at some generalized degrees of freedom, such as average values on edges or faces of subdomains. We revisit existing algorithms for selection of corners. The main contribution of this paper consists of proposing a new heuristic algorithm for this purpose. Considering faces as the basic building blocks of the interface, inherent parallelism, and better robustness with respect to disconnected subdomains are among features of the new technique. The advantages of the presented algorithm in comparison to some earlier approaches are demonstrated on three engineering problems of structural analysis solved by the BDDC method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis presents a learning based approach for detecting classes of objects and patterns with variable image appearance but highly predictable image boundaries. It consists of two parts. In part one, we introduce our object and pattern detection approach using a concrete human face detection example. The approach first builds a distribution-based model of the target pattern class in an appropriate feature space to describe the target's variable image appearance. It then learns from examples a similarity measure for matching new patterns against the distribution-based target model. The approach makes few assumptions about the target pattern class and should therefore be fairly general, as long as the target class has predictable image boundaries. Because our object and pattern detection approach is very much learning-based, how well a system eventually performs depends heavily on the quality of training examples it receives. The second part of this thesis looks at how one can select high quality examples for function approximation learning tasks. We propose an {em active learning} formulation for function approximation, and show for three specific approximation function classes, that the active example selection strategy learns its target with fewer data samples than random sampling. We then simplify the original active learning formulation, and show how it leads to a tractable example selection paradigm, suitable for use in many object and pattern detection problems.