864 resultados para Feature selection algorithm


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Most machine-learning algorithms are designed for datasets with features of a single type whereas very little attention has been given to datasets with mixed-type features. We recently proposed a model to handle mixed types with a probabilistic latent variable formalism. This proposed model describes the data by type-specific distributions that are conditionally independent given the latent space and is called generalised generative topographic mapping (GGTM). It has often been observed that visualisations of high-dimensional datasets can be poor in the presence of noisy features. In this paper we therefore propose to extend the GGTM to estimate feature saliency values (GGTMFS) as an integrated part of the parameter learning process with an expectation-maximisation (EM) algorithm. The efficacy of the proposed GGTMFS model is demonstrated both for synthetic and real datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Due to dynamic variability, identifying the specific conditions under which non-functional requirements (NFRs) are satisfied may be only possible at runtime. Therefore, it is necessary to consider the dynamic treatment of relevant information during the requirements specifications. The associated data can be gathered by monitoring the execution of the application and its underlying environment to support reasoning about how the current application configuration is fulfilling the established requirements. This paper presents a dynamic decision-making infrastructure to support both NFRs representation and monitoring, and to reason about the degree of satisfaction of NFRs during runtime. The infrastructure is composed of: (i) an extended feature model aligned with a domain-specific language for representing NFRs to be monitored at runtime; (ii) a monitoring infrastructure to continuously assess NFRs at runtime; and (iii) a exible decision-making process to select the best available configuration based on the satisfaction degree of the NRFs. The evaluation of the approach has shown that it is able to choose application configurations that well fit user NFRs based on runtime information. The evaluation also revealed that the proposed infrastructure provided consistent indicators regarding the best application configurations that fit user NFRs. Finally, a benefit of our approach is that it allows us to quantify the level of satisfaction with respect to NFRs specification.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Value of online Question Answering (QandA) communities is driven by the question-answering behaviour of its members. Finding the questions that members are willing to answer is therefore vital to the effcient operation of such communities. In this paper, we aim to identify the parameters that cor- relate with such behaviours. We train different models and construct effective predictions using various user, question and thread feature sets. We show that answering behaviour can be predicted with a high level of success.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lifelong surveillance is not cost-effective after endovascular aneurysm repair (EVAR), but is required to detect aortic complications which are fatal if untreated (type 1/3 endoleak, sac expansion, device migration). Aneurysm morphology determines the probability of aortic complications and therefore the need for surveillance, but existing analyses have proven incapable of identifying patients at sufficiently low risk to justify abandoning surveillance. This study aimed to improve the prediction of aortic complications, through the application of machine-learning techniques. Patients undergoing EVAR at 2 centres were studied from 2004–2010. Aneurysm morphology had previously been studied to derive the SGVI Score for predicting aortic complications. Bayesian Neural Networks were designed using the same data, to dichotomise patients into groups at low- or high-risk of aortic complications. Network training was performed only on patients treated at centre 1. External validation was performed by assessing network performance independently of network training, on patients treated at centre 2. Discrimination was assessed by Kaplan-Meier analysis to compare aortic complications in predicted low-risk versus predicted high-risk patients. 761 patients aged 75 +/− 7 years underwent EVAR in 2 centres. Mean follow-up was 36+/− 20 months. Neural networks were created incorporating neck angu- lation/length/diameter/volume; AAA diameter/area/volume/length/tortuosity; and common iliac tortuosity/diameter. A 19-feature network predicted aor- tic complications with excellent discrimination and external validation (5-year freedom from aortic complications in predicted low-risk vs predicted high-risk patients: 97.9% vs. 63%; p < 0.0001). A Bayesian Neural-Network algorithm can identify patients in whom it may be safe to abandon surveillance after EVAR. This proposal requires prospective study.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Principal component analysis (PCA) is well recognized in dimensionality reduction, and kernel PCA (KPCA) has also been proposed in statistical data analysis. However, KPCA fails to detect the nonlinear structure of data well when outliers exist. To reduce this problem, this paper presents a novel algorithm, named iterative robust KPCA (IRKPCA). IRKPCA works well in dealing with outliers, and can be carried out in an iterative manner, which makes it suitable to process incremental input data. As in the traditional robust PCA (RPCA), a binary field is employed for characterizing the outlier process, and the optimization problem is formulated as maximizing marginal distribution of a Gibbs distribution. In this paper, this optimization problem is solved by stochastic gradient descent techniques. In IRKPCA, the outlier process is in a high-dimensional feature space, and therefore kernel trick is used. IRKPCA can be regarded as a kernelized version of RPCA and a robust form of kernel Hebbian algorithm. Experimental results on synthetic data demonstrate the effectiveness of IRKPCA. © 2010 Taylor & Francis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A model of multiple criteria decision making is presented for selecting the “best” of a finite number of alternatives. Techniques of scoring the alternatives and weighting the criteria are combined with different evaluating procedures and amalgamated in an interactive algorithm. Application of this method for choosing the best tender in a competitive bidding is discussed and a case is presented in some detail.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since the seminal works of Markowitz (1952), Sharpe (1964), and Lintner (1965), numerous studies on portfolio selection and performance measure have been based upon the mean-variance framework. However, several researchers (e.g., Arditti (1967, and 1971), Samuelson (1970), and Rubinstein (1973)) argue that the higher moments cannot be neglected unless there is reason to believe that: (i) the asset returns are normally distributed and the investor's utility function is quadratic, or (ii) the empirical evidence demonstrates that higher moments are irrelevant to the investor's decision. Based on the same argument, this dissertation investigates the impact of higher moments of return distributions on three issues concerning the 14 international stock markets.^ First, the portfolio selection with skewness is determined using: the Polynomial Goal Programming in which investor preferences for skewness can be incorporated. The empirical findings suggest that the return distributions of international stock markets are not normally distributed, and that the incorporation of skewness into an investor's portfolio decision causes a major change in the construction of his optimal portfolio. The evidence also indicates that an investor will trade expected return of the portfolio for skewness. Moreover, when short sales are allowed, investors are better off as they attain higher expected return and skewness simultaneously.^ Second, the performance of international stock markets are evaluated using two types of performance measures: (i) the two-moment performance measures of Sharpe (1966), and Treynor (1965), and (ii) the higher-moment performance measures of Prakash and Bear (1986), and Stephens and Proffitt (1991). The empirical evidence indicates that higher moments of return distributions are significant and relevant to the investor's decision. Thus, the higher moment performance measures should be more appropriate to evaluate the performances of international stock markets. The evidence also indicates that various measures provide a vastly different performance ranking of the markets, albeit in the same direction.^ Finally, the inter-temporal stability of the international stock markets is investigated using the Parhizgari and Prakash (1989) algorithm for the Sen and Puri (1968) test which accounts for non-normality of return distributions. The empirical finding indicates that there is strong evidence to support the stability in international stock market movements. However, when the Anderson test which assumes normality of return distributions is employed, the stability in the correlation structure is rejected. This suggests that the non-normality of the return distribution is an important factor that cannot be ignored in the investigation of inter-temporal stability of international stock markets. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent advances in airborne Light Detection and Ranging (LIDAR) technology allow rapid and inexpensive measurements of topography over large areas. Airborne LIDAR systems usually return a 3-dimensional cloud of point measurements from reflective objects scanned by the laser beneath the flight path. This technology is becoming a primary method for extracting information of different kinds of geometrical objects, such as high-resolution digital terrain models (DTMs), buildings and trees, etc. In the past decade, LIDAR gets more and more interest from researchers in the field of remote sensing and GIS. Compared to the traditional data sources, such as aerial photography and satellite images, LIDAR measurements are not influenced by sun shadow and relief displacement. However, voluminous data pose a new challenge for automated extraction the geometrical information from LIDAR measurements because many raster image processing techniques cannot be directly applied to irregularly spaced LIDAR points. ^ In this dissertation, a framework is proposed to filter out information about different kinds of geometrical objects, such as terrain and buildings from LIDAR automatically. They are essential to numerous applications such as flood modeling, landslide prediction and hurricane animation. The framework consists of several intuitive algorithms. Firstly, a progressive morphological filter was developed to detect non-ground LIDAR measurements. By gradually increasing the window size and elevation difference threshold of the filter, the measurements of vehicles, vegetation, and buildings are removed, while ground data are preserved. Then, building measurements are identified from no-ground measurements using a region growing algorithm based on the plane-fitting technique. Raw footprints for segmented building measurements are derived by connecting boundary points and are further simplified and adjusted by several proposed operations to remove noise, which is caused by irregularly spaced LIDAR measurements. To reconstruct 3D building models, the raw 2D topology of each building is first extracted and then further adjusted. Since the adjusting operations for simple building models do not work well on 2D topology, 2D snake algorithm is proposed to adjust 2D topology. The 2D snake algorithm consists of newly defined energy functions for topology adjusting and a linear algorithm to find the minimal energy value of 2D snake problems. Data sets from urbanized areas including large institutional, commercial, and small residential buildings were employed to test the proposed framework. The results demonstrated that the proposed framework achieves a very good performance. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Traffic incidents are non-recurring events that can cause a temporary reduction in roadway capacity. They have been recognized as a major contributor to traffic congestion on our nation’s highway systems. To alleviate their impacts on capacity, automatic incident detection (AID) has been applied as an incident management strategy to reduce the total incident duration. AID relies on an algorithm to identify the occurrence of incidents by analyzing real-time traffic data collected from surveillance detectors. Significant research has been performed to develop AID algorithms for incident detection on freeways; however, similar research on major arterial streets remains largely at the initial stage of development and testing. This dissertation research aims to identify design strategies for the deployment of an Artificial Neural Network (ANN) based AID algorithm for major arterial streets. A section of the US-1 corridor in Miami-Dade County, Florida was coded in the CORSIM microscopic simulation model to generate data for both model calibration and validation. To better capture the relationship between the traffic data and the corresponding incident status, Discrete Wavelet Transform (DWT) and data normalization were applied to the simulated data. Multiple ANN models were then developed for different detector configurations, historical data usage, and the selection of traffic flow parameters. To assess the performance of different design alternatives, the model outputs were compared based on both detection rate (DR) and false alarm rate (FAR). The results show that the best models were able to achieve a high DR of between 90% and 95%, a mean time to detect (MTTD) of 55-85 seconds, and a FAR below 4%. The results also show that a detector configuration including only the mid-block and upstream detectors performs almost as well as one that also includes a downstream detector. In addition, DWT was found to be able to improve model performance, and the use of historical data from previous time cycles improved the detection rate. Speed was found to have the most significant impact on the detection rate, while volume was found to contribute the least. The results from this research provide useful insights on the design of AID for arterial street applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research is motivated by the need for considering lot sizing while accepting customer orders in a make-to-order (MTO) environment, in which each customer order must be delivered by its due date. Job shop is the typical operation model used in an MTO operation, where the production planner must make three concurrent decisions; they are order selection, lot size, and job schedule. These decisions are usually treated separately in the literature and are mostly led to heuristic solutions. The first phase of the study is focused on a formal definition of the problem. Mathematical programming techniques are applied to modeling this problem in terms of its objective, decision variables, and constraints. A commercial solver, CPLEX is applied to solve the resulting mixed-integer linear programming model with small instances to validate the mathematical formulation. The computational result shows it is not practical for solving problems of industrial size, using a commercial solver. The second phase of this study is focused on development of an effective solution approach to this problem of large scale. The proposed solution approach is an iterative process involving three sequential decision steps of order selection, lot sizing, and lot scheduling. A range of simple sequencing rules are identified for each of the three subproblems. Using computer simulation as the tool, an experiment is designed to evaluate their performance against a set of system parameters. For order selection, the proposed weighted most profit rule performs the best. The shifting bottleneck and the earliest operation finish time both are the best scheduling rules. For lot sizing, the proposed minimum cost increase heuristic, based on the Dixon-Silver method performs the best, when the demand-to-capacity ratio at the bottleneck machine is high. The proposed minimum cost heuristic, based on the Wagner-Whitin algorithm is the best lot-sizing heuristic for shops of a low demand-to-capacity ratio. The proposed heuristic is applied to an industrial case to further evaluate its performance. The result shows it can improve an average of total profit by 16.62%. This research contributes to the production planning research community with a complete mathematical definition of the problem and an effective solution approach to solving the problem of industry scale.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Global connectivity, for anyone, at anyplace, at anytime, to provide high-speed, high-quality, and reliable communication channels for mobile devices, is now becoming a reality. The credit mainly goes to the recent technological advances in wireless communications comprised of a wide range of technologies, services, and applications to fulfill the particular needs of end-users in different deployment scenarios (Wi-Fi, WiMAX, and 3G/4G cellular systems). In such a heterogeneous wireless environment, one of the key ingredients to provide efficient ubiquitous computing with guaranteed quality and continuity of service is the design of intelligent handoff algorithms. Traditional single-metric handoff decision algorithms, such as Received Signal Strength (RSS) based, are not efficient and intelligent enough to minimize the number of unnecessary handoffs, decision delays, and call-dropping and/or blocking probabilities. This research presented a novel approach for the design and implementation of a multi-criteria vertical handoff algorithm for heterogeneous wireless networks. Several parallel Fuzzy Logic Controllers were utilized in combination with different types of ranking algorithms and metric weighting schemes to implement two major modules: the first module estimated the necessity of handoff, and the other module was developed to select the best network as the target of handoff. Simulations based on different traffic classes, utilizing various types of wireless networks were carried out by implementing a wireless test-bed inspired by the concept of Rudimentary Network Emulator (RUNE). Simulation results indicated that the proposed scheme provided better performance in terms of minimizing the unnecessary handoffs, call dropping, and call blocking and handoff blocking probabilities. When subjected to Conversational traffic and compared against the RSS-based reference algorithm, the proposed scheme, utilizing the FTOPSIS ranking algorithm, was able to reduce the average outage probability of MSs moving with high speeds by 17%, new call blocking probability by 22%, the handoff blocking probability by 16%, and the average handoff rate by 40%. The significant reduction in the resulted handoff rate provides MS with efficient power consumption, and more available battery life. These percentages indicated a higher probability of guaranteed session continuity and quality of the currently utilized service, resulting in higher user satisfaction levels.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Physiological signals, which are controlled by the autonomic nervous system (ANS), could be used to detect the affective state of computer users and therefore find applications in medicine and engineering. The Pupil Diameter (PD) seems to provide a strong indication of the affective state, as found by previous research, but it has not been investigated fully yet. ^ In this study, new approaches based on monitoring and processing the PD signal for off-line and on-line affective assessment ("relaxation" vs. "stress") are proposed. Wavelet denoising and Kalman filtering methods are first used to remove abrupt changes in the raw Pupil Diameter (PD) signal. Then three features (PDmean, PDmax and PDWalsh) are extracted from the preprocessed PD signal for the affective state classification. In order to select more relevant and reliable physiological data for further analysis, two types of data selection methods are applied, which are based on the paired t-test and subject self-evaluation, respectively. In addition, five different kinds of the classifiers are implemented on the selected data, which achieve average accuracies up to 86.43% and 87.20%, respectively. Finally, the receiver operating characteristic (ROC) curve is utilized to investigate the discriminating potential of each individual feature by evaluation of the area under the ROC curve, which reaches values above 0.90. ^ For the on-line affective assessment, a hard threshold is implemented first in order to remove the eye blinks from the PD signal and then a moving average window is utilized to obtain the representative value PDr for every one-second time interval of PD. There are three main steps for the on-line affective assessment algorithm, which are preparation, feature-based decision voting and affective determination. The final results show that the accuracies are 72.30% and 73.55% for the data subsets, which were respectively chosen using two types of data selection methods (paired t-test and subject self-evaluation). ^ In order to further analyze the efficiency of affective recognition through the PD signal, the Galvanic Skin Response (GSR) was also monitored and processed. The highest affective assessment classification rate obtained from GSR processing is only 63.57% (based on the off-line processing algorithm). The overall results confirm that the PD signal should be considered as one of the most powerful physiological signals to involve in future automated real-time affective recognition systems, especially for detecting the "relaxation" vs. "stress" states.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Traffic incidents are non-recurring events that can cause a temporary reduction in roadway capacity. They have been recognized as a major contributor to traffic congestion on our national highway systems. To alleviate their impacts on capacity, automatic incident detection (AID) has been applied as an incident management strategy to reduce the total incident duration. AID relies on an algorithm to identify the occurrence of incidents by analyzing real-time traffic data collected from surveillance detectors. Significant research has been performed to develop AID algorithms for incident detection on freeways; however, similar research on major arterial streets remains largely at the initial stage of development and testing. This dissertation research aims to identify design strategies for the deployment of an Artificial Neural Network (ANN) based AID algorithm for major arterial streets. A section of the US-1 corridor in Miami-Dade County, Florida was coded in the CORSIM microscopic simulation model to generate data for both model calibration and validation. To better capture the relationship between the traffic data and the corresponding incident status, Discrete Wavelet Transform (DWT) and data normalization were applied to the simulated data. Multiple ANN models were then developed for different detector configurations, historical data usage, and the selection of traffic flow parameters. To assess the performance of different design alternatives, the model outputs were compared based on both detection rate (DR) and false alarm rate (FAR). The results show that the best models were able to achieve a high DR of between 90% and 95%, a mean time to detect (MTTD) of 55-85 seconds, and a FAR below 4%. The results also show that a detector configuration including only the mid-block and upstream detectors performs almost as well as one that also includes a downstream detector. In addition, DWT was found to be able to improve model performance, and the use of historical data from previous time cycles improved the detection rate. Speed was found to have the most significant impact on the detection rate, while volume was found to contribute the least. The results from this research provide useful insights on the design of AID for arterial street applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Supply chain operations directly affect service levels. Decision on amendment of facilities is generally decided based on overall cost, leaving out the efficiency of each unit. Decomposing the supply chain superstructure, efficiency analysis of the facilities (warehouses or distribution centers) that serve customers can be easily implemented. With the proposed algorithm, the selection of a facility is based on service level maximization and not just cost minimization as this analysis filters all the feasible solutions utilizing Data Envelopment Analysis (DEA) technique. Through multiple iterations, solutions are filtered via DEA and only the efficient ones are selected leading to cost minimization. In this work, the problem of optimal supply chain networks design is addressed based on a DEA based algorithm. A Branch and Efficiency (B&E) algorithm is deployed for the solution of this problem. Based on this DEA approach, each solution (potentially installed warehouse, plant etc) is treated as a Decision Making Unit, thus is characterized by inputs and outputs. The algorithm through additional constraints named “efficiency cuts”, selects only efficient solutions providing better objective function values. The applicability of the proposed algorithm is demonstrated through illustrative examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Knowledge-based radiation treatment is an emerging concept in radiotherapy. It

mainly refers to the technique that can guide or automate treatment planning in

clinic by learning from prior knowledge. Dierent models are developed to realize

it, one of which is proposed by Yuan et al. at Duke for lung IMRT planning. This

model can automatically determine both beam conguration and optimization ob-

jectives with non-coplanar beams based on patient-specic anatomical information.

Although plans automatically generated by this model demonstrate equivalent or

better dosimetric quality compared to clinical approved plans, its validity and gener-

ality are limited due to the empirical assignment to a coecient called angle spread

constraint dened in the beam eciency index used for beam ranking. To eliminate

these limitations, a systematic study on this coecient is needed to acquire evidences

for its optimal value.

To achieve this purpose, eleven lung cancer patients with complex tumor shape

with non-coplanar beams adopted in clinical approved plans were retrospectively

studied in the frame of the automatic lung IMRT treatment algorithm. The primary

and boost plans used in three patients were treated as dierent cases due to the

dierent target size and shape. A total of 14 lung cases, thus, were re-planned using

the knowledge-based automatic lung IMRT planning algorithm by varying angle

spread constraint from 0 to 1 with increment of 0.2. A modied beam angle eciency

index used for navigate the beam selection was adopted. Great eorts were made to assure the quality of plans associated to every angle spread constraint as good

as possible. Important dosimetric parameters for PTV and OARs, quantitatively

re

ecting the plan quality, were extracted from the DVHs and analyzed as a function

of angle spread constraint for each case. Comparisons of these parameters between

clinical plans and model-based plans were evaluated by two-sampled Students t-tests,

and regression analysis on a composite index built on the percentage errors between

dosimetric parameters in the model-based plans and those in the clinical plans as a

function of angle spread constraint was performed.

Results show that model-based plans generally have equivalent or better quality

than clinical approved plans, qualitatively and quantitatively. All dosimetric param-

eters except those for lungs in the automatically generated plans are statistically

better or comparable to those in the clinical plans. On average, more than 15% re-

duction on conformity index and homogeneity index for PTV and V40, V60 for heart

while an 8% and 3% increase on V5, V20 for lungs, respectively, are observed. The

intra-plan comparison among model-based plans demonstrates that plan quality does

not change much with angle spread constraint larger than 0.4. Further examination

on the variation curve of the composite index as a function of angle spread constraint

shows that 0.6 is the optimal value that can result in statistically the best achievable

plans.