100 resultados para statistical machine learning


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traffic congestion is one of the major problems in modern cities. This study applies machine learning methods to determine green times in order to minimize in an isolated intersection. Q-learning and neural networks are applied here to set signal light times and minimize total delays. It is assumed that an intersection behaves in a similar fashion to an intelligent agent learning how to set green times in each cycle based on traffic information. Here, a comparison between Q-learning and neural network is presented. In Q-learning, considering continuous green time requires a large state space, making the learning process practically impossible. In contrast to Q-learning methods, the neural network model can easily set the appropriate green time to fit the traffic demand. The performance of the proposed neural network is compared with two traditional alternatives for controlling traffic lights. Simulation results indicate that the application of the proposed method greatly reduces the total delay in the network compared to the alternative methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Developers sometimes maintain an internal copy of another software or fork development of an existing project. This practice can lead to software vulnerabilities when the embedded code is not kept up to date with upstream sources. We propose an automated solution to identify clones of packages without any prior knowledge of these relationships. We then correlate clones with vulnerability information to identify outstanding security problems. This approach motivates software maintainers to avoid using cloned packages and link against system wide libraries. We propose over 30 novel features that enable us to use to use pattern classification to accurately identify package-level clones. To our knowledge, we are the first to consider clone detection as a classification problem. Our results show our system, Clonewise, compares well to manually tracked databases. Based on our work, over 30 unknown package clones and vulnerabilities have been identified and patched.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Structural MRI offers anatomical details and high sensitivity to pathological changes. It can demonstrate certain patterns of brain changes present at a structural level. Research to date has shown that volumetric analysis of brain regions has importance in depression detection. However, such analysis has had very minimal use in depression detection studies at individual level. Optimally combining various brain volumetric features/attributes, and summarizing the data into a distinctive set of variables remain difficult. This study investigates machine learning algorithms that automatically identify relevant data attributes for depression detection. Different machine learning techniques are studied for depression classification based on attributes extracted from structural MRI (sMRI) data. The attributes include volume calculated from whole brain, white matter, grey matter and hippocampus. Attributes subset selection is performed aiming to remove redundant attributes using three filtering methods and one hybrid method, in combination with ranker search algorithms. The highest average classification accuracy, obtained by using a combination of both SVM-EM and IG-Random Tree algorithms, is 85.23%. The classification approach implemented in this study can achieve higher accuracy than most reported studies using sMRI data, specifically for detection of depression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Using the prediction of cancer outcome as a model, we have tested the hypothesis that through analysing routinely collected digital data contained in an electronic administrative record (EAR), using machine-learning techniques, we could enhance conventional methods in predicting clinical outcomes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For years, we have relied on population surveys to keep track of regional public health statistics, including the prevalence of non-communicable diseases. Because of the cost and limitations of such surveys, we often do not have the up-to-date data on health outcomes of a region. In this paper, we examined the feasibility of inferring regional health outcomes from socio-demographic data that are widely available and timely updated through national censuses and community surveys. Using data for 50 American states (excluding Washington DC) from 2007 to 2012, we constructed a machine-learning model to predict the prevalence of six non-communicable disease (NCD) outcomes (four NCDs and two major clinical risk factors), based on population socio-demographic characteristics from the American Community Survey. We found that regional prevalence estimates for non-communicable diseases can be reasonably predicted. The predictions were highly correlated with the observed data, in both the states included in the derivation model (median correlation 0.88) and those excluded from the development for use as a completely separated validation sample (median correlation 0.85), demonstrating that the model had sufficient external validity to make good predictions, based on demographics alone, for areas not included in the model development. This highlights both the utility of this sophisticated approach to model development, and the vital importance of simple socio-demographic characteristics as both indicators and determinants of chronic disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, two real-world medical classification problems using electrocardiogram (ECG) and auscultatory blood pressure (Korotkoff) signals are examined. A total of nine machine learning models are applied to perform classification of the medical data sets. A number of useful performance metrics which include accuracy, sensitivity, specificity, as well as the area under the receiver operating characteristic curve are computed. In addition to the original data sets, noisy data sets are generated to evaluate the robustness of the classifiers against noise. The 10-fold cross validation method is used to compute the performance statistics, in order to ensure statistically reliable results pertaining to classification of the ECG and Korotkoff signals are produced. The outcomes indicate that while logistic regression models perform the best with the original data set, ensemble machine learning models achieve good accuracy rates with noisy data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis advances several theoretical and practical aspects of the recently introduced restricted Boltzmann machine - a powerful probabilistic and generative framework for modelling data and learning representations. The contributions of this study represent a systematic and common theme in learning structured representations from complex data.