796 resultados para Data-Mining Techniques


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper deals with the problem of using the data mining models in a real-world situation where the user can not provide all the inputs with which the predictive model is built. A learning system framework, Query Based Learning System (QBLS), is developed for improving the performance of the predictive models in practice where not all inputs are available for querying to the system. The automatic feature selection algorithm called Query Based Feature Selection (QBFS) is developed for selecting features to obtain a balance between the relative minimum subset of features and the relative maximum classification accuracy. Performance of the QBLS system and the QBFS algorithm is successfully demonstrated with a real-world application

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The management of main material prices of provincial highway project quota has problems of lag and blindness. Framework of provincial highway project quota data MIS and main material price data warehouse were established based on WEB firstly. Then concrete processes of provincial highway project main material prices were brought forward based on BP neural network algorithmic. After that standard BP algorithmic, additional momentum modify BP network algorithmic, self-adaptive study speed improved BP network algorithmic were compared in predicting highway project main prices. The result indicated that it is feasible to predict highway main material prices using BP NN, and using self-adaptive study speed improved BP network algorithmic is the relatively best one.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dealing with the ever-growing information overload in the Internet, Recommender Systems are widely used online to suggest potential customers item they may like or find useful. Collaborative Filtering is the most popular techniques for Recommender Systems which collects opinions from customers in the form of ratings on items, services or service providers. In addition to the customer rating about a service provider, there is also a good number of online customer feedback information available over the Internet as customer reviews, comments, newsgroups post, discussion forums or blogs which is collectively called user generated contents. This information can be used to generate the public reputation of the service providers’. To do this, data mining techniques, specially recently emerged opinion mining could be a useful tool. In this paper we present a state of the art review of Opinion Mining from online customer feedback. We critically evaluate the existing work and expose cutting edge area of interest in opinion mining. We also classify the approaches taken by different researchers into several categories and sub-categories. Each of those steps is analyzed with their strength and limitations in this paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many data mining techniques have been proposed for mining useful patterns in databases. However, how to effectively utilize discovered patterns is still an open research issue, especially in the domain of text mining. Most existing methods adopt term-based approaches. However, they all suffer from the problems of polysemy and synonymy. This paper presents an innovative technique, pattern taxonomy mining, to improve the effectiveness of using discovered patterns for finding useful information. Substantial experiments on RCV1 demonstrate that the proposed solution achieves encouraging performance.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Road curves are an important feature of road infrastructure and many serious crashes occur on road curves. In Queensland, the number of fatalities is twice as many on curves as that on straight roads. Therefore, there is a need to reduce drivers’ exposure to crash risk on road curves. Road crashes in Australia and in the Organisation for Economic Co-operation and Development(OECD) have plateaued in the last five years (2004 to 2008) and the road safety community is desperately seeking innovative interventions to reduce the number of crashes. However, designing an innovative and effective intervention may prove to be difficult as it relies on providing theoretical foundation, coherence, understanding, and structure to both the design and validation of the efficiency of the new intervention. Researchers from multiple disciplines have developed various models to determine the contributing factors for crashes on road curves with a view towards reducing the crash rate. However, most of the existing methods are based on statistical analysis of contributing factors described in government crash reports. In order to further explore the contributing factors related to crashes on road curves, this thesis designs a novel method to analyse and validate these contributing factors. The use of crash claim reports from an insurance company is proposed for analysis using data mining techniques. To the best of our knowledge, this is the first attempt to use data mining techniques to analyse crashes on road curves. Text mining technique is employed as the reports consist of thousands of textual descriptions and hence, text mining is able to identify the contributing factors. Besides identifying the contributing factors, limited studies to date have investigated the relationships between these factors, especially for crashes on road curves. Thus, this study proposed the use of the rough set analysis technique to determine these relationships. The results from this analysis are used to assess the effect of these contributing factors on crash severity. The findings obtained through the use of data mining techniques presented in this thesis, have been found to be consistent with existing identified contributing factors. Furthermore, this thesis has identified new contributing factors towards crashes and the relationships between them. A significant pattern related with crash severity is the time of the day where severe road crashes occur more frequently in the evening or night time. Tree collision is another common pattern where crashes that occur in the morning and involves hitting a tree are likely to have a higher crash severity. Another factor that influences crash severity is the age of the driver. Most age groups face a high crash severity except for drivers between 60 and 100 years old, who have the lowest crash severity. The significant relationship identified between contributing factors consists of the time of the crash, the manufactured year of the vehicle, the age of the driver and hitting a tree. Having identified new contributing factors and relationships, a validation process is carried out using a traffic simulator in order to determine their accuracy. The validation process indicates that the results are accurate. This demonstrates that data mining techniques are a powerful tool in road safety research, and can be usefully applied within the Intelligent Transport System (ITS) domain. The research presented in this thesis provides an insight into the complexity of crashes on road curves. The findings of this research have important implications for both practitioners and academics. For road safety practitioners, the results from this research illustrate practical benefits for the design of interventions for road curves that will potentially help in decreasing related injuries and fatalities. For academics, this research opens up a new research methodology to assess crash severity, related to road crashes on curves.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Road safety is a major concern worldwide. Road safety will improve as road conditions and their effects on crashes are continually investigated. This paper proposes to use the capability of data mining to include the greater set of road variables for all available crashes with skid resistance values across the Queensland state main road network in order to understand the relationships among crash, traffic and road variables. This paper presents a data mining based methodology for the road asset management data to find out the various road properties that contribute unduly to crashes. The models demonstrate high levels of accuracy in predicting crashes in roads when various road properties are included. This paper presents the findings of these models to show the relationships among skid resistance, crashes, crash characteristics and other road characteristics such as seal type, seal age, road type, texture depth, lane count, pavement width, rutting, speed limit, traffic rates intersections, traffic signage and road design and so on.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Developing safe and sustainable road systems is a common goal in all countries. Applications to assist with road asset management and crash minimization are sought universally. This paper presents a data mining methodology using decision trees for modeling the crash proneness of road segments using available road and crash attributes. The models quantify the concept of crash proneness and demonstrate that road segments with only a few crashes have more in common with non-crash roads than roads with higher crash counts. This paper also examines ways of dealing with highly unbalanced data sets encountered in the study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is commonly accepted that wet roads have higher risk of crash than dry roads; however, providing evidence to support this assumption presents some difficulty. This paper presents a data mining case study in which predictive data mining is applied to model the skid resistance and crash relationship to search for discernable differences in the probability of wet and dry road segments having crashes based on skid resistance. The models identify an increased probability of wet road segments having crashes for mid-range skid resistance values.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Road crashes cost world and Australian society a significant proportion of GDP, affecting productivity and causing significant suffering for communities and individuals. This paper presents a case study that generates data mining models that contribute to understanding of road crashes by allowing examination of the role of skid resistance (F60) and other road attributes in road crashes. Predictive data mining algorithms, primarily regression trees, were used to produce road segment crash count models from the road and traffic attributes of crash scenarios. The rules derived from the regression trees provide evidence of the significance of road attributes in contributing to crash, with a focus on the evaluation of skid resistance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments did not support this hypothesis. This paper presents an innovative technique, effective pattern discovery which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Road asset managers are overwhelmed with a high volume of raw data which they need to process and utilise in supporting their decision making. This paper presents a method that processes road-crash data of a whole road network and exposes hidden value inherent in the data by deploying the clustering data mining method. The goal of the method is to partition the road network into a set of groups (classes) based on common data and characterise the class crash types to produce a crash profiles for each cluster. By comparing similar road classes with differing crash types and rates, insight can be gained into these differences that are caused by the particular characteristics of their roads. These differences can be used as evidence in knowledge development and decision support.