979 resultados para mining areas
Resumo:
This paper deals with the problem of using the data mining models in a real-world situation where the user can not provide all the inputs with which the predictive model is built. A learning system framework, Query Based Learning System (QBLS), is developed for improving the performance of the predictive models in practice where not all inputs are available for querying to the system. The automatic feature selection algorithm called Query Based Feature Selection (QBFS) is developed for selecting features to obtain a balance between the relative minimum subset of features and the relative maximum classification accuracy. Performance of the QBLS system and the QBFS algorithm is successfully demonstrated with a real-world application
Resumo:
The management of main material prices of provincial highway project quota has problems of lag and blindness. Framework of provincial highway project quota data MIS and main material price data warehouse were established based on WEB firstly. Then concrete processes of provincial highway project main material prices were brought forward based on BP neural network algorithmic. After that standard BP algorithmic, additional momentum modify BP network algorithmic, self-adaptive study speed improved BP network algorithmic were compared in predicting highway project main prices. The result indicated that it is feasible to predict highway main material prices using BP NN, and using self-adaptive study speed improved BP network algorithmic is the relatively best one.
Resumo:
Queensland University of Technology (QUT) is faced with a rapidly growing research agenda built upon a strategic research capacity-building program. This presentation will outline the results of a project that has recently investigated QUT’s research support requirements and which has developed a model for the support of eResearch across the university. QUT’s research building strategy has produced growth at the faculty level and within its research institutes. This increased research activity is pushing the need for university-wide eResearch platforms capable of providing infrastructure and support in areas such as collaboration, data, networking, authentication and authorisation, workflows and the grid. One of the driving forces behind the investigation is data-centric nature of modern research. It is now critical that researchers have access to supported infrastructure that allows the collection, analysis, aggregation and sharing of large data volumes for exploration and mining in order to gain new insights and to generate new knowledge. However, recent surveys into current research data management practices by the Australian Partnership for Sustainable Repositories (APSR) and by QUT itself, has revealed serious shortcomings in areas such as research data management, especially its long term maintenance for reuse and authoritative evidence of research findings. While these internal university pressures are building, at the same time there are external pressures that are magnifying them. For example, recent compliance guidelines from bodies such as the ARC, and NHMRC and Universities Australia indicate that institutions need to provide facilities for the safe and secure storage of research data along with a surrounding set of policies, on its retention, ownership and accessibility. The newly formed Australian National Data Service (ANDS) is developing strategies and guidelines for research data management and research institutions are a central focus, responsible for managing and storing institutional data on platforms that can be federated nationally and internationally for wider use. For some time QUT has recognised the importance of eResearch and has been active in a number of related areas: ePrints to digitally publish research papers, grid computing portals and workflows, institutional-wide provisioning and authentication systems, and legal protocols for copyright management. QUT also has two widely recognised centres focused on fundamental research into eResearch itself: The OAK LAW project (Open Access to Knowledge) which focuses upon legal issues relating eResearch and the Microsoft QUT eResearch Centre whose goal is to accelerate scientific research discovery, through new smart software. In order to better harness all of these resources and improve research outcomes, the university recently established a project to investigate how it might better organise the support of eResearch. This presentation will outline the project outcomes, which include a flexible and sustainable eResearch support service model addressing short and longer term research needs, identification of resource requirements required to establish and sustain the service, and the development of research data management policies and implementation plans.
Resumo:
A number of factors have been shown to influence residential property prices in various locations. Studies have identified the importance of location in relation to services, transport and proximity to negative factors such as power lines and cell phone towers. Often the socio-economic status of a residential precinct can determine the overall quality and nature of the streetscapes in that area, with higher value suburbs or locations offering a better visual appearance compared to areas where these factors are not present. However, does the same value for a good streetscape apply in lower socio-economic areas or a buyers more motivated by less aesthetic factors such as size of the house, construction materials or land size. This paper analyses specific streets in a lower to middle socio-economic suburb of Christchurch New Zealand to determine if the location of a house in a street with good streetscape appeal has greater value, investment performance and saleability compared to adjoining streets with less aesthetic appeal.
Resumo:
Real-Time Kinematic (RTK) positioning is a technique used to provide precise positioning services at centimetre accuracy level in the context of Global Navigation Satellite Systems (GNSS). While a Network-based RTK (N-RTK) system involves multiple continuously operating reference stations (CORS), the simplest form of a NRTK system is a single-base RTK. In Australia there are several NRTK services operating in different states and over 1000 single-base RTK systems to support precise positioning applications for surveying, mining, agriculture, and civil construction in regional areas. Additionally, future generation GNSS constellations, including modernised GPS, Galileo, GLONASS, and Compass, with multiple frequencies have been either developed or will become fully operational in the next decade. A trend of future development of RTK systems is to make use of various isolated operating network and single-base RTK systems and multiple GNSS constellations for extended service coverage and improved performance. Several computational challenges have been identified for future NRTK services including: • Multiple GNSS constellations and multiple frequencies • Large scale, wide area NRTK services with a network of networks • Complex computation algorithms and processes • Greater part of positioning processes shifting from user end to network centre with the ability to cope with hundreds of simultaneous users’ requests (reverse RTK) There are two major requirements for NRTK data processing based on the four challenges faced by future NRTK systems, expandable computing power and scalable data sharing/transferring capability. This research explores new approaches to address these future NRTK challenges and requirements using the Grid Computing facility, in particular for large data processing burdens and complex computation algorithms. A Grid Computing based NRTK framework is proposed in this research, which is a layered framework consisting of: 1) Client layer with the form of Grid portal; 2) Service layer; 3) Execution layer. The user’s request is passed through these layers, and scheduled to different Grid nodes in the network infrastructure. A proof-of-concept demonstration for the proposed framework is performed in a five-node Grid environment at QUT and also Grid Australia. The Networked Transport of RTCM via Internet Protocol (Ntrip) open source software is adopted to download real-time RTCM data from multiple reference stations through the Internet, followed by job scheduling and simplified RTK computing. The system performance has been analysed and the results have preliminarily demonstrated the concepts and functionality of the new NRTK framework based on Grid Computing, whilst some aspects of the performance of the system are yet to be improved in future work.
Resumo:
Research has noted a ‘pronounced pattern of increase with increasing remoteness' of death rates in road crashes. However, crash characteristics by remoteness are not commonly or consistently reported, with definitions of rural and urban often relying on proxy representations such as prevailing speed limit. The current paper seeks to evaluate the efficacy of the Accessibility / Remoteness Index of Australia (ARIA+) to identifying trends in road crashes. ARIA+ does not rely on road-specific measures and uses distances to populated centres to attribute a score to an area, which can in turn be grouped into 5 classifications of increasing remoteness. The current paper uses applications of these classifications at the broad level of Australian Bureau of Statistics' Statistical Local Areas, thus avoiding precise crash locating or dedicated mapping software. Analyses used Queensland road crash database details for all 31,346 crashes resulting in a fatality or hospitalisation occurring between 1st July, 2001 and 30th June 2006 inclusive. Results showed that this simplified application of ARIA+ aligned with previous definitions such as speed limit, while also providing further delineation. Differences in crash contributing factors were noted with increasing remoteness such as a greater representation of alcohol and ‘excessive speed for circumstances.' Other factors such as the predominance of younger drivers in crashes differed little by remoteness classification. The results are discussed in terms of the utility of remoteness as a graduated rather than binary (rural/urban) construct and the potential for combining ARIA crash data with census and hospital datasets.
Resumo:
Research has shown that road lane width impacts on driver behaviour. This literature review provides guidelines to assist in the design, construction and retrofitting of urban roads to accommodate road users' safety requirements. It focuses on the impacts of lane widths on cyclists and motor vehicle safety behaviour. The literature review commenced with a search of library databases. Peer reviewed articles and road authority (local, state and national) reports were reviewed. The majority of studies investigating the effects of lane width on driver behaviour were simulator based, while research into cycling safety involved data collected from actual traffic environments. Results show that marked road lane width influences perceived task difficulty, risk perception and possibly speed choice. The positioning of cyclists in traffic lanes is influenced by the presence of on-road cycling facilities and the total roadway width. The lateral displacement between bicycle and vehicle is smallest when a bicycle facility is present. Lower, or reduced, vehicle speeds play a significant role in improving bicyclist and pedestrian safety. It is also shown that if road lane widths in urban areas were reduced, to a functional width that was less than the current guidelines of 3.5m, it could result in a safer road environment for all road users.
Resumo:
Classical negotiation models are weak in supporting real-world business negotiations because these models often assume that the preference information of each negotiator is made public. Although parametric learning methods have been proposed for acquiring the preference information of negotiation opponents, these methods suffer from the strong assumptions about the specific utility function and negotiation mechanism employed by the opponents. Consequently, it is difficult to apply these learning methods to the heterogeneous negotiation agents participating in e‑marketplaces. This paper illustrates the design, development, and evaluation of a nonparametric negotiation knowledge discovery method which is underpinned by the well-known Bayesian learning paradigm. According to our empirical testing, the novel knowledge discovery method can speed up the negotiation processes while maintaining negotiation effectiveness. To the best of our knowledge, this is the first nonparametric negotiation knowledge discovery method developed and evaluated in the context of multi-issue bargaining over e‑marketplaces.
Resumo:
It is a big challenge to clearly identify the boundary between positive and negative streams. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on RCV1, and substantial experiments show that the proposed approach achieves encouraging performance.
Resumo:
Dealing with the ever-growing information overload in the Internet, Recommender Systems are widely used online to suggest potential customers item they may like or find useful. Collaborative Filtering is the most popular techniques for Recommender Systems which collects opinions from customers in the form of ratings on items, services or service providers. In addition to the customer rating about a service provider, there is also a good number of online customer feedback information available over the Internet as customer reviews, comments, newsgroups post, discussion forums or blogs which is collectively called user generated contents. This information can be used to generate the public reputation of the service providers’. To do this, data mining techniques, specially recently emerged opinion mining could be a useful tool. In this paper we present a state of the art review of Opinion Mining from online customer feedback. We critically evaluate the existing work and expose cutting edge area of interest in opinion mining. We also classify the approaches taken by different researchers into several categories and sub-categories. Each of those steps is analyzed with their strength and limitations in this paper.
Resumo:
An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).
Resumo:
The wide range of contributing factors and circumstances surrounding crashes on road curves suggest that no single intervention can prevent these crashes. This paper presents a novel methodology, based on data mining techniques, to identify contributing factors and the relationship between them. It identifies contributing factors that influence the risk of a crash. Incident records, described using free text, from a large insurance company were analysed with rough set theory. Rough set theory was used to discover dependencies among data, and reasons using the vague, uncertain and imprecise information that characterised the insurance dataset. The results show that male drivers, who are between 50 and 59 years old, driving during evening peak hours are involved with a collision, had a lowest crash risk. Drivers between 25 and 29 years old, driving from around midnight to 6 am and in a new car has the highest risk. The analysis of the most significant contributing factors on curves suggests that drivers with driving experience of 25 to 42 years, who are driving a new vehicle have the highest crash cost risk, characterised by the vehicle running off the road and hitting a tree. This research complements existing statistically based tools approach to analyse road crashes. Our data mining approach is supported with proven theory and will allow road safety practitioners to effectively understand the dependencies between contributing factors and the crash type with the view to designing tailored countermeasures.
Resumo:
Despite all attempts to prevent fraud, it continues to be a major threat to industry and government. Traditionally, organizations have focused on fraud prevention rather than detection, to combat fraud. In this paper we present a role mining inspired approach to represent user behaviour in Enterprise Resource Planning (ERP) systems, primarily aimed at detecting opportunities to commit fraud or potentially suspicious activities. We have adapted an approach which uses set theory to create transaction profiles based on analysis of user activity records. Based on these transaction profiles, we propose a set of (1) anomaly types to detect potentially suspicious user behaviour and (2) scenarios to identify inadequate segregation of duties in an ERP environment. In addition, we present two algorithms to construct a directed acyclic graph to represent relationships between transaction profiles. Experiments were conducted using a real dataset obtained from a teaching environment and a demonstration dataset, both using SAP R/3, presently the most predominant ERP system. The results of this empirical research demonstrate the effectiveness of the proposed approach.
Resumo:
The misuse of alcohol is well documented in Australia and has been associated with disorders and harms that often require police attention. The extent of alcohol-related incidents requiring police attention has been recorded as substantial in some Australian cities (Arro, Crook, & Fenton, 1992; Davey & French, 1995; Ireland & Thommeny, 1993). A significant proportion of harmful drinking occurs in and around licensed premises (Jochelson, 1997; Stockwell, Masters, Phillips, Daly, Gahegan, Midford, & Philp, 1998; Borges, Cherpitel, & Rosovsky, 1998) and most of these incidents are not reported to police (Bryant & Williams, 2000; Lister, Hobbs, Hall, & Winlow, 2000). Alcohol-related incidents have also been found to be concentrated in certain places at certain times (Jochelson, 1997) and therefore manipulating the context in which these incidents occur may provide a means to prevent and reduce the harm associated with alcohol misuse. One of the major objectives of the present program of research was to investigate the occurrence and resource impact of alcohol-related incidents on operational (general duties) policing across a large geographical area. A second objective of the thesis was to examine the characteristics and temporal/spatial dynamics of police attended alcohol incidents in the context of Place Based theories of crime. It was envisaged that this approach would reveal the patterns of the most prevalent offences and demonstrate the relevance of Place Based theories of crime to understanding these patterns. In addition, the role of alcohol, time and place were also explored in order to examine the association between non criminal traffic offences and other types of criminal offences. A final objective of the thesis was to examine the impact of a situational crime prevention strategy that had been initiated to reduce the violence and disorder associated with late-night liquor trading premises. The program of research in this doctorate thesis has been undertaken through the presentation of published papers. The research was conducted in three stages which produced six manuscripts, five of which were submitted to peer reviewed journals and one that was published in a peer reviewed conference proceedings. Stage One included two studies (Studies 1 & 2) both of which involved a cross sectional approach to examine the prevalence and characteristics of alcohol-related incidents requiring police attendance across three large geographical areas that included metropolitan cities, provincial regions and rural areas. Stage Two of the program of research also comprised two cross sectional quantitative studies (Studies 3 & 4) that investigated the temporal and spatial dynamics of the major offence categories attended by operational police in a specific Police District (Gold Coast). Stage Three of the program of research involved two studies (Studies 5 & 6) that assessed the effectiveness of a situational crime prevention strategy. The studies employed a pre-post design to assess the impact on crime, disorder and violence by preventing patrons from entering late-night liquor trading premises between 3 a.m. and 5 a.m. (lockout policy). Although Study Five was solely quantitative in nature, Study Six included both quantitative and qualitative aspects. The approach adopted in Study Six, therefore facilitated not only a quantative comparison of the impact of the lockout policy on different policing areas, but also enabled the processes related to the implementation of the lockout policy to be examined. The thesis reports a program of research involving a common data collection method which then involved a series of studies being conducted to explore different aspects of the data. The data was collected from three sources. Firstly a pilot phase was undertaken to provide participants with training. Secondly a main study period was undertaken immediately following the pilot phase. The first and second sources of data were collected between 29th March 2004 and 2nd May 2004. Thirdly, additional data was collected between the 1st April 2005 and 31st May 2005. Participants in the current program of research were first response operational police officers who completed a modified activity log over a 9 week period (4 week pilot phase & 5 week survey study phase), identifying the type, prevalence and characteristics of alcohol-related incidents that were attended. During the study period police officers attended 31,090 alcohol-related incidents. Studies One and Two revealed that a substantial proportion of current police work involves attendance at alcohol-related incidents (i.e., 25% largely involving young males aged between 17 and 24 years). The most common incidents police attended were vehicle and/or traffic matters, disturbances and offences against property. The major category of offences most likely to involve alcohol included vehicle/traffic matters, disturbances and offences against the person (e.g., common & serious assaults). These events were most likely to occur in the late evenings and early hours of the morning on the weekends, and importantly, usually took longer for police to complete than non alcohol-related incidents. The findings in Studies Three and Four suggest that serious traffic offences, disturbances and offences against the person share similar characteristics and occur in concentrated places at similar times. In addition, it was found that time, place and incident type all have an influence on whether an incident attended by a police officer is alcohol-related. Alcohol-related incidents are more likely to occur in particular locations in the late evenings and early mornings on the weekends. In particular, there was a strong association between the occurrence of alcohol-related disturbances and alcohol-related serious traffic offences in regards to place and time. In general, stealing and property offences were not alcohol-related and occurred in daylight hours during weekdays. The results of Studies Five and Six were mixed. A number of alcohol-related offences requiring police attention were significantly reduced for some policing areas and for some types of offences following the implementation of the lockout policy. However, in some locations the lockout policy appeared to have a negative or minimal impact. Interviews with licensees revealed that although all were initially opposed to the lockout policy as they believed it would have a negative impact on business, most perceived some benefits from its introduction. Some of the benefits included, improved patron safety and the development of better business strategies to increase patron numbers. In conclusion, the overall findings of the six studies highlight the pervasive nature of alcohol across a range of criminal incidents, demonstrating the tremendous impact alcohol-related incidents have on police. The findings also demonstrate the importance of time and place in predicting the occurrence of alcohol-related offences. Although this program of research did not set out to test Place Based theories of crime, these theories were used to inform the interpretation of findings. The findings in the current research program provide evidence for the relevance of Place Based theories of crime to understanding the factors contributing to violence and disorder, and designing relevant crime prevention strategies. For instance, the results in Studies Five and Six provide supportive evidence that this novel lockout initiative can be beneficial for public safety by reducing some types of offences in particular areas in and around late-night liquor trading premises. Finally, intelligent-led policing initiatives based on problem oriented policing, such as the lockout policy examined in this thesis, have potential as a major crime prevention technique to reduce specific types of alcohol-related offences.