868 resultados para Data mining
Resumo:
Experience plays an important role in building management. “How often will this asset need repair?” or “How much time is this repair going to take?” are types of questions that project and facility managers face daily in planning activities. Failure or success in developing good schedules, budgets and other project management tasks depend on the project manager's ability to obtain reliable information to be able to answer these types of questions. Young practitioners tend to rely on information that is based on regional averages and provided by publishing companies. This is in contrast to experienced project managers who tend to rely heavily on personal experience. Another aspect of building management is that many practitioners are seeking to improve available scheduling algorithms, estimating spreadsheets and other project management tools. Such “micro-scale” levels of research are important in providing the required tools for the project manager's tasks. However, even with such tools, low quality input information will produce inaccurate schedules and budgets as output. Thus, it is also important to have a broad approach to research at a more “macro-scale.” Recent trends show that the Architectural, Engineering, Construction (AEC) industry is experiencing explosive growth in its capabilities to generate and collect data. There is a great deal of valuable knowledge that can be obtained from the appropriate use of this data and therefore the need has arisen to analyse this increasing amount of available data. Data Mining can be applied as a powerful tool to extract relevant and useful information from this sea of data. Knowledge Discovery in Databases (KDD) and Data Mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so large amounts of project data may be analysed. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. The project involves the development of a prototype tool to support facility managers, building owners and designers. This Industry focused report presents the AIMMTM prototype system and documents how and what data mining techniques can be applied, the results of their application and the benefits gained from the system. The AIMMTM system is capable of searching for useful patterns of knowledge and correlations within the existing building maintenance data to support decision making about future maintenance operations. The application of the AIMMTM prototype system on building models and their maintenance data (supplied by industry partners) utilises various data mining algorithms and the maintenance data is analysed using interactive visual tools. The application of the AIMMTM prototype system to help in improving maintenance management and building life cycle includes: (i) data preparation and cleaning, (ii) integrating meaningful domain attributes, (iii) performing extensive data mining experiments in which visual analysis (using stacked histograms), classification and clustering techniques, associative rule mining algorithm such as “Apriori” and (iv) filtering and refining data mining results, including the potential implications of these results for improving maintenance management. Maintenance data of a variety of asset types were selected for demonstration with the aim of discovering meaningful patterns to assist facility managers in strategic planning and provide a knowledge base to help shape future requirements and design briefing. Utilising the prototype system developed here, positive and interesting results regarding patterns and structures of data have been obtained.
Resumo:
The project has further developed two programs for the industry partners related to service life prediction and salt deposition. The program for Queensland Department of Main Roads which predicts salt deposition on different bridge structures at any point in Queensland has been further refined by looking at more variables. It was found that the height of the bridge significantly affects the salt deposition levels only when very close to the coast. However the effect of natural cleaning of salt by rainfall was incorporated into the program. The user interface allows selection of a location in Queensland, followed by a bridge component. The program then predicts the annual salt deposition rate and rates the likely severity of the environment. The service life prediction program for the Queensland Department of Public Works has been expanded to include 10 common building components, in a variety of environments. Data mining procedures have been used to develop the program and increase the usefulness of the application. A Query Based Learning System (QBLS) has been developed which is based on a data-centric model with extensions to provide support for user interaction. The program is based on number of sources of information about the service life of building components. These include the Delphi survey, the CSIRO Holistic model and a school survey. During the project, the Holistic model was modified for each building component and databases generated for the locations of all Queensland schools. Experiments were carried out to verify and provide parameters for the modelling. These included instrumentation of a downpipe, measurements on pH and chloride levels in leaf litter, EIS measurements and chromate leaching from Colorbond materials and dose tests to measure corrosion rates of new materials. A further database was also generated for inclusion in the program through a large school survey. Over 30 schools in a range of environments from tropical coastal to temperate inland were visited and the condition of the building components rated on a scale of 0-5. The data was analysed and used to calculate an average service life for each component/material combination in the environments, where sufficient examples were available.
Resumo:
Real-World Data Mining Applications generally do not end up with the creation of the models. The use of the model is the final purpose especially in prediction tasks. The problem arises when the model is built based on much more information than that the user can provide in using the model. As a result, the performance of model reduces drastically due to many missing attributes values. This paper develops a new learning system framework, called as User Query Based Learning System (UQBLS), for building data mining models best suitable for users use. We demonstrate its deployment in a real-world application of the lifetime prediction of metallic components in buildings
Resumo:
AIMM stands for 'Agents for Improved Maintenance Management.' The AIMM system is a prototype tool that has developed the state of the art life cycle modelling of buildings through the linking of a 3D model with maintenance data to allow both the facility manager and the designer to gain access to building maintenance information and knowledge that is currently inaccessible. AIMM integrates data mining agents into the maintenance process to produce timely data for the facility manager on the effects of different maintenance regimes.
Resumo:
This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.
Resumo:
Automatic detection of suspicious activities in CCTV camera feeds is crucial to the success of video surveillance systems. Such a capability can help transform the dumb CCTV cameras into smart surveillance tools for fighting crime and terror. Learning and classification of basic human actions is a precursor to detecting suspicious activities. Most of the current approaches rely on a non-realistic assumption that a complete dataset of normal human actions is available. This paper presents a different approach to deal with the problem of understanding human actions in video when no prior information is available. This is achieved by working with an incomplete dataset of basic actions which are continuously updated. Initially, all video segments are represented by Bags-Of-Words (BOW) method using only Term Frequency-Inverse Document Frequency (TF-IDF) features. Then, a data-stream clustering algorithm is applied for updating the system's knowledge from the incoming video feeds. Finally, all the actions are classified into different sets. Experiments and comparisons are conducted on the well known Weizmann and KTH datasets to show the efficacy of the proposed approach.
Resumo:
Drivers' ability to react to unpredictable events deteriorates when exposed to highly predictable and uneventful driving tasks. Particularly, highway design reduces the driving task mainly to a lane-keeping one. It contributes to hypovigilance and road crashes as drivers are often not aware that their driving behaviour is impaired. Monotony increases fatigue, however, the fatigue community has mainly focused on endogenous factors leading to fatigue such as sleep deprivation. This paper focuses on the exogenous factor monotony which contributes to hypovigilance. Objective measurements of the effects of monotonous driving conditions on the driver and the vehicle's dynamics is systematically reviewed with the aim of justifying the relevance of the need for a mathematical framework that could predict hypovigilance in real-time. Although electroencephalography (EEG) is one of the most reliable measures of vigilance, it is obtrusive. This suggests to predict from observable variables the time when the driver is hypovigilant. Outlined is a vision for future research in the modelling of driver vigilance decrement due to monotonous driving conditions. A mathematical model for predicting drivers’ hypovigilance using information like lane positioning, steering wheel movements and eye blinks is provided. Such a modelling of driver vigilance should enable the future development of an in-vehicle device that detects driver hypovigilance in advance, thus offering the potential to enhance road safety and prevent road crashes.
Resumo:
This paper describes a novel framework for facial expression recognition from still images by selecting, optimizing and fusing ‘salient’ Gabor feature layers to recognize six universal facial expressions using the K nearest neighbor classifier. The recognition comparisons with all layer approach using JAFFE and Cohn-Kanade (CK) databases confirm that using ‘salient’ Gabor feature layers with optimized sizes can achieve better recognition performance and dramatically reduce computational time. Moreover, comparisons with the state of the art performances demonstrate the effectiveness of our approach.
Resumo:
In condition-based maintenance (CBM), effective diagnostics and prognostics are essential tools for maintenance engineers to identify imminent fault and to predict the remaining useful life before the components finally fail. This enables remedial actions to be taken in advance and reschedules production if necessary. This paper presents a technique for accurate assessment of the remnant life of machines based on historical failure knowledge embedded in the closed loop diagnostic and prognostic system. The technique uses the Support Vector Machine (SVM) classifier for both fault diagnosis and evaluation of health stages of machine degradation. To validate the feasibility of the proposed model, the five different level data of typical four faults from High Pressure Liquefied Natural Gas (HP-LNG) pumps were used for multi-class fault diagnosis. In addition, two sets of impeller-rub data were analysed and employed to predict the remnant life of pump based on estimation of health state. The results obtained were very encouraging and showed that the proposed prognosis system has the potential to be used as an estimation tool for machine remnant life prediction in real life industrial applications.
Resumo:
Intelligent software agents are promising in improving the effectiveness of e-marketplaces for e-commerce. Although a large amount of research has been conducted to develop negotiation protocols and mechanisms for e-marketplaces, existing negotiation mechanisms are weak in dealing with complex and dynamic negotiation spaces often found in e-commerce. This paper illustrates a novel knowledge discovery method and a probabilistic negotiation decision making mechanism to improve the performance of negotiation agents. Our preliminary experiments show that the probabilistic negotiation agents empowered by knowledge discovery mechanisms are more effective and efficient than the Pareto optimal negotiation agents in simulated e-marketplaces.
Resumo:
Over the years, people have often held the hypothesis that negative feedback should be very useful for largely improving the performance of information filtering systems; however, we have not obtained very effective models to support this hypothesis. This paper, proposes an effective model that use negative relevance feedback based on a pattern mining approach to improve extracted features. This study focuses on two main issues of using negative relevance feedback: the selection of constructive negative examples to reduce the space of negative examples; and the revision of existing features based on the selected negative examples. The former selects some offender documents, where offender documents are negative documents that are most likely to be classified in the positive group. The later groups the extracted features into three groups: the positive specific category, general category and negative specific category to easily update the weight. An iterative algorithm is also proposed to implement this approach on RCV1 data collections, and substantial experiments show that the proposed approach achieves encouraging performance.
Resumo:
Random Indexing K-tree is the combination of two algorithms suited for large scale document clustering.
Resumo:
This article explores two matrix methods to induce the ``shades of meaning" (SoM) of a word. A matrix representation of a word is computed from a corpus of traces based on the given word. Non-negative Matrix Factorisation (NMF) and Singular Value Decomposition (SVD) compute a set of vectors corresponding to a potential shade of meaning. The two methods were evaluated based on loss of conditional entropy with respect to two sets of manually tagged data. One set reflects concepts generally appearing in text, and the second set comprises words used for investigations into word sense disambiguation. Results show that for NMF consistently outperforms SVD for inducing both SoM of general concepts as well as word senses. The problem of inducing the shades of meaning of a word is more subtle than that of word sense induction and hence relevant to thematic analysis of opinion where nuances of opinion can arise.
Resumo:
We argue that web service discovery technology should help the user navigate a complex problem space by providing suggestions for services which they may not be able to formulate themselves as (s)he lacks the epistemic resources to do so. Free text documents in service environments provide an untapped source of information for augmenting the epistemic state of the user and hence their ability to search effectively for services. A quantitative approach to semantic knowledge representation is adopted in the form of semantic space models computed from these free text documents. Knowledge of the user’s agenda is promoted by associational inferences computed from the semantic space. The inferences are suggestive and aim to promote human abductive reasoning to guide the user from fuzzy search goals into a better understanding of the problem space surrounding the given agenda. Experimental results are discussed based on a complex and realistic planning activity.
Resumo:
This paper proposes a novel Hybrid Clustering approach for XML documents (HCX) that first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. The empirical analysis reveals that the proposed method is scalable and accurate.