991 resultados para Data-cleaning
Resumo:
With the advent of Service Oriented Architecture, Web Services have gained tremendous popularity. Due to the availability of a large number of Web services, finding an appropriate Web service according to the requirement of the user is a challenge. This warrants the need to establish an effective and reliable process of Web service discovery. A considerable body of research has emerged to develop methods to improve the accuracy of Web service discovery to match the best service. The process of Web service discovery results in suggesting many individual services that partially fulfil the user’s interest. By considering the semantic relationships of words used in describing the services as well as the use of input and output parameters can lead to accurate Web service discovery. Appropriate linking of individual matched services should fully satisfy the requirements which the user is looking for. This research proposes to integrate a semantic model and a data mining technique to enhance the accuracy of Web service discovery. A novel three-phase Web service discovery methodology has been proposed. The first phase performs match-making to find semantically similar Web services for a user query. In order to perform semantic analysis on the content present in the Web service description language document, the support-based latent semantic kernel is constructed using an innovative concept of binning and merging on the large quantity of text documents covering diverse areas of domain of knowledge. The use of a generic latent semantic kernel constructed with a large number of terms helps to find the hidden meaning of the query terms which otherwise could not be found. Sometimes a single Web service is unable to fully satisfy the requirement of the user. In such cases, a composition of multiple inter-related Web services is presented to the user. The task of checking the possibility of linking multiple Web services is done in the second phase. Once the feasibility of linking Web services is checked, the objective is to provide the user with the best composition of Web services. In the link analysis phase, the Web services are modelled as nodes of a graph and an allpair shortest-path algorithm is applied to find the optimum path at the minimum cost for traversal. The third phase which is the system integration, integrates the results from the preceding two phases by using an original fusion algorithm in the fusion engine. Finally, the recommendation engine which is an integral part of the system integration phase makes the final recommendations including individual and composite Web services to the user. In order to evaluate the performance of the proposed method, extensive experimentation has been performed. Results of the proposed support-based semantic kernel method of Web service discovery are compared with the results of the standard keyword-based information-retrieval method and a clustering-based machine-learning method of Web service discovery. The proposed method outperforms both information-retrieval and machine-learning based methods. Experimental results and statistical analysis also show that the best Web services compositions are obtained by considering 10 to 15 Web services that are found in phase-I for linking. Empirical results also ascertain that the fusion engine boosts the accuracy of Web service discovery by combining the inputs from both the semantic analysis (phase-I) and the link analysis (phase-II) in a systematic fashion. Overall, the accuracy of Web service discovery with the proposed method shows a significant improvement over traditional discovery methods.
Resumo:
Experience plays an important role in building management. “How often will this asset need repair?” or “How much time is this repair going to take?” are types of questions that project and facility managers face daily in planning activities. Failure or success in developing good schedules, budgets and other project management tasks depend on the project manager's ability to obtain reliable information to be able to answer these types of questions. Young practitioners tend to rely on information that is based on regional averages and provided by publishing companies. This is in contrast to experienced project managers who tend to rely heavily on personal experience. Another aspect of building management is that many practitioners are seeking to improve available scheduling algorithms, estimating spreadsheets and other project management tools. Such “micro-scale” levels of research are important in providing the required tools for the project manager's tasks. However, even with such tools, low quality input information will produce inaccurate schedules and budgets as output. Thus, it is also important to have a broad approach to research at a more “macro-scale.” Recent trends show that the Architectural, Engineering, Construction (AEC) industry is experiencing explosive growth in its capabilities to generate and collect data. There is a great deal of valuable knowledge that can be obtained from the appropriate use of this data and therefore the need has arisen to analyse this increasing amount of available data. Data Mining can be applied as a powerful tool to extract relevant and useful information from this sea of data. Knowledge Discovery in Databases (KDD) and Data Mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so large amounts of project data may be analysed. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. The project involves the development of a prototype tool to support facility managers, building owners and designers. This final report presents the AIMMTM prototype system and documents how and what data mining techniques can be applied, the results of their application and the benefits gained from the system. The AIMMTM system is capable of searching for useful patterns of knowledge and correlations within the existing building maintenance data to support decision making about future maintenance operations. The application of the AIMMTM prototype system on building models and their maintenance data (supplied by industry partners) utilises various data mining algorithms and the maintenance data is analysed using interactive visual tools. The application of the AIMMTM prototype system to help in improving maintenance management and building life cycle includes: (i) data preparation and cleaning, (ii) integrating meaningful domain attributes, (iii) performing extensive data mining experiments in which visual analysis (using stacked histograms), classification and clustering techniques, associative rule mining algorithm such as “Apriori” and (iv) filtering and refining data mining results, including the potential implications of these results for improving maintenance management. Maintenance data of a variety of asset types were selected for demonstration with the aim of discovering meaningful patterns to assist facility managers in strategic planning and provide a knowledge base to help shape future requirements and design briefing. Utilising the prototype system developed here, positive and interesting results regarding patterns and structures of data have been obtained.
Resumo:
The construction industry has adapted information technology in its processes in terms of computer aided design and drafting, construction documentation and maintenance. The data generated within the construction industry has become increasingly overwhelming. Data mining is a sophisticated data search capability that uses classification algorithms to discover patterns and correlations within a large volume of data. This paper presents the selection and application of data mining techniques on maintenance data of buildings. The results of applying such techniques and potential benefits of utilising their results to identify useful patterns of knowledge and correlations to support decision making of improving the management of building life cycle are presented and discussed.
Using Agents for Mining Maintenance Data while interacting in 3D Objectoriented Virtual Environments
Resumo:
This report demonstrates the development of: (a) object-oriented representation to provide 3D interactive environment using data provided by Woods Bagot; (b) establishing basis of agent technology for mining building maintenance data, and (C) 3D interaction in virtual environments using object-oriented representation. Applying data mining over industry maintenance database has been demonstrated in the previous report.
Resumo:
This report demonstrates the development of: • Development of software agents for data mining • Link data mining to building model in virtual environments • Link knowledge development with building model in virtual environments • Demonstration of software agents for data mining • Populate with maintenance data
Resumo:
Experience plays an important role in building management. “How often will this asset need repair?” or “How much time is this repair going to take?” are types of questions that project and facility managers face daily in planning activities. Failure or success in developing good schedules, budgets and other project management tasks depend on the project manager's ability to obtain reliable information to be able to answer these types of questions. Young practitioners tend to rely on information that is based on regional averages and provided by publishing companies. This is in contrast to experienced project managers who tend to rely heavily on personal experience. Another aspect of building management is that many practitioners are seeking to improve available scheduling algorithms, estimating spreadsheets and other project management tools. Such “micro-scale” levels of research are important in providing the required tools for the project manager's tasks. However, even with such tools, low quality input information will produce inaccurate schedules and budgets as output. Thus, it is also important to have a broad approach to research at a more “macro-scale.” Recent trends show that the Architectural, Engineering, Construction (AEC) industry is experiencing explosive growth in its capabilities to generate and collect data. There is a great deal of valuable knowledge that can be obtained from the appropriate use of this data and therefore the need has arisen to analyse this increasing amount of available data. Data Mining can be applied as a powerful tool to extract relevant and useful information from this sea of data. Knowledge Discovery in Databases (KDD) and Data Mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so large amounts of project data may be analysed. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. The project involves the development of a prototype tool to support facility managers, building owners and designers. This Industry focused report presents the AIMMTM prototype system and documents how and what data mining techniques can be applied, the results of their application and the benefits gained from the system. The AIMMTM system is capable of searching for useful patterns of knowledge and correlations within the existing building maintenance data to support decision making about future maintenance operations. The application of the AIMMTM prototype system on building models and their maintenance data (supplied by industry partners) utilises various data mining algorithms and the maintenance data is analysed using interactive visual tools. The application of the AIMMTM prototype system to help in improving maintenance management and building life cycle includes: (i) data preparation and cleaning, (ii) integrating meaningful domain attributes, (iii) performing extensive data mining experiments in which visual analysis (using stacked histograms), classification and clustering techniques, associative rule mining algorithm such as “Apriori” and (iv) filtering and refining data mining results, including the potential implications of these results for improving maintenance management. Maintenance data of a variety of asset types were selected for demonstration with the aim of discovering meaningful patterns to assist facility managers in strategic planning and provide a knowledge base to help shape future requirements and design briefing. Utilising the prototype system developed here, positive and interesting results regarding patterns and structures of data have been obtained.
Resumo:
The building life cycle process is complex and prone to fragmentation as it moves through its various stages. The number of participants, and the diversity, specialisation and isolation both in space and time of their activities, have dramatically increased over time. The data generated within the construction industry has become increasingly overwhelming. Most currently available computer tools for the building industry have offered productivity improvement in the transmission of graphical drawings and textual specifications, without addressing more fundamental changes in building life cycle management. Facility managers and building owners are primarily concerned with highlighting areas of existing or potential maintenance problems in order to be able to improve the building performance, satisfying occupants and minimising turnover especially the operational cost of maintenance. In doing so, they collect large amounts of data that is stored in the building’s maintenance database. The work described in this paper is targeted at adding value to the design and maintenance of buildings by turning maintenance data into information and knowledge. Data mining technology presents an opportunity to increase significantly the rate at which the volumes of data generated through the maintenance process can be turned into useful information. This can be done using classification algorithms to discover patterns and correlations within a large volume of data. This paper presents how and what data mining techniques can be applied on maintenance data of buildings to identify the impediments to better performance of building assets. It demonstrates what sorts of knowledge can be found in maintenance records. The benefits to the construction industry lie in turning passive data in databases into knowledge that can improve the efficiency of the maintenance process and of future designs that incorporate that maintenance knowledge.
Resumo:
Qualitative research methods require transparency to ensure the ‘trustworthiness’ of the data analysis. The intricate processes of organizing, coding and analyzing the data are often rendered invisible in the presentation of the research findings, which requires a ‘leap of faith’ for the reader. Computer assisted data analysis software can be used to make the research process more transparent, without sacrificing rich, interpretive analysis by the researcher. This article describes in detail how one software package was used in a poststructural study to link and code multiple forms of data to four research questions for fine-grained analysis. This description will be useful for researchers seeking to use qualitative data analysis software as an analytic tool.
Resumo:
This project, as part of a broader Sustainable Sub-divisions research agenda, addresses the role of natural ventilation in reducing the use of energy required to cool dwellings
Resumo:
In the case of industrial relations research, particularly that which sets out to examine practices within workplaces, the best way to study this real-life context is to work for the organisation. Studies conducted by researchers working within the organisation comprise some of the (broad) field’s classic research (cf. Roy, 1954; Burawoy, 1979). Participant and non-participant ethnographic research provides an opportunity to investigate workplace behaviour beyond the scope of questionnaires and interviews. However, we suggest that the data collected outside a workplace can be just as important as the data collected inside the organisation’s walls. In recent years the introduction of anti-smoking legislation in Australia has meant that people who smoke cigarettes are no longer allowed to do so inside buildings. Not only are smokers forced outside to engage in their habit, but they have to smoke prescribed distances from doorways, or in some workplaces outside the property line. This chapter considers the importance of cigarette-smoking employees in ethnographic research. Through data collected across three separate research projects, the chapter argues that smokers, as social outcasts in the workplace, can provide a wealth of important research data. We suggest that smokers also appear more likely to provide stories that contradict the ‘management’ or ‘organisational’ position. Thus, within the haze of smoke, researchers can uncover a level of discontent with the ‘corporate line’ presented inside the workplace. There are several aspects to the increased propensity of smokers to provide a contradictory or discontented story. It may be that the researcher is better able to establish a rapport with smokers, as there is a removal of the artificial wall a researcher presents as an outsider. It may also be that a research location physically outside the boundaries of the organisation provides workers with the freedom to express their discontent. The authors offer no definitive answers; rather, this chapter is intended to extend our knowledge of workplace research through highlighting the methodological value in using smokers as research subjects. We present the experience of three separate case studies where interactions with cigarette smokers have provided either important organisational data or alternatively a means of entering what Cunnison (1966) referred to as the ‘gossip circle’. The final section of the chapter draws on the evidence to demonstrate how the community of smokers, as social outcasts, are valuable in investigating workplace issues. For researchers and practitioners, these social outcasts may very well prove to be an important barometer of employee attitudes; attitudes that perhaps cannot be measured through traditional staff surveys.
Resumo:
This project is an extension of a previous CRC project (220-059-B) which developed a program for life prediction of gutters in Queensland schools. A number of sources of information on service life of metallic building components were formed into databases linked to a Case-Based Reasoning Engine which extracted relevant cases from each source. In the initial software, no attempt was made to choose between the results offered or construct a case for retention in the casebase. In this phase of the project, alternative data mining techniques will be explored and evaluated. A process for selecting a unique service life prediction for each query will also be investigated. This report summarises the initial evaluation of several data mining techniques.
Resumo:
Maintenance of bridge structures is a major issue for the Queensland Department of Main Roads. In the previous phase of this CRC project an initial approach was made towards the development of a program for lifetime prediction of metallic bridge components. This involved the analysis of five representative bridge structures with respect to salt deposition (a major contributor to metallic corrosion) to determine common elements to be used as “cases” - those defined for buildings are not applicable. The five bridges analysed included the Gladstone Port Access Road Overpass, Stewart Road Overpass, South Johnstone River Bridge, Johnson Creek Bridge and the Ward River Bridge.
Resumo:
A survey of a number of schools in a number of different climates was carried out to determine the condition of building components of interest in the project. Schools in Melbourne, the Victorian Surf Coast, Brisbane, Townsville and the Sunshine Coast were inspected. A rating system was devised to categorise the components and the results collated in tables. Analysis of the data (where sufficient examples permitted) resulted in formulae to predict the service of the components and a database was derived.