1000 resultados para Pattern Cleaning


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis is a study for automatic discovery of text features for describing user information needs. It presents an innovative data-mining approach that discovers useful knowledge from both relevance and non-relevance feedback information. The proposed approach can largely reduce noises in discovered patterns and significantly improve the performance of text mining systems. This study provides a promising method for the study of Data Mining and Web Intelligence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the overwhelming increase in the amount of texts on the web, it is almost impossible for people to keep abreast of up-to-date information. Text mining is a process by which interesting information is derived from text through the discovery of patterns and trends. Text mining algorithms are used to guarantee the quality of extracted knowledge. However, the extracted patterns using text or data mining algorithms or methods leads to noisy patterns and inconsistency. Thus, different challenges arise, such as the question of how to understand these patterns, whether the model that has been used is suitable, and if all the patterns that have been extracted are relevant. Furthermore, the research raises the question of how to give a correct weight to the extracted knowledge. To address these issues, this paper presents a text post-processing method, which uses a pattern co-occurrence matrix to find the relation between extracted patterns in order to reduce noisy patterns. The main objective of this paper is not only reducing the number of closed sequential patterns, but also improving the performance of pattern mining as well. The experimental results on Reuters Corpus Volume 1 data collection and TREC filtering topics show that the proposed method is promising.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Birds that remove ectoparasites and other food material from their hosts are iconic illustrations of mutualistic-commensalistic cleaning associations. To assess the complex pattern of food resource use embedded in cleaning interactions of an assemblage of birds and their herbivorous mammal hosts in open habitats in Brazil, we used a network approach that characterized their patterns of association. Cleaning interactions showed a distinctly nested pattern, related to the number of interactions of cleaners and hosts and to the range of food types that each host species provided. Hosts that provided a wide range of food types (flies, ticks, tissue and blood, and organic debris) were attended by more species of cleaners and formed the core of the web. On the other hand, core cleaner species did not exploit the full range of available food resources, but used a variety of host species to exploit these resources instead. The structure that we found indicates that cleaners rely on cleaning interactions to obtain food types that would not be available otherwise (e.g., blood-engorged ticks or horseflies, wounded tissue). Additionally, a nested organization for the cleaner bird mammalian herbivore association means that both generalist and selective species take part in the interactions and that partners of selective species form an ordered subset of the partners of generalist species. The availability of predictable protein-rich food sources for birds provided by cleaning interactions may lead to an evolutionary pathway favoring their increased use by birds that forage opportunistically. Received 30 June 2011, accepted 10 November 2011.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The promise of Wireless Sensor Networks (WSNs) is the autonomous collaboration of a collection of sensors to accomplish some specific goals which a single sensor cannot offer. Basically, sensor networking serves a range of applications by providing the raw data as fundamentals for further analyses and actions. The imprecision of the collected data could tremendously mislead the decision-making process of sensor-based applications, resulting in an ineffectiveness or failure of the application objectives. Due to inherent WSN characteristics normally spoiling the raw sensor readings, many research efforts attempt to improve the accuracy of the corrupted or "dirty" sensor data. The dirty data need to be cleaned or corrected. However, the developed data cleaning solutions restrict themselves to the scope of static WSNs where deployed sensors would rarely move during the operation. Nowadays, many emerging applications relying on WSNs need the sensor mobility to enhance the application efficiency and usage flexibility. The location of deployed sensors needs to be dynamic. Also, each sensor would independently function and contribute its resources. Sensors equipped with vehicles for monitoring the traffic condition could be depicted as one of the prospective examples. The sensor mobility causes a transient in network topology and correlation among sensor streams. Based on static relationships among sensors, the existing methods for cleaning sensor data in static WSNs are invalid in such mobile scenarios. Therefore, a solution of data cleaning that considers the sensor movements is actively needed. This dissertation aims to improve the quality of sensor data by considering the consequences of various trajectory relationships of autonomous mobile sensors in the system. First of all, we address the dynamic network topology due to sensor mobility. The concept of virtual sensor is presented and used for spatio-temporal selection of neighboring sensors to help in cleaning sensor data streams. This method is one of the first methods to clean data in mobile sensor environments. We also study the mobility pattern of moving sensors relative to boundaries of sub-areas of interest. We developed a belief-based analysis to determine the reliable sets of neighboring sensors to improve the cleaning performance, especially when node density is relatively low. Finally, we design a novel sketch-based technique to clean data from internal sensors where spatio-temporal relationships among sensors cannot lead to the data correlations among sensor streams.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Enterprise Application Integration (EAI) is a challenging area that is attracting growing attention from the software industry and the research community. A landscape of languages and techniques for EAI has emerged and is continuously being enriched with new proposals from different software vendors and coalitions. However, little or no effort has been dedicated to systematically evaluate and compare these languages and techniques. The work reported in this paper is a first step in this direction. It presents an in-depth analysis of a language, namely the Business Modeling Language, specifically developed for EAI. The framework used for this analysis is based on a number of workflow and communication patterns. This framework provides a basis for evaluating the advantages and drawbacks of EAI languages with respect to recurrent problems and situations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cat’s claw creeper, Macfadyena unguis-cati (L.) Gentry (Bignoniaceae) is a major environmental weed of riparian areas, rainforest communities and remnant natural vegetation in coastal Queensland and New South Wales, Australia. In densely infested areas, it smothers standing vegetation, including large trees, and causes canopy collapse. Quantitative data on the ecology of this invasive vine are generally lacking. The present study examines the underground tuber traits of M. unguis-cati and explores their links with aboveground parameters at five infested sites spanning both riparian and inland vegetation. Tubers were abundant in terms of density (~1000 per m2), although small in size and low in level of interconnectivity. M. unguis-cati also exhibits multiple stems per plant. Of all traits screened, the link between stand (stem density) and tuber density was the most significant and yielded a promising bivariate relationship for the purposes of estimation, prediction and management of what lies beneath the soil surface of a given M. unguis-cati infestation site. The study also suggests that new recruitment is primarily from seeds, not from vegetative propagation as previously thought. The results highlight the need for future biological-control efforts to focus on introducing specialist seed- and pod-feeding insects to reduce seed-output.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Experience plays an important role in building management. “How often will this asset need repair?” or “How much time is this repair going to take?” are types of questions that project and facility managers face daily in planning activities. Failure or success in developing good schedules, budgets and other project management tasks depend on the project manager's ability to obtain reliable information to be able to answer these types of questions. Young practitioners tend to rely on information that is based on regional averages and provided by publishing companies. This is in contrast to experienced project managers who tend to rely heavily on personal experience. Another aspect of building management is that many practitioners are seeking to improve available scheduling algorithms, estimating spreadsheets and other project management tools. Such “micro-scale” levels of research are important in providing the required tools for the project manager's tasks. However, even with such tools, low quality input information will produce inaccurate schedules and budgets as output. Thus, it is also important to have a broad approach to research at a more “macro-scale.” Recent trends show that the Architectural, Engineering, Construction (AEC) industry is experiencing explosive growth in its capabilities to generate and collect data. There is a great deal of valuable knowledge that can be obtained from the appropriate use of this data and therefore the need has arisen to analyse this increasing amount of available data. Data Mining can be applied as a powerful tool to extract relevant and useful information from this sea of data. Knowledge Discovery in Databases (KDD) and Data Mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so large amounts of project data may be analysed. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. The project involves the development of a prototype tool to support facility managers, building owners and designers. This final report presents the AIMMTM prototype system and documents how and what data mining techniques can be applied, the results of their application and the benefits gained from the system. The AIMMTM system is capable of searching for useful patterns of knowledge and correlations within the existing building maintenance data to support decision making about future maintenance operations. The application of the AIMMTM prototype system on building models and their maintenance data (supplied by industry partners) utilises various data mining algorithms and the maintenance data is analysed using interactive visual tools. The application of the AIMMTM prototype system to help in improving maintenance management and building life cycle includes: (i) data preparation and cleaning, (ii) integrating meaningful domain attributes, (iii) performing extensive data mining experiments in which visual analysis (using stacked histograms), classification and clustering techniques, associative rule mining algorithm such as “Apriori” and (iv) filtering and refining data mining results, including the potential implications of these results for improving maintenance management. Maintenance data of a variety of asset types were selected for demonstration with the aim of discovering meaningful patterns to assist facility managers in strategic planning and provide a knowledge base to help shape future requirements and design briefing. Utilising the prototype system developed here, positive and interesting results regarding patterns and structures of data have been obtained.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Experience plays an important role in building management. “How often will this asset need repair?” or “How much time is this repair going to take?” are types of questions that project and facility managers face daily in planning activities. Failure or success in developing good schedules, budgets and other project management tasks depend on the project manager's ability to obtain reliable information to be able to answer these types of questions. Young practitioners tend to rely on information that is based on regional averages and provided by publishing companies. This is in contrast to experienced project managers who tend to rely heavily on personal experience. Another aspect of building management is that many practitioners are seeking to improve available scheduling algorithms, estimating spreadsheets and other project management tools. Such “micro-scale” levels of research are important in providing the required tools for the project manager's tasks. However, even with such tools, low quality input information will produce inaccurate schedules and budgets as output. Thus, it is also important to have a broad approach to research at a more “macro-scale.” Recent trends show that the Architectural, Engineering, Construction (AEC) industry is experiencing explosive growth in its capabilities to generate and collect data. There is a great deal of valuable knowledge that can be obtained from the appropriate use of this data and therefore the need has arisen to analyse this increasing amount of available data. Data Mining can be applied as a powerful tool to extract relevant and useful information from this sea of data. Knowledge Discovery in Databases (KDD) and Data Mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so large amounts of project data may be analysed. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. The project involves the development of a prototype tool to support facility managers, building owners and designers. This Industry focused report presents the AIMMTM prototype system and documents how and what data mining techniques can be applied, the results of their application and the benefits gained from the system. The AIMMTM system is capable of searching for useful patterns of knowledge and correlations within the existing building maintenance data to support decision making about future maintenance operations. The application of the AIMMTM prototype system on building models and their maintenance data (supplied by industry partners) utilises various data mining algorithms and the maintenance data is analysed using interactive visual tools. The application of the AIMMTM prototype system to help in improving maintenance management and building life cycle includes: (i) data preparation and cleaning, (ii) integrating meaningful domain attributes, (iii) performing extensive data mining experiments in which visual analysis (using stacked histograms), classification and clustering techniques, associative rule mining algorithm such as “Apriori” and (iv) filtering and refining data mining results, including the potential implications of these results for improving maintenance management. Maintenance data of a variety of asset types were selected for demonstration with the aim of discovering meaningful patterns to assist facility managers in strategic planning and provide a knowledge base to help shape future requirements and design briefing. Utilising the prototype system developed here, positive and interesting results regarding patterns and structures of data have been obtained.