852 resultados para 080109 Pattern Recognition and Data Mining
Resumo:
Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).
Resumo:
This paper develops an interactive approach for exploratory spatial data analysis. Measures of attribute similarity and spatial proximity are combined in a clustering model to support the identification of patterns in spatial information. Relationships between the developed clustering approach, spatial data mining and choropleth display are discussed. Analysis of property crime rates in Brisbane, Australia is presented. A surprising finding in this research is that there are substantial inconsistencies in standard choropleth display options found in two widely used commercial geographical information systems, both in terms of definition and performance. The comparative results demonstrate the usefulness and appeal of the developed approach in a geographical information system environment for exploratory spatial data analysis.
Resumo:
The principle of using induction rules based on spatial environmental data to model a soil map has previously been demonstrated Whilst the general pattern of classes of large spatial extent and those with close association with geology were delineated small classes and the detailed spatial pattern of the map were less well rendered Here we examine several strategies to improve the quality of the soil map models generated by rule induction Terrain attributes that are better suited to landscape description at a resolution of 250 m are introduced as predictors of soil type A map sampling strategy is developed Classification error is reduced by using boosting rather than cross validation to improve the model Further the benefit of incorporating the local spatial context for each environmental variable into the rule induction is examined The best model was achieved by sampling in proportion to the spatial extent of the mapped classes boosting the decision trees and using spatial contextual information extracted from the environmental variables.
Resumo:
The intracellular bacterium Legionella pneumophila induces a severe form of pneumonia called Legionnaires diseases, which is characterized by a strong neutrophil (NE) infiltrate to the lungs of infected individuals. Although the participation of pattern recognition receptors, such as Toll-like receptors, was recently demonstrated, there is no information on the role of nod-like receptors (NLRs) for bacterial recognition in vivo and for NE recruitment to the lungs. Here, we employed a murine model of Legionnaires disease to evaluate host and bacterial factors involved in NE recruitment to the mice lungs. We found that L. pneumophila type four secretion system, known as Dot/Icm, was required for NE recruitment as dot/icm mutants fail to trigger NE recruitment in a process independent of bacterial multiplication. By using mice deficient for Nod1, Nod2, and Rip2, we found that these receptors accounted for NE recruitment to the lungs of infected mice. In addition, Rip2-dependent responses were important for cytokine production and bacterial clearance. Collectively, these studies show that Nod1, Nod2, and Rip2 account for generation of innate immune responses in vivo, which are important for NE recruitment and bacterial clearance in a murine model of Legionnaires diseases. (C) 2010 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Resumo:
This paper deals with the establishment of a characterization methodology of electric power profiles of medium voltage (MV) consumers. The characterization is supported on the data base knowledge discovery process (KDD). Data Mining techniques are used with the purpose of obtaining typical load profiles of MV customers and specific knowledge of their customers’ consumption habits. In order to form the different customers’ classes and to find a set of representative consumption patterns, a hierarchical clustering algorithm and a clustering ensemble combination approach (WEACS) are used. Taking into account the typical consumption profile of the class to which the customers belong, new tariff options were defined and new energy coefficients prices were proposed. Finally, and with the results obtained, the consequences that these will have in the interaction between customer and electric power suppliers are analyzed.
Resumo:
Presently power system operation produces huge volumes of data that is still treated in a very limited way. Knowledge discovery and machine learning can make use of these data resulting in relevant knowledge with very positive impact. In the context of competitive electricity markets these data is of even higher value making clear the trend to make data mining techniques application in power systems more relevant. This paper presents two cases based on real data, showing the importance of the use of data mining for supporting demand response and for supporting player strategic behavior.
Resumo:
In this work liver contour is semi-automatically segmented and quantified in order to help the identification and diagnosis of diffuse liver disease. The features extracted from the liver contour are jointly used with clinical and laboratorial data in the staging process. The classification results of a support vector machine, a Bayesian and a k-nearest neighbor classifier are compared. A population of 88 patients at five different stages of diffuse liver disease and a leave-one-out cross-validation strategy are used in the classification process. The best results are obtained using the k-nearest neighbor classifier, with an overall accuracy of 80.68%. The good performance of the proposed method shows a reliable indicator that can improve the information in the staging of diffuse liver disease.
Resumo:
Software for pattern recognition of the larvae of mosquitoes Aedes aegypti and Aedes albopictus, biological vectors of dengue and yellow fever, has been developed. Rapid field identification of larva using a digital camera linked to a laptop computer equipped with this software may greatly help prevention campaigns.
Resumo:
Earthworks tasks aim at levelling the ground surface at a target construction area and precede any kind of structural construction (e.g., road and railway construction). It is comprised of sequential tasks, such as excavation, transportation, spreading and compaction, and it is strongly based on heavy mechanical equipment and repetitive processes. Under this context, it is essential to optimize the usage of all available resources under two key criteria: the costs and duration of earthwork projects. In this paper, we present an integrated system that uses two artificial intelligence based techniques: data mining and evolutionary multi-objective optimization. The former is used to build data-driven models capable of providing realistic estimates of resource productivity, while the latter is used to optimize resource allocation considering the two main earthwork objectives (duration and cost). Experiments held using real-world data, from a construction site, have shown that the proposed system is competitive when compared with current manual earthwork design.
Resumo:
Football is considered nowadays one of the most popular sports. In the betting world, it has acquired an outstanding position, which moves millions of euros during the period of a single football match. The lack of profitability of football betting users has been stressed as a problem. This lack gave origin to this research proposal, which it is going to analyse the possibility of existing a way to support the users to increase their profits on their bets. Data mining models were induced with the purpose of supporting the gamblers to increase their profits in the medium/long term. Being conscience that the models can fail, the results achieved by four of the seven targets in the models are encouraging and suggest that the system can help to increase the profits. All defined targets have two possible classes to predict, for example, if there are more or less than 7.5 corners in a single game. The data mining models of the targets, more or less than 7.5 corners, 8.5 corners, 1.5 goals and 3.5 goals achieved the pre-defined thresholds. The models were implemented in a prototype, which it is a pervasive decision support system. This system was developed with the purpose to be an interface for any user, both for an expert user as to a user who has no knowledge in football games.
Resumo:
Data Mining, Vision Restoration, Treatment outcome prediction, Self-Organising-Map
Resumo:
Magdeburg, Univ., Fak. für Informatik, Diss., 2013
Resumo:
The spleen plays a crucial role in the development of immunity to malaria, but the role of pattern recognition receptors (PRRs) in splenic effector cells during malaria infection is poorly understood. In the present study, we analysed the expression of selected PRRs in splenic effector cells from BALB/c mice infected with the lethal and non-lethal Plasmodium yoelii strains 17XL and 17X, respectively, and the non-lethal Plasmodium chabaudi chabaudi AS strain. The results of these experiments showed fewer significant changes in the expression of PRRs in AS-infected mice than in 17X and 17XL-infected mice. Mannose receptor C type 2 (MRC2) expression increased with parasitemia, whereas Toll-like receptors and sialoadhesin (Sn) decreased in mice infected with P. chabaudi AS. In contrast, MRC type 1 (MRC1), MRC2 and EGF-like module containing mucin-like hormone receptor-like sequence 1 (F4/80) expression decreased with parasitemia in mice infected with 17X, whereas MRC1 an MRC2 increased and F4/80 decreased in mice infected with 17XL. Furthermore, macrophage receptor with collagenous structure and CD68 declined rapidly after initial parasitemia. SIGNR1 and Sn expression demonstrated minor variations in the spleens of mice infected with either strain. Notably, macrophage scavenger receptor (Msr1) and dendritic cell-associated C-type lectin 2 expression increased at both the transcript and protein levels in 17XL-infected mice with 50% parasitemia. Furthermore, the increased lethality of 17X infection in Msr1 -/- mice demonstrated a protective role for Msr1. Our results suggest a dual role for these receptors in parasite clearance and protection in 17X infection and lethality in 17XL infection.