21 resultados para 380305 Knowledge Representation and Machine Learning

em Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho"


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Characteristics of speech, especially figures of speech, are used by specific communities or domains, and, in this way, reflect their identities through their choice of vocabulary. This topic should be an object of study in the context of knowledge representation once it deals with different contexts of production of documents. This study aims to explore the dimensions of the concepts of euphemism, dysphemism, and orthophemism, focusing on the latter with the goal of extracting a concept which can be included in discussions about subject analysis and indexing. Euphemism is used as an alternative to a non-preferred expression or as an alternative to an offensive attribution-to avoid potential offense taken by the listener or by other persons, for instance, pass away. Dysphemism, on the other hand, is used by speakers to talk about people and things that frustrate and annoy them-their choice of language indicates disapproval and the topic is therefore denigrated, humiliated, or degraded, for instance, kick the bucket. While euphemism tries to make something sound better, dysphemism tries to make something sound worse. Orthophemism (Allan and Burridge 2006) is also used as an alternative to expressions, but it is a preferred, formal, and direct language of expression when representing an object or a situation, for instance, die. This paper suggests that the comprehension and use of such concepts could support the following issues: possible contributions from linguistics and terminology to subject analysis as demonstrated by Talamo et al. (1992); decrease of polysemy and ambiguity of terms used to represent certain topics of documents; and construction and evaluation of indexing languages. The concept of orthophemism can also serves to support associative relationships in the context of subject analysis, indexing, and even information retrieval related to more specific requests.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The correct classification of sugar according to its physico-chemical characteristics directly influences the value of the product and its acceptance by the market. This study shows that using an electronic tongue system along with established techniques of supervised learning leads to the correct classification of sugar samples according to their qualities. In this paper, we offer two new real, public and non-encoded sugar datasets whose attributes were automatically collected using an electronic tongue, with and without pH controlling. Moreover, we compare the performance achieved by several established machine learning methods. Our experiments were diligently designed to ensure statistically sound results and they indicate that k-nearest neighbors method outperforms other evaluated classifiers and, hence, it can be used as a good baseline for further comparison. © 2012 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The presence of precipitates in metallic materials affects its durability, resistance and mechanical properties. Hence, its automatic identification by image processing and machine learning techniques may lead to reliable and efficient assessments on the materials. In this paper, we introduce four widely used supervised pattern recognition techniques to accomplish metallic precipitates segmentation in scanning electron microscope images from dissimilar welding on a Hastelloy C-276 alloy: Support Vector Machines, Optimum-Path Forest, Self Organizing Maps and a Bayesian classifier. Experimental results demonstrated that all classifiers achieved similar recognition rates with good results validated by an expert in metallographic image analysis. © 2011 Springer-Verlag Berlin Heidelberg.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The genome-wide identification of both morbid genes, i.e., those genes whose mutations cause hereditary human diseases, and druggable genes, i.e., genes coding for proteins whose modulation by small molecules elicits phenotypic effects, requires experimental approaches that are time-consuming and laborious. Thus, a computational approach which could accurately predict such genes on a genome-wide scale would be invaluable for accelerating the pace of discovery of causal relationships between genes and diseases as well as the determination of druggability of gene products.Results: In this paper we propose a machine learning-based computational approach to predict morbid and druggable genes on a genome-wide scale. For this purpose, we constructed a decision tree-based meta-classifier and trained it on datasets containing, for each morbid and druggable gene, network topological features, tissue expression profile and subcellular localization data as learning attributes. This meta-classifier correctly recovered 65% of known morbid genes with a precision of 66% and correctly recovered 78% of known druggable genes with a precision of 75%. It was than used to assign morbidity and druggability scores to genes not known to be morbid and druggable and we showed a good match between these scores and literature data. Finally, we generated decision trees by training the J48 algorithm on the morbidity and druggability datasets to discover cellular rules for morbidity and druggability and, among the rules, we found that the number of regulating transcription factors and plasma membrane localization are the most important factors to morbidity and druggability, respectively.Conclusions: We were able to demonstrate that network topological features along with tissue expression profile and subcellular localization can reliably predict human morbid and druggable genes on a genome-wide scale. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing morbidity and druggability.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Interactive visual representations complement traditional statistical and machine learning techniques for data analysis, allowing users to play a more active role in a knowledge discovery process and making the whole process more understandable. Though visual representations are applicable to several stages of the knowledge discovery process, a common use of visualization is in the initial stages to explore and organize a sometimes unknown and complex data set. In this context, the integrated and coordinated - that is, user actions should be capable of affecting multiple visualizations when desired - use of multiple graphical representations allows data to be observed from several perspectives and offers richer information than isolated representations. In this paper we propose an underlying model for an extensible and adaptable environment that allows independently developed visualization components to be gradually integrated into a user configured knowledge discovery application. Because a major requirement when using multiple visual techniques is the ability to link amongst them, so that user actions executed on a representation propagate to others if desired, the model also allows runtime configuration of coordinated user actions over different visual representations. We illustrate how this environment is being used to assist data exploration and organization in a climate classification problem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One common problem in all basic techniques of knowledge representation is the handling of the trade-off between precision of inferences and resource constraints, such as time and memory. Michalski and Winston (1986) suggested the Censored Production Rule (CPR) as an underlying representation and computational mechanism to enable logic based systems to exhibit variable precision in which certainty varies while specificity stays constant. As an extension of CPR, the Hierarchical Censored Production Rules (HCPRs) system of knowledge representation, proposed by Bharadwaj & Jain (1992), exhibits both variable certainty as well as variable specificity and offers mechanisms for handling the trade-off between the two. An HCPR has the form: Decision If(preconditions) Unless(censor) Generality(general_information) Specificity(specific_information). As an attempt towards evolving a generalized knowledge representation, an Extended Hierarchical Censored Production Rules (EHCPRs) system is suggested in this paper. With the inclusion of new operators, an Extended Hierarchical Censored Production Rule (EHCPR) takes the general form: Concept If (Preconditions) Unless (Exceptions) Generality (General-Concept) Specificity (Specific Concepts) Has_part (default: structural-parts) Has_property (default:characteristic-properties) Has_instance (instances). How semantic networks and frames are represented in terms of an EHCPRs is shown. Multiple inheritance, inheritance with and without cancellation, recognition with partial match, and a few default logic problems are shown to be tackled efficiently in the proposed system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Plant phenology has gained importance in the context of global change research, stimulating the development of new technologies for phenological observation. Digital cameras have been successfully used as multi-channel imaging sensors, providing measures of leaf color change information (RGB channels), or leafing phenological changes in plants. We monitored leaf-changing patterns of a cerrado-savanna vegetation by taken daily digital images. We extract RGB channels from digital images and correlated with phenological changes. Our first goals were: (1) to test if the color change information is able to characterize the phenological pattern of a group of species; and (2) to test if individuals from the same functional group may be automatically identified using digital images. In this paper, we present a machine learning approach to detect phenological patterns in the digital images. Our preliminary results indicate that: (1) extreme hours (morning and afternoon) are the best for identifying plant species; and (2) different plant species present a different behavior with respect to the color change information. Based on those results, we suggest that individuals from the same functional group might be identified using digital images, and introduce a new tool to help phenology experts in the species identification and location on-the-ground. ©2012 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Wireless Sensor Networks (WSNs) can be used to monitor hazardous and inaccessible areas. In these situations, the power supply (e.g. battery) of each node cannot be easily replaced. One solution to deal with the limited capacity of current power supplies is to deploy a large number of sensor nodes, since the lifetime and dependability of the network will increase through cooperation among nodes. Applications on WSN may also have other concerns, such as meeting temporal deadlines on message transmissions and maximizing the quality of information. Data fusion is a well-known technique that can be useful for the enhancement of data quality and for the maximization of WSN lifetime. In this paper, we propose an approach that allows the implementation of parallel data fusion techniques in IEEE 802.15.4 networks. One of the main advantages of the proposed approach is that it enables a trade-off between different user-defined metrics through the use of a genetic machine learning algorithm. Simulations and field experiments performed in different communication scenarios highlight significant improvements when compared with, for instance, the Gur Game approach or the implementation of conventional periodic communication techniques over IEEE 802.15.4 networks. © 2013 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Plant phenology is one of the most reliable indicators of species responses to global climate change, motivating the development of new technologies for phenological monitoring. Digital cameras or near remote systems have been efficiently applied as multi-channel imaging sensors, where leaf color information is extracted from the RGB (Red, Green, and Blue) color channels, and the changes in green levels are used to infer leafing patterns of plant species. In this scenario, texture information is a great ally for image analysis that has been little used in phenology studies. We monitored leaf-changing patterns of Cerrado savanna vegetation by taking daily digital images. We extract RGB channels from the digital images and correlate them with phenological changes. Additionally, we benefit from the inclusion of textural metrics for quantifying spatial heterogeneity. Our first goals are: (1) to test if color change information is able to characterize the phenological pattern of a group of species; (2) to test if the temporal variation in image texture is useful to distinguish plant species; and (3) to test if individuals from the same species may be automatically identified using digital images. In this paper, we present a machine learning approach based on multiscale classifiers to detect phenological patterns in the digital images. Our results indicate that: (1) extreme hours (morning and afternoon) are the best for identifying plant species; (2) different plant species present a different behavior with respect to the color change information; and (3) texture variation along temporal images is promising information for capturing phenological patterns. Based on those results, we suggest that individuals from the same species and functional group might be identified using digital images, and introduce a new tool to help phenology experts in the identification of new individuals from the same species in the image and their location on the ground. © 2013 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Since its emergence as a discipline, in the nineteenth century (1889), the theory and practice of Archival Science have focused on the arrangement and description of archival materials as complementary and inseparable nuclear processes that aim to classify, to order, to describe and to give access to records. These processes have their specific goals sharing one in common: the representation of archival knowledge. In the late 1980 a paradigm shift was announced in Archival Science, especially after the appearance of the new forms of document production and information technologies. The discipline was then invited to rethink its theoretical and methodological bases founded in the nineteenth century so it could handle the contemporary archival knowledge production, organization and representation. In this sense, the present paper aims to discuss, under a theoretical perspective, the archival representation, more specifically the archival description facing these changes and proposals, in order to illustrate the challenges faced by Contemporary Archival Science in a new context of production, organization and representation of archival knowledge.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes the use of the Principal Components Analysis (PCA) method to represent and to analyse soccer players' actions distribution in the pitch. The seven games of the Brazilian National Team during the 2002 World Cup were analysed. The player's position actions were measured from videotapes in a computer interface. The results were: a) the graphical representation, given by two orthogonal segments in the two directions of maximal variability and centred at the mean of each player's actions position; b) the eccentricity measurement, given by the variability ratio and c) the actions zone area, given by variability product. The results showed that the individual characteristics of acting were well represented by the PCA, allowing comparisons among games and providing insights related to the tactical organisation of the team.