5 resultados para Distance-based techniques
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Although nontechnical losses automatic identification has been massively studied, the problem of selecting the most representative features in order to boost the identification accuracy and to characterize possible illegal consumers has not attracted much attention in this context. In this paper, we focus on this problem by reviewing three evolutionary-based techniques for feature selection, and we also introduce one of them in this context. The results demonstrated that selecting the most representative features can improve a lot of the classification accuracy of possible frauds in datasets composed by industrial and commercial profiles.
Resumo:
The impact of tannery sludge application on soil microbial community and diversity is poorly understood. We studied the microbial community in an agricultural soil following two applications (2006 and 2007) of tannery sludge with annual application rates of 0.0,2.3 and 22.6 Mg ha(-1). The soil was sampled 12 and 271 days after the second (2007) application. Community structure was assessed via a phospholipid fatty acid analysis, and the physiological profile of the soil microbial community via the Biolog method. Tannery sludge application changed soil chemical properties, increasing the soil pH and electrical conductivity as well as available P and mineral N concentrations. The higher sludge application rate changed the community structure and the physiological profile of the microbial community at both sampling dates. However, there is no clear link between community structure and carbon substrate utilization. According to the Distance Based Linear Models Analysis, the fatty acids 16:0 and 117:0 together contributed 84% to the observed PLFA patterns, whereas the chemical properties available P, mineral N, and Ca, and pH together contributed 54%. At 12 days, tannery sludge application increased the average well color development from 0.46 to 0.87 after 48 h, and reduced the time elapsed before reaching the midpoint carbon substrate utilization (s) from 71 to 44 h, an effect still apparent nine months after application of the higher sludge application rate. The dominant signature fatty acids and kinetic parameters (r and s) were correlated to the concentrations of available P. Ca, mineral N, pH and EC. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
Individuals with Down syndrome (DS) carry three copies of the Cystathionine beta-synthase (C beta S) gene. The increase in the dosage of this gene results in an altered profile of metabolites involved in the folate pathway, including reduced homocysteine (Hcy), methionine, S-adenosylhomocysteine (SAH) and S-adenosylmethionine (SAM). Furthermore, previous studies in individuals with DS have shown that genetic variants in genes involved in the folate pathway influence the concentrations of this metabolism's products. The purpose of this study is to investigate whether polymorphisms in genes involved in folate metabolism affect the plasma concentrations of Hcy and methylmalonic acid (MMA) along with the concentration of serum folate in individuals with DS. Twelve genetic polymorphisms were investigated in 90 individuals with DS (median age 1.29 years, range 0.07-30.35 years; 49 male and 41 female). Genotyping for the polymorphisms was performed either by polymerase chain reaction (PCR) based techniques or by direct sequencing. Plasma concentrations of Hcy and MMA were measured by liquid chromatography-tandem mass spectrometry as previously described, and serum folate was quantified using a competitive immunoassay. Our results indicate that the MTHFR C677T, MTR A2756G, TC2 C776G and BHMT G742A polymorphisms along with MMA concentration are predictors of Hcy concentration. They also show that age and Hcy concentration are predictors of MMA concentration. These findings could help to understand how genetic variation impacts folate metabolism and what metabolic consequences these variants have in individuals with trisomy 21.
Resumo:
XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficiently addressed while comparing XML documents. In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally and semantically similar sub-trees), and to allow the end-user to adjust the comparison process according to her requirements. Our framework consists of four main modules for (i) discovering the structural commonalities between sub-trees, (ii) identifying sub-tree semantic resemblances, (iii) computing tree-based edit operations costs, and (iv) computing tree edit distance. Experimental results demonstrate higher comparison accuracy with respect to alternative methods, while timing experiments reflect the impact of semantic similarity on overall system performance.
Resumo:
Traditional supervised data classification considers only physical features (e. g., distance or similarity) of the input data. Here, this type of learning is called low level classification. On the other hand, the human (animal) brain performs both low and high orders of learning and it has facility in identifying patterns according to the semantic meaning of the input data. Data classification that considers not only physical attributes but also the pattern formation is, here, referred to as high level classification. In this paper, we propose a hybrid classification technique that combines both types of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features or class topologies, while the latter measures the compliance of the test instances to the pattern formation of the data. Our study shows that the proposed technique not only can realize classification according to the pattern formation, but also is able to improve the performance of traditional classification techniques. Furthermore, as the class configuration's complexity increases, such as the mixture among different classes, a larger portion of the high level term is required to get correct classification. This feature confirms that the high level classification has a special importance in complex situations of classification. Finally, we show how the proposed technique can be employed in a real-world application, where it is capable of identifying variations and distortions of handwritten digit images. As a result, it supplies an improvement in the overall pattern recognition rate.