7 resultados para Interest Similarity
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
We compared the microbial community composition in soils from the Brazilian Amazon with two contrasting histories; anthrosols and their adjacent non-anthrosol soils of the same mineralogy. The anthrosols, also known as the Amazonian Dark Earths or terra preta, were managed by the indigenous pre-Colombian Indians between 500 and 8,700 years before present and are characterized by unusually high cation exchange capacity, phosphorus (P), and calcium (Ca) contents, and soil carbon pools that contain a high proportion of incompletely combusted biomass as biochar or black carbon (BC). We sampled paired anthrosol and unmodified soils from four locations in the Manaus, Brazil, region that differed in their current land use and soil type. Community DNA was extracted from sampled soils and characterized by use of denaturing gradient gel electrophoresis (DGGE) and terminal restriction fragment length polymorphism. DNA bands of interest from Bacteria and Archaea DGGE gels were cloned and sequenced. In cluster analyses of the DNA fingerprints, microbial communities from the anthrosols grouped together regardless of current land use or soil type and were distinct from those in their respective, paired adjacent soils. For the Archaea, the anthrosol communities diverged from the adjacent soils by over 90%. A greater overall richness was observed for Bacteria sequences as compared with those of the Archaea. Most of the sequences obtained were novel and matched those in databases at less than 98% similarity. Several sequences obtained only from the anthrosols grouped at 93% similarity with the Verrucomicrobia, a genus commonly found in rice paddies in the tropics. Sequences closely related to Proteobacteria and Cyanobacteria sp. were recovered only from adjacent soil samples. Sequences related to Pseudomonas, Acidobacteria, and Flexibacter sp. were recovered from both anthrosols and adjacent soils. The strong similarities among the microbial communities present in the anthrosols for both the Bacteria and Archaea suggests that the microbial community composition in these soils is controlled more strongly by their historical soil management than by soil type or current land use. The anthrosols had consistently higher concentrations of incompletely combusted organic black carbon material (BC), higher soil pH, and higher concentrations of P and Ca compared to their respective adjacent soils. Such characteristics may help to explain the longevity and distinctiveness of the anthrosols in the Amazonian landscape and guide us in recreating soils with sustained high fertility in otherwise nutrient-poor soils in modern times.
Resumo:
The classification of texts has become a major endeavor with so much electronic material available, for it is an essential task in several applications, including search engines and information retrieval. There are different ways to define similarity for grouping similar texts into clusters, as the concept of similarity may depend on the purpose of the task. For instance, in topic extraction similar texts mean those within the same semantic field, whereas in author recognition stylistic features should be considered. In this study, we introduce ways to classify texts employing concepts of complex networks, which may be able to capture syntactic, semantic and even pragmatic features. The interplay between various metrics of the complex networks is analyzed with three applications, namely identification of machine translation (MT) systems, evaluation of quality of machine translated texts and authorship recognition. We shall show that topological features of the networks representing texts can enhance the ability to identify MT systems in particular cases. For evaluating the quality of MT texts, on the other hand, high correlation was obtained with methods capable of capturing the semantics. This was expected because the golden standards used are themselves based on word co-occurrence. Notwithstanding, the Katz similarity, which involves semantic and structure in the comparison of texts, achieved the highest correlation with the NIST measurement, indicating that in some cases the combination of both approaches can improve the ability to quantify quality in MT. In authorship recognition, again the topological features were relevant in some contexts, though for the books and authors analyzed good results were obtained with semantic features as well. Because hybrid approaches encompassing semantic and topological features have not been extensively used, we believe that the methodology proposed here may be useful to enhance text classification considerably, as it combines well-established strategies. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
The ability to discriminate nestmates from non-nestmates in insect societies is essential to protect colonies from conspecific invaders. The acceptance threshold hypothesis predicts that organisms whose recognition systems classify recipients without errors should optimize the balance between acceptance and rejection. In this process, cuticular hydrocarbons play an important role as cues of recognition in social insects. The aims of this study were to determine whether guards exhibit a restrictive level of rejection towards chemically distinct individuals, becoming more permissive during the encounters with either nestmate or non-nestmate individuals bearing chemically similar profiles. The study demonstrates that Melipona asilvai (Hymenoptera: Apidae: Meliponini) guards exhibit a flexible system of nestmate recognition according to the degree of chemical similarity between the incoming forager and its own cuticular hydrocarbons profile. Guards became less restrictive in their acceptance rates when they encounter non-nestmates with highly similar chemical profiles, which they probably mistake for nestmates, hence broadening their acceptance level.
Resumo:
HTLV-1 is endemic in Brazil and HIV/ HTLV-1 coinfection has been detected, mostly in the northeast region. Cosmopolitan HTLV-1a is the main subtype that circulates in Brazil. This study characterized 17 HTLV-1 isolates from HIV coinfected patients of southern (n = 7) and southeastern (n = 10) Brazil. HTLV-1 provirus DNA was amplified by nested PCR (env and LTR) and sequenced. Env sequences (705 bp) from 15 isolates and LTR sequences (731 bp) from 17 isolates showed 99.5% and 98.8% similarity among sequences, respectively. Comparing these sequences with ATK (HTLV-1a) and Mel5 (HTLV-1c) prototypes, similarities of 99% and 97.4%, respectively, for env and LTR with ATK, and 91.6% and 90.3% with Mel5, were detected. Phylogenetic analysis showed that all sequences belonged to the transcontinental subgroup A of the Cosmopolitan subtype, clustering in two Latin American clusters.
Resumo:
Abstract Background Depressive symptoms and chronic disease have adverse effects on patients' health-related quality of life (H-RQOL). However, little is known about this effect on H-RQOL when only the two core depressive symptoms - loss of interest and depressed mood - are considered. The objective of this study is to investigate H-RQOL in the presence of loss of interest and depressed mood at a general medical outpatient unit. Methods We evaluated 553 patients at their first attendance at a general medical outpatient unit of a teaching hospital. H-RQOL was assessed with the Medical Outcomes Study 36-item Short-Form Health Survey (SF-36). Depressed mood and loss of interest were assessed by the Primary Care Evaluation of Mental Disorders (PRIME-MD)-Patient Questionnaire. A physician performed the diagnosis of chronic diseases by clinical judgment and classified them in 13 possible pre-defined categories. We used multiple linear regression to investigate associations between each domain of H-RQOL and our two core depression symptoms. The presence of chronic diseases and demographic variables were included in the models as covariates. Results Among the 553 patients, 70.5% were women with a mean age of 41.0 years (range 18-85, SD ± 15.4). Loss of interest was reported by 54.6%, and depressed mood by 59.7% of the patients. At least one chronic disease was diagnosed in 59.5% of patients; cardiovascular disease was the most prevalent, affecting 20.6% of our patients. Loss of interest and depressed mood was significantly associated with decreased scores in all domains of H-RQOL after adjustment for possible confounders. The presence of any chronic disease was associated with a decrease in the domain of vitality. The analysis of each individual chronic disease category revealed that no category was associated with a decrease in more than one domain of H-RQOL. Conclusion Loss of interest and depressed mood were associated with significant decreases in H-RQOL. We recommend these simple tests for screening in general practice.
Resumo:
In this paper, we present a novel approach to perform similarity queries over medical images, maintaining the semantics of a given query posted by the user. Content-based image retrieval systems relying on relevance feedback techniques usually request the users to label relevant/irrelevant images. Thus, we present a highly effective strategy to survey user profiles, taking advantage of such labeling to implicitly gather the user perceptual similarity. The profiles maintain the settings desired for each user, allowing tuning of the similarity assessment, which encompasses the dynamic change of the distance function employed through an interactive process. Experiments on medical images show that the method is effective and can improve the decision making process during analysis.
Resumo:
The ubiquity of time series data across almost all human endeavors has produced a great interest in time series data mining in the last decade. While dozens of classification algorithms have been applied to time series, recent empirical evidence strongly suggests that simple nearest neighbor classification is exceptionally difficult to beat. The choice of distance measure used by the nearest neighbor algorithm is important, and depends on the invariances required by the domain. For example, motion capture data typically requires invariance to warping, and cardiology data requires invariance to the baseline (the mean value). Similarly, recent work suggests that for time series clustering, the choice of clustering algorithm is much less important than the choice of distance measure used.In this work we make a somewhat surprising claim. There is an invariance that the community seems to have missed, complexity invariance. Intuitively, the problem is that in many domains the different classes may have different complexities, and pairs of complex objects, even those which subjectively may seem very similar to the human eye, tend to be further apart under current distance measures than pairs of simple objects. This fact introduces errors in nearest neighbor classification, where some complex objects may be incorrectly assigned to a simpler class. Similarly, for clustering this effect can introduce errors by “suggesting” to the clustering algorithm that subjectively similar, but complex objects belong in a sparser and larger diameter cluster than is truly warranted.We introduce the first complexity-invariant distance measure for time series, and show that it generally produces significant improvements in classification and clustering accuracy. We further show that this improvement does not compromise efficiency, since we can lower bound the measure and use a modification of triangular inequality, thus making use of most existing indexing and data mining algorithms. We evaluate our ideas with the largest and most comprehensive set of time series mining experiments ever attempted in a single work, and show that complexity-invariant distance measures can produce improvements in classification and clustering in the vast majority of cases.