10 resultados para similarity indices
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
The classification of texts has become a major endeavor with so much electronic material available, for it is an essential task in several applications, including search engines and information retrieval. There are different ways to define similarity for grouping similar texts into clusters, as the concept of similarity may depend on the purpose of the task. For instance, in topic extraction similar texts mean those within the same semantic field, whereas in author recognition stylistic features should be considered. In this study, we introduce ways to classify texts employing concepts of complex networks, which may be able to capture syntactic, semantic and even pragmatic features. The interplay between various metrics of the complex networks is analyzed with three applications, namely identification of machine translation (MT) systems, evaluation of quality of machine translated texts and authorship recognition. We shall show that topological features of the networks representing texts can enhance the ability to identify MT systems in particular cases. For evaluating the quality of MT texts, on the other hand, high correlation was obtained with methods capable of capturing the semantics. This was expected because the golden standards used are themselves based on word co-occurrence. Notwithstanding, the Katz similarity, which involves semantic and structure in the comparison of texts, achieved the highest correlation with the NIST measurement, indicating that in some cases the combination of both approaches can improve the ability to quantify quality in MT. In authorship recognition, again the topological features were relevant in some contexts, though for the books and authors analyzed good results were obtained with semantic features as well. Because hybrid approaches encompassing semantic and topological features have not been extensively used, we believe that the methodology proposed here may be useful to enhance text classification considerably, as it combines well-established strategies. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
This study examined whether there is an association between surface electromyography (EMG) of masticatory muscles, orofacial myofunction status and temporomandibular disorder (TMD) severity scores. Forty-two women with TMD (mean 30 years, SD 8) and 18 healthy women (mean 26 years, SD 6) were examined. According to the Research Diagnostic Criteria for TMD (RDC/TMD), all patients had myogenous disorders plus disk displacements with reduction. Surface EMG of masseter and temporal muscles was performed during maximum teeth clenching either on cotton rolls or in intercuspal position. Standardized EMG indices were obtained. Validated protocols were used to determine the perception severity of TMD and to assess orofacial myofunctional status. TMD patients showed more asymmetry between right and left muscle pairs, and more unbalanced contractile activities of contralateral masseter and temporal muscles (p < 0.05, t-test), worse orofacial myofunction status and higher TMD severity scores (p < 0.05, Mann-Whitney test) than healthy subjects. Spearman coefficient revealed significant correlations between EMG indices, orofacial myofunctional status and TMD severity (p < 0.05). In conclusion, these methods will provide useful information for TMD diagnosis and future therapeutic planning. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
Effects of roads on wildlife and its habitat have been measured using metrics, such as the nearest road distance, road density, and effective mesh size. In this work we introduce two new indices: (1) Integral Road Effect (IRE), which measured the sum effects of points in a road at a fixed point in the forest; and (2) Average Value of the Infinitesimal Road Effect (AVIRE), which measured the average of the effects of roads at this point. IRE is formally defined as the line integral of a special function (the infinitesimal road effect) along the curves that model the roads, whereas AVIRE is the quotient of IRE by the length of the roads. Combining tools of ArcGIS software with a numerical algorithm, we calculated these and other road and habitat cover indices in a sample of points in a human-modified landscape in the Brazilian Atlantic Forest, where data on the abundance of two groups of small mammals (forest specialists and habitat generalists) were collected in the field. We then compared through the Akaike Information Criterion (AIC) a set of candidate regression models to explain the variation in small mammal abundance, including models with our two new road indices (AVIRE and IRE) or models with other road effect indices (nearest road distance, mesh size, and road density), and reference models (containing only habitat indices, or only the intercept without the effect of any variable). Compared to other road effect indices, AVIRE showed the best performance to explain abundance of forest specialist species, whereas the nearest road distance obtained the best performance to generalist species. AVIRE and habitat together were included in the best model for both small mammal groups, that is, higher abundance of specialist and generalist small mammals occurred where there is lower average road effect (less AVIRE) and more habitat. Moreover, AVIRE was not significantly correlated with habitat cover of specialists and generalists differing from the other road effect indices, except mesh size, which allows for separating the effect of roads from the effect of habitat on small mammal communities. We suggest that the proposed indices and GIS procedures could also be useful to describe other spatial ecological phenomena, such as edge effect in habitat fragments. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Mapping of soil has been highlighted in the scientific community, because as alertness about the environment increases, it is necessary to understand more and more about the distribution of the soil in the landscape, as well as its potential and its limitations for the use. In that way the main aim of this study was to apply indices representing landscape with the use of geoprocessing to give support in the delimitation of different compartments of landscape. Primary indices used were altitude above channel network (AACN) and secondary channel network base level (CNBL), multiresolution index of valley bottom flatness (MRVBF) and Wetness index (ITW), having as object of study the Canguiri Experimental Farm, located in Pinhais, Curitiba's Metropolitan region. To correlate the chemical attributes and granulometric ones in sampling groups, totalizing 17 points (Sugamosto, 2002), a matrix of a simple linear correlation (Pearson) with the indices of the landscape were generated in the Software Statistica. The conclusion is that the indices representing the landscape used in the analysis of groupings were efficient as support to map soil at the level of suborder of Brazilian Soil Classification System.
Resumo:
Forensic age estimation is an important element of anthropological research, as it produces one of the primary sources of data that researchers use to establish the identity of a person living or the identity of unknown bodily remains. The aim of this study was to determine if the chronology of third molar mineralization could be an accurate indicator of estimated age in a sample Brazilian population. If so, mineralization could determine the probability of an individual being 18 years or older. The study evaluated 407 panoramic radiographs of males and females from the past 5 years in order to assess the mineralization status of the mandibular third molars. The evaluation was carried out using an adaptation of Demirjian's system. The results indicated a strong correlation between chronological age and the mineralization of the mandibular third molars. The results indicated that modern Brazilian generation tends to demonstrate an earlier mandibular third molar mineralization than older Brazilian generation and people of other nationalities. Males reached developmental stages slightly earlier than females, but statistically significant differences between the sex were not found. The probability that an individual with third molar mineralization stage H had reached an age of 18 years or older was 96.8-98.6% for males and females, respectively. (C) 2011 Elsevier Ireland Ltd. All rights reserved.
Resumo:
The ability to discriminate nestmates from non-nestmates in insect societies is essential to protect colonies from conspecific invaders. The acceptance threshold hypothesis predicts that organisms whose recognition systems classify recipients without errors should optimize the balance between acceptance and rejection. In this process, cuticular hydrocarbons play an important role as cues of recognition in social insects. The aims of this study were to determine whether guards exhibit a restrictive level of rejection towards chemically distinct individuals, becoming more permissive during the encounters with either nestmate or non-nestmate individuals bearing chemically similar profiles. The study demonstrates that Melipona asilvai (Hymenoptera: Apidae: Meliponini) guards exhibit a flexible system of nestmate recognition according to the degree of chemical similarity between the incoming forager and its own cuticular hydrocarbons profile. Guards became less restrictive in their acceptance rates when they encounter non-nestmates with highly similar chemical profiles, which they probably mistake for nestmates, hence broadening their acceptance level.
Resumo:
Spatial data warehouses (SDWs) allow for spatial analysis together with analytical multidimensional queries over huge volumes of data. The challenge is to retrieve data related to ad hoc spatial query windows according to spatial predicates, avoiding the high cost of joining large tables. Therefore, mechanisms to provide efficient query processing over SDWs are essential. In this paper, we propose two efficient indices for SDW: the SB-index and the HSB-index. The proposed indices share the following characteristics. They enable multidimensional queries with spatial predicate for SDW and also support predefined spatial hierarchies. Furthermore, they compute the spatial predicate and transform it into a conventional one, which can be evaluated together with other conventional predicates by accessing a star-join Bitmap index. While the SB-index has a sequential data structure, the HSB-index uses a hierarchical data structure to enable spatial objects clustering and a specialized buffer-pool to decrease the number of disk accesses. The advantages of the SB-index and the HSB-index over the DBMS resources for SDW indexing (i.e. star-join computation and materialized views) were investigated through performance tests, which issued roll-up operations extended with containment and intersection range queries. The performance results showed that improvements ranged from 68% up to 99% over both the star-join computation and the materialized view. Furthermore, the proposed indices proved to be very compact, adding only less than 1% to the storage requirements. Therefore, both the SB-index and the HSB-index are excellent choices for SDW indexing. Choosing between the SB-index and the HSB-index mainly depends on the query selectivity of spatial predicates. While low query selectivity benefits the HSB-index, the SB-index provides better performance for higher query selectivity.
Resumo:
HTLV-1 is endemic in Brazil and HIV/ HTLV-1 coinfection has been detected, mostly in the northeast region. Cosmopolitan HTLV-1a is the main subtype that circulates in Brazil. This study characterized 17 HTLV-1 isolates from HIV coinfected patients of southern (n = 7) and southeastern (n = 10) Brazil. HTLV-1 provirus DNA was amplified by nested PCR (env and LTR) and sequenced. Env sequences (705 bp) from 15 isolates and LTR sequences (731 bp) from 17 isolates showed 99.5% and 98.8% similarity among sequences, respectively. Comparing these sequences with ATK (HTLV-1a) and Mel5 (HTLV-1c) prototypes, similarities of 99% and 97.4%, respectively, for env and LTR with ATK, and 91.6% and 90.3% with Mel5, were detected. Phylogenetic analysis showed that all sequences belonged to the transcontinental subgroup A of the Cosmopolitan subtype, clustering in two Latin American clusters.
Resumo:
Abstract Background Decreased heart rate variability (HRV) is related to higher morbidity and mortality. In this study we evaluated the linear and nonlinear indices of the HRV in stable angina patients submitted to coronary angiography. Methods We studied 77 unselected patients for elective coronary angiography, which were divided into two groups: coronary artery disease (CAD) and non-CAD groups. For analysis of HRV indices, HRV was recorded beat by beat with the volunteers in the supine position for 40 minutes. We analyzed the linear indices in the time (SDNN [standard deviation of normal to normal], NN50 [total number of adjacent RR intervals with a difference of duration greater than 50ms] and RMSSD [root-mean square of differences]) and frequency domains ultra-low frequency (ULF) ≤ 0,003 Hz, very low frequency (VLF) 0,003 – 0,04 Hz, low frequency (LF) (0.04–0.15 Hz), and high frequency (HF) (0.15–0.40 Hz) as well as the ratio between LF and HF components (LF/HF). In relation to the nonlinear indices we evaluated SD1, SD2, SD1/SD2, approximate entropy (−ApEn), α1, α2, Lyapunov Exponent, Hurst Exponent, autocorrelation and dimension correlation. The definition of the cutoff point of the variables for predictive tests was obtained by the Receiver Operating Characteristic curve (ROC). The area under the ROC curve was calculated by the extended trapezoidal rule, assuming as relevant areas under the curve ≥ 0.650. Results Coronary arterial disease patients presented reduced values of SDNN, RMSSD, NN50, HF, SD1, SD2 and -ApEn. HF ≤ 66 ms2, RMSSD ≤ 23.9 ms, ApEn ≤−0.296 and NN50 ≤ 16 presented the best discriminatory power for the presence of significant coronary obstruction. Conclusion We suggest the use of Heart Rate Variability Analysis in linear and nonlinear domains, for prognostic purposes in patients with stable angina pectoris, in view of their overall impairment.
Resumo:
In this paper, we present a novel approach to perform similarity queries over medical images, maintaining the semantics of a given query posted by the user. Content-based image retrieval systems relying on relevance feedback techniques usually request the users to label relevant/irrelevant images. Thus, we present a highly effective strategy to survey user profiles, taking advantage of such labeling to implicitly gather the user perceptual similarity. The profiles maintain the settings desired for each user, allowing tuning of the similarity assessment, which encompasses the dynamic change of the distance function employed through an interactive process. Experiments on medical images show that the method is effective and can improve the decision making process during analysis.