989 resultados para Text similarity measures
Resumo:
telligence applications for the banking industry. Searches were performed in relevant journals resulting in 219 articles published between 2002 and 2013. To analyze such a large number of manuscripts, text mining techniques were used in pursuit for relevant terms on both business intelligence and banking domains. Moreover, the latent Dirichlet allocation modeling was used in or- der to group articles in several relevant topics. The analysis was conducted using a dictionary of terms belonging to both banking and business intelli- gence domains. Such procedure allowed for the identification of relationships between terms and topics grouping articles, enabling to emerge hypotheses regarding research directions. To confirm such hypotheses, relevant articles were collected and scrutinized, allowing to validate the text mining proce- dure. The results show that credit in banking is clearly the main application trend, particularly predicting risk and thus supporting credit approval or de- nial. There is also a relevant interest in bankruptcy and fraud prediction. Customer retention seems to be associated, although weakly, with targeting, justifying bank offers to reduce churn. In addition, a large number of ar- ticles focused more on business intelligence techniques and its applications, using the banking industry just for evaluation, thus, not clearly acclaiming for benefits in the banking business. By identifying these current research topics, this study also highlights opportunities for future research.
Resumo:
Many texture measures have been developed and used for improving land-cover classification accuracy, but rarely has research examined the role of textures in improving the performance of aboveground biomass estimations. The relationship between texture and biomass is poorly understood. This paper used Landsat Thematic Mapper (TM) data to explore relationships between TM image textures and aboveground biomass in Rondônia, Brazilian Amazon. Eight grey level co-occurrence matrix (GLCM) based texture measures (i.e., mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment, and correlation), associated with seven different window sizes (5x5, 7x7, 9x9, 11x11, 15x15, 19x19, and 25x25), and five TM bands (TM 2, 3, 4, 5, and 7) were analyzed. Pearson's correlation coefficient was used to analyze texture and biomass relationships. This research indicates that most textures are weakly correlated with successional vegetation biomass, but some textures are significantly correlated with mature forest biomass. In contrast, TM spectral signatures are significantly correlated with successional vegetation biomass, but weakly correlated with mature forest biomass. Our findings imply that textures may be critical in improving mature forest biomass estimation, but relatively less important for successional vegetation biomass estimation.
Resumo:
We investigate palm species distribution, richness and abundance along the Mokoti, a seasonally-dry river of southeastern Amazon and compare it to the patterns observed at a large scale, comprising the entire Brazilian territory. A total of 694 palms belonging to 10 species were sampled at the Mokoti River basin. Although the species showed diverse distribution patterns, we found that local palm abundance, richness and tree basal area were significantly higher from the hills to the bottomlands of the study region, revealing a positive association of these measures with moisture. The analyses at the larger spatial scale also showed a strong influence of vapor pressure (a measure of moisture content of the air, in turn modulated by temperature) and seasonality in temperature: the richest regions were those where temperature and humidity were simultaneously high, and which also presented a lower degree of seasonality in temperature. These results indicate that the distribution of palms seems to be strongly associated with climatic variables, supporting the idea that, by 'putting all the eggs in one basket' (a consequence of survival depending on the preservation of a single irreplaceable bud), palms have become vulnerable to extreme environmental conditions. Hence, their distribution is concentrated in those tropical and sub-tropical regions with constant conditions of (mild to high) temperature and moisture all year round.
Resumo:
Transcriptional Regulatory Networks (TRNs) are powerful tool for representing several interactions that occur within a cell. Recent studies have provided information to help researchers in the tasks of building and understanding these networks. One of the major sources of information to build TRNs is biomedical literature. However, due to the rapidly increasing number of scientific papers, it is quite difficult to analyse the large amount of papers that have been published about this subject. This fact has heightened the importance of Biomedical Text Mining approaches in this task. Also, owing to the lack of adequate standards, as the number of databases increases, several inconsistencies concerning gene and protein names and identifiers are common. In this work, we developed an integrated approach for the reconstruction of TRNs that retrieve the relevant information from important biological databases and insert it into a unique repository, named KREN. Also, we applied text mining techniques over this integrated repository to build TRNs. However, was necessary to create a dictionary of names and synonyms associated with these entities and also develop an approach that retrieves all the abstracts from the related scientific papers stored on PubMed, in order to create a corpora of data about genes. Furthermore, these tasks were integrated into @Note, a software system that allows to use some methods from the Biomedical Text Mining field, including an algorithms for Named Entity Recognition (NER), extraction of all relevant terms from publication abstracts, extraction relationships between biological entities (genes, proteins and transcription factors). And finally, extended this tool to allow the reconstruction Transcriptional Regulatory Networks through using scientific literature.
Resumo:
A preliminary survey of the spider fauna in natural and artificial forest gap formations at Porto Urucu, a petroleum/natural gas production facility in the Urucu river basin, Coari, Amazonas, Brazil is presented. Sampling was conducted both occasionally and using a protocol composed of a suite of techniques: beating trays (32 samples), nocturnal manual samplings (48), sweeping nets (16), Winkler extractors (24), and pitfall traps (120). A total of 4201 spiders, belonging to 43 families and 393 morphospecies, were collected during the dry season, in July, 2003. Excluding the occasional samples, the observed richness was 357 species. In a performance test of seven species richness estimators, the Incidence Based Coverage Estimator (ICE) was the best fit estimator, with 639 estimated species. To evaluate differences in species richness associated with natural and artificial gaps, samples from between the center of the gaps up to 300 meters inside the adjacent forest matrix were compared through the inspection of the confidence intervals of individual-based rarefaction curves for each treatment. The observed species richness was significantly higher in natural gaps combined with adjacent forest than in the artificial gaps combined with adjacent forest. Moreover, a community similarity analysis between the fauna collected under both treatments demonstrated that there were considerable differences in species composition. The significantly higher abundance of Lycosidae in artificial gap forest is explained by the presence of herbaceous vegetation in the gaps themselves. Ctenidae was significantly more abundant in the natural gap forest, probable due to the increase of shelter availability provided by the fallen trees in the gaps themselves. Both families are identified as potential indicators of environmental change related to the establishment or recovery of artificial gaps in the study area.
Resumo:
The high tree diversity and vast extent of Amazonian forests challenge our understanding of how tree species abundance and composition varies across this region. Information about these parameters, usually obtained from tree inventories plots, is essential for revealing patterns of tree diversity. Numerous tree inventories plots have been established in Amazonia, yet, tree species composition and diversity of white-sand and terra-firme forests of the upper Rio Negro still remain poorly understood. Here, we present data from eight new one-hectare tree inventories plots established in the upper Rio Negro; four of which were located in white-sand forests and four in terra-firme forests. Overall, we registered 4703 trees > 10 cm of diameter at breast height. These trees belong to 49 families, 215 genera, and 603 species. We found that tree communities of terra-firme and white-sand forests in the upper Rio Negro significantly differ from each other in their species composition. Tree communities of white-sand forests show a higher floristic similarity and lower diversity than those of terra-firme forests. We argue that mechanisms driving differences between tree communities of white-sand and terra-firme forests are related to habitat size, which ultimately influences large-scale and long-term evolutionary processes.
Resumo:
Lakes play an important role in biogeochemical, ecological and hydrological processes in the river-floodplain system. The aim of this study was to evaluate the dynamics of the limnological conditions of Catalão Lake, an Amazon floodplain lake. Thus, some of the main limnological environment variables (O2, temperature, pH, nutrient, electrical conductivity) of the Catalão Lake were analyzed under temporal and spacial scales. The study was conducted between November/2004 and August/2005. Sampling excursion were carried out every three months; one excursion for each of the four different hydrological periods (low water, rising water, high water and falling water). Sampling points were chosen so that it could be obtained a gradient of the distance from Negro River. Limnological profiles in Catalão Lake showed generally acidic to slightly alcaline water, with low levels of dissolved oxygen and low concentrations of soluble reactive phosphorous. The Negro River seems to exert the main influence during the rising water period, while the Solimões River is the principal controlling river during peak water. The Principal Component Analysis (PCA) grouped the seasonal collections by hydrological period, showing the formation of a north-south spatial gradient within the lake in relation to the limnological variables. Multivariate dispersion analysis based on distance-to-centroid method demonstrated an increase in similarity over the course of the hydrological cycle, as the lake was inundated in response to the flood pulse of the main river channels. However, the largest spatial homogeneity in the lake was observed in the epilimnion layer, during the falling water period. The daily analysis of variation indicated an oligomitic pattern during the years in which the lake was permanently connected to the Negro River. Although Catalão Lake receives large quantities of both black water from the Negro River and sediment-filled water from the Solimões River, the physical and chemical characteristics of the lake are more similar to those of the Solimões (várzea lake) than the Negro (blackwater lake).
Resumo:
Phlebotomine sand flies are insects of medical importance. Species in the Neotropical region are highly diverse. Some of these species are considered cryptic species because of their morphological similarity between adult females of different species make identification especially difficult. The aim of this study was to analyze and describe the armature in the genital atrium (AGA) of some adult female sand flies, in order to discover new taxonomic characters that make it possible to distinguish between species that would otherwise be treated as cryptic by analysis of the AGA. The AGA of 16 Phlebotomine sand fly species are described. Distinct differences were found in relation to the shape and size of the armature, the presence or absence of spines on the armature, and the shape, size, and grouping patterns of the spines. These characters made it possible to distinguish between the species studied.
Resumo:
Field collection and herbaria data did not allow to quantify the diversity of aquatic plants from Northern Brazil, so we could not detect biogeographic patterns. Therefore, our objectives were to identify and quantify the aquatic macrophytes of North Brazilian states, analyzing herbaria data plataforms (SpeciesLink and Flora do Brasil). The checklist was produced by bibliographic search (articles published between 1980 and 2000), herbaria collections of the platforms SpeciesLink and Flora do Brasil and field expeditions, where we utilized asystematic sampling. We also analyzed the floristic similarity of aquatic macrophytes among Northern Brazil, wetlands of distinct Brazilian regions and the Neotropics. We recorded 539 species, of which 48 are endemic to Brazil. The states with highest number of species were Amazonas and Pará, independently on platform. The most represented families were Poaceae (89 species), Podostemaceae (55), Cyperaceae (50) and Fabaceae (47). We highlight the unprecedent richness of Podostemaceae, due to our own field collection efforts on favorable habitats, 25 species being endemic. Emergent and/or amphibious plants (515) were dominant in total species richness and were best represented in lotic habitats. We found significant differences in richness and floristics among states, obtained from the platforms. There is floristic similarity among Northern states and other Brazilian wetlands. In conclusion, we observed a rich aquatic flora in Northern Brazil, in spite of scarcity of records for Acre, Rondonia and Tocantins; we highlight the unprecedent number of endemic species of Podostemaceae (25) and contrasting richness between SpeciesLink and Flora do Brasil.
Resumo:
Given the limitations of different types of remote sensing images, automated land-cover classifications of the Amazon várzea may yield poor accuracy indexes. One way to improve accuracy is through the combination of images from different sensors, by either image fusion or multi-sensor classifications. Therefore, the objective of this study was to determine which classification method is more efficient in improving land cover classification accuracies for the Amazon várzea and similar wetland environments - (a) synthetically fused optical and SAR images or (b) multi-sensor classification of paired SAR and optical images. Land cover classifications based on images from a single sensor (Landsat TM or Radarsat-2) are compared with multi-sensor and image fusion classifications. Object-based image analyses (OBIA) and the J.48 data-mining algorithm were used for automated classification, and classification accuracies were assessed using the kappa index of agreement and the recently proposed allocation and quantity disagreement measures. Overall, optical-based classifications had better accuracy than SAR-based classifications. Once both datasets were combined using the multi-sensor approach, there was a 2% decrease in allocation disagreement, as the method was able to overcome part of the limitations present in both images. Accuracy decreased when image fusion methods were used, however. We therefore concluded that the multi-sensor classification method is more appropriate for classifying land cover in the Amazon várzea.
Resumo:
ABSTRACT The analysis of changes in species composition and vegetation structure in chronosequences improves knowledge on the regeneration patterns following land abandonment in the Amazon. Here, the objective was to perform floristic-structural analysis in mature forests (with/without timber exploitation) and secondary successions (initial, intermediate and advanced vegetation regrowth) in the Tapajós region. The regrowth age and plot locations were determined using Landsat-5/Thematic Mapper images (1984-2012). For floristic analysis, we determined the sample sufficiency and the Shannon-Weaver (H'), Pielou evenness (J), Value of Importance (VI) and Fisher's alpha (α) indices. We applied the Non-metric Multidimensional Scaling (NMDS) for similarity ordination. For structural analysis, the diameter at the breast height (DBH), total tree height (Ht), basal area (BA) and the aboveground biomass (AGB) were obtained. We inspected the differences in floristic-structural attributes using Tukey and Kolmogorov-Smirnov tests. The results showed an increase in the H', J and α indices from initial regrowth to mature forests of the order of 47%, 33% and 91%, respectively. The advanced regrowth had more species in common with the intermediate stage than with the mature forest. Statistically significant differences between initial and intermediate stages (p<0.05) were observed for DBH, BA and Ht. The recovery of carbon stocks showed an AGB variation from 14.97 t ha-1 (initial regrowth) to 321.47 t ha-1 (mature forests). In addition to AGB, Ht was also important to discriminate the typologies.
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação
Resumo:
OBJECTIVE: Describe suicide attempts assisted in an emergency room (ER) and acute substance consumption or dependence on these individuals. METHODS: Descriptive epidemiologic study was carried out during one year, evaluating suicide attempts assisted at Embu das Artes ER, São Paulo, Brazil. Patients were scheduled to a non structured psychiatric interview. Main outcomes measures were: socio demographic data, suicide attempt method, drugs or alcohol acute use in the six hours prior to attempt, patients with ICD-10 substance dependence diagnosis. The descriptive analyses and chi-square test (p < 0.05) were used to verify associations between the variables studied. RESULTS: sample was formed of 80 patients, mean age of 26.9 years (SD = 8.91), predominantly female (72.5%) and 21.2% adolescents. Most suicide attempts were made through medicine ingestion (62.5%). Approximately 21.2% and 7.5% related to have used alcohol and an illicit drug respectively within 6 hours prior to attempt and 10% were found to be substance dependent. All substance dependents had attempted suicide previously (p-value = 0.4). There was a significant association between suicide attempt through medicine ingestion and psychiatric treatment history (p = 0.02). CONCLUSION: More national studies are necessary to consider the role of alcohol and drug in suicide attempts assisted in ER, especially in chemical dependents whose suicidal behavior is relevant.
Resumo:
Objective: To evaluate body image dissatisfaction and its relationship with physical activity and body mass index in a Brazilian sample of adolescents. Methods: A total of 275 adolescents (139 boys and 136 girls) between the ages of 14 and 18 years completed measures of body image dissatisfaction through the Contour Drawing Scale and current physical activity by the International Physical Activity Questionnaire. Weight and height were also measured for subsequent calculation of body mass index. Results: Boys and girls differed significantly regarding body image dissatisfaction, with girls reporting higher levels of dissatisfaction. Underweight and eutrophic boys preferred to be heavier, while those overweight preferred be thinner and, in contrast, girls desired to be thinner even when they are of normal weight. Conclusion: Body image dissatisfaction was strictly related to body mass index, but not to physical activity.