15 resultados para Latent Semantic Indexing
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Even though the digital processing of documents is increasingly widespread in industry, printed documents are still largely in use. In order to process electronically the contents of printed documents, information must be extracted from digital images of documents. When dealing with complex documents, in which the contents of different regions and fields can be highly heterogeneous with respect to layout, printing quality and the utilization of fonts and typing standards, the reconstruction of the contents of documents from digital images can be a difficult problem. In the present article we present an efficient solution for this problem, in which the semantic contents of fields in a complex document are extracted from a digital image.
Resumo:
Increasing public interest in science information in a digital and 2.0 science era promotes a dramatically, rapid and deep change in science itself. The emergence and expansion of new technologies and internet-based tools is leading to new means to improve scientific methodology and communication, assessment, promotion and certification. It allows methods of acquisition, manipulation and storage, generating vast quantities of data that can further facilitate the research process. It also improves access to scientific results through information sharing and discussion. Content previously restricted only to specialists is now available to a wider audience. This context requires new management systems to make scientific knowledge more accessible and useable, including new measures to evaluate the reach of scientific information. The new science and research quality measures are strongly related to the new online technologies and services based in social media. Tools such as blogs, social bookmarks and online reference managers, Twitter and others offer alternative, transparent and more comprehensive information about the active interest, usage and reach of scientific publications. Another of these new filters is the Research Blogging platform, which was created in 2007 and now has over 1,230 active blogs, with over 26,960 entries posted about peer-reviewed research on subjects ranging from Anthropology to Zoology. This study takes a closer look at RB, in order to get insights into its contribution to the rapidly changing landscape of scientific communication.
Resumo:
The Neotropical evaniid genus Evaniscus Szepligeti currently includes six species. Two new species are described, Evaniscus lansdownei Mullins, sp. n. from Colombia and Brazil and E. rafaeli Kawada, sp. n. from Brazil. Evaniscus sulcigenis Roman, syn. n., is synonymized under E. rufithorax Enderlein. An identification key to species of Evaniscus is provided. Thirty-five parsimony informative morphological characters are analyzed for six ingroup and four outgroup taxa. A topology resulting in a monophyletic Evaniscus is presented with E. tibialis and E. rafaeli as sister to the remaining Evaniscus species. The Hymenoptera Anatomy Ontology and other relevant biomedical ontologies are employed to create semantic phenotype statements in Entity-Quality (EQ) format for species descriptions. This approach is an early effort to formalize species descriptions and to make descriptive data available to other domains.
Resumo:
In this paper, a new family of survival distributions is presented. It is derived by considering that the latent number of failure causes follows a Poisson distribution and the time for these causes to be activated follows an exponential distribution. Three different activation schemes are also considered. Moreover, we propose the inclusion of covariates in the model formulation in order to study their effect on the expected value of the number of causes and on the failure rate function. Inferential procedure based on the maximum likelihood method is discussed and evaluated via simulation. The developed methodology is illustrated on a real data set on ovarian cancer.
Resumo:
In this paper, we proposed a new three-parameter long-term lifetime distribution induced by a latent complementary risk framework with decreasing, increasing and unimodal hazard function, the long-term complementary exponential geometric distribution. The new distribution arises from latent competing risk scenarios, where the lifetime associated scenario, with a particular risk, is not observable, rather we observe only the maximum lifetime value among all risks, and the presence of long-term survival. The properties of the proposed distribution are discussed, including its probability density function and explicit algebraic formulas for its reliability, hazard and quantile functions and order statistics. The parameter estimation is based on the usual maximum-likelihood approach. A simulation study assesses the performance of the estimation procedure. We compare the new distribution with its particular cases, as well as with the long-term Weibull distribution on three real data sets, observing its potential and competitiveness in comparison with some usual long-term lifetime distributions.
Resumo:
PTFE foils were irradiated with different ion beams (Xe, Au and U) with energies up to 1.5 GeV and fluences between 1 x 10(8) and 1 x 10(13) ions/cm(2) at room temperature. The induced modifications in the polymer were analyzed by FTIR, UV-Vis spectroscopy, and XRD. In the FTIR spectra, the CF2 degradation accompanied by the formation of CF3 terminal and side groups were observed. In the UV-Vis spectra, the observed increase in the absorption at UV wavelengths is an indication of polymer carbonization. From XRD, the amorphization of the material was evidenced by the decrease in the intensity of the main diffraction peak. An exponential fit of the intensity of the IR absorption peaks resulted in the following values: 2.9 +/- 0.8; 4.5 +/- 0.9 and 5.6 +/- 0.8 nm for the latent track radius after irradiation with Xe, Au and U beams, respectively. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Background: Early progressive nonfluent aphasia (PNFA) may be difficult to differentiate from semantic dementia (SD) in a nonspecialist setting. There are descriptions of the clinical and neuropsychological profiles of patients with PNFA and SD but few systematic comparisons. Method: We compared the performance of groups with SD (n = 27) and PNFA (n = 16) with comparable ages, education, disease duration, and severity of dementia as measured by the Clinical Dementia Rating Scale on a comprehensive neuropsychological battery. Principal components analysis and intergroup comparisons were used. Results: A 5-factor solution accounted for 78.4% of the total variance with good separation of neuropsychological variables. As expected, both groups were anomic with preserved visuospatial function and mental speed. Patients with SD had lower scores on comprehension-based semantic tests and better performance on verbal working memory and phonological processing tasks. The opposite pattern was found in the PNFA group. Conclusions: Neuropsychological tests that examine verbal and nonverbal semantic associations, verbal working memory, and phonological processing are the most helpful for distinguishing between PNFA and SD.
Resumo:
Model diagnostics is an integral part of model determination and an important part of the model diagnostics is residual analysis. We adapt and implement residuals considered in the literature for the probit, logistic and skew-probit links under binary regression. New latent residuals for the skew-probit link are proposed here. We have detected the presence of outliers using the residuals proposed here for different models in a simulated dataset and a real medical dataset.
Resumo:
OBJECTIVE: To estimate the prevalences of tuberculosis and latent tuberculosis ill inmates. METHODS: Observational study was carried out with inmates of a prison and a jail in the State of Sao Paulo, Southeastern Brazil, between March and December of 2008. Questionnaires were used to collect sociodemographic and epidemiological data. Tuberculin skin testing was administered (PPD-RT23-2TU/0.1 mL), and the following laboratory tests were also performed: sputum smear examination, sputum culture, identification of strains isolated and drug susceptibility testing. The variables were compared using Pearson's chi-square (chi(2)) association test, Fisher's exact test and the proportion test. RESULTS: Of the 2,435 inmates interviewed, 2,237(91.9%) agreed to submit to tuberculin skin testing and of these, 73.0% had positive reactions. The prevalence of tuberculosis was 830.6 per 100,000 inmates. The coefficients of prevalence were 1,029.5/100,000 for inmates of the prison and 525.7/100,000 for inmates of the jail. The sociodemographic characteristics of the inmates in the two groups studied were similar; most of the inmates were young and single with little schooling. The epidemiological characteristics differed between the prison units, with the number of cases of previous tuberculosis and of previous contact with the disease greater in the prison and coughing, expectoration and smoking more common in the jail. Among the 20 Mycobacterium tuberculosis strains identified, 95.0% were sensitive to anti-tuberculosis drugs, and 5.0% were resistant to streptomycin. CONCLUSIONS: The prevalences of tuberculosis and latent tuberculosis were higher in the incarcerated population than in the general population, and they were also higher in the prison than in the jail.
Resumo:
With the increase in research on the components of Body Image, validated instruments are needed to evaluate its dimensions. The Body Change Inventory (BCI) assesses strategies used to alter body size among adolescents. The scope of this study was to describe the translation and evaluation for semantic equivalence of the BCI in the Portuguese language. The process involved the steps of (1) translation of the questionnaire to the Portuguese language; (2) back-translation to English; (3) evaluation of semantic equivalence; and (4) assessment of comprehension by professional experts and the target population. The six subscales of the instrument were translated into the Portuguese language. Language adaptations were made to render the instrument suitable for the Brazilian reality. The questions were interpreted as easily understandable by both experts and young people. The Body Change Inventory has been translated and adapted into Portuguese. Evaluation of the operational, measurement and functional equivalence are still needed.
Resumo:
We propose a new general Bayesian latent class model for evaluation of the performance of multiple diagnostic tests in situations in which no gold standard test exists based on a computationally intensive approach. The modeling represents an interesting and suitable alternative to models with complex structures that involve the general case of several conditionally independent diagnostic tests, covariates, and strata with different disease prevalences. The technique of stratifying the population according to different disease prevalence rates does not add further marked complexity to the modeling, but it makes the model more flexible and interpretable. To illustrate the general model proposed, we evaluate the performance of six diagnostic screening tests for Chagas disease considering some epidemiological variables. Serology at the time of donation (negative, positive, inconclusive) was considered as a factor of stratification in the model. The general model with stratification of the population performed better in comparison with its concurrents without stratification. The group formed by the testing laboratory Biomanguinhos FIOCRUZ-kit (c-ELISA and rec-ELISA) is the best option in the confirmation process by presenting false-negative rate of 0.0002% from the serial scheme. We are 100% sure that the donor is healthy when these two tests have negative results and he is chagasic when they have positive results.
Resumo:
Background: An important issue concerning the worldwide fight against stigma is the evaluation of psychiatrists’ beliefs and attitudes toward schizophrenia and mental illness in general. However, there is as yet no consensus on this matter in the literature, and results vary according to the stigma dimension assessed and to the cultural background of the sample. The aim of this investigation was to search for profiles of stigmatizing beliefs related to schizophrenia in a national sample of psychiatrists in Brazil. Methods: A sample of 1414 psychiatrists were recruited from among those attending the 2009 Brazilian Congress of Psychiatry. A questionnaire was applied in face-to-face interviews. The questionnaire addressed four stigma dimensions, all in reference to individuals with schizophrenia: stereotypes, restrictions, perceived prejudice and social distance. Stigma item scores were included in latent profile analyses; the resulting profiles were entered into multinomial logistic regression models with sociodemographics, in order to identify significant correlates. Results: Three profiles were identified. The “no stigma” subjects (n = 337) characterized individuals with schizophrenia in a positive light, disagreed with restrictions, and displayed a low level of social distance. The “unobtrusive stigma” subjects (n = 471) were significantly younger and displayed the lowest level of social distance, although most of them agreed with involuntary admission and demonstrated a high level of perceived prejudice. The “great stigma” subjects (n = 606) negatively stereotyped individuals with schizophrenia, agreed with restrictions and scored the highest on the perceived prejudice and social distance dimensions. In comparison with the first two profiles, this last profile comprised a significantly larger number of individuals who were in frequent contact with a family member suffering from a psychiatric disorder, as well as comprising more individuals who had no such family member. Conclusions: Our study not only provides additional data related to an under-researched area but also reveals that psychiatrists are a heterogeneous group regarding stigma toward schizophrenia. The presence of different stigma profiles should be evaluated in further studies; this could enable anti-stigma initiatives to be specifically designed to effectively target the stigmatizing group.
Resumo:
There is evidence that the explicit lexical-semantic processing deficits which characterize aphasia may be observed in the absence of implicit semantic impairment. The aim of this article was to critically review the international literature on lexical-semantic processing in aphasia, as tested through the semantic priming paradigm. Specifically, this review focused on aphasia and lexical-semantic processing, the methodological strengths and weaknesses of the semantic paradigms used, and recent evidence from neuroimaging studies on lexical-semantic processing. Furthermore, evidence on dissociations between implicit and explicit lexical-semantic processing reported in the literature will be discussed and interpreted by referring to functional neuroimaging evidence from healthy populations. There is evidence that semantic priming effects can be found both in fluent and in non-fluent aphasias, and that these effects are related to an extensive network which includes the temporal lobe, the pre-frontal cortex, the left frontal gyrus, the left temporal gyrus and the cingulated cortex.
Resumo:
Abstract Background The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heteregeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. Results We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. Conclusions The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.
Resumo:
With the increasing production of information from e-government initiatives, there is also the need to transform a large volume of unstructured data into useful information for society. All this information should be easily accessible and made available in a meaningful and effective way in order to achieve semantic interoperability in electronic government services, which is a challenge to be pursued by governments round the world. Our aim is to discuss the context of e-Government Big Data and to present a framework to promote semantic interoperability through automatic generation of ontologies from unstructured information found in the Internet. We propose the use of fuzzy mechanisms to deal with natural language terms and present some related works found in this area. The results achieved in this study are based on the architectural definition and major components and requirements in order to compose the proposed framework. With this, it is possible to take advantage of the large volume of information generated from e-Government initiatives and use it to benefit society.