16 resultados para Machine Translation
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
The classification of texts has become a major endeavor with so much electronic material available, for it is an essential task in several applications, including search engines and information retrieval. There are different ways to define similarity for grouping similar texts into clusters, as the concept of similarity may depend on the purpose of the task. For instance, in topic extraction similar texts mean those within the same semantic field, whereas in author recognition stylistic features should be considered. In this study, we introduce ways to classify texts employing concepts of complex networks, which may be able to capture syntactic, semantic and even pragmatic features. The interplay between various metrics of the complex networks is analyzed with three applications, namely identification of machine translation (MT) systems, evaluation of quality of machine translated texts and authorship recognition. We shall show that topological features of the networks representing texts can enhance the ability to identify MT systems in particular cases. For evaluating the quality of MT texts, on the other hand, high correlation was obtained with methods capable of capturing the semantics. This was expected because the golden standards used are themselves based on word co-occurrence. Notwithstanding, the Katz similarity, which involves semantic and structure in the comparison of texts, achieved the highest correlation with the NIST measurement, indicating that in some cases the combination of both approaches can improve the ability to quantify quality in MT. In authorship recognition, again the topological features were relevant in some contexts, though for the books and authors analyzed good results were obtained with semantic features as well. Because hybrid approaches encompassing semantic and topological features have not been extensively used, we believe that the methodology proposed here may be useful to enhance text classification considerably, as it combines well-established strategies. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
The realization that statistical physics methods can be applied to analyze written texts represented as complex networks has led to several developments in natural language processing, including automatic summarization and evaluation of machine translation. Most importantly, so far only a few metrics of complex networks have been used and therefore there is ample opportunity to enhance the statistics-based methods as new measures of network topology and dynamics are created. In this paper, we employ for the first time the metrics betweenness, vulnerability and diversity to analyze written texts in Brazilian Portuguese. Using strategies based on diversity metrics, a better performance in automatic summarization is achieved in comparison to previous work employing complex networks. With an optimized method the Rouge score (an automatic evaluation method used in summarization) was 0.5089, which is the best value ever achieved for an extractive summarizer with statistical methods based on complex networks for Brazilian Portuguese. Furthermore, the diversity metric can detect keywords with high precision, which is why we believe it is suitable to produce good summaries. It is also shown that incorporating linguistic knowledge through a syntactic parser does enhance the performance of the automatic summarizers, as expected, but the increase in the Rouge score is only minor. These results reinforce the suitability of complex network methods for improving automatic summarizers in particular, and treating text in general. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The automatic disambiguation of word senses (i.e., the identification of which of the meanings is used in a given context for a word that has multiple meanings) is essential for such applications as machine translation and information retrieval, and represents a key step for developing the so-called Semantic Web. Humans disambiguate words in a straightforward fashion, but this does not apply to computers. In this paper we address the problem of Word Sense Disambiguation (WSD) by treating texts as complex networks, and show that word senses can be distinguished upon characterizing the local structure around ambiguous words. Our goal was not to obtain the best possible disambiguation system, but we nevertheless found that in half of the cases our approach outperforms traditional shallow methods. We show that the hierarchical connectivity and clustering of words are usually the most relevant features for WSD. The results reported here shed light on the relationship between semantic and structural parameters of complex networks. They also indicate that when combined with traditional techniques the complex network approach may be useful to enhance the discrimination of senses in large texts. Copyright (C) EPLA, 2012
Resumo:
The adaptation of a commercially available ice machine for autonomous photovoltaic operation without batteries is presented. In this adaptation a 1040 W(p) photovoltaic array directly feeds a variable-speed drive and a 24 V(dc) source. The drive runs an induction motor coupled by belt-and-pulley to an open reciprocating compressor, while the dc source supplies a solenoid valve and the control electronics. Motor speed and refrigerant evaporation pressure are set aiming at continuously matching system power demand to photovoltaic power availability. The resulting system is a simple integration of robust, standard, readily available parts. It produces 27 kg of ice in a clear-sky day and has ice production costs around US$0.30/kg. Although a few machine features might be specific to Brazil, its technical and economical guidelines are applicable elsewhere. Copyright (C); 2010 John Wiley & Sons, Ltd.
Resumo:
The single machine scheduling problem with a common due date and non-identical ready times for the jobs is examined in this work. Performance is measured by the minimization of the weighted sum of earliness and tardiness penalties of the jobs. Since this problem is NP-hard, the application of constructive heuristics that exploit specific characteristics of the problem to improve their performance is investigated. The proposed approaches are examined through a computational comparative study on a set of 280 benchmark test problems with up to 1000 jobs.
Resumo:
The objective of this study was to perform the translation on and cultural adaptation of the Global Appraisal of Individual Needs - Initial instrument, and calculate its content validity index. This is a methodological study designed for the cultural adaptation of the instrument. The instrument was translated into Portuguese in two versions that originated the synthesis of the translations, which were then submitted to the evaluation of four judges, experts in the field of alcohol and other drugs. After the suggested changes were made, the instrument was back-translated and resubmitted to the judges and authors of the original instrument, resulting in the final version of the instrument, Avaliacao Global das Necessidades Individuais - Inicial. The content validity index of the instrument was 0.91, considered valid according to the literature. The instrument Avaliacao Global das Necessidades Individuais - Inicial was culturally adapted to the Portuguese language spoken in Brazil; however, it was not submitted to tests with the target population, which suggests further studies should be performed to test its reliability and validity.
Resumo:
Objective: To translate, culturally adapt and validate the "Knee Society Score"(KSS) for the Portuguese language and determine its measurement properties, reproducibility and validity. Method: We analyzed 70 patients of both sexes, aged between 55 and 85 years, in a cross-sectional clinical trial, with diagnosis of primary osteoarthritis,undergoing total knee arthroplasty surgery. We assessed the patients with the English version of the KSS questionnaire and after 30 minutes with the Portuguese version of the KSS questionnaire, done by a different evaluator. All the patients were assessed preoperatively, and again at three, and six months postoperatively. Results: There was no statistical difference, using Cronbach's alpha index and the Bland-Altman graphical analysis, for the knees core during the preoperative period (p=1), and at three months (p=0.991) and six months postoperatively (p=0.985). There was no statistical difference for knee function score for all three periods (p=1.0). Conclusion: The Brazilian version of the Knee Society Score is easy to apply, as well providing as a valid and reliable instrument for measuring the knee score and function of Brazilian patients undergoing TKA. Level of Evidence: Level I - Diagnostic Studies Investigating a Diagnostic Test- Testing of previously developed diagnostic criteria on consecutive patients (with universally applied 'gold' reference standard).
Resumo:
STUDY DESIGN: Clinical measurement. OBJECTIVE: To translate and culturally adapt the Lower Extremity Functional Scale (LEFS) into a Brazilian Portuguese version, and to test the construct and content validity and reliability of this version in patients with knee injuries. BACKGROUND: There is no Brazilian Portuguese version of an instrument to assess the function of the lower extremity after orthopaedic injury. METHODS: The translation of the original English version of the LEFS into a Brazilian Portuguese version was accomplished using standard guidelines and tested in 31 patients with knee injuries. Subsequently, 87 patients with a variety of knee disorders completed the Brazilian Portuguese LEES, the Medical Outcomes Study 36-Item Short-Form Health Survey, the Western Ontario and McMaster Universities Osteoarthritis Index, and the International Knee Documentation Committee Subjective Knee Evaluation Form and a visual analog scale for pain. All patients were retested within 2 days to determine reliability of these measures. Validation was assessed by determining the level of association between the Brazilian Portuguese LEFS and the other outcome measures. Reliability was documented by calculating internal consistency, test-retest reliability, and standard error of measurement. RESULTS: The Brazilian Portuguese LEES had a high level of association with the physical component of the Medical Outcomes Study 36-Item Short-Form Health Survey (r = 0.82), the Western Ontario and McMaster Universities Osteoarthritis Index (r = 0.87), the International Knee Documentation Committee Subjective Knee Evaluation Form (r = 0.82), and the pain visual analog scale (r = -0.60) (all, P<.05). The Brazilian Portuguese LEES had a low level of association with the mental component of the Medical Outcomes Study 36-Item Short-Form Health Survey (r = 0.38, P<.05). The internal consistency (Cronbach alpha = .952) and test-retest reliability (intraclass correlation coefficient = 0.957) of the Brazilian Portuguese version of the LEES were high. The standard error of measurement was low (3.6) and the agreement was considered high, demonstrated by the small differences between test and retest and the narrow limit of agreement, as observed in Bland-Altman and survival-agreement plots. CONCLUSION: The translation of the LEFS into a Brazilian Portuguese version was successful in preserving the semantic and measurement properties of the original version and was shown to be valid and reliable in a Brazilian population with knee injuries. J Ort hop Sports Phys Ther 2012;42(11):932-939, Epub 9 October 2012. doi:10.2519/jospt.2012.4101
Resumo:
Workplace accidents involving machines are relevant for their magnitude and their impacts on worker health. Despite consolidated critical statements, explanation centered on errors of operators remains predominant with industry professionals, hampering preventive measures and the improvement of production-system reliability. Several initiatives were adopted by enforcement agencies in partnership with universities to stimulate production and diffusion of analysis methodologies with a systemic approach. Starting from one accident case that occurred with a worker who operated a brake-clutch type mechanical press, the article explores cognitive aspects and the existence of traps in the operation of this machine. It deals with a large-sized press that, despite being endowed with a light curtain in areas of access to the pressing zone, did not meet legal requirements. The safety devices gave rise to an illusion of safety, permitting activation of the machine when a worker was still found within the operational zone. Preventive interventions must stimulate the tailoring of systems to the characteristics of workers, minimizing the creation of traps and encouraging safety policies and practices that replace judgments of behaviors that participate in accidents by analyses of reasons that lead workers to act in that manner.
Resumo:
Surveillance Levels (SLs) are categories for medical patients (used in Brazil) that represent different types of medical recommendations. SLs are defined according to risk factors and the medical and developmental history of patients. Each SL is associated with specific educational and clinical measures. The objective of the present paper was to verify computer-aided, automatic assignment of SLs. The present paper proposes a computer-aided approach for automatic recommendation of SLs. The approach is based on the classification of information from patient electronic records. For this purpose, a software architecture composed of three layers was developed. The architecture is formed by a classification layer that includes a linguistic module and machine learning classification modules. The classification layer allows for the use of different classification methods, including the use of preprocessed, normalized language data drawn from the linguistic module. We report the verification and validation of the software architecture in a Brazilian pediatric healthcare institution. The results indicate that selection of attributes can have a great effect on the performance of the system. Nonetheless, our automatic recommendation of surveillance level can still benefit from improvements in processing procedures when the linguistic module is applied prior to classification. Results from our efforts can be applied to different types of medical systems. The results of systems supported by the framework presented in this paper may be used by healthcare and governmental institutions to improve healthcare services in terms of establishing preventive measures and alerting authorities about the possibility of an epidemic.
Resumo:
Defects of mitochondrial protein synthesis are clinically and genetically heterogeneous. We previously described a male infant who was born to consanguineous parents and who presented with severe congenital encephalopathy, peripheral neuropathy, myopathy, and lactic acidosis associated with deficiencies of multiple mitochondrial respiratory-chain enzymes and defective mitochondrial translation. In this work, we have characterized four additional affected family members, performed homozygosity mapping, and identified a homozygous splicing mutation in the splice donor site of exon 2 (c.504+1G>A) of RMND1 (required for meiotic nuclear division-1) in the affected individuals. Fibroblasts from affected individuals expressed two aberrant transcripts and had decreased wild-type mRNA and deficiencies of mitochondrial respiratory-chain enzymes. The RMND1 mutation caused haploinsufficiency that was rescued by overexpression of the wild-type transcript in mutant fibroblasts; this overexpression increased the levels and activities of mitochondrial respiratory-chain proteins. Knockdown of RMND1 via shRNA recapitulated the biochemical defect of the mutant fibroblasts, further supporting a loss-of-function pathomechanism in this disease. RMND1 belongs to the sif2 family, an evolutionary conserved group of proteins that share the DUF155 domain, have unknown function, and have never been associated with human disease. We documented that the protein localizes to mitochondria in mammalian and yeast cells. Further studies are necessary for understanding the function of this protein in mitochondrial protein translation.
Resumo:
Several recent studies in literature have identified brain morphological alterations associated to Borderline Personality Disorder (BPD) patients. These findings are reported by studies based on voxel-based-morphometry analysis of structural MRI data, comparing mean gray-matter concentration between groups of BPD patients and healthy controls. On the other hand, mean differences between groups are not informative about the discriminative value of neuroimaging data to predict the group of individual subjects. In this paper, we go beyond mean differences analyses, and explore to what extent individual BPD patients can be differentiated from controls (25 subjects in each group), using a combination of automated-morphometric tools for regional cortical thickness/volumetric estimation and Support Vector Machine classifier. The approach included a feature selection step in order to identify the regions containing most discriminative information. The accuracy of this classifier was evaluated using the leave-one-subject-out procedure. The brain regions indicated as containing relevant information to discriminate groups were the orbitofrontal, rostral anterior cingulate, posterior cingulate, middle temporal cortices, among others. These areas, which are distinctively involved in emotional and affect regulation of BPD patients, were the most informative regions to achieve both sensitivity and specificity values of 80% in SVM classification. The findings suggest that this new methodology can add clinical and potential diagnostic value to neuroimaging of psychiatric disorders. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
Abstract Background Collybistin (CB), a neuron-specific guanine nucleotide exchange factor, has been implicated in targeting gephyrin-GABAA receptors clusters to inhibitory postsynaptic sites. However, little is known about additional CB partners and functions. Findings Here, we identified the p40 subunit of the eukaryotic translation initiation factor 3 (eIF3H) as a novel binding partner of CB, documenting the interaction in yeast, non-neuronal cell lines, and the brain. In addition, we demonstrated that gephyrin also interacts with eIF3H in non-neuronal cells and forms a complex with eIF3 in the brain. Conclusions Together, our results suggest, for the first time, that CB and gephyrin associate with the translation initiation machinery, and lend further support to the previous evidence that gephyrin may act as a regulator of synaptic protein synthesis.
Resumo:
OBJECTIVE: The Prodromal Questionnaire (PQ) is a 92-item self-report screening tool for individuals at ultra-high risk (UHR) to develop psychosis. This study aims to present the translation to Portuguese and preliminary results in UHR and first episode (FE) psychosis in a Portuguese sample. METHODS: The PQ was translated from English to Portuguese by two bilingual researchers from the research program on early psychosis of the Instituto de Psiquiatria HCFMUSP, São Paulo, Brazil (ASAS - "Evaluation and Follow up of Adolescents and Young Adults in São Paulo") and back translated by two other researchers. The study participants (n = 11-) were evaluated through the Portuguese version of the Prodromal Questionnaire (PQ) and SIPS. RESULTS: The individuals at UHR (n = 7) presented a lower score than first episode patients (n = 4). The UHR mean scores and standard deviation on Portuguese version of the PQ were: 13.0 ± 10.0 points on positive symptoms subscale, and FE patients: 33.0 ± 10.0. CONCLUSION: The UHR and FE patients' of this study presented PQ scores similar to the ones found in the literature; what suggests that it is possible to use the PQ in Brazilian help-seeking individuals as a screening tool.
Resumo:
INTRODUCTION: Schizophrenia is a chronic mental disorder associated with impairment in social functioning. The most widely used scale to measure social functioning is the GAF (Global Assessment of Functioning), but it has the disadvantage of measuring at the same time symptoms and functioning, as described in its anchors. OBJECTIVES:Translation and cultural adaptation of the PSP, proposing a final version in Portuguese for use in Brazil. METHODS: We performed five steps: 1) translation; 2) back translation; 3) formal assessment of semantic equivalence; 4) debriefing; 5) analysis by experts. Interrater reliability (Intraclass correlation, ICC) between two raters was also measured. RESULTS: The final version was applied by two independent investigators in 18 adults with schizophrenia (DSM-IV-TR). The interrater reliability (ICC) was 0.812 (p < 0.001). CONCLUSION: The translation and adaptation of the PSP had an adequate level of semantic equivalence between the Portuguese version and the original English version. There were no difficulties related to understanding the content expressed in the translated texts and terms. Its application was easy and it showed a good interrater reliability. The PSP is a valid instrument for the measurement of personal and social functioning in schizophrenia.