24 resultados para machine translation programs

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Establishing metrics to assess machine translation (MT) systems automatically is now crucial owing to the widespread use of MT over the web. In this study we show that such evaluation can be done by modeling text as complex networks. Specifically, we extend our previous work by employing additional metrics of complex networks, whose results were used as input for machine learning methods and allowed MT texts of distinct qualities to be distinguished. Also shown is that the node-to-node mapping between source and target texts (English-Portuguese and Spanish-Portuguese pairs) can be improved by adding further hierarchical levels for the metrics out-degree, in-degree, hierarchical common degree, cluster coefficient, inter-ring degree, intra-ring degree and convergence ratio. The results presented here amount to a proof-of-principle that the possible capturing of a wider context with the hierarchical levels may be combined with machine learning methods to yield an approach for assessing the quality of MT systems. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Complex networks have been increasingly used in text analysis, including in connection with natural language processing tools, as important text features appear to be captured by the topology and dynamics of the networks. Following previous works that apply complex networks concepts to text quality measurement, summary evaluation, and author characterization, we now focus on machine translation (MT). In this paper we assess the possible representation of texts as complex networks to evaluate cross-linguistic issues inherent in manual and machine translation. We show that different quality translations generated by NIT tools can be distinguished from their manual counterparts by means of metrics such as in-(ID) and out-degrees (OD), clustering coefficient (CC), and shortest paths (SP). For instance, we demonstrate that the average OD in networks of automatic translations consistently exceeds the values obtained for manual ones, and that the CC values of source texts are not preserved for manual translations, but are for good automatic translations. This probably reflects the text rearrangements humans perform during manual translation. We envisage that such findings could lead to better NIT tools and automatic evaluation metrics.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Due to idiosyncrasies in their syntax, semantics or frequency, Multiword Expressions (MWEs) have received special attention from the NLP community, as the methods and techniques developed for the treatment of simplex words are not necessarily suitable for them. This is certainly the case for the automatic acquisition of MWEs from corpora. A lot of effort has been directed to the task of automatically identifying them, with considerable success. In this paper, we propose an approach for the identification of MWEs in a multilingual context, as a by-product of a word alignment process, that not only deals with the identification of possible MWE candidates, but also associates some multiword expressions with semantics. The results obtained indicate the feasibility and low costs in terms of tools and resources demanded by this approach, which could, for example, facilitate and speed up lexicographic work.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work proposes a new approach using a committee machine of artificial neural networks to classify masses found in mammograms as benign or malignant. Three shape factors, three edge-sharpness measures, and 14 texture measures are used for the classification of 20 regions of interest (ROIs) related to malignant tumors and 37 ROIs related to benign masses. A group of multilayer perceptrons (MLPs) is employed as a committee machine of neural network classifiers. The classification results are reached by combining the responses of the individual classifiers. Experiments involving changes in the learning algorithm of the committee machine are conducted. The classification accuracy is evaluated using the area A. under the receiver operating characteristics (ROC) curve. The A, result for the committee machine is compared with the A, results obtained using MLPs and single-layer perceptrons (SLPs), as well as a linear discriminant analysis (LDA) classifier Tests are carried out using the student's t-distribution. The committee machine classifier outperforms the MLP SLP, and LDA classifiers in the following cases: with the shape measure of spiculation index, the A, values of the four methods are, in order 0.93, 0.84, 0.75, and 0.76; and with the edge-sharpness measure of acutance, the values are 0.79, 0.70, 0.69, and 0.74. Although the features with which improvement is obtained with the committee machines are not the same as those that provided the maximal value of A(z) (A(z) = 0.99 with some shape features, with or without the committee machine), they correspond to features that are not critically dependent on the accuracy of the boundaries of the masses, which is an important result. (c) 2008 SPIE and IS&T.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Introduction: Cervical and breast cancer are the most common malignancies among women worldwide. Effective screening can facilitate early detection and dramatically reduce mortality rates. The interface between those screening patients and patients most needing screening is complex, and women in remote areas of rural counties face additional barriers that limit the effectiveness of cancer prevention programs. This study compared various methods to improve compliance with mass screening for breast and cervical cancer among women in a remote, rural region of Brazil. Methods: In 2003, a mobile unit was used to perform 10 156 mammograms and Papanicolaou smear tests for women living in the Barretos County region of Sao Paulo state, Brazil (consisting of 19 neighbouring cities). To reach the women, the following community outreach strategies were used: distribution of flyers and pamphlets; media broadcasts (via radio and car loudspeakers); and community healthcare agents (CHCAs) making home visits. Results: The most useful intervention appeared to be the home visits by healthcare agents or CHCAs. These agents of the Family Health Programme of the Brazilian Ministry of Health reached an average of 45.6% of those screened, with radio advertisements reaching a further 11.9%. The great majority of the screened women were illiterate or had elementary level schooling (80.9%) and were of 'poor' or 'very poor' socioeconomic class (67.2%). Conclusions: Use of a mobile screening unit is a useful strategy in developing countries where local health systems have inadequate facilities for cancer screening in underserved populations. A multimodal approach to community outreach strategies, especially using CHCAs and radio advertisements, can improve the uptake of mass screening in low-income, low-educational background female populations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A study was designed to determine how the degree programs in Information and library science available in 2000-2005 at the public universities of Madrid fit the tabour market needs of their students. The methodology used was the development of a questionnaire addressed to graduates. Although the number of surveys completed is not high (118), the authors believe that the results obtained permit a series of conclusions that may be extrapolated to the entire cohort.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the relentless quest for improved performance driving ever tighter tolerances for manufacturing, machine tools are sometimes unable to meet the desired requirements. One option to improve the tolerances of machine tools is to compensate for their errors. Among all possible sources of machine tool error, thermally induced errors are, in general for newer machines, the most important. The present work demonstrates the evaluation and modelling of the behaviour of the thermal errors of a CNC cylindrical grinding machine during its warm-up period.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An accurate estimate of machining time is very important for predicting delivery time, manufacturing costs, and also to help production process planning. Most commercial CAM software systems estimate the machining time in milling operations simply by dividing the entire tool path length by the programmed feed rate. This time estimate differs drastically from the real process time because the feed rate is not always constant due to machine and computer numerical controlled (CNC) limitations. This study presents a practical mechanistic method for milling time estimation when machining free-form geometries. The method considers a variable called machine response time (MRT) which characterizes the real CNC machine`s capacity to move in high feed rates in free-form geometries. MRT is a global performance feature which can be obtained for any type of CNC machine configuration by carrying out a simple test. For validating the methodology, a workpiece was used to generate NC programs for five different types of CNC machines. A practical industrial case study was also carried out to validate the method. The results indicated that MRT, and consequently, the real machining time, depends on the CNC machine`s potential: furthermore, the greater MRT, the larger the difference between predicted milling time and real milling time. The proposed method achieved an error range from 0.3% to 12% of the real machining time, whereas the CAM estimation achieved from 211% to 1244% error. The MRT-based process is also suggested as an instrument for helping in machine tool benchmarking.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a simple high-level programming language, endowed with resources that help encoding self-modifying programs. With this purpose, a conventional imperative language syntax (not explicitly stated in this paper) is incremented with special commands and statements forming an adaptive layer specially designed with focus on the dynamical changes to be applied to the code at run-time. The resulting language allows programmers to easily specify dynamic changes to their own program`s code. Such a language succeeds to allow programmers to effortless describe the dynamic logic of their adaptive applications. In this paper, we describe the most important aspects of the design and implementation of such a language. A small example is finally presented for illustration purposes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Paper products show dimensional changes when subjected to moisture content modification. Hygroexpansivity was investigated in a commercial paper machine operating at 1256 m/min by a set of measurements on 75 g/m(2) reprographic bleached eucalyptus pulp paper samples. The present work shows hygroexpansivity development in different sections of the paper machine along the manufacturing direction. The measurement results demonstrate the effects of papermaking process operations on paper hygroexpansivity and lead to the confirmation of fiber orientation degree, drying restraint and shrinkage and paper tension as significant influencing factors. Structural, strength and elastic properties of paper were also measured as a function of machine direction position and presented for discussion purposes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper addresses the minimization of the mean absolute deviation from a common due date in a two-machine flowshop scheduling problem. We present heuristics that use an algorithm, based on proposed properties, which obtains an optimal schedule fora given job sequence. A new set of benchmark problems is presented with the purpose of evaluating the heuristics. Computational experiments show that the developed heuristics outperform results found in the literature for problems up to 500 jobs. (C) 2007 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper addresses the non-preemptive single machine scheduling problem to minimize total tardiness. We are interested in the online version of this problem, where orders arrive at the system at random times. Jobs have to be scheduled without knowledge of what jobs will come afterwards. The processing times and the due dates become known when the order is placed. The order release date occurs only at the beginning of periodic intervals. A customized approximate dynamic programming method is introduced for this problem. The authors also present numerical experiments that assess the reliability of the new approach and show that it performs better than a myopic policy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper addresses the single machine scheduling problem with a common due date aiming to minimize earliness and tardiness penalties. Due to its complexity, most of the previous studies in the literature deal with this problem using heuristics and metaheuristics approaches. With the intention of contributing to the study of this problem, a branch-and-bound algorithm is proposed. Lower bounds and pruning rules that exploit properties of the problem are introduced. The proposed approach is examined through a computational comparative study with 280 problems involving different due date scenarios. In addition, the values of optimal solutions for small problems from a known benchmark are provided.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, 20 Brazilian public schools have been assessed regarding good manufacturing practices and standard sanitation operating procedures implementation. We used a checklist comprised of 10 parts ( facilities and installations, water supply, equipments and tools, pest control, waste management, personal hygiene, sanitation, storage, documentation, and training), making a total of 69 questions. The implementing modification cost to the found nonconformities was also determined so that it could work with technical data as a based decision-making prioritization. The average nonconformity percentage at schools concerning to prerequisite program was 36%, from which 66% of them own inadequate installations, 65% waste management, 44% regarding documentation, and 35% water supply and sanitation. The initial estimated cost for changing has been U.S.$24,438 and monthly investments of 1.55% on the currently needed invested values. This would result in U.S.$0.015 increase on each served meal cost over the investment replacement within a year. Thus, we have concluded that such modifications are economically feasible and will be considered on technical requirements when prerequisite program implementation priorities are established.