Biblioteca Digital

971 resultados para Grammar checking

L'atténuation statistique des surdétections d'un correcteur grammatical symbolique

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Les logiciels de correction grammaticale commettent parfois des détections illégitimes (fausses alertes), que nous appelons ici surdétections. La présente étude décrit les expériences de mise au point d’un système créé pour identifier et mettre en sourdine les surdétections produites par le correcteur du français conçu par la société Druide informatique. Plusieurs classificateurs ont été entraînés de manière supervisée sur 14 types de détections faites par le correcteur, en employant des traits couvrant di-verses informations linguistiques (dépendances et catégories syntaxiques, exploration du contexte des mots, etc.) extraites de phrases avec et sans surdétections. Huit des 14 classificateurs développés sont maintenant intégrés à la nouvelle version d’un correcteur commercial très populaire. Nos expériences ont aussi montré que les modèles de langue probabilistes, les SVM et la désambiguïsation sémantique améliorent la qualité de ces classificateurs. Ce travail est un exemple réussi de déploiement d’une approche d’apprentissage machine au service d’une application langagière grand public robuste.

A Robust Parser of Czech

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Mr. Kubon's project was inspired by the growing need for an automatic, syntactic analyser (parser) of Czech, which could be used in the syntactic processing of large amounts of texts. Mr. Kubon notes that such a tool would be very useful, especially in the field of corpus linguistics, where creating a large-scale "tree bank" (a collection of syntactic representations of natural language sentences) is a very important step towards the investigation of the properties of a given language. The work involved in syntactically parsing a whole corpus in order to get a representative set of syntactic structures would be almost inconceivable without the help of some kind of robust (semi)automatic parser. The need for the automatic natural language parser to be robust increases with the size of the linguistic data in the corpus or in any other kind of text which is going to be parsed. Practical experience shows that apart from syntactically correct sentences, there are many sentences which contain a "real" grammatical error. These sentences may be corrected in small-scale texts, but not generally in the whole corpus. In order to be able to complete the overall project, it was necessary to address a number of smaller problems. These were; 1. the adaptation of a suitable formalism able to describe the formal grammar of the system; 2. the definition of the structure of the system's dictionary containing all relevant lexico-syntactic information, and the development of a formal grammar able to robustly parse Czech sentences from the test suite; 3. filling the syntactic dictionary with sample data allowing the system to be tested and debugged during its development (about 1000 words); 4. the development of a set of sample sentences containing a reasonable amount of grammatical and ungrammatical phenomena covering some of the most typical syntactic constructions being used in Czech. Number 3, building a formal grammar, was the main task of the project. The grammar is of course far from complete (Mr. Kubon notes that it is debatable whether any formal grammar describing a natural language may ever be complete), but it covers the most frequent syntactic phenomena, allowing for the representation of a syntactic structure of simple clauses and also the structure of certain types of complex sentences. The stress was not so much on building a wide coverage grammar, but on the description and demonstration of a method. This method uses a similar approach as that of grammar-based grammar checking. The problem of reconstructing the "correct" form of the syntactic representation of a sentence is closely related to the problem of localisation and identification of syntactic errors. Without a precise knowledge of the nature and location of syntactic errors it is not possible to build a reliable estimation of a "correct" syntactic tree. The incremental way of building the grammar used in this project is also an important methodological issue. Experience from previous projects showed that building a grammar by creating a huge block of metarules is more complicated than the incremental method, which begins with the metarules covering most common syntactic phenomena first, and adds less important ones later, especially from the point of view of testing and debugging the grammar. The sample of the syntactic dictionary containing lexico-syntactical information (task 4) now has slightly more than 1000 lexical items representing all classes of words. During the creation of the dictionary it turned out that the task of assigning complete and correct lexico-syntactic information to verbs is a very complicated and time-consuming process which would itself be worth a separate project. The final task undertaken in this project was the development of a method allowing effective testing and debugging of the grammar during the process of its development. The problem of the consistency of new and modified rules of the formal grammar with the rules already existing is one of the crucial problems of every project aiming at the development of a large-scale formal grammar of a natural language. This method allows for the detection of any discrepancy or inconsistency of the grammar with respect to a test-bed of sentences containing all syntactic phenomena covered by the grammar. This is not only the first robust parser of Czech, but also one of the first robust parsers of a Slavic language. Since Slavic languages display a wide range of common features, it is reasonable to claim that this system may serve as a pattern for similar systems in other languages. To transfer the system into any other language it is only necessary to revise the grammar and to change the data contained in the dictionary (but not necessarily the structure of primary lexico-syntactic information). The formalism and methods used in this project can be used in other Slavic languages without substantial changes.

Tradução, adaptação transcultural e validação do "Body Image Avoidance Questionnaire (BIAQ)" e do "Body Checking Questionnaire (BCQ)" para lingua portuguesa no Brasil

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Universidade Estadual de Campinas . Faculdade de Educação Física

A Grammar of Jingulu, an Aboriginal Language of the Northern Territory

Relevância:

20.00% 20.00%

Publicador:

Grammar for reading

Relevância:

20.00% 20.00%

Publicador:

A Student's Introduction to English Grammar

Relevância:

20.00% 20.00%

Publicador:

Is your vision consistent? A method for checking, based on scenario concepts

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Common sense tells us that the future is an essential element in any strategy. In addition, there is a good deal of literature on scenario planning, which is an important tool in considering the future in terms of strategy. However, in many organizations there is serious resistance to the development of scenarios, and they are not broadly implemented by companies. But even organizations that do not rely heavily on the development of scenarios do, in fact, construct visions to guide their strategies. But it might be asked, what happens when this vision is not consistent with the future? To address this problem, the present article proposes a method for checking the content and consistency of an organization`s vision of the future, no matter how it was conceived. The proposed method is grounded on theoretical concepts from the field of future studies, which are described in this article. This study was motivated by the search for developing new ways of improving and using scenario techniques as a method for making strategic decisions. The method was then tested on a company in the field of information technology in order to check its operational feasibility. The test showed that the proposed method is, in fact, operationally feasible and was capable of analyzing the vision of the company being studied, indicating both its shortcomings and points of inconsistency. (C) 2007 Elsevier Ltd. All rights reserved.

Mood as a category in English grammar

Relevância:

20.00% 20.00%

Publicador:

Model checking with abstract types

Relevância:

20.00% 20.00%

Publicador:

Model checking Object-Z classes: Some experiments with FDR

Relevância:

20.00% 20.00%

Publicador:

Checking Z data refinements using an animation tool

Relevância:

20.00% 20.00%

Publicador:

Some vocabulary and grammar for the analysis of multi-environment trials, as applied to the analysis of FPB and PPB trials

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For the improvement of genetic material suitable for on farm use under low-input conditions, participatory and formal plant breeding strategies are frequently presented as competing options. A common frame of reference to phrase mechanisms and purposes related to breeding strategies will facilitate clearer descriptions of similarities and differences between participatory plant breeding and formal plant breeding. In this paper an attempt is made to develop such a common framework by means of a statistically inspired language that acknowledges the importance of both on farm trials and research centre trials as sources of information for on farm genetic improvement. Key concepts are the genetic correlation between environments, and the heterogeneity of phenotypic and genetic variance over environments. Classic selection response theory is taken as the starting point for the comparison of selection trials (on farm and research centre) with respect to the expected genetic improvement in a target environment (low-input farms). The variance-covariance parameters that form the input for selection response comparisons traditionally come from a mixed model fit to multi-environment trial data. In this paper we propose a recently developed class of mixed models, namely multiplicative mixed models, also called factor-analytic models, for modelling genetic variances and covariances (correlations). Mixed multiplicative models allow genetic variances and covariances to be dependent on quantitative descriptors of the environment, and confer a high flexibility in the choice of variance-covariance structure, without requiring the estimation of a prohibitively high number of parameters. As a result detailed considerations regarding selection response comparisons are facilitated. ne statistical machinery involved is illustrated on an example data set consisting of barley trials from the International Center for Agricultural Research in the Dry Areas (ICARDA). Analysis of the example data showed that participatory plant breeding and formal plant breeding are better interpreted as providing complementary rather than competing information.

Part of Speech Mismatches in Modular Grammar: New Evidence from Jingulu

Relevância:

20.00% 20.00%

Publicador:

English Grammar

Relevância:

20.00% 20.00%

Publicador:

Type checking cryptography implementations

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cryptographic software development is a challenging eld: high performance must be achieved, while ensuring correctness and com- pliance with low-level security policies. CAO is a domain speci c language designed to assist development of cryptographic software. An important feature of this language is the design of a novel type system introducing native types such as prede ned sized vectors, matrices and bit strings, residue classes modulo an integer, nite elds and nite eld extensions, allowing for extensive static validation of source code. We present the formalisation, validation and implementation of this type system

«
1
2
3
4
5
6
7
8
...
64
65
»