742 resultados para Syntactic foams
Resumo:
Le dictionnaire LVF (Les Verbes Français) de J. Dubois et F. Dubois-Charlier représente une des ressources lexicales les plus importantes dans la langue française qui est caractérisée par une description sémantique et syntaxique très pertinente. Le LVF a été mis disponible sous un format XML pour rendre l’accès aux informations plus commode pour les applications informatiques telles que les applications de traitement automatique de la langue française. Avec l’émergence du web sémantique et la diffusion rapide de ses technologies et standards tels que XML, RDF/RDFS et OWL, il serait intéressant de représenter LVF en un langage plus formalisé afin de mieux l’exploiter par les applications du traitement automatique de la langue ou du web sémantique. Nous en présentons dans ce mémoire une version ontologique OWL en détaillant le processus de transformation de la version XML à OWL et nous en démontrons son utilisation dans le domaine du traitement automatique de la langue avec une application d’annotation sémantique développée dans GATE.
Resumo:
Depuis quelques années, les applications intégrant un module de dialogues avancés sont en plein essor. En revanche, le processus d’universalisation de ces systèmes est rapidement décourageant : ceux-ci étant naturellement dépendants de la langue pour laquelle ils ont été conçus, chaque nouveau langage à intégrer requiert son propre temps de développement. Un constat qui ne s’améliore pas en considérant que la qualité est souvent tributaire de la taille de l’ensemble d’entraînement. Ce projet cherche donc à accélérer le processus. Il rend compte de différentes méthodes permettant de générer des versions polyglottes d’un premier système fonctionnel, à l’aide de la traduction statistique. L’information afférente aux données sources est projetée afin de générer des données cibles parentes, qui diminuent d’autant le temps de développement subséquent. En ce sens, plusieurs approches ont été expérimentées et analysées. Notamment, une méthode qui regroupe les données avant de réordonner les différents candidats de traduction permet d’obtenir de bons résultats.
Resumo:
Ce mémoire examine les poétiques de trois poètes très différentes, mais dont les œuvres peuvent être qualifiées d'indéterminées et de radicales : Emily Dickinson (1830-1886), Gertrude Stein (1874-1946) et Caroline Bergvall (née en 1962). Dickinson et Stein sont anglo-américaines, tandis que Bergvall est d’origine franco-norvégienne, bien qu'elle choisisse d’écrire en anglais. Toutes les trois rompent la structure syntaxique conventionnelle de l’anglais par leurs poétiques, ce qui comporte des implications esthétiques et politiques. Dans ce qui suit, j’analyse l’indétermination de leurs poétiques à partir de la notion, décrite par Lyn Hejinian, de la description comme appréhension qui présente l’écriture comme un mode de connaissance plutôt qu'un moyen d’enregistrer ce que le poète sait déjà. La temporalité de cette activité épistémologique est donc celle du présent de l’écriture, elle lui est concomitante. J'affirme que c'est cette temporalité qui, en ouvrant l’écriture aux événements imprévus, aux vicissitudes, aux hésitations, aux erreurs et torsions de l’affect, cause l'indétermination de la poésie. Dans le premier chapitre, j'envisage l'appréhension chez Gertrude Stein à travers son engagement, tout au long de sa carrière, envers « le présent continu » de l’écriture. Le deuxième chapitre porte sur le sens angoissé de l’appréhension dans la poésie de Dickinson, où le malaise, en empêchant ou en refoulant une pensée, suspend la connaissance. Le langage, sollicité par une expérience qu'il ne peut lui-même exprimer, donne forme à l'indétermination. Un dernier chapitre considère l’indétermination linguistique du texte et de l’exposition Say Parsley, dans lesquels Bergvall met en scène l’appréhension du langage : une appréhension qui survient plutôt chez le lecteur ou spectateur que chez la poète.
Resumo:
Nous soutenons dans ce mémoire qu'il existe, en français québécois, deux sous-types de constructions exclamatives. Située dans un cadre théorique qui participe à la fois de la philosophie du langage (la théorie des actes de langage, Austin, 1962; Searle 1969, Searle, 1979; Searle et Vanderveken 1985) et de la linguistique (la théorie des types de phrase, Sadock et Zwicky, 1985; Reis, 1999), notre analyse porte sur un ensemble de constructions exclamatives en apparence synonymes qui impliquent respectivement les morphèmes -tu, donc et assez (1). (1) Elle est-tu/donc/assez belle! Nous démontrons que si ces exclamatives satisfont aux critères d'identification des constructions exclamatives donnés par Zanuttini et Portner (2003) (factivité, évaluativité/implicature scalaire, expressivité/orientation vers le locuteur et incompatibilité avec les paires de question/réponse), les actes de langage exclamatifs servis par les exclamatives en -tu/donc n'ont pas les mêmes conditions de félicité que les actes de langage exclamatifs servis par les exclamatives en assez. En effet, les exclamatives en -tu/donc imposent une contrainte sur leur contexte d'énonciation par rapport à la position épistémique de l'interlocuteur, lequel doit être en mesure de corroborer le jugement exprimé par le locuteur au moyen de l'exclamative. Les exclamatives en assez n'imposent pas de telle contrainte. Nous démontrons que cette distinction pragmatique peut être corrélée avec des distinctions sémantiques et syntaxiques et concluons qu'il existe bien deux sous-types de constructions exclamatives en français québécois. En ce sens, notre recherche ouvre de nouvelles perspectives empiriques et théoriques pour la description et l'analyse de la grammaire des actes de langage exclamatifs.
Resumo:
La rétroaction corrective (RC) se définit comme étant un indice permettant à l’apprenant de savoir que son utilisation de la L2 est incorrecte (Lightbown et Spada, 2006). Les chercheurs reconnaissent de plus en plus l’importance de la RC à l’écrit (Ferris, 2010). La recherche sur la RC écrite s’est grandement concentrée sur l’évaluation des différentes techniques de RC sans pour autant commencer par comprendre comment les enseignants corrigent les textes écrits de leurs élèves et à quel point ces derniers sont en mesure d’utiliser cette RC pour réviser leurs productions écrites. Cette étude vise à décrire quelles techniques de RC sont utilisées par les enseignants de francisation ainsi que comment les étudiants incorporent cette RC dans leur révision. De plus, elle veut aussi vérifier si les pratiques des enseignants et des étudiants varient selon le type d’erreur corrigée (lexicale, syntaxique et morphologique), la technique utilisée (RC directe, indirecte, combinée) et la compétence des étudiants à l’écrit (faible ou fort). Trois classes de francisation ont participé à cette étude : 3 enseignants et 24 étudiants (12 jugés forts et 12 faibles). Les étudiants ont rédigé un texte qui a été corrigé par les enseignants selon leur méthode habituelle. Puis les étudiants ont réécrit leur texte en incorporant la RC de leur enseignant. Des entrevues ont aussi été réalisées auprès des 3 enseignants et des 24 étudiants. Les résultats indiquent l’efficacité générale de la RC à l’écrit en langue seconde. En outre, cette efficacité varie en fonction de la technique utilisée, des types d’erreurs ainsi que du niveau de l’apprenant. Cette étude démontre que ces trois variables ont un rôle à jouer et que les enseignants devraient varier leur RC lorsqu’ils corrigent.
Resumo:
This work is aimed at building an adaptable frame-based system for processing Dravidian languages. There are about 17 languages in this family and they are spoken by the people of South India.Karaka relations are one of the most important features of Indian languages. They are the semabtuco-syntactic relations between verbs and other related constituents in a sentence. The karaka relations and surface case endings are analyzed for meaning extraction. This approach is comparable with the borad class of case based grammars.The efficiency of this approach is put into test in two applications. One is machine translation and the other is a natural language interface (NLI) for information retrieval from databases. The system mainly consists of a morphological analyzer, local word grouper, a parser for the source language and a sentence generator for the target language. This work make contributios like, it gives an elegant account of the relation between vibhakthi and karaka roles in Dravidian languages. This mapping is elegant and compact. The same basic thing also explains simple and complex sentence in these languages. This suggests that the solution is not just ad hoc but has a deeper underlying unity. This methodology could be extended to other free word order languages. Since the frame designed for meaning representation is general, they are adaptable to other languages coming in this group and to other applications.
Resumo:
A new procedure for the classification of lower case English language characters is presented in this work . The character image is binarised and the binary image is further grouped into sixteen smaller areas ,called Cells . Each cell is assigned a name depending upon the contour present in the cell and occupancy of the image contour in the cell. A data reduction procedure called Filtering is adopted to eliminate undesirable redundant information for reducing complexity during further processing steps . The filtered data is fed into a primitive extractor where extraction of primitives is done . Syntactic methods are employed for the classification of the character . A decision tree is used for the interaction of the various components in the scheme . 1ike the primitive extraction and character recognition. A character is recognized by the primitive by primitive construction of its description . Openended inventories are used for including variants of the characters and also adding new members to the general class . Computer implementation of the proposal is discussed at the end using handwritten character samples . Results are analyzed and suggestions for future studies are made. The advantages of the proposal are discussed in detail .
Resumo:
Biotechnology is currently considered as a useful altemative to conventional process technology in industrial and catalytic fields. The increasing awareness of the need to create green and sustainable production processes in all fields of chemistry has stimulated materials scientists to search for innovative catalysts supports. lmmobilization of enzymes in inorganic matrices is very useful in practical applications due to the preserved stability and catalytic activity of the immobilized enzymes under extreme conditions. Nanostructured inorganic, organic or hybrid organic-inorganic nanocomposites present paramount advantages to facilitate integration and miniaturization of the devices (nanotechnologies), thus affording a direct connection between the inorganic, organic and biological worlds. These properties, combined with good chemical stability, make them competent candidates for designed biocatalysts, protein-separation devices, drug delivery systems, and biosensors Aluininosilicate clays and layered double hydroxides, displaying, respectively, cation and anion exchange properties, were found to be attractive materials for immobilization because of their hydrophilic, swelling and porosity properties, as well as their mechanical and thermal stability.The aim of this study is the replacement of inorganic catalysts by immobilized lipases to obtain purer and healthier products.Mesocellular silica foams were synthesized by oil-in-water microemulsion templating route and were functionalized with silane and glutaraldehyde. " The experimental results from IR spectroscopy and elemental analysis demonstrated the presence of immobilized lipase and also functionalisation with silane and glutaraldehyde on the supports.The present work is a comprehensive study on enzymatic synthesis of butyl isobutyrate through esterification reaction using lipase immobilized onto mesocellular siliceous foams and montmorillonite K-10 via adsorption and covalent binding. Moreover, the irnrnobil-ization does not modify the nature of the kinetic mechanism proposed which is of the Bi-Bi Ping—Pong type with inhibition by n-butanol. The immobilized biocatalyst can be commercially exploited for the synthesis of other short chain flavor esters. Mesocellular silica foams (MCF) were synthesized by microemusion templating method via two different routes (hydrothermal and room temperature). and were functionalized with silane and glutaraldehyde. Candida rugosa lipase was adsorbed onto MCF silica and clay using heptane as the coupling medium for reactions in non-aqueous media. I From XRD results, a slight broadening and lowering of d spacing values after immobilization and modification was observed in the case of MCF 160 and MCF35 but there was no change in the d-spacing in the case of K-10 which showed that the enzymes are adsorbed only on the external surface. This was further confirmed from the nitrogen adsorption measurements
Resumo:
This thesis summarizes the results on the studies on a syntax based approach for translation between Malayalam, one of Dravidian languages and English and also on the development of the major modules in building a prototype machine translation system from Malayalam to English. The development of the system is a pioneering effort in Malayalam language unattempted by previous researchers. The computational models chosen for the system is first of its kind for Malayalam language. An in depth study has been carried out in the design of the computational models and data structures needed for different modules: morphological analyzer , a parser, a syntactic structure transfer module and target language sentence generator required for the prototype system. The generation of list of part of speech tags, chunk tags and the hierarchical dependencies among the chunks required for the translation process also has been done. In the development process, the major goals are: (a) accuracy of translation (b) speed and (c) space. Accuracy-wise, smart tools for handling transfer grammar and translation standards including equivalent words, expressions, phrases and styles in the target language are to be developed. The grammar should be optimized with a view to obtaining a single correct parse and hence a single translated output. Speed-wise, innovative use of corpus analysis, efficient parsing algorithm, design of efficient Data Structure and run-time frequency-based rearrangement of the grammar which substantially reduces the parsing and generation time are required. The space requirement also has to be minimised
Resumo:
Author identification is the problem of identifying the author of an anonymous text or text whose authorship is in doubt from a given set of authors. The works by different authors are strongly distinguished by quantifiable features of the text. This paper deals with the attempts made on identifying the most likely author of a text in Malayalam from a list of authors. Malayalam is a Dravidian language with agglutinative nature and not much successful tools have been developed to extract syntactic & semantic features of texts in this language. We have done a detailed study on the various stylometric features that can be used to form an authors profile and have found that the frequencies of word collocations can be used to clearly distinguish an author in a highly inflectious language such as Malayalam. In our work we try to extract the word level and character level features present in the text for characterizing the style of an author. Our first step was towards creating a profile for each of the candidate authors whose texts were available with us, first from word n-gram frequencies and then by using variable length character n-gram frequencies. Profiles of the set of authors under consideration thus formed, was then compared with the features extracted from anonymous text, to suggest the most likely author.
Resumo:
In natural languages with a high degree of word-order freedom syntactic phenomena like dependencies (subordinations) or valencies do not depend on the word-order (or on the individual positions of the individual words). This means that some permutations of sentences of these languages are in some (important) sense syntactically equivalent. Here we study this phenomenon in a formal way. Various types of j-monotonicity for restarting automata can serve as parameters for the degree of word-order freedom and for the complexity of word-order in sentences (languages). Here we combine two types of parameters on computations of restarting automata: 1. the degree of j-monotonicity, and 2. the number of rewrites per cycle. We study these notions formally in order to obtain an adequate tool for modelling and comparing formal descriptions of (natural) languages with different degrees of word-order freedom and word-order complexity.
Resumo:
Free-word order languages have long posed significant problems for standard parsing algorithms. This thesis presents an implemented parser, based on Government-Binding (GB) theory, for a particular free-word order language, Warlpiri, an aboriginal language of central Australia. The words in a sentence of a free-word order language may swap about relatively freely with little effect on meaning: the permutations of a sentence mean essentially the same thing. It is assumed that this similarity in meaning is directly reflected in the syntax. The parser presented here properly processes free word order because it assigns the same syntactic structure to the permutations of a single sentence. The parser also handles fixed word order, as well as other phenomena. On the view presented here, there is no such thing as a "configurational" or "non-configurational" language. Rather, there is a spectrum of languages that are more or less ordered. The operation of this parsing system is quite different in character from that of more traditional rule-based parsing systems, e.g., context-free parsers. In this system, parsing is carried out via the construction of two different structures, one encoding precedence information and one encoding hierarchical information. This bipartite representation is the key to handling both free- and fixed-order phenomena. This thesis first presents an overview of the portion of Warlpiri that can be parsed. Following this is a description of the linguistic theory on which the parser is based. The chapter after that describes the representations and algorithms of the parser. In conclusion, the parser is compared to related work. The appendix contains a substantial list of test cases ??th grammatical and ungrammatical ??at the parser has actually processed.
Resumo:
The central thesis of this report is that human language is NP-complete. That is, the process of comprehending and producing utterances is bounded above by the class NP, and below by NP-hardness. This constructive complexity thesis has two empirical consequences. The first is to predict that a linguistic theory outside NP is unnaturally powerful. The second is to predict that a linguistic theory easier than NP-hard is descriptively inadequate. To prove the lower bound, I show that the following three subproblems of language comprehension are all NP-hard: decide whether a given sound is possible sound of a given language; disambiguate a sequence of words; and compute the antecedents of pronouns. The proofs are based directly on the empirical facts of the language user's knowledge, under an appropriate idealization. Therefore, they are invariant across linguistic theories. (For this reason, no knowledge of linguistic theory is needed to understand the proofs, only knowledge of English.) To illustrate the usefulness of the upper bound, I show that two widely-accepted analyses of the language user's knowledge (of syntactic ellipsis and phonological dependencies) lead to complexity outside of NP (PSPACE-hard and Undecidable, respectively). Next, guided by the complexity proofs, I construct alternate linguisitic analyses that are strictly superior on descriptive grounds, as well as being less complex computationally (in NP). The report also presents a new framework for linguistic theorizing, that resolves important puzzles in generative linguistics, and guides the mathematical investigation of human language.
Resumo:
This paper describes a system for the computer understanding of English. The system answers questions, executes commands, and accepts information in normal English dialog. It uses semantic information and context to understand discourse and to disambiguate sentences. It combines a complete syntactic analysis of each sentence with a "heuristic understander" which uses different kinds of information about a sentence, other parts of the discourse, and general information about the world in deciding what the sentence means. It is based on the belief that a computer cannot deal reasonably with language unless it can "understand" the subject it is discussing. The program is given a detailed model of the knowledge needed by a simple robot having only a hand and an eye. We can give it instructions to manipulate toy objects, interrogate it about the scene, and give it information it will use in deduction. In addition to knowing the properties of toy objects, the program has a simple model of its own mentality. It can remember and discuss its plans and actions as well as carry them out. It enters into a dialog with a person, responding to English sentences with actions and English replies, and asking for clarification when its heuristic programs cannot understand a sentence through use of context and physical knowledge.
Resumo:
In this report, we investigate the relationship between the semantic and syntactic properties of verbs. Our work is based on the English Verb Classes and Alternations of (Levin, 1993). We explore how these classes are manifested in other languages, in particular, in Bangla, German, and Korean. Our report includes a survey and classification of several hundred verbs from these languages into the cross-linguistic equivalents of Levin's classes. We also explore ways in which our findings may be used to enhance WordNet in two ways: making the English syntactic information of WordNet more fine-grained, and making WordNet multilingual.