855 resultados para Catalan language -- To 1500 -- Word order -- Congresses


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper investigates several approaches to bootstrapping a new spoken language understanding (SLU) component in a target language given a large dataset of semantically-annotated utterances in some other source language. The aim is to reduce the cost associated with porting a spoken dialogue system from one language to another by minimising the amount of data required in the target language. Since word-level semantic annotations are costly, Semantic Tuple Classifiers (STCs) are used in conjunction with statistical machine translation models both of which are trained from unaligned data to further reduce development time. The paper presents experiments in which a French SLU component in the tourist information domain is bootstrapped from English data. Results show that training STCs on automatically translated data produced the best performance for predicting the utterance's dialogue act type, however individual slot/value pairs are best predicted by training STCs on the source language and using them to decode translated utterances. © 2010 ISCA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mandarin Chinese is based on characters which are syllabic in nature and morphological in meaning. All spoken languages have syllabiotactic rules which govern the construction of syllables and their allowed sequences. These constraints are not as restrictive as those learned from word sequences, but they can provide additional useful linguistic information. Hence, it is possible to improve speech recognition performance by appropriately combining these two types of constraints. For the Chinese language considered in this paper, character level language models (LMs) can be used as a first level approximation to allowed syllable sequences. To test this idea, word and character level n-gram LMs were trained on 2.8 billion words (equivalent to 4.3 billion characters) of texts from a wide collection of text sources. Both hypothesis and model based combination techniques were investigated to combine word and character level LMs. Significant character error rate reductions up to 7.3% relative were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using an adapted history dependent multi-level LM that performs a log-linearly combination of character and word level LMs. This supports the hypothesis that character or syllable sequence models are useful for improving Mandarin speech recognition performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Planner is a formalism for proving theorems and manipulating models in a robot. The formalism is built out of a number of problem-solving primitives together with a hierarchical multiprocess backtrack control structure. Statements can be asserted and perhaps later withdrawn as the state of the world changes. Under BACKTRACK control structure, the hierarchy of activations of functions previously executed is maintained so that it is possible to revert to any previous state. Thus programs can easily manipulate elaborate hypothetical tentative states. In addition PLANNER uses multiprocessing so that there can be multiple loci of changes in state. Goals can be established and dismissed when they are satisfied. The deductive system of PLANNER is subordinate to the hierarchical control structure in order to maintain the desired degree of control. The use of a general-purpose matching language as the basis of the deductive system increases the flexibility of the system. Instead of explicitly naming procedures in calls, procedures can be invoked implicitly by patterns of what the procedure is supposed to accomplish. The language is being applied to solve problems faced by a robot, to write special purpose routines from goal oriented language, to express and prove properties of procedures, to abstract procedures from protocols of their actions, and as a semantic base for English.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study examines the L2 acquisition of word order variation in Spanish by three groups of L1 English learners in an instructed setting. The three groups represent learners at three different L2 proficiencies: beginners, intermediate and advanced. The aim of the study is to analyse the acquisition of word order variation in a situation where the target input is highly ambiguous, since two apparent optional forms exist in the target grammar, in order to examine how the optionality is disambiguated by learners from the earlier stages of learning to the more advanced. Our results support the hypothesis that an account based on a discourse-pragmatics deficit cannot satisfactorily explain learners’ non-targetlike representations in the contexts analysed in our study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The capacity to attribute beliefs to others in order to understand action is one of the mainstays of human cognition. Yet it is debatable whether children attribute beliefs in the same way to all agents. In this paper, we present the results of a false-belief task concerning humans and God run with a sample of Maya children aged 4–7, and place them in the context of several psychological theories of cognitive development. Children were found to attribute beliefs in different ways to humans and God. The evidence also speaks to the debate concerning the universality and uniformity of the development of folk-psychological reasoning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We as language instructors are tasked with preparing students to transition from language to literature courses. The shorter length of many poems makes them ideal for presentation in the language classroom, where the acquisition of communicative competence is the priority. Introductory and intermediate textbooks’ poetry offerings, however, are frequently drawn from a canon of poems by only a few nineteenth- and twentieth-century authors (Verlaine, Apollinaire, Prévert) and fail to expose students to broader aspects of French literature. This article offers strategies for presenting pre-nineteenth-century poetry to first- and second-year students of French using dizains from Scève’s Délie as examples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This Ph.D., by thesis, proposes a speculative lens to read Internet Art via the concept of digital debris. In order to do so, the research explores the idea of digital debris in Internet Art from 1993 to 2011 in a series of nine case studies. Here, digital debris are understood as words typed in search engines and which then disappear; bits of obsolete codes which are lingering on the Internet, abandoned website, broken links or pieces of ephemeral information circulating on the Internet and which are used as a material by practitioners. In this context, the thesis asks what are digital debris? The thesis argues that the digital debris of Internet Art represent an allegorical and entropic resistance to the what Art Historian David Joselit calls the Epistemology of Search. The ambition of the research is to develop a language in-between the agency of the artist and the autonomy of the algorithm, as a way of introducing Internet Art to a pluridisciplinary audience, hence the presence of the comparative studies unfolding throughout the thesis, between Internet Art and pionners in the recycling of waste in art, the use of instructions as a medium and the programming of poetry. While many anthropological and ethnographical studies are concerned with the material object of the computer as debris once it becomes obsolete, very few studies have analysed waste as discarded data. The research shifts the focus from an industrial production of digital debris (such as pieces of hardware) to obsolete pieces of information in art practice. The research demonstrates that illustrations of such considerations can be found, for instance, in Cory Arcangel’s work Data Diaries (2001) where QuickTime files are stolen, disassembled, and then re-used in new displays. The thesis also looks at Jodi’s approach in Jodi.org (1993) and Asdfg (1998), where websites and hyperlinks are detourned, deconstructed, and presented in abstract collages that reveals the architecture of the Internet. The research starts in a typological manner and classifies the pieces of Internet Art according to the structure at play in the work. Indeed if some online works dealing with discarded documents offer a self-contained and closed system, others nurture the idea of openness and unpredictability. The thesis foregrounds the ideas generated through the artworks and interprets how those latter are visually constructed and displayed. Not only does the research questions the status of digital debris once they are incorporated into art practice but it also examine the method according to which they are retrieved, manipulated and displayed to submit that digital debris of Internet Art are the result of both semantic and automated processes, rendering them both an object of discourse and a technical reality. Finally, in order to frame the serendipity and process-based nature of the digital debris, the Ph.D. concludes that digital debris are entropic . In other words that they are items of language to-be, paradoxically locked in a constant state of realisation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis will introduce a new strongly typed programming language utilizing Self types, named Win--*Foy, along with a suitable user interface designed specifically to highlight language features. The need for such a programming language is based on deficiencies found in programming languages that support both Self types and subtyping. Subtyping is a concept that is taken for granted by most software engineers programming in object-oriented languages. Subtyping supports subsumption but it does not support the inheritance of binary methods. Binary methods contain an argument of type Self, the same type as the object itself, in a contravariant position, i.e. as a parameter. There are several arguments in favour of introducing Self types into a programming language (11. This rationale led to the development of a relation that has become known as matching [4, 5). The matching relation does not support subsumption, however, it does support the inheritance of binary methods. Two forms of matching have been proposed (lJ. Specifically, these relations are known as higher-order matching and I-bound matching. Previous research on these relations indicates that the higher-order matching relation is both reflexive and transitive whereas the f-bound matching is reflexive but not transitive (7]. The higher-order matching relation provides significant flexibility regarding inheritance of methods that utilize or return values of the same type. This flexibility, in certain situations, can restrict the programmer from defining specific classes and methods which are based on constant values [21J. For this reason, the type This is used as a second reference to the type of the object that cannot, contrary to Self, be specialized in subclasses. F-bound matching allows a programmer to define a function that will work for all types of A', a subtype of an upper bound function of type A, with the result type being dependent on A'. The use of parametric polymorphism in f-bound matching provides a connection to subtyping in object-oriented languages. This thesis will contain two main sections. Firstly, significant details concerning deficiencies of the subtype relation and the need to introduce higher-order and f-bound matching relations into programming languages will be explored. Secondly, a new programming language named Win--*Foy Functional Object-Oriented Programming Language has been created, along with a suitable user interface, in order to facilitate experimentation by programmers regarding the matching relation. The construction of the programming language and the user interface will be explained in detail.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A certificate of initiation and acceptance to the Canadian Order Chosen Friends, Thomas Cowan. The certificate reads "This certifies that evidence has been received that Thomas Cowan has been accepted and initiated by the Council name below, and has thus become a member of the Canadian Order of Chosen Friends, and entitled to all the rights and privileges of membership and a benefit of not exceeding one thousand dollars from the relief fund of said order, which shall in case of death be paid to Annie Cowan his wife in the manner and subject to the conditions set forth in the laws governing said relief fund and in the application for membership. This certificate to be in force and binding when accepted in writing by the said member, with the acceptance attested by the Councilor and Recorder and the seal of the Subordinate Council affixed, so long as said member shall comply with the requirements of the Constitution, Laws and Regulations now in force or hereafter adopted for the government of the Order: otherwise, and also in the case of granting of a new certificate, to be null and void. In witness whereof, we have hereunto attached our signatures, and affixed the seal of the Grand Council of the Canadian Order of Chosen Friends. Dated the Twenty Seventh day of July, A.D. 1891." The front and back of the certificate are available for viewing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Lattice valued fuzziness is more general than crispness or fuzziness based on the unit interval. In this work, we present a query language for a lattice based fuzzy database. We define a Lattice Fuzzy Structured Query Language (LFSQL) taking its membership values from an arbitrary lattice L. LFSQL can handle, manage and represent crisp values, linear ordered membership degrees and also allows membership degrees from lattices with non-comparable values. This gives richer membership degrees, and hence makes LFSQL more flexible than FSQL or SQL. In order to handle vagueness or imprecise information, every entry into an L-fuzzy database is an L-fuzzy set instead of crisp values. All of this makes LFSQL an ideal query language to handle imprecise data where some factors are non-comparable. After defining the syntax of the language formally, we provide its semantics using L-fuzzy sets and relations. The semantics can be used in future work to investigate concepts such as functional dependencies. Last but not least, we present a parser for LFSQL implemented in Haskell.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Il est connu que les problèmes d'ambiguïté de la langue ont un effet néfaste sur les résultats des systèmes de Recherche d'Information (RI). Toutefois, les efforts de recherche visant à intégrer des techniques de Désambiguisation de Sens (DS) à la RI n'ont pas porté fruit. La plupart des études sur le sujet obtiennent effectivement des résultats négatifs ou peu convaincants. De plus, des investigations basées sur l'ajout d'ambiguïté artificielle concluent qu'il faudrait une très haute précision de désambiguation pour arriver à un effet positif. Ce mémoire vise à développer de nouvelles approches plus performantes et efficaces, se concentrant sur l'utilisation de statistiques de cooccurrence afin de construire des modèles de contexte. Ces modèles pourront ensuite servir à effectuer une discrimination de sens entre une requête et les documents d'une collection. Dans ce mémoire à deux parties, nous ferons tout d'abord une investigation de la force de la relation entre un mot et les mots présents dans son contexte, proposant une méthode d'apprentissage du poids d'un mot de contexte en fonction de sa distance du mot modélisé dans le document. Cette méthode repose sur l'idée que des modèles de contextes faits à partir d'échantillons aléatoires de mots en contexte devraient être similaires. Des expériences en anglais et en japonais montrent que la force de relation en fonction de la distance suit généralement une loi de puissance négative. Les poids résultant des expériences sont ensuite utilisés dans la construction de systèmes de DS Bayes Naïfs. Des évaluations de ces systèmes sur les données de l'atelier Semeval en anglais pour la tâche Semeval-2007 English Lexical Sample, puis en japonais pour la tâche Semeval-2010 Japanese WSD, montrent que les systèmes ont des résultats comparables à l'état de l'art, bien qu'ils soient bien plus légers, et ne dépendent pas d'outils ou de ressources linguistiques. La deuxième partie de ce mémoire vise à adapter les méthodes développées à des applications de Recherche d'Information. Ces applications ont la difficulté additionnelle de ne pas pouvoir dépendre de données créées manuellement. Nous proposons donc des modèles de contextes à variables latentes basés sur l'Allocation Dirichlet Latente (LDA). Ceux-ci seront combinés à la méthodes de vraisemblance de requête par modèles de langue. En évaluant le système résultant sur trois collections de la conférence TREC (Text REtrieval Conference), nous observons une amélioration proportionnelle moyenne de 12% du MAP et 23% du GMAP. Les gains se font surtout sur les requêtes difficiles, augmentant la stabilité des résultats. Ces expériences seraient la première application positive de techniques de DS sur des tâches de RI standard.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Les systèmes statistiques de traduction automatique ont pour tâche la traduction d’une langue source vers une langue cible. Dans la plupart des systèmes de traduction de référence, l'unité de base considérée dans l'analyse textuelle est la forme telle qu’observée dans un texte. Une telle conception permet d’obtenir une bonne performance quand il s'agit de traduire entre deux langues morphologiquement pauvres. Toutefois, ceci n'est plus vrai lorsqu’il s’agit de traduire vers une langue morphologiquement riche (ou complexe). Le but de notre travail est de développer un système statistique de traduction automatique comme solution pour relever les défis soulevés par la complexité morphologique. Dans ce mémoire, nous examinons, dans un premier temps, un certain nombre de méthodes considérées comme des extensions aux systèmes de traduction traditionnels et nous évaluons leurs performances. Cette évaluation est faite par rapport aux systèmes à l’état de l’art (système de référence) et ceci dans des tâches de traduction anglais-inuktitut et anglais-finnois. Nous développons ensuite un nouvel algorithme de segmentation qui prend en compte les informations provenant de la paire de langues objet de la traduction. Cet algorithme de segmentation est ensuite intégré dans le modèle de traduction à base d’unités lexicales « Phrase-Based Models » pour former notre système de traduction à base de séquences de segments. Enfin, nous combinons le système obtenu avec des algorithmes de post-traitement pour obtenir un système de traduction complet. Les résultats des expériences réalisées dans ce mémoire montrent que le système de traduction à base de séquences de segments proposé permet d’obtenir des améliorations significatives au niveau de la qualité de la traduction en terme de le métrique d’évaluation BLEU (Papineni et al., 2002) et qui sert à évaluer. Plus particulièrement, notre approche de segmentation réussie à améliorer légèrement la qualité de la traduction par rapport au système de référence et une amélioration significative de la qualité de la traduction est observée par rapport aux techniques de prétraitement de base (baseline).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dans le but d’optimiser la représentation en mémoire des enregistrements Scheme dans le compilateur Gambit, nous avons introduit dans celui-ci un système d’annotations de type et des vecteurs contenant une représentation abrégée des enregistrements. Ces derniers omettent la référence vers le descripteur de type et l’entête habituellement présents sur chaque enregistrement et utilisent plutôt un arbre de typage couvrant toute la mémoire pour retrouver le vecteur contenant une référence. L’implémentation de ces nouvelles fonctionnalités se fait par le biais de changements au runtime de Gambit. Nous introduisons de nouvelles primitives au langage et modifions l’architecture existante pour gérer correctement les nouveaux types de données. On doit modifier le garbage collector pour prendre en compte des enregistrements contenants des valeurs hétérogènes à alignements irréguliers, et l’existence de références contenues dans d’autres objets. La gestion de l’arbre de typage doit aussi être faite automatiquement. Nous conduisons ensuite une série de tests de performance visant à déterminer si des gains sont possibles avec ces nouvelles primitives. On constate une amélioration majeure de performance au niveau de l’allocation et du comportement du gc pour les enregistrements typés de grande taille et des vecteurs d’enregistrements typés ou non. De légers surcoûts sont toutefois encourus lors des accès aux champs et, dans le cas des vecteurs d’enregistrements, au descripteur de type.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paratext framework is now used in a variety of fields to assess, measure, analyze, and comprehend the elements that provide thresholds, allowing scholars to better understand digital objects. Researchers from many disciplines revisit paratextual theories in order to grasp what surrounds text in the digital age. Examining Paratextual Theory and its Applications in Digital Culture suggests a theoretical and practical tool for building bridges between disciplines interested in conducting joint research and exploration of digital culture. Helping scholars from different fields find an interdisciplinary framework and common language to study digital objects, this book serves as a useful reference for academics, librarians, professionals, researchers, and students, offering a collaborative outlook and perspective.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En este trabajo se describe la naturaleza y secuencia de adquisición de las preguntas interrogativas parciales en niños de habla catalana y/o castellana dentro de un marco de análisis según el cual la adquisición de las estructuras lingüísticas se construye gradualmente desde estructuras concretas hasta estructuras más abstractas. La muestra utilizada se compone de 10 niños y niñas procedentes de corpus longitudinales cuyas edades van de los 17 meses a los 3 años. El análisis se ha realizado atendiendo a la estructura sintáctica de la oración, los errores, los pronombres y adverbios interrogativos, y la tipología verbal. Los resultados muestran que la secuencia de adquisición pasa por un momento inicial caracterizado por producciones estereotipadas o fórmulas, durante el cual sólo aparecen algunas partículas interrogativas en estructuras muy concretas. Posteriormente la interrogación aparece con otros pronombres y adverbios y se diversifica a otros verbos, además, no se observan errores en la construcción sintáctica. Estos resultados suponen un hecho diferencial respecto de estudios previos en lengua inglesa