23 resultados para Natural language processing (Computer science) -- TFC
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
Written text is an important component in the process of knowledge acquisition and communication. Poorly written text fails to deliver clear ideas to the reader no matter how revolutionary and ground-breaking these ideas are. Providing text with good writing style is essential to transfer ideas smoothly. While we have sophisticated tools to check for stylistic problems in program code, we do not apply the same techniques for written text. In this paper we present TextLint, a rule-based tool to check for common style errors in natural language. TextLint provides a structural model of written text and an extensible rule-based checking mechanism.
Resumo:
This article discusses the detection of discourse markers (DM) in dialog transcriptions, by human annotators and by automated means. After a theoretical discussion of the definition of DMs and their relevance to natural language processing, we focus on the role of like as a DM. Results from experiments with human annotators show that detection of DMs is a difficult but reliable task, which requires prosodic information from soundtracks. Then, several types of features are defined for automatic disambiguation of like: collocations, part-of-speech tags and duration-based features. Decision-tree learning shows that for like, nearly 70% precision can be reached, with near 100% recall, mainly using collocation filters. Similar results hold for well, with about 91% precision at 100% recall.
Resumo:
In his in uential article about the evolution of the Web, Berners-Lee [1] envisions a Semantic Web in which humans and computers alike are capable of understanding and processing information. This vision is yet to materialize. The main obstacle for the Semantic Web vision is that in today's Web meaning is rooted most often not in formal semantics, but in natural language and, in the sense of semiology, emerges not before interpretation and processing. Yet, an automated form of interpretation and processing can be tackled by precisiating raw natural language. To do that, Web agents extract fuzzy grassroots ontologies through induction from existing Web content. Inductive fuzzy grassroots ontologies thus constitute organically evolved knowledge bases that resemble automated gradual thesauri, which allow precisiating natural language [2]. The Web agents' underlying dynamic, self-organizing, and best-effort induction, enable a sub-syntactical bottom up learning of semiotic associations. Thus, knowledge is induced from the users' natural use of language in mutual Web interactions, and stored in a gradual, thesauri-like lexical-world knowledge database as a top-level ontology, eventually allowing a form of computing with words [3]. Since when computing with words the objects of computation are words, phrases and propositions drawn from natural languages, it proves to be a practical notion to yield emergent semantics for the Semantic Web. In the end, an improved understanding by computers on the one hand should upgrade human- computer interaction on the Web, and, on the other hand allow an initial version of human- intelligence amplification through the Web.
Resumo:
The goal of the present thesis was to investigate the production of code-switched utterances in bilinguals’ speech production. This study investigates the availability of grammatical-category information during bilingual language processing. The specific aim is to examine the processes involved in the production of Persian-English bilingual compound verbs (BCVs). A bilingual compound verb is formed when the nominal constituent of a compound verb is replaced by an item from the other language. In the present cases of BCVs the nominal constituents are replaced by a verb from the other language. The main question addressed is how a lexical element corresponding to a verb node can be placed in a slot that corresponds to a noun lemma. This study also investigates how the production of BCVs might be captured within a model of BCVs and how such a model may be integrated within incremental network models of speech production. In the present study, both naturalistic and experimental data were used to investigate the processes involved in the production of BCVs. In the first part of the present study, I collected 2298 minutes of a popular Iranian TV program and found 962 code-switched utterances. In 83 (8%) of the switched cases, insertions occurred within the Persian compound verb structure, hence, resulting in BCVs. As to the second part of my work, a picture-word interference experiment was conducted. This study addressed whether in the case of the production of Persian-English BCVs, English verbs compete with the corresponding Persian compound verbs as a whole, or whether English verbs compete with the nominal constituents of Persian compound verbs only. Persian-English bilinguals named pictures depicting actions in 4 conditions in Persian (L1). In condition 1, participants named pictures of action using the whole Persian compound verb in the context of its English equivalent distractor verb. In condition 2, only the nominal constituent was produced in the presence of the light verb of the target Persian compound verb and in the context of a semantically closely related English distractor verb. In condition 3, the whole Persian compound verb was produced in the context of a semantically unrelated English distractor verb. In condition 4, only the nominal constituent was produced in the presence of the light verb of the target Persian compound verb and in the context of a semantically unrelated English distractor verb. The main effect of linguistic unit was significant by participants and items. Naming latencies were longer in the nominal linguistic unit compared to the compound verb (CV) linguistic unit. That is, participants were slower to produce the nominal constituent of compound verbs in the context of a semantically closely related English distractor verb compared to producing the whole compound verbs in the context of a semantically closely related English distractor verb. The three-way interaction between version of the experiment (CV and nominal versions), linguistic unit (nominal and CV linguistic units), and relation (semantically related and unrelated distractor words) was significant by participants. In both versions, naming latencies were longer in the semantically related nominal linguistic unit compared to the response latencies in the semantically related CV linguistic unit. In both versions, naming latencies were longer in the semantically related nominal linguistic unit compared to response latencies in the semantically unrelated nominal linguistic unit. Both the analysis of the naturalistic data and the results of the experiment revealed that in the case of the production of the nominal constituent of BCVs, a verb from the other language may compete with a noun from the base language, suggesting that grammatical category does not necessarily provide a constraint on lexical access during the production of the nominal constituent of BCVs. There was a minimal context in condition 2 (the nominal linguistic unit) in which the nominal constituent was produced in the presence of its corresponding light verb. The results suggest that generating words within a context may not guarantee that the effect of grammatical class becomes available. A model is proposed in order to characterize the processes involved in the production of BCVs. Implications for models of bilingual language production are discussed.
Resumo:
In this paper we propose a new system that allows reliable acetabular cup placement when the THA is operated in lateral approach. Conceptually it combines the accuracy of computer-generated patient-specific morphology information with an easy-to-use mechanical guide, which effectively uses natural gravity as the angular reference. The former is achieved by using a statistical shape model-based 2D-3D reconstruction technique that can generate a scaled, patient-specific 3D shape model of the pelvis from a single conventional anteroposterior (AP) pelvic X-ray radiograph. The reconstructed 3D shape model facilitates a reliable and accurate co-registration of the mechanical guide with the patient’s anatomy in the operating theater. We validated the accuracy of our system by conducting experiments on placing seven cups to four pelvises with different morphologies. Taking the measurements from an image-free navigation system as the ground truth, our system showed an average accuracy of 2.1 ±0.7 o for inclination and an average accuracy of 1.2 ±1.4 o for anteversion.
Resumo:
Object-oriented meta-languages such as MOF or EMOF are often used to specify domain specific languages. However, these meta-languages lack the ability to describe behavior or operational semantics. Several approaches used a subset of Java mixed with OCL as executable meta-languages. In this paper, we report our experience of using Smalltalk as an executable and integrated meta-language. We validated this approach in incrementally building over the last decade, Moose, a meta-described reengineering environment. The reflective capabilities of Smalltalk support a uniform way of letting the base developer focus on his tasks while at the same time allowing him to meta-describe his domain model. The advantage of our this approach is that the developer uses the same tools and environment
Resumo:
As domain-specific modeling begins to attract widespread acceptance, pressure is increasing for the development of new domain-specific languages. Unfortunately these DSLs typically conflict with the grammar of the host language, making it difficult to compose hybrid code except at the level of strings; few mechanisms (if any) exist to control the scope of usage of multiple DSLs; and, most seriously, existing host language tools are typically unaware of the DSL extensions, thus hampering the development process. Language boxes address these issues by offering a simple, modular mechanism to encapsulate (i) compositional changes to the host language, (ii) transformations to address various concerns such as compilation and highlighting, and (iii) scoping rules to control visibility of language extensions. We describe the design and implementation of language boxes, and show with the help of several examples how modular extensions can be introduced to a host language and environment.
Resumo:
Brain processing of grammatical word class was studied analyzing event-related potential (ERP) brain fields. Normal subjects observed a randomized sequence of single German nouns and verbs on a computer screen, while 20-channel ERP field map series were recorded separately for both word classes. Spatial microstate analysis was applied, based on the observation that series of ERP maps consist of epochs of quasi-stable map landscapes and based on the rationale that different map landscapes must have been generated by different neural generators and thus suggest different brain functions. Space-oriented segmentation of the mean map series identified nine successive, different functional microstates, i.e., steps of brain information processing characterized by quasi-stable map landscapes. In the microstate from 116 to 172 msec, noun-related maps differed significantly from verb-related maps along the left–right axis. The results indicate that different neural populations represent different grammatical word classes in language processing, in agreement with clinical observations. This word class differentiation as revealed by the spatial–temporal organization of neural activity occurred at a time after word input compatible with speed of reading.
Resumo:
Online reputation management deals with monitoring and influencing the online record of a person, an organization or a product. The Social Web offers increasingly simple ways to publish and disseminate personal or opinionated information, which can rapidly have a disastrous influence on the online reputation of some of the entities. The author focuses on the Social Web and possibilities of its integration with the Semantic Web as resource for a semi-automated tracking of online reputations using imprecise natural language terms. The inherent structure of natural language supports humans not only in communication but also in the perception of the world. Thereby fuzziness is a promising tool for transforming those human perceptions into computer artifacts. Through fuzzy grassroots ontologies, the Social Semantic Web becomes more naturally and thus can streamline online reputation management. For readers interested in the cross-over field of computer science, information systems, and social sciences, this book is an ideal source for becoming acquainted with the evolving field of fuzzy online reputation management in the Social Semantic Web area.
Resumo:
Researchers suggest that personalization on the Semantic Web adds up to a Web 3.0 eventually. In this Web, personalized agents process and thus generate the biggest share of information rather than humans. In the sense of emergent semantics, which supplements traditional formal semantics of the Semantic Web, this is well conceivable. An emergent Semantic Web underlying fuzzy grassroots ontology can be accomplished through inducing knowledge from users' common parlance in mutual Web 2.0 interactions [1]. These ontologies can also be matched against existing Semantic Web ontologies, to create comprehensive top-level ontologies. On the Web, if augmented with information in the form of restrictions andassociated reliability (Z-numbers) [2], this collection of fuzzy ontologies constitutes an important basis for an implementation of Zadeh's restriction-centered theory of reasoning and computation (RRC) [3]. By considering real world's fuzziness, RRC differs from traditional approaches because it can handle restrictions described in natural language. A restriction is an answer to a question of the value of a variable such as the duration of an appointment. In addition to mathematically well-defined answers, RRC can likewise deal with unprecisiated answers as "about one hour." Inspired by mental functions, it constitutes an important basis to leverage present-day Web efforts to a natural Web 3.0. Based on natural language information, RRC may be accomplished with Z-number calculation to achieve a personalized Web reasoning and computation. Finally, through Web agents' understanding of natural language, they can react to humans more intuitively and thus generate and process information.