918 resultados para Lexical semantic classes
Resumo:
This study evaluates the differing claims of the Aspect Hypothesis (Anderson & Shirai 1996) and the Sentential Aspect Hypothesis (Sharma & Deo 2009) for perfective marking by L1 English learners of Mandarin. The AH predicts a narrow focus on inherent lexical aspect (the verb and predicate) in determining the use of the perfective marker le, whilst the SAH suggests that – subject to L1 influence – perfective marking agrees with the final derived aspectual class of the sentence. To test these claims data were collected using a controlled le-insertion task, combined with oral corpus data. The results show that learners’ perfective marking patterns with the sentential aspectual class and not inherent lexical aspect (where these differ), and that overall the sentential aspectual class better predicts learners’ assignment of perfective marking than lexical aspect.
Resumo:
We present an account of semantic representation that focuses on distinct types of information from which word meanings can be learned. In particular, we argue that there are at least two major types of information from which we learn word meanings. The first is what we call experiential information. This is data derived both from our sensory-motor interactions with the outside world, as well as from our experience of own inner states, particularly our emotions. The second type of information is language-based. In particular, it is derived from the general linguistic context in which words appear. The paper spells out this proposal, summarizes research supporting this view and presents new predictions emerging from this framework.
Resumo:
This study contributes to ongoing discussions on how measures of lexical diversity (LD) can help discriminate between essays from second language learners of English, whose work has been assessed as belonging to levels B1 to C2 of the Common European Framework of Reference (CEFR). The focus is in particular on how different operationalisations of what constitutes a “different word” (type) impact on the LD measures themselves and on their ability to discriminate between CEFR levels. The results show that basic measures of LD, such as the number of different words, the TTR (Templin 1957) and the Index of Guiraud (Guiraud 1954) explain more variance in the CEFR levels than sophisticated measures, such as D (Malvern et al. 2004), HD-D (McCarthy and Jarvis 2007) and MTLD (McCarthy 2005) provided text length is kept constant across texts. A simple count of different words (defined as lemma’s and not as word families) was the best predictor of CEFR levels and explained 22 percent of the variance in overall scores on the Pearson Test of English Academic in essays written by 176 test takers.
Resumo:
The incorporation of new representations into the mental lexicon has raised numerous questions about the organisational principles that govern the process. A number of studies have argued that similarity between the new L3 items and existing representations in the L1 and L2 is the main incorporating force (Hall & Ecke, 2003; Herwig, 2001). Experimental evidence obtained through a primed picture-naming task with L1 Polish-L2 English learners of L3 Russian supports Hall and Ecke’s Parasitic Model of L3 vocabulary acquisition, displaying a significant main effect for both priming and proficiency. These results complement current models of vocabulary acquisition and lexical access in multilingual speakers.
Resumo:
This study uses the Deese-Roediger-McDermott paradigm to investigate how deaf children with cochlear implants organize their semantic networks as compared to their hearing age-mates.
Resumo:
The viability of two different classes of Lambda(t)CDM cosmologies is tested by using the APM 08279+5255, an old quasar at redshift z = 3.91. In the first class of models, the cosmological term scales as Lambda(t) similar to R(-n). The particular case n = 0 describes the standard Lambda CDM model whereas n = 2 stands for the Chen and Wu model. For an estimated age of 2 Gyr, it is found that the power index has a lower limit n > 0.21, whereas for 3 Gyr the limit is n > 0.6. Since n can not be so large as similar to 0.81, the Lambda CDM and Chen and Wu models are also ruled out by this analysis. The second class of models is the one recently proposed by Wang and Meng which describes several Lambda(t)CDM cosmologies discussed in the literature. By assuming that the true age is 2 Gyr it is found that the epsilon parameter satisfies the lower bound epsilon > 0.11 while for 3 Gyr, a lower limit of epsilon > 0.52 is obtained. Such limits are slightly modified when the baryonic component is included.
Resumo:
In contrast to the many studies on the venoms of scorpions, spiders, snakes and cone snails, tip to now there has been no report of the proteomic analysis of sea anemones venoms. In this work we report for the first time the peptide mass fingerprint and some novel peptides in the neurotoxic fraction (Fr III) of the sea anemone Bunodosoma cangicum venom. Fr III is neurotoxic to crabs and was purified by rp-HPLC in a C-18 column, yielding 41 fractions. By checking their molecular masses by ESI-Q-Tof and MALDI-Tof MS we found 81 components ranging from near 250 amu to approximately 6000 amu. Some of the peptidic molecules were partially sequenced through the automated Edman technique. Three of them are peptides with near 4500 amu belonging to the class of the BcIV, BDS-I, BDS-II, APETx1, APETx2 and Am-II toxins. Another three peptides represent a novel group of toxins (similar to 3200 amu). A further three molecules (similar to similar to 4900 amu) belong to the group of type 1 sodium channel neurotoxins. When assayed over the crab leg nerve compound action potentials, one of the BcIV- and APETx-like peptides exhibits an action similar to the type 1 sodium channel toxins in this preparation, suggesting the same target in this assay. On the other hand one of the novel peptides, with 3176 amu, displayed an action similar to potassium channel blockage in this experiment. In summary, the proteomic analysis and mass fingerprint of fractions from sea anemone venoms through MS are valuable tools, allowing us to rapidly predict the occurrence of different groups of toxins and facilitating the search and characterization of novel molecules without the need of full characterization of individual components by broader assays and bioassay-guided purifications. It also shows that sea anemones employ dozens of components for prey capture and defense. (C) 2008 Elsevier Inc. All rights reserved.
Resumo:
This paper is about the use of natural language to communicate with computers. Most researches that have pursued this goal consider only requests expressed in English. A way to facilitate the use of several languages in natural language systems is by using an interlingua. An interlingua is an intermediary representation for natural language information that can be processed by machines. We propose to convert natural language requests into an interlingua [universal networking language (UNL)] and to execute these requests using software components. In order to achieve this goal, we propose OntoMap, an ontology-based architecture to perform the semantic mapping between UNL sentences and software components. OntoMap also performs component search and retrieval based on semantic information formalized in ontologies and rules.
Resumo:
This paper presents an approach for assisting low-literacy readers in accessing Web online information. The oEducational FACILITAo tool is a Web content adaptation tool that provides innovative features and follows more intuitive interaction models regarding accessibility concerns. Especially, we propose an interaction model and a Web application that explore the natural language processing tasks of lexical elaboration and named entity labeling for improving Web accessibility. We report on the results obtained from a pilot study on usability analysis carried out with low-literacy users. The preliminary results show that oEducational FACILITAo improves the comprehension of text elements, although the assistance mechanisms might also confuse users when word sense ambiguity is introduced, by gathering, for a complex word, a list of synonyms with multiple meanings. This fact evokes a future solution in which the correct sense for a complex word in a sentence is identified, solving this pervasive characteristic of natural languages. The pilot study also identified that experienced computer users find the tool to be more useful than novice computer users do.
Resumo:
Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.
Resumo:
There is an increasing interest in the application of Evolutionary Algorithms (EAs) to induce classification rules. This hybrid approach can benefit areas where classical methods for rule induction have not been very successful. One example is the induction of classification rules in imbalanced domains. Imbalanced data occur when one or more classes heavily outnumber other classes. Frequently, classical machine learning (ML) classifiers are not able to learn in the presence of imbalanced data sets, inducing classification models that always predict the most numerous classes. In this work, we propose a novel hybrid approach to deal with this problem. We create several balanced data sets with all minority class cases and a random sample of majority class cases. These balanced data sets are fed to classical ML systems that produce rule sets. The rule sets are combined creating a pool of rules and an EA is used to build a classifier from this pool of rules. This hybrid approach has some advantages over undersampling, since it reduces the amount of discarded information, and some advantages over oversampling, since it avoids overfitting. The proposed approach was experimentally analysed and the experimental results show an improvement in the classification performance measured as the area under the receiver operating characteristics (ROC) curve.
Resumo:
Robotic mapping is the process of automatically constructing an environment representation using mobile robots. We address the problem of semantic mapping, which consists of using mobile robots to create maps that represent not only metric occupancy but also other properties of the environment. Specifically, we develop techniques to build maps that represent activity and navigability of the environment. Our approach to semantic mapping is to combine machine learning techniques with standard mapping algorithms. Supervised learning methods are used to automatically associate properties of space to the desired classification patterns. We present two methods, the first based on hidden Markov models and the second on support vector machines. Both approaches have been tested and experimentally validated in two problem domains: terrain mapping and activity-based mapping.
Resumo:
Model trees are a particular case of decision trees employed to solve regression problems. They have the advantage of presenting an interpretable output, helping the end-user to get more confidence in the prediction and providing the basis for the end-user to have new insight about the data, confirming or rejecting hypotheses previously formed. Moreover, model trees present an acceptable level of predictive performance in comparison to most techniques used for solving regression problems. Since generating the optimal model tree is an NP-Complete problem, traditional model tree induction algorithms make use of a greedy top-down divide-and-conquer strategy, which may not converge to the global optimal solution. In this paper, we propose a novel algorithm based on the use of the evolutionary algorithms paradigm as an alternate heuristic to generate model trees in order to improve the convergence to globally near-optimal solutions. We call our new approach evolutionary model tree induction (E-Motion). We test its predictive performance using public UCI data sets, and we compare the results to traditional greedy regression/model trees induction algorithms, as well as to other evolutionary approaches. Results show that our method presents a good trade-off between predictive performance and model comprehensibility, which may be crucial in many machine learning applications. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
OWL-S is an application of OWL, the Web Ontology Language, that describes the semantics of Web Services so that their discovery, selection, invocation and composition can be automated. The research literature reports the use of UML diagrams for the automatic generation of Semantic Web Service descriptions in OWL-S. This paper demonstrates a higher level of automation by generating complete complete Web applications from OWL-S descriptions that have themselves been generated from UML. Previously, we proposed an approach for processing OWL-S descriptions in order to produce MVC-based skeletons for Web applications. The OWL-S ontology undergoes a series of transformations in order to generate a Model-View-Controller application implemented by a combination of Java Beans, JSP, and Servlets code, respectively. In this paper, we show in detail the documents produced at each processing step. We highlight the connections between OWL-S specifications and executable code in the various Java dialects and show the Web interfaces that result from this process.
Resumo:
A group is said to have the R(infinity) property if every automorphism has an infinite number of twisted conjugacy classes. We study the question whether G has the R(infinity) property when G is a finitely generated torsion-free nilpotent group. As a consequence, we show that for every positive integer n >= 5, there is a compact nilmanifold of dimension n on which every homeomorphism is isotopic to a fixed point free homeomorphism. As a by-product, we give a purely group theoretic proof that the free group on two generators has the R(infinity) property. The R(infinity) property for virtually abelian and for C-nilpotent groups are also discussed.