909 resultados para Lexical ambiguity
Resumo:
Lexical diversity measures are notoriously sensitive to variations of sample size and recent approaches to this issue typically involve the computation of the average variety of lexical units in random subsamples of fixed size. This methodology has been further extended to measures of inflectional diversity such as the average number of wordforms per lexeme, also known as the mean size of paradigm (MSP) index. In this contribution we argue that, while random sampling can indeed be used to increase the robustness of inflectional diversity measures, using a fixed subsample size is only justified under the hypothesis that the corpora that we compare have the same degree of lexematic diversity. In the more general case where they may have differing degrees of lexematic diversity, a more sophisticated strategy can and should be adopted. A novel approach to the measurement of inflectional diversity is proposed, aiming to cope not only with variations of sample size, but also with variations of lexematic diversity. The robustness of this new method is empirically assessed and the results show that while there is still room for improvement, the proposed methodology considerably attenuates the impact of lexematic diversity discrepancies on the measurement of inflectional diversity.
Resumo:
Here we adopt a novel strategy to investigate phonological assembly. Participants performed a visual lexical decision task in English in which the letters in words and letterstrings were delivered either sequentially (promoting phonological assembly) or simultaneously (not promoting phonological assembly). A region of interest analysis confirmed that regions previously associated with phonological assembly, in studies contrasting different word types (e.g. words versus pseudowords), were also identified using our novel task that controls for a number of confounding variables. Specifically, the left pars opercularis, the superior part of the ventral precentral gyrus and the supramarginal gyrus were all recruited more during sequential delivery than simultaneous delivery, even when various psycholinguistic characteristics of the stimuli were controlled. This suggests that sequential delivery of orthographic stimuli is a useful tool to explore how readers, with various levels of proficiency, use sublexical phonological processing during visual word recognition.
Resumo:
Biomedical research is currently facing a new type of challenge: an excess of information, both in terms of raw data from experiments and in the number of scientific publications describing their results. Mirroring the focus on data mining techniques to address the issues of structured data, there has recently been great interest in the development and application of text mining techniques to make more effective use of the knowledge contained in biomedical scientific publications, accessible only in the form of natural human language. This thesis describes research done in the broader scope of projects aiming to develop methods, tools and techniques for text mining tasks in general and for the biomedical domain in particular. The work described here involves more specifically the goal of extracting information from statements concerning relations of biomedical entities, such as protein-protein interactions. The approach taken is one using full parsing—syntactic analysis of the entire structure of sentences—and machine learning, aiming to develop reliable methods that can further be generalized to apply also to other domains. The five papers at the core of this thesis describe research on a number of distinct but related topics in text mining. In the first of these studies, we assessed the applicability of two popular general English parsers to biomedical text mining and, finding their performance limited, identified several specific challenges to accurate parsing of domain text. In a follow-up study focusing on parsing issues related to specialized domain terminology, we evaluated three lexical adaptation methods. We found that the accurate resolution of unknown words can considerably improve parsing performance and introduced a domain-adapted parser that reduced the error rate of theoriginal by 10% while also roughly halving parsing time. To establish the relative merits of parsers that differ in the applied formalisms and the representation given to their syntactic analyses, we have also developed evaluation methodology, considering different approaches to establishing comparable dependency-based evaluation results. We introduced a methodology for creating highly accurate conversions between different parse representations, demonstrating the feasibility of unification of idiverse syntactic schemes under a shared, application-oriented representation. In addition to allowing formalism-neutral evaluation, we argue that such unification can also increase the value of parsers for domain text mining. As a further step in this direction, we analysed the characteristics of publicly available biomedical corpora annotated for protein-protein interactions and created tools for converting them into a shared form, thus contributing also to the unification of text mining resources. The introduced unified corpora allowed us to perform a task-oriented comparative evaluation of biomedical text mining corpora. This evaluation established clear limits on the comparability of results for text mining methods evaluated on different resources, prompting further efforts toward standardization. To support this and other research, we have also designed and annotated BioInfer, the first domain corpus of its size combining annotation of syntax and biomedical entities with a detailed annotation of their relationships. The corpus represents a major design and development effort of the research group, with manual annotation that identifies over 6000 entities, 2500 relationships and 28,000 syntactic dependencies in 1100 sentences. In addition to combining these key annotations for a single set of sentences, BioInfer was also the first domain resource to introduce a representation of entity relations that is supported by ontologies and able to capture complex, structured relationships. Part I of this thesis presents a summary of this research in the broader context of a text mining system, and Part II contains reprints of the five included publications.
Resumo:
This paper gives a full description of the phonetics and phonology of Traditional Cockney and Popular London speech, treating these varieties as constituting a continuum rather than two separate dialects. Exemplification of the vowels, diphthongs and consonants is provided, both in isolate words and in connected speech, along with their range of variation. The frequencies of the vowels have been charted on the basis of the pronunciation of three elderly male speakers. Regarding the consonants, there are detailed observations on the features typically associated with the linguistic varieties examined: strong aspiration of unvoiced plosives, glottalization, H-dropping, L-vocalization and TH-fronting. A section on prosody provides coverage of lexical stress, rhythm and intonation. The paper takes into account up-to-date research on these phenomena, but does not deal with the most recent vowel shifts, some of which form part of Multi-cultural London English.
Resumo:
The visual angle that is projected by an object (e.g. a ball) on the retina depends on the object's size and distance. Without further information, however, the visual angle is ambiguous with respect to size and distance, because equal visual angles can be obtained from a big ball at a longer distance and a smaller one at a correspondingly shorter distance. Failure to recover the true 3D structure of the object (e.g. a ball's physical size) causing the ambiguous retinal image can lead to a timing error when catching the ball. Two opposing views are currently prevailing on how people resolve this ambiguity when estimating time to contact. One explanation challenges any inference about what causes the retinal image (i.e. the necessity to recover this 3D structure), and instead favors a direct analysis of optic flow. In contrast, the second view suggests that action timing could be rather based on obtaining an estimate of the 3D structure of the scene. With the latter, systematic errors will be predicted if our inference of the 3D structure fails to reveal the underlying cause of the retinal image. Here we show that hand closure in catching virtual balls is triggered by visual angle, using an assumption of a constant ball size. As a consequence of this assumption, hand closure starts when the ball is at similar distance across trials. From that distance on, the remaining arrival time, therefore, depends on ball's speed. In order to time the catch successfully, closing time was coupled with ball's speed during the motor phase. This strategy led to an increased precision in catching but at the cost of committing systematic errors.
Resumo:
Objective To construct a Portuguese language index of information on the practice of diagnostic radiology in order to improve the standardization of the medical language and terminology. Materials and Methods A total of 61,461 definitive reports were collected from the database of the Radiology Information System at Hospital das Clínicas – Faculdade de Medicina de Ribeirão Preto (RIS/HCFMRP) as follows: 30,000 chest x-ray reports; 27,000 mammography reports; and 4,461 thyroid ultrasonography reports. The text mining technique was applied for the selection of terms, and the ANSI/NISO Z39.19-2005 standard was utilized to construct the index based on a thesaurus structure. The system was created in *html. Results The text mining resulted in a set of 358,236 (n = 100%) words. Out of this total, 76,347 (n = 21%) terms were selected to form the index. Such terms refer to anatomical pathology description, imaging techniques, equipment, type of study and some other composite terms. The index system was developed with 78,538 *html web pages. Conclusion The utilization of text mining on a radiological reports database has allowed the construction of a lexical system in Portuguese language consistent with the clinical practice in Radiology.
Resumo:
The role of grammatical class in lexical access and representation is still not well understood. Grammatical effects obtained in picture-word interference experiments have been argued to show the operation of grammatical constraints during lexicalization when syntactic integration is required by the task. Alternative views hold that the ostensibly grammatical effects actually derive from the coincidence of semantic and grammatical differences between lexical candidates. We present three picture-word interference experiments conducted in Spanish. In the first two, the semantic relatedness (related or unrelated) and the grammatical class (nouns or verbs) of the target and the distracter were manipulated in an infinitive form action naming task in order to disentangle their contributions to verb lexical access. In the third experiment, a possible confound between grammatical class and semantic domain (objects or actions) was eliminated by using action-nouns as distracters. A condition in which participants were asked to name the action pictures using an inflected form of the verb was also included to explore whether the need of syntactic integration modulated the appearance of grammatical effects. Whereas action-words (nouns or verbs), but not object-nouns, produced longer reaction times irrespective of their grammatical class in the infinitive condition, only verbs slowed latencies in the inflected form condition. Our results suggest that speech production relies on the exclusion of candidate responses that do not fulfil task-pertinent criteria like membership in the appropriate semantic domain or grammatical class. Taken together, these findings are explained by a response-exclusion account of speech output. This and alternative hypotheses are discussed.
Resumo:
The ¹H NMR data set of a series of 3-aryl (1,2,4)-oxadiazol-5-carbohydrazide benzylidene derivatives synthesized in our group was analyzed using the chemometric technique of principal component analysis (PCA). Using the original ¹H NMR data PCA allowed identifying some misassignments of the proton aromatic chemical shifts. As a consequence of this multivariate analysis, nuclear Overhauser difference experiments were performed to investigate the ambiguity of other assignments of the ortho and meta aromatic hydrogens for the compound with the bromine substituent. The effect of the 1,2,4-oxadiazol group as an electron acceptor, mainly for the hydrogens 12,13, has been highlighted.
Resumo:
Performance-based studies on the psychological nature of linguistic competence can conceal significant differences in the brain processes that underlie native versus nonnative knowledge of language. Here we report results from the brain activity of very proficient early bilinguals making a lexical decision task that illustrates this point. Two groups of SpanishCatalan early bilinguals (Spanish-dominant and Catalan-dominant) were asked to decide whether a given form was a Catalan word or not. The nonwords were based on real words, with one vowel changed. In the experimental stimuli, the vowel change involved a Catalan-specific contrast that previous research had shown to be difficult for Spanish natives to perceive. In the control stimuli, the vowel switch involved contrasts common to Spanish and Catalan. The results indicated that the groups of bilinguals did not differ in their behavioral and event-related brain potential measurements for the control stimuli; both groups made very few errors and showed a larger N400 component for control nonwords than for control words. However, significant differences were observed for the experimental stimuli across groups: Specifically, Spanish-dominant bilinguals showed great difficulty in rejecting experimental nonwords. Indeed, these participants not only showed very high error rates for these stimuli, but also did not show an error-related negativity effect in their erroneous nonword decisions. However, both groups of bilinguals showed a larger correctrelated negativity when making correct decisions about the experimental nonwords. The results suggest that although some aspects of a second language system may show a remarkable lack of plasticity (like the acquisition of some foreign contrasts), first-language representations seem to be more dynamic in their capacity of adapting and incorporating new information. &
Resumo:
The ERP repetition priming paradigm has been shown to be sensitive to the processing differences between regular and irregular verb forms in English and German. The purpose of the present study is to extend this research to a language with a different inflectional system, Spanish. The design (delayed visual repetition priming) was adopted from our previous study on English, and the specific linguistic phenomena we examined are priming relations between different kinds of stem (or root) forms. There were two experimental conditions: In the first condition, the prime and the target shared the same stem form, e.g., "ando-andar" [I walk-to walk], whereas in the second condition, the prime contained a marked (alternated) stem, e.g., "duermo-dormir" [I sleep-to sleep]. A reduced N400 was found for unmarked (nonalternated) stems in the primed condition, whereas marked stems showed no such effect. Moreover, control conditions demonstrated that the surface form properties (i.e., the different degree of phonetic and orthographic overlap between primes and targets) do not explain the observed priming difference. The ERP priming effect for verb forms with unmarked stems in Spanish is parallel to that found for regularly inflected verb forms in English and German. We argue that effective priming is possible because prime target pairs such as "ando-andar" access the same lexical entry for their stems. By contrast, verb forms with alternated stems (e.g., "duermo") constitute separate lexical entries, and are therefore less powerful primes for their corresponding base forms.
Resumo:
Tutkielman tarkoituksena on osallistua liiketaloustieteelliseen keskusteluun organisaatiositoutumisesta. Pro gradu -työn tavoitteena on tunnistaa, mitkä tekijät vaikuttavat etätyöntekijöiden sitoutumiseen ja minkälaisia lisähaasteita etätyöskentely luo sitoutumiselle. Tutkielman teoriaosuudessa käsitellään etätyön eri lajeja, etätyöhön siirtymisen edellytyksiä, erilaisia sitoutumismalleja sekä sitoutumiseen vaikuttavia tekijöitä. Tutkimuksessa käytettiin laadullisia tutkimusmenetelmiä. Havaintoaineisto muodostuu yhdeksästä teemahaastattelusta. Haastateltavat etätyöntekijät edustavat tasaisesti kaikkia etätyönteon muotoja. Analyysimenetelmänä on käytetty teemoittelua, joka helpottaa aineiston jäsentelyä. Tutkimustulosten perusteella etätyöntekijöiden organisaatiositoutumiseen vaikuttavat eniten työn haasteellisuus, uralla etenemisen mahdollisuudet, esimiestoiminta ja työilmapiiri. Tutkimuksen mukaan etätyöskentely vaikuttaa etenkin työilmapiiriin, työnkuvan selkeyteen, työstä saatavaan palautteeseen sekä sisäiseen viestintään.
Resumo:
This paper studies the initial development of certain language components. More precisely, we analyse the relation between three aspects that are closely involved in the grammar of the verb: morphological productivity, syntactic complexity, and verb vocabulary learning. The study is based on data about the relationship between lexical development and grammatical development, and also on proposals that a critical mass of vocabulary is needed in order to develop a grammatical component. The sample comprised six subjects who are monolingual or bilingual in Catalan andlor Spanish. Results show a morphological spurt some time afer the learning of a certain quantity of verbs. Moreover, syntactic complexity is only evident some months after this morphological spurt
Resumo:
El presente trabajo se centra en estudiar la relación que existe entre el desarrollo de léxico y el de la morfosintaxis. Concretamente pretendemos explorar el tipo de vocabulario que mejor predice el desarrollo de la morfología verbal y el de la complejidad gramatical, así como establecer el tipo de relación entre desarrollo léxico y desarrollo morfosintáctico. La muestra comprende 517 niños de edades comprendidas entre los 18 meses y los 30 meses. Los datos se han recogido a partir de la adaptación al catalán del instrumento MacArthur-Bates Communicative Development Inventories (CDI). Los resultados muestran que el mejor predictor del desarrollo morfológico y gramatical es el vocabulario de clase cerrada, conjuntamente con el vocabulario general. Por otra parte, se observa una relación predominantemente lineal entre el desarrollo del léxico y el desarrollo morfosintáctico
Resumo:
The problem of understanding how humans perceive the quality of a reproduced image is of interest to researchers of many fields related to vision science and engineering: optics and material physics, image processing (compression and transfer), printing and media technology, and psychology. A measure for visual quality cannot be defined without ambiguity because it is ultimately the subjective opinion of an “end-user” observing the product. The purpose of this thesis is to devise computational methods to estimate the overall visual quality of prints, i.e. a numerical value that combines all the relevant attributes of the perceived image quality. The problem is limited to consider the perceived quality of printed photographs from the viewpoint of a consumer, and moreover, the study focuses only on digital printing methods, such as inkjet and electrophotography. The main contributions of this thesis are two novel methods to estimate the overall visual quality of prints. In the first method, the quality is computed as a visible difference between the reproduced image and the original digital (reference) image, which is assumed to have an ideal quality. The second method utilises instrumental print quality measures, such as colour densities, measured from printed technical test fields, and connects the instrumental measures to the overall quality via subjective attributes, i.e. attributes that directly contribute to the perceived quality, using a Bayesian network. Both approaches were evaluated and verified with real data, and shown to predict well the subjective evaluation results.
Resumo:
The main focus of the present thesis was at verbal episodic memory processes that are particularly vulnerable to preclinical and clinical Alzheimer’s disease (AD). Here these processes were studied by a word learning paradigm, cutting across the domains of memory and language learning studies. Moreover, the differentiation between normal aging, mild cognitive impairment (MCI) and AD was studied by the cognitive screening test CERAD. In study I, the aim was to examine how patients with amnestic MCI differ from healthy controls in the different CERAD subtests. Also, the sensitivity and specificity of the CERAD screening test to MCI and AD was examined, as previous studies on the sensitivity and specificity of the CERAD have not included MCI patients. The results indicated that MCI is characterized by an encoding deficit, as shown by the overall worse performance on the CERAD Wordlist learning test compared with controls. As a screening test, CERAD was not very sensitive to MCI. In study II, verbal learning and forgetting in amnestic MCI, AD and healthy elderly controls was investigated with an experimental word learning paradigm, where names of 40 unfamiliar objects (mainly archaic tools) were trained with or without semantic support. The object names were trained during a 4-day long period and a follow-up was conducted one week, 4 weeks and 8 weeks after the training period. Manipulation of semantic support was included in the paradigm because it was hypothesized that semantic support might have some beneficial effects in the present learning task especially for the MCI group, as semantic memory is quite well preserved in MCI in contrast to episodic memory. We found that word learning was significantly impaired in MCI and AD patients, whereas forgetting patterns were similar across groups. Semantic support showed a beneficial effect on object name retrieval in the MCI group 8 weeks after training, indicating that the MCI patients’ preserved semantic memory abilities compensated for their impaired episodic memory. The MCI group performed equally well as the controls in the tasks tapping incidental learning and recognition memory, whereas the AD group showed impairment. Both the MCI and the AD group benefited less from phonological cueing than the controls. Our findings indicate that acquisition is compromised in both MCI and AD, whereas long13 term retention is not affected to the same extent. Incidental learning and recognition memory seem to be well preserved in MCI. In studies III and IV, the neural correlates of naming newly learned objects were examined in healthy elderly subjects and in amnestic MCI patients by means of positron emission tomography (PET) right after the training period. The naming of newly learned objects by healthy elderly subjects recruited a left-lateralized network, including frontotemporal regions and the cerebellum, which was more extensive than the one related to the naming of familiar objects (study III). Semantic support showed no effects on the PET results for the healthy subjects. The observed activation increases may reflect lexicalsemantic and lexical-phonological retrieval, as well as more general associative memory mechanisms. In study IV, compared to the controls, the MCI patients showed increased anterior cingulate activation when naming newly learned objects that had been learned without semantic support. This suggests a recruitment of additional executive and attentional resources in the MCI group.