928 resultados para Lexical Database
Resumo:
The identification of cognates between two distinct languages has recently start- ed to attract the attention of NLP re- search, but there has been little research into using semantic evidence to detect cognates. The approach presented in this paper aims to detect English-French cog- nates within monolingual texts (texts that are not accompanied by aligned translat- ed equivalents), by integrating word shape similarity approaches with word sense disambiguation techniques in order to account for context. Our implementa- tion is based on BabelNet, a semantic network that incorporates a multilingual encyclopedic dictionary. Our approach is evaluated on two manually annotated da- tasets. The first one shows that across different types of natural text, our method can identify the cognates with an overall accuracy of 80%. The second one, con- sisting of control sentences with semi- cognates acting as either true cognates or false friends, shows that our method can identify 80% of semi-cognates acting as cognates but also identifies 75% of the semi-cognates acting as false friends.
Resumo:
Protein adsorption at solid-liquid interfaces is critical to many applications, including biomaterials, protein microarrays and lab-on-a-chip devices. Despite this general interest, and a large amount of research in the last half a century, protein adsorption cannot be predicted with an engineering level, design-orientated accuracy. Here we describe a Biomolecular Adsorption Database (BAD), freely available online, which archives the published protein adsorption data. Piecewise linear regression with breakpoint applied to the data in the BAD suggests that the input variables to protein adsorption, i.e., protein concentration in solution; protein descriptors derived from primary structure (number of residues, global protein hydrophobicity and range of amino acid hydrophobicity, isoelectric point); surface descriptors (contact angle); and fluid environment descriptors (pH, ionic strength), correlate well with the output variable-the protein concentration on the surface. Furthermore, neural network analysis revealed that the size of the BAD makes it sufficiently representative, with a neural network-based predictive error of 5% or less. Interestingly, a consistently better fit is obtained if the BAD is divided in two separate sub-sets representing protein adsorption on hydrophilic and hydrophobic surfaces, respectively. Based on these findings, selected entries from the BAD have been used to construct neural network-based estimation routines, which predict the amount of adsorbed protein, the thickness of the adsorbed layer and the surface tension of the protein-covered surface. While the BAD is of general interest, the prediction of the thickness and the surface tension of the protein-covered layers are of particular relevance to the design of microfluidics devices.
Resumo:
Public buildings and large infrastructure are typically monitored by tens or hundreds of cameras, all capturing different physical spaces and observing different types of interactions and behaviours. However to date, in large part due to limited data availability, crowd monitoring and operational surveillance research has focused on single camera scenarios which are not representative of real-world applications. In this paper we present a new, publicly available database for large scale crowd surveillance. Footage from 12 cameras for a full work day covering the main floor of a busy university campus building, including an internal and external foyer, elevator foyers, and the main external approach are provided; alongside annotation for crowd counting (single or multi-camera) and pedestrian flow analysis for 10 and 6 sites respectively. We describe how this large dataset can be used to perform distributed monitoring of building utilisation, and demonstrate the potential of this dataset to understand and learn the relationship between different areas of a building.
Resumo:
Spoken word production is assumed to involve stages of processing in which activation spreads through layers of units comprising lexical-conceptual knowledge and their corresponding phonological word forms. Using high-field (4T) functional magnetic resonance imagine (fMRI), we assessed whether the relationship between these stages is strictly serial or involves cascaded-interactive processing, and whether central (decision/control) processing mechanisms are involved in lexical selection. Participants performed the competitor priming paradigm in which distractor words, named from a definition and semantically related to a subsequently presented target picture, slow picture-naming latency compared to that with unrelated words. The paradigm intersperses two trials between the definition and the picture to be named, temporally separating activation in the word perception and production networks. Priming semantic competitors of target picture names significantly increased activation in the left posterior temporal cortex, and to a lesser extent the left middle temporal cortex, consistent with the predictions of cascaded-interactive models of lexical access. In addition, extensive activation was detected in the anterior cingulate and pars orbitalis of the inferior frontal gyrus. The findings indicate that lexical selection during competitor priming is biased by top-down mechanisms to reverse associations between primed distractor words and target pictures to select words that meet the current goal of speech.
Resumo:
The speed at which target pictures are named increases monotonically as a function of prior retrieval of other exemplars of the same semantic category and is unaffected by the number of intervening items. This cumulative semantic interference effect is generally attributed to three mechanisms: shared feature activation, priming and lexical-level selection. However, at least two additional mechanisms have been proposed: (1) a 'booster' to amplify lexical-level activation and (2) retrieval-induced forgetting (RIF). In a perfusion functional Magnetic Resonance Imaging (fMRI) experiment, we tested hypotheses concerning the involvement of all five mechanisms. Our results demonstrate that the cumulative interference effect is associated with perfusion signal changes in the left perirhinal and middle temporal cortices that increase monotonically according to the ordinal position of exemplars being named. The left inferior frontal gyrus (LIFG) also showed significant perfusion signal changes across ordinal presentations; however, these responses did not conform to a monotonically increasing function. None of the cerebral regions linked with RIF in prior neuroimaging and modelling studies showed significant effects. This might be due to methodological differences between the RIF paradigm and continuous naming as the latter does not involve practicing particular information. We interpret the results as indicating priming of shared features and lexical-level selection mechanisms contribute to the cumulative interference effect, while adding noise to a booster mechanism could account for the pattern of responses observed in the LIFG.
Resumo:
In two fMRI experiments, participants named pictures with superimposed distractors that were high or low in frequency or varied in terms of age of acquisition. Pictures superimposed with low-frequency words were named more slowly than those superimposed with high-frequency words, and late-acquired words interfered with picture naming to a greater extent than early-acquired words. The distractor frequency effect (Experiment 1) was associated with increased activity in left premotor and posterior superior temporal cortices, consistent with the operation of an articulatory response buffer and verbal selfmonitoring system. Conversely, the distractor age-of-acquisition effect (Experiment 2) was associated with increased activity in the left middle and posterior middle temporal cortex, consistent with the operation of lexical level processes such as lemma and phonological word form retrieval. The spatially dissociated patterns of activity across the two experiments indicate that distractor effects in picture-word interference may occur at lexical or postlexical levels of processing in speech production.
Resumo:
Contemporary models of spoken word production assume conceptual feature sharing determines the speed with which objects are named in categorically-related contexts. However, statistical models of concept representation have also identified a role for feature distinctiveness, i.e., features that identify a single concept and serve to distinguish it quickly from other similar concepts. In three experiments we investigated whether distinctive features might explain reports of counter-intuitive semantic facilitation effects in the picture word interference (PWI) paradigm. In Experiment 1, categorically-related distractors matched in terms of semantic similarity ratings (e.g., zebra and pony) and manipulated with respect to feature distinctiveness (e.g., a zebra has stripes unlike other equine species) elicited interference effects of comparable magnitude. Experiments 2 and 3 investigated the role of feature distinctiveness with respect to reports of facilitated naming with part-whole distractor-target relations (e.g., a hump is a distinguishing part of a CAMEL, whereas knee is not, vs. an unrelated part such as plug). Related part distractors did not influence target picture naming latencies significantly when the part denoted by the related distractor was not visible in the target picture (whether distinctive or not; Experiment 2). When the part denoted by the related distractor was visible in the target picture, non-distinctive part distractors slowed target naming significantly at SOA of -150 ms (Experiment 3). Thus, our results show that semantic interference does occur for part-whole distractor-target relations in PWI, but only when distractors denote features shared with the target and other category exemplars. We discuss the implications of these results for some recently developed, novel accounts of lexical access in spoken word production.
Resumo:
How does the presence of a categorically related word influence picture naming latencies? In order to test competitive and noncompetitive accounts of lexical selection in spoken word production, we employed the picture–word interference (PWI) paradigm to investigate how conceptual feature overlap influences naming latencies when distractors are category coordinates of the target picture. Mahon et al. (2007. Lexical selection is not by competition: A reinterpretation of semantic interference and facilitation effects in the picture-word interference paradigm. Journal of Experimental Psychology. Learning, Memory, and Cognition, 33(3), 503–535. doi:10.1037/0278-7393.33.3.503) reported that semantically close distractors (e.g., zebra) facilitated target picture naming latencies (e.g., HORSE) compared to far distractors (e.g., whale). We failed to replicate a facilitation effect for within-category close versus far target–distractor pairings using near-identical materials based on feature production norms, instead obtaining reliably larger interference effects (Experiments 1 and 2). The interference effect did not show a monotonic increase across multiple levels of within-category semantic distance, although there was evidence of a linear trend when unrelated distractors were included in analyses (Experiment 2). Our results show that semantic interference in PWI is greater for semantically close than for far category coordinate relations, reflecting the extent of conceptual feature overlap between target and distractor. These findings are consistent with the assumptions of prominent competitive lexical selection models of speech production.
Resumo:
Visual information in the form of lip movements of the speaker has been shown to improve the performance of speech recognition and search applications. In our previous work, we proposed cross database training of synchronous hidden Markov models (SHMMs) to make use of external large and publicly available audio databases in addition to the relatively small given audio visual database. In this work, the cross database training approach is improved by performing an additional audio adaptation step, which enables audio visual SHMMs to benefit from audio observations of the external audio models before adding visual modality to them. The proposed approach outperforms the baseline cross database training approach in clean and noisy environments in terms of phone recognition accuracy as well as spoken term detection (STD) accuracy.
Resumo:
Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.
Resumo:
This article uses topological approaches to suggest that education is becoming-topological. Analyses presented in a recent double-issue of Theory, Culture & Society are used to demonstrate the utility of topology for education. In particular, the article explains education's topological character through examining the global convergence of education policy, testing and the discursive ranking of systems, schools and individuals in the promise of reforming education through the proliferation of regimes of testing at local and global levels that constitute a new form of governance through data. In this conceptualisation of global education policy changes in the form and nature of testing combine with it the emergence of global policy network to change the nature of the local (national, regional, school and classroom) forces that operate through the ‘system’. While these forces change, they work through a discursivity that produces disciplinary effects, but in a different way. This new–old disciplinarity, or ‘database effect’, is here represented through a topological approach because of its utility for conceiving education in an increasingly networked world.
Resumo:
Fossils provide the principal basis for temporal calibrations, which are critical to the accuracy of divergence dating analyses. Translating fossil data into minimum and maximum bounds for calibrations is the most important, and often least appreciated, step of divergence dating. Properly justified calibrations require the synthesis of phylogenetic, paleontological, and geological evidence and can be difficult for non-specialists to formulate. The dynamic nature of the fossil record (e.g., new discoveries, taxonomic revisions, updates of global or local stratigraphy) requires that calibration data be updated continually lest they become obsolete. Here, we announce the Fossil Calibration Database (http://fossilcalibrations.org), a new open-access resource providing vetted fossil calibrations to the scientific community. Calibrations accessioned into this database are based on individual fossil specimens and follow best practices for phylogenetic justification and geochronological constraint. The associated Fossil Calibration Series, a calibration-themed publication series at Palaeontologia Electronica, will serve as one key pipeline for peer-reviewed calibrations to enter the database.