55 resultados para Lexical Database
Resumo:
Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia.
Resumo:
The location of previously unseen and unregistered individuals in complex camera networks from semantic descriptions is a time consuming and often inaccurate process carried out by human operators, or security staff on the ground. To promote the development and evaluation of automated semantic description based localisation systems, we present a new, publicly available, unconstrained 110 sequence database, collected from 6 stationary cameras. Each sequence contains detailed semantic information for a single search subject who appears in the clip (gender, age, height, build, hair and skin colour, clothing type, texture and colour), and between 21 and 290 frames for each clip are annotated with the target subject location (over 11,000 frames are annotated in total). A novel approach for localising a person given a semantic query is also proposed and demonstrated on this database. The proposed approach incorporates clothing colour and type (for clothing worn below the waist), as well as height and build to detect people. A method to assess the quality of candidate regions, as well as a symmetry driven approach to aid in modelling clothing on the lower half of the body, is proposed within this approach. An evaluation on the proposed dataset shows that a relative improvement in localisation accuracy of up to 21 is achieved over the baseline technique.
Resumo:
Motivated by the analysis of the Australian Grain Insect Resistance Database (AGIRD), we develop a Bayesian hurdle modelling approach to assess trends in strong resistance of stored grain insects to phosphine over time. The binary response variable from AGIRD indicating presence or absence of strong resistance is characterized by a majority of absence observations and the hurdle model is a two step approach that is useful when analyzing such a binary response dataset. The proposed hurdle model utilizes Bayesian classification trees to firstly identify covariates and covariate levels pertaining to possible presence or absence of strong resistance. Secondly, generalized additive models (GAMs) with spike and slab priors for variable selection are fitted to the subset of the dataset identified from the Bayesian classification tree indicating possibility of presence of strong resistance. From the GAM we assess trends, biosecurity issues and site specific variables influencing the presence of strong resistance using a variable selection approach. The proposed Bayesian hurdle model is compared to its frequentist counterpart, and also to a naive Bayesian approach which fits a GAM to the entire dataset. The Bayesian hurdle model has the benefit of providing a set of good trees for use in the first step and appears to provide enough flexibility to represent the influence of variables on strong resistance compared to the frequentist model, but also captures the subtle changes in the trend that are missed by the frequentist and naive Bayesian models.
Resumo:
The provision of visual support to individuals with an autism spectrum disorder (ASD) is widely recommended. We explored one mechanism underlying the use of visual supports: efficiency of language processing. Two groups of children, one with and one without an ASD, participated. The groups had comparable oral and written language skills and nonverbal cognitive abilities. In two semantic priming experiments, prime modality and prime–target relatedness were manipulated. Response time and accuracy of lexical decisions on the spoken word targets were measured. In the first uni-modal experiment, both groups demonstrated significant priming effects. In the second experiment which was cross-modal, no effect for relatedness or group was found. This result is considered in the light of the attentional capacity required for access to the lexicon via written stimuli within the developing semantic system. These preliminary findings are also considered with respect to the use of visual support for children with ASD.
Resumo:
Database watermarking has received significant research attention in the current decade. Although, almost all watermarking models have been either irreversible (the original relation cannot be restored from the watermarked relation) and/or non-blind (requiring original relation to detect the watermark in watermarked relation). This model has several disadvantages over reversible and blind watermarking (requiring only watermarked relation and secret key from which the watermark is detected and original relation is restored) including inability to identify rightful owner in case of successful secondary watermarking, inability to revert the relation to original data set (required in high precision industries) and requirement to store unmarked relation at a secure secondary storage. To overcome these problems, we propose a watermarking scheme that is reversible as well as blind. We utilize difference expansion on integers to achieve reversibility. The major advantages provided by our scheme are reversibility to high quality original data set, rightful owner identification, resistance against secondary watermarking attacks, and no need to store original database at a secure secondary storage.
Resumo:
The identification of cognates between two distinct languages has recently start- ed to attract the attention of NLP re- search, but there has been little research into using semantic evidence to detect cognates. The approach presented in this paper aims to detect English-French cog- nates within monolingual texts (texts that are not accompanied by aligned translat- ed equivalents), by integrating word shape similarity approaches with word sense disambiguation techniques in order to account for context. Our implementa- tion is based on BabelNet, a semantic network that incorporates a multilingual encyclopedic dictionary. Our approach is evaluated on two manually annotated da- tasets. The first one shows that across different types of natural text, our method can identify the cognates with an overall accuracy of 80%. The second one, con- sisting of control sentences with semi- cognates acting as either true cognates or false friends, shows that our method can identify 80% of semi-cognates acting as cognates but also identifies 75% of the semi-cognates acting as false friends.
Resumo:
Protein adsorption at solid-liquid interfaces is critical to many applications, including biomaterials, protein microarrays and lab-on-a-chip devices. Despite this general interest, and a large amount of research in the last half a century, protein adsorption cannot be predicted with an engineering level, design-orientated accuracy. Here we describe a Biomolecular Adsorption Database (BAD), freely available online, which archives the published protein adsorption data. Piecewise linear regression with breakpoint applied to the data in the BAD suggests that the input variables to protein adsorption, i.e., protein concentration in solution; protein descriptors derived from primary structure (number of residues, global protein hydrophobicity and range of amino acid hydrophobicity, isoelectric point); surface descriptors (contact angle); and fluid environment descriptors (pH, ionic strength), correlate well with the output variable-the protein concentration on the surface. Furthermore, neural network analysis revealed that the size of the BAD makes it sufficiently representative, with a neural network-based predictive error of 5% or less. Interestingly, a consistently better fit is obtained if the BAD is divided in two separate sub-sets representing protein adsorption on hydrophilic and hydrophobic surfaces, respectively. Based on these findings, selected entries from the BAD have been used to construct neural network-based estimation routines, which predict the amount of adsorbed protein, the thickness of the adsorbed layer and the surface tension of the protein-covered surface. While the BAD is of general interest, the prediction of the thickness and the surface tension of the protein-covered layers are of particular relevance to the design of microfluidics devices.
Resumo:
Public buildings and large infrastructure are typically monitored by tens or hundreds of cameras, all capturing different physical spaces and observing different types of interactions and behaviours. However to date, in large part due to limited data availability, crowd monitoring and operational surveillance research has focused on single camera scenarios which are not representative of real-world applications. In this paper we present a new, publicly available database for large scale crowd surveillance. Footage from 12 cameras for a full work day covering the main floor of a busy university campus building, including an internal and external foyer, elevator foyers, and the main external approach are provided; alongside annotation for crowd counting (single or multi-camera) and pedestrian flow analysis for 10 and 6 sites respectively. We describe how this large dataset can be used to perform distributed monitoring of building utilisation, and demonstrate the potential of this dataset to understand and learn the relationship between different areas of a building.
Resumo:
Spoken word production is assumed to involve stages of processing in which activation spreads through layers of units comprising lexical-conceptual knowledge and their corresponding phonological word forms. Using high-field (4T) functional magnetic resonance imagine (fMRI), we assessed whether the relationship between these stages is strictly serial or involves cascaded-interactive processing, and whether central (decision/control) processing mechanisms are involved in lexical selection. Participants performed the competitor priming paradigm in which distractor words, named from a definition and semantically related to a subsequently presented target picture, slow picture-naming latency compared to that with unrelated words. The paradigm intersperses two trials between the definition and the picture to be named, temporally separating activation in the word perception and production networks. Priming semantic competitors of target picture names significantly increased activation in the left posterior temporal cortex, and to a lesser extent the left middle temporal cortex, consistent with the predictions of cascaded-interactive models of lexical access. In addition, extensive activation was detected in the anterior cingulate and pars orbitalis of the inferior frontal gyrus. The findings indicate that lexical selection during competitor priming is biased by top-down mechanisms to reverse associations between primed distractor words and target pictures to select words that meet the current goal of speech.
Resumo:
The speed at which target pictures are named increases monotonically as a function of prior retrieval of other exemplars of the same semantic category and is unaffected by the number of intervening items. This cumulative semantic interference effect is generally attributed to three mechanisms: shared feature activation, priming and lexical-level selection. However, at least two additional mechanisms have been proposed: (1) a 'booster' to amplify lexical-level activation and (2) retrieval-induced forgetting (RIF). In a perfusion functional Magnetic Resonance Imaging (fMRI) experiment, we tested hypotheses concerning the involvement of all five mechanisms. Our results demonstrate that the cumulative interference effect is associated with perfusion signal changes in the left perirhinal and middle temporal cortices that increase monotonically according to the ordinal position of exemplars being named. The left inferior frontal gyrus (LIFG) also showed significant perfusion signal changes across ordinal presentations; however, these responses did not conform to a monotonically increasing function. None of the cerebral regions linked with RIF in prior neuroimaging and modelling studies showed significant effects. This might be due to methodological differences between the RIF paradigm and continuous naming as the latter does not involve practicing particular information. We interpret the results as indicating priming of shared features and lexical-level selection mechanisms contribute to the cumulative interference effect, while adding noise to a booster mechanism could account for the pattern of responses observed in the LIFG.
Resumo:
In two fMRI experiments, participants named pictures with superimposed distractors that were high or low in frequency or varied in terms of age of acquisition. Pictures superimposed with low-frequency words were named more slowly than those superimposed with high-frequency words, and late-acquired words interfered with picture naming to a greater extent than early-acquired words. The distractor frequency effect (Experiment 1) was associated with increased activity in left premotor and posterior superior temporal cortices, consistent with the operation of an articulatory response buffer and verbal selfmonitoring system. Conversely, the distractor age-of-acquisition effect (Experiment 2) was associated with increased activity in the left middle and posterior middle temporal cortex, consistent with the operation of lexical level processes such as lemma and phonological word form retrieval. The spatially dissociated patterns of activity across the two experiments indicate that distractor effects in picture-word interference may occur at lexical or postlexical levels of processing in speech production.