71 resultados para Semantic enrichment
em Aston University Research Archive
Resumo:
Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions.
Resumo:
We present a vision and a proposal for using Semantic Web technologies in the organic food industry. This is a very knowledge intensive industry at every step from the producer, to the caterer or restauranteur, through to the consumer. There is a crucial need for a concept of environmental audit which would allow the various stake holders to know the full environmental impact of their economic choices. This is a di?erent and parallel form of knowledge to that of price. Semantic Web technologies can be used e?ectively for the calculation and transfer of this type of knowledge (together with other forms of multimedia data) which could contribute considerably to the commercial and educational impact of the organic food industry. We outline how this could be achieved as our essential ob jective is to show how advanced technologies could be used to both reduce ecological impact and increase public awareness.
Resumo:
The main argument of this paper is that Natural Language Processing (NLP) does, and will continue to, underlie the Semantic Web (SW), including its initial construction from unstructured sources like the World Wide Web (WWW), whether its advocates realise this or not. Chiefly, we argue, such NLP activity is the only way up to a defensible notion of meaning at conceptual levels (in the original SW diagram) based on lower level empirical computations over usage. Our aim is definitely not to claim logic-bad, NLP-good in any simple-minded way, but to argue that the SW will be a fascinating interaction of these two methodologies, again like the WWW (which has been basically a field for statistical NLP research) but with deeper content. Only NLP technologies (and chiefly information extraction) will be able to provide the requisite RDF knowledge stores for the SW from existing unstructured text databases in the WWW, and in the vast quantities needed. There is no alternative at this point, since a wholly or mostly hand-crafted SW is also unthinkable, as is a SW built from scratch and without reference to the WWW. We also assume that, whatever the limitations on current SW representational power we have drawn attention to here, the SW will continue to grow in a distributed manner so as to serve the needs of scientists, even if it is not perfect. The WWW has already shown how an imperfect artefact can become indispensable.
Resumo:
Epidemiological evidence suggests that diets rich in fruits, vegetables and pulses reduce the risk of CVD. The Physicians Health Study has demonstrated reduction of CHD death with regular nut consumption1. One major modifiable risk factor for CHD is an unhealthy diet. Thus, an almondenrichment study has been undertaken to examine the benefit of almonds (Prunus amygdalis) in healthy individuals either with or without significant risk of vascular disease. Almonds contain various macronutrients (low SFA content, absence of cholesterol and high MUFA content) and micronutrients, including vitamin E, polyphenols and arginine, which afford vascular benefit. The effects of almond consumption (25 g/d for 4 weeks followed by 50 g/d for 4 weeks) were evaluated in three non-smoking subject groups: healthy male volunteers between the ages of 18 and 35 years (n 15); men at risk of heart disease between the ages of 18 and 35 years (n 12); mature men and women >50 years of age (n 18). A fourth control group (n 14) were followed over 8 weeks without dietary almond enrichment as a treatment control. None of the subjects withdrew from the study and 90% completed the study. The interim results of the study showed that in the three active groups there was little evidence for a change in total cholesterol, LDL-cholesterol or HDL-cholesterol. In the mature group there was a trend towards increasing HDL-cholesterol. The mature and ‘at-risk’ groups also showed a significant changes in systolic blood pressure (P<0.05) during almond consumption. The healthy group showed a decrease in diastolic blood pressure (P<0.05). The ‘at-risk’ group showed a significant increase (P<0.05) in flowmediated dilation after 8 weeks of almond consumption. Data analysis is ongoing, with completion of the study in November 2007. The beneficial effects of almond consumption on flow-mediated dilation and blood pressure may be attributed to the high content in almonds of arginine, which serves as a precursor to the vasodilatory molecule, NO.
Resumo:
Competition between four foliose lichen species, common on slate rock surfaces in South Gwynedd, Wales, UK, was studied in experimental plots with and without nutrient enrichment by bird droppings. Fragments of the four lichens were glued to pieces of slate on horizontal boards in monoculture and in two-, three- and four-species mixtures in a factorial experimental design. In monoculture, nutrient enrichment increased thallus area of Parmelia conspersa (Ehrh. ex. Ach.) Ach., decreased thallus areas of Parmelia saxatilis (L.) Ach. and Parmelia glabratula ssp. fuliginosa (Fr. ex. Duby) Laundon, and did not affect thallus area of Phaeophyscia orbicularis (Necker) Moberg compared with untreated thalli. In the mixtures, P. conspersa and Ph. orbicularis were equally effective competitors in plots with and without nutrient enrichment. Addition of bird droppings, however, altered the ability of P. saxatilis and P. glabratula ssp. fuliginosa, to compete with the other species, the competitive ability of both species being reduced in some mixtures but increased in others. The results suggest that nutrient enrichment may alter the competitive balance between the four lichen species and this may be a factor determining their relative abundance on rock surfaces in South Gwynedd.
Resumo:
This paper proposes a semantic analysis of the French free-choice indefinite 'n’importe qui'. The semantics of the indefinite is organised as a ternary structure. The (1) abstract meaning underlies all uses of the item and acts as a principle of creative interpretation generation and comprehension. This principle is actualised via (2) discrete contextual features through to (3) contextual interpretations. Thus, the “existential” reading of 'n’importe qui' is derived by a veridical reading of the arbitrary selection of a qualitatively-marked occurrence from the set of human animates. The derivation of contextual readings from the enrichment by contextual cues of an underspecified meaning has a claim to an explanatory model of the semantics of grammatical polysemous items, and is certainly relevant to model-theoretic approaches in as much as formal semantic notions are intricately linked to the contextual interpretation of items. It is not 'n’importe qui' itself, but its contextual interpretations which may be weak or strong, and an homonymous treatment is not possible given the continuity of the quality and free-choice dimensions from one observed reading of n’importe qui to the next.
Resumo:
Category-specific disorders are frequently explained by suggesting that living and non-living things are processed in separate subsystems (e.g. Caramazza & Shelton, 1998). If subsystems exist, there should be benefits for normal processing, beyond the influence of structural similarity. However, no previous study has separated the relative influences of similarity and semantic category. We created novel examples of living and non-living things so category and similarity could be manipulated independently. Pre-tests ensured that our images evoked appropriate semantic information and were matched for familiarity. Participants were trained to associate names with the images and then performed a name-verification task under two levels of time pressure. We found no significant advantage for living things alongside strong effects of similarity. Our results suggest that similarity rather than category is the key determinant of speed and accuracy in normal semantic processing. We discuss the implications of this finding for neuropsychological studies. © 2005 Psychology Press Ltd.
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY WITH PRIOR ARRANGEMENT
Resumo:
This work explores the relevance of semantic and linguistic description to translation, theory and practice. It is aimed towards a practical model of approach to texts to translate. As literary texts [poetry mainly] are the focus of attention, so are stylistic matters. Note, however, that 'style', and, to some extent, the conclusions of the work, are not limited to so-called literary texts. The study of semantic description reveals that most translation problems do not stem from the cognitive (langue-related), but rather from the contextual (parole-related) aspects of meaning. Thus, any linguistic model that fails to account for the latter is bound to fall short. T.G.G. does, whereas Systemics, concerned with both the 'Iangue' and 'parole' (stylistic and sociolinguistic mainly) aspects of meaning, provides a useful framework of approach to texts to translate. Two essential semantic principles for translation are: that meaning is the property of a language (Firth); and the 'relativity of meaning assignments' (Tymoczko). Both imply that meaning can only be assessed, correctly, in the relevant socio-cultural background. Translation is seen as a restricted creation, and the translator's encroach as a three-dimensional critical one. To encompass the most technical to the most literary text, and account for variations in emphasis in any text, translation theory must be based on typology of function Halliday's ideational, interpersonal and textual, or, Buhler's symbol, signal, symptom, Functions3. Function Coverall and specific] will dictate aims and method, and also provide the critic with criteria to assess translation Faithfulness. Translation can never be reduced to purely objective methods, however. Intuitive procedures intervene, in textual interpretation and analysis, in the choice of equivalents, and in the reception of a translation. Ultimately, translation, theory and practice, may perhaps constitute the touchstone as regards the validity of linguistic and semantic theories.
Resumo:
In a series of experiments, we tested category-specific activation in normal parti¬cipants using magnetoencephalography (MEG). Our experiments explored the temporal processing of objects, as MEG characterises neural activity on the order of milliseconds. Our experiments explored object-processing, including assessing the time-course of ob¬ject naming, early differences in processing living compared with nonliving objects and processing objects at the basic compared with the domain level, and late differences in processing living compared with nonliving objects and processing objects at the basic compared with the domain level. In addition to studies using normal participants, we also utilised MEG to explore category-specific processing in a patient with a deficit for living objects. Our findings support the cascade model of object naming (Humphreys et al., 1988). In addition, our findings using normal participants demonstrate early, category-specific perceptual differences. These findings are corroborated by our patient study. In our assessment of the time-course of category-specific effects as well as a separate analysis designed to measure semantic differences between living and nonliving objects, we found support for the sensory/motor model of object naming (Martin, 1998), in addition to support for the cascade model of object naming. Thus, object processing in normal participants appears to be served by a distributed network in the brain, and there are both perceptual and semantic differences between living and nonliving objects. A separate study assessing the influence of the level at which you are asked to identify an object on processing in the brain found evidence supporting the convergence zone hypothesis (Damasio, 1989). Taken together, these findings indicate the utility of MEG in exploring the time-course of object processing, isolating early perceptual and later semantic effects within the brain.
Resumo:
The topic of this thesis is the development of knowledge based statistical software. The shortcomings of conventional statistical packages are discussed to illustrate the need to develop software which is able to exhibit a greater degree of statistical expertise, thereby reducing the misuse of statistical methods by those not well versed in the art of statistical analysis. Some of the issues involved in the development of knowledge based software are presented and a review is given of some of the systems that have been developed so far. The majority of these have moved away from conventional architectures by adopting what can be termed an expert systems approach. The thesis then proposes an approach which is based upon the concept of semantic modelling. By representing some of the semantic meaning of data, it is conceived that a system could examine a request to apply a statistical technique and check if the use of the chosen technique was semantically sound, i.e. will the results obtained be meaningful. Current systems, in contrast, can only perform what can be considered as syntactic checks. The prototype system that has been implemented to explore the feasibility of such an approach is presented, the system has been designed as an enhanced variant of a conventional style statistical package. This involved developing a semantic data model to represent some of the statistically relevant knowledge about data and identifying sets of requirements that should be met for the application of the statistical techniques to be valid. Those areas of statistics covered in the prototype are measures of association and tests of location.
Resumo:
This thesis presents a new approach to designing large organizational databases. The approach emphasizes the need for a holistic approach to the design process. The development of the proposed approach was based on a comprehensive examination of the issues of relevance to the design and utilization of databases. Such issues include conceptual modelling, organization theory, and semantic theory. The conceptual modelling approach presented in this thesis is developed over three design stages, or model perspectives. In the semantic perspective, concept definitions were developed based on established semantic principles. Such definitions rely on meaning - provided by intension and extension - to determine intrinsic conceptual definitions. A tool, called meaning-based classification (MBC), is devised to classify concepts based on meaning. Concept classes are then integrated using concept definitions and a set of semantic relations which rely on concept content and form. In the application perspective, relationships are semantically defined according to the application environment. Relationship definitions include explicit relationship properties and constraints. The organization perspective introduces a new set of relations specifically developed to maintain conformity of conceptual abstractions with the nature of information abstractions implied by user requirements throughout the organization. Such relations are based on the stratification of work hierarchies, defined elsewhere in the thesis. Finally, an example of an application of the proposed approach is presented to illustrate the applicability and practicality of the modelling approach.
Resumo:
Existing theories of semantic cognition propose models of cognitive processing occurring in a conceptual space, where ‘meaning’ is derived from the spatial relationships between concepts’ mapped locations within the space. Information visualisation is a growing area of research within the field of information retrieval, and methods for presenting database contents visually in the form of spatial data management systems (SDMSs) are being developed. This thesis combined these two areas of research to investigate the benefits associated with employing spatial-semantic mapping (documents represented as objects in two- and three-dimensional virtual environments are proximally mapped dependent on the semantic similarity of their content) as a tool for improving retrieval performance and navigational efficiency when browsing for information within such systems. Positive effects associated with the quality of document mapping were observed; improved retrieval performance and browsing behaviour were witnessed when mapping was optimal. It was also shown using a third dimension for virtual environment (VE) presentation provides sufficient additional information regarding the semantic structure of the environment that performance is increased in comparison to using two-dimensions for mapping. A model that describes the relationship between retrieval performance and browsing behaviour was proposed on the basis of findings. Individual differences were not found to have any observable influence on retrieval performance or browsing behaviour when mapping quality was good. The findings from this work have implications for both cognitive modelling of semantic information, and for designing and testing information visualisation systems. These implications are discussed in the conclusions of this work.