996 resultados para Controlled vocabulary
Resumo:
O pesquisador científico necessita de informações precisas, em tempo hábil para conclusão de seus trabalhos. Com o advento da INTERNET, o processo de comunicação em linha, homem x máquina, mediado pelos mecanismos de busca, tornou-se, simultaneamente, um auxílio e uma dificuldade no processo de recuperação de informações. O pesquisador teve que adaptar-se ao modo de operar da INTERNET e incluiu conhecimentos de diferenças idiomáticas, de terminologia, além de utilizar instrumentos que lhe forneçam parâmetros para obter maior pertinência e relevância nos dados. O uso de agentes inteligentes para melhoria de resultados e a diminuição de ruídos semânticos têm sido apontados como soluções para aumento da precisão no resultado das buscas. O estudo de casos exploratório realizado analisa a pesquisa em linha a partir da teoria da informação e propõe duas formas de otimizar o processo comunicacional com vistas à pertinência e relevância dos dados obtidos: a primeira sugere a aplicação de algoritmos que utilizem o vocabulário controlado como mediador do processo de comunicação utilizando-se dos descritores para recuperação em linha. , e a segunda ressalta a importância dos agentes inteligentes no processo de comunicação homem-máquina.(AU)
Collection-Level Subject Access in Aggregations of Digital Collections: Metadata Application and Use
Resumo:
Problems in subject access to information organization systems have been under investigation for a long time. Focusing on item-level information discovery and access, researchers have identified a range of subject access problems, including quality and application of metadata, as well as the complexity of user knowledge required for successful subject exploration. While aggregations of digital collections built in the United States and abroad generate collection-level metadata of various levels of granularity and richness, no research has yet focused on the role of collection-level metadata in user interaction with these aggregations. This dissertation research sought to bridge this gap by answering the question “How does collection-level metadata mediate scholarly subject access to aggregated digital collections?” This goal was achieved using three research methods: • in-depth comparative content analysis of collection-level metadata in three large-scale aggregations of cultural heritage digital collections: Opening History, American Memory, and The European Library • transaction log analysis of user interactions, with Opening History, and • interview and observation data on academic historians interacting with two aggregations: Opening History and American Memory. It was found that subject-based resource discovery is significantly influenced by collection-level metadata richness. The richness includes such components as: 1) describing collection’s subject matter with mutually-complementary values in different metadata fields, and 2) a variety of collection properties/characteristics encoded in the free-text Description field, including types and genres of objects in a digital collection, as well as topical, geographic and temporal coverage are the most consistently represented collection characteristics in free-text Description fields. Analysis of user interactions with aggregations of digital collections yields a number of interesting findings. Item-level user interactions were found to occur more often than collection-level interactions. Collection browse is initiated more often than search, while subject browse (topical and geographic) is used most often. Majority of collection search queries fall within FRBR Group 3 categories: object, concept, and place. Significantly more object, concept, and corporate body searches and less individual person, event and class of persons searches were observed in collection searches than in item searches. While collection search is most often satisfied by Description and/or Subjects collection metadata fields, it would not retrieve a significant proportion of collection records without controlled-vocabulary subject metadata (Temporal Coverage, Geographic Coverage, Subjects, and Objects), and free-text metadata (the Description field). Observation data shows that collection metadata records in Opening History and American Memory aggregations are often viewed. Transaction log data show a high level of engagement with collection metadata records in Opening History, with the total page views for collections more than 4 times greater than item page views. Scholars observed viewing collection records valued descriptive information on provenance, collection size, types of objects, subjects, geographic coverage, and temporal coverage information. They also considered the structured display of collection metadata in Opening History more useful than the alternative approach taken by other aggregations, such as American Memory, which displays only the free-text Description field to the end-user. The results extend the understanding of the value of collection-level subject metadata, particularly free-text metadata, for the scholarly users of aggregations of digital collections. The analysis of the collection metadata created by three large-scale aggregations provides a better understanding of collection-level metadata application patterns and suggests best practices. This dissertation is also the first empirical research contribution to test the FRBR model as a conceptual and analytic framework for studying collection-level subject access.
Resumo:
Purpose – The purpose of this research is to show how the self-archiving of journal papers is a major step towards providing open access to research. However, copyright transfer agreements (CTAs) that are signed by an author prior to publication often indicate whether, and in what form, self-archiving is allowed. The SHERPA/RoMEO database enables easy access to publishers' policies in this area and uses a colour-coding scheme to classify publishers according to their self-archiving status. The database is currently being redeveloped and renamed the Copyright Knowledge Bank. However, it will still assign a colour to individual publishers indicating whether pre-prints can be self-archived (yellow), post-prints can be self-archived (blue), both pre-print and post-print can be archived (green) or neither (white). The nature of CTAs means that these decisions are rarely as straightforward as they may seem, and this paper describes the thinking and considerations that were used in assigning these colours in the light of the underlying principles and definitions of open access. Approach – Detailed analysis of a large number of CTAs led to the development of controlled vocabulary of terms which was carefully analysed to determine how these terms equate to the definition and “spirit” of open access. Findings – The paper reports on how conditions outlined by publishers in their CTAs, such as how or where a paper can be self-archived, affect the assignment of a self-archiving colour to the publisher. Value – The colour assignment is widely used by authors and repository administrators in determining whether academic papers can be self-archived. This paper provides a starting-point for further discussion and development of publisher classification in the open access environment.
Resumo:
This dissertation research points out major challenging problems with current Knowledge Organization (KO) systems, such as subject gateways or web directories: (1) the current systems use traditional knowledge organization systems based on controlled vocabulary which is not very well suited to web resources, and (2) information is organized by professionals not by users, which means it does not reflect intuitively and instantaneously expressed users’ current needs. In order to explore users’ needs, I examined social tags which are user-generated uncontrolled vocabulary. As investment in professionally-developed subject gateways and web directories diminishes (support for both BUBL and Intute, examined in this study, is being discontinued), understanding characteristics of social tagging becomes even more critical. Several researchers have discussed social tagging behavior and its usefulness for classification or retrieval; however, further research is needed to qualitatively and quantitatively investigate social tagging in order to verify its quality and benefit. This research particularly examined the indexing consistency of social tagging in comparison to professional indexing to examine the quality and efficacy of tagging. The data analysis was divided into three phases: analysis of indexing consistency, analysis of tagging effectiveness, and analysis of tag attributes. Most indexing consistency studies have been conducted with a small number of professional indexers, and they tended to exclude users. Furthermore, the studies mainly have focused on physical library collections. This dissertation research bridged these gaps by (1) extending the scope of resources to various web documents indexed by users and (2) employing the Information Retrieval (IR) Vector Space Model (VSM) - based indexing consistency method since it is suitable for dealing with a large number of indexers. As a second phase, an analysis of tagging effectiveness with tagging exhaustivity and tag specificity was conducted to ameliorate the drawbacks of consistency analysis based on only the quantitative measures of vocabulary matching. Finally, to investigate tagging pattern and behaviors, a content analysis on tag attributes was conducted based on the FRBR model. The findings revealed that there was greater consistency over all subjects among taggers compared to that for two groups of professionals. The analysis of tagging exhaustivity and tag specificity in relation to tagging effectiveness was conducted to ameliorate difficulties associated with limitations in the analysis of indexing consistency based on only the quantitative measures of vocabulary matching. Examination of exhaustivity and specificity of social tags provided insights into particular characteristics of tagging behavior and its variation across subjects. To further investigate the quality of tags, a Latent Semantic Analysis (LSA) was conducted to determine to what extent tags are conceptually related to professionals’ keywords and it was found that tags of higher specificity tended to have a higher semantic relatedness to professionals’ keywords. This leads to the conclusion that the term’s power as a differentiator is related to its semantic relatedness to documents. The findings on tag attributes identified the important bibliographic attributes of tags beyond describing subjects or topics of a document. The findings also showed that tags have essential attributes matching those defined in FRBR. Furthermore, in terms of specific subject areas, the findings originally identified that taggers exhibited different tagging behaviors representing distinctive features and tendencies on web documents characterizing digital heterogeneous media resources. These results have led to the conclusion that there should be an increased awareness of diverse user needs by subject in order to improve metadata in practical applications. This dissertation research is the first necessary step to utilize social tagging in digital information organization by verifying the quality and efficacy of social tagging. This dissertation research combined both quantitative (statistics) and qualitative (content analysis using FRBR) approaches to vocabulary analysis of tags which provided a more complete examination of the quality of tags. Through the detailed analysis of tag properties undertaken in this dissertation, we have a clearer understanding of the extent to which social tagging can be used to replace (and in some cases to improve upon) professional indexing.
Resumo:
Things change. Words change, meaning changes and use changes both words and meaning. In information access systems this means concept schemes such as thesauri or clas- sification schemes change. They always have. Concept schemes that have survived have evolved over time, moving from one version, often called an edition, to the next. If we want to manage how words and meanings - and as a conse- quence use - change in an effective manner, and if we want to be able to search across versions of concept schemes, we have to track these changes. This paper explores how we might expand SKOS, a World Wide Web Consortium (W3C) draft recommendation in order to do that kind of tracking.The Simple Knowledge Organization System (SKOS) Core Guide is sponsored by the Semantic Web Best Practices and Deployment Working Group. The second draft, edited by Alistair Miles and Dan Brickley, was issued in November 2005. SKOS is a “model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, other types of controlled vocabulary and also concept schemes embedded in glossaries and terminologies” in RDF. How SKOS handles version in concept schemes is an open issue. The current draft guide suggests using OWL and DCTERMS as mechanisms for concept scheme revision.As it stands an editor of a concept scheme can make notes or declare in OWL that more than one version exists. This paper adds to the SKOS Core by introducing a tracking sys- tem for changes in concept schemes. We call this tracking system vocabulary ontogeny. Ontogeny is a biological term for the development of an organism during its lifetime. Here we use the ontogeny metaphor to describe how vocabularies change over their lifetime. Our purpose here is to create a conceptual mechanism that will track these changes and in so doing enhance information retrieval and prevent document loss through versioning, thereby enabling persistent retrieval.
Resumo:
In order to facilitate subject access interoperability a mechanism must be built that allows the different controlled vocabularies to communicate meaning, relationships, and levels of extension and intension so that different user groups using different controlled vocabularies could access collections across the network. Switching languages, the tools of controlled vocabulary compatibility, consist of a single layer that does not allow for a flexible control of the semantic levels of meaning, relationships, and extension or intension. This paper proposes a multilayered conceptual framework wherein the levels of meaning, relationships and extension and intension are each controlled as individual parameters, rather than in a single switching language.
Resumo:
Infants (12 to 17 months) were taught 2 novel words for 2 images of novel objects, by pairing isolated auditory labels with to-be-associated images. Comprehension was tested using a preferential looking task in which the infant was presented with both images together with an isolated auditory label. The auditory label usually, but not always, matched one of the images. Infants looked preferentially at images that matched the auditory stimulus. The experiment controlled within-subjects for both side bias and preference for previously named items. Infants showed learning after 12 presentations of the new words. Evidence is presented that, in certain circumstances, the duration of longest look at a target may be a more robust measure of target preference than overall looking time. The experiment provides support for previous demonstrations of rapid word learning by pre-vocabulary spurt children, and offers some methodological improvements to the preferential looking task.
Resumo:
Interoperability of water quality data depends on the use of common models, schemas and vocabularies. However, terms are usually collected during different activities and projects in isolation of one another, resulting in vocabularies that have the same scope being represented with different terms, using different formats and formalisms, and published in various access methods. Significantly, most water quality vocabularies conflate multiple concepts in a single term, e.g. quantity kind, units of measure, substance or taxon, medium and procedure. This bundles information associated with separate elements from the OGC Observations and Measurements (O&M) model into a single slot. We have developed a water quality vocabulary, formalized using RDF, and published as Linked Data. The terms were extracted from existing water quality vocabularies. The observable property model is inspired by O&M but aligned with existing ontologies. The core is an OWL ontology that extends the QUDT ontology for Unit and QuantityKind definitions. We add classes to generalize the QuantityKind model, and properties for explicit description of the conflated concepts. The key elements are defined to be sub-classes or sub-properties of SKOS elements, which enables a SKOS view to be published through standard vocabulary APIs, alongside the full view. QUDT terms are re-used where possible, supplemented with additional Unit and QuantityKind entries required for water quality. Along with items from separate vocabularies developed for objects, media, and procedures, these are linked into definitions in the actual observable property vocabulary. Definitions of objects related to chemical substances are linked to items from the Chemical Entities of Biological Interest (ChEBI) ontology. Mappings to other vocabularies, such as DBPedia, are in separately maintained files. By formalizing the model for observable properties, and clearly labelling the separate concerns, water quality observations from different sources may be more easily merged and also transformed to O&M for cross-domain applications.
Resumo:
This project was comparing the accuracy of capturing the oral pathology diagnoses among different coding systems. 55 diagnoses were selected for comparison among 5 coding systems. The results of accuracy in capturing oral diagnoses are: AFIP (96.4%), followed by Read 99 (85.5%), SNOMED 98 (74.5%), ICD-9 (43.6%), and CDT-3 (14.5%). It shows that the currently used coding systems, ICD-9 and CDT-3, were inadequate, whereas the AFIP coding system captured the majority of oral diagnoses. In conclusion, the most commonly used medical and dental coding systems lack terms for the diagnosis of oral and dental conditions.
Resumo:
CLIL instruction has been reported to be beneficial for foreign language vocabulary learning since CLIL students show higher vocabulary profiles than students of their same age in traditional EFL contexts. However, to our knowledge, the receptive vocabulary knowledge of CLIL and non-CLIL learners at the end of primary and secondary education has not been examined yet. Hence, this study aims at comparing the receptive vocabulary size 79 CLIL primary learners with the receptive vocabulary knowledge of 331 non-CLIL learners at the end of primary and secondary school. Sex-based differences were also analysed. The 2k Vocabulary Levels Test (VLT) was used for the purposes of the study. Results revealed that learners’ receptive vocabulary sizes lie within the most frequent 1000 words, non-CLIL secondary school students throw better results than primary students but the differences between the secondary group and the CLIL group are not statistically significant. As for sex-based differences, we found no significant differences among the groups. These findings led us to believe that the CLIL approach offers a benefit for vocabulary acquisition since CLIL learners have been exposed to the foreign language for a shorter period of time and the results are quite similar to their non-CLIL secondary school partners.
Resumo:
Raman spectroscopy of formamide-intercalated kaolinites treated using controlled-rate thermal analysis technology (CRTA), allowing the separation of adsorbed formamide from intercalated formamide in formamide-intercalated kaolinites, is reported. The Raman spectra of the CRTA-treated formamide-intercalated kaolinites are significantly different from those of the intercalated kaolinites, which display a combination of both intercalated and adsorbed formamide. An intense band is observed at 3629 cm-1, attributed to the inner surface hydroxyls hydrogen bonded to the formamide. Broad bands are observed at 3600 and 3639 cm-1, assigned to the inner surface hydroxyls, which are hydrogen bonded to the adsorbed water molecules. The hydroxyl-stretching band of the inner hydroxyl is observed at 3621 cm-1 in the Raman spectra of the CRTA-treated formamide-intercalated kaolinites. The results of thermal analysis show that the amount of intercalated formamide between the kaolinite layers is independent of the presence of water. Significant differences are observed in the CO stretching region between the adsorbed and intercalated formamide.
Resumo:
The thermal behaviour of halloysite fully expanded with hydrazine-hydrate has been investigated in nitrogen atmosphere under dynamic heating and at a constant, pre-set decomposition rate of 0.15 mg min-1. Under controlled-rate thermal analysis (CRTA) conditions it was possible to resolve the closely overlapping decomposition stages and to distinguish between adsorbed and bonded reagent. Three types of bonded reagent could be identified. The loosely bonded reagent amounting to 0.20 mol hydrazine-hydrate per mol inner surface hydroxyl is connected to the internal and external surfaces of the expanded mineral and is present as a space filler between the sheets of the delaminated mineral. The strongly bonded (intercalated) hydrazine-hydrate is connected to the kaolinite inner surface OH groups by the formation of hydrogen bonds. Based on the thermoanalytical results two different types of bonded reagent could be distinguished in the complex. Type 1 reagent (approx. 0.06 mol hydrazine-hydrate/mol inner surface OH) is liberated between 77 and 103°C. Type 2 reagent is lost between 103 and 227°C, corresponding to a quantity of 0.36 mol hydrazine/mol inner surface OH. When heating the complex to 77°C under CRTA conditions a new reflection appears in the XRD pattern with a d-value of 9.6 Å, in addition to the 10.2 Ĺ reflection. This new reflection disappears in contact with moist air and the complex re-expands to the original d-value of 10.2 Å in a few h. The appearance of the 9.6 Å reflection is interpreted as the expansion of kaolinite with hydrazine alone, while the 10.2 Å one is due to expansion with hydrazine-hydrate. FTIR (DRIFT) spectroscopic results showed that the treated mineral after intercalation/deintercalation and heat treatment to 300°C is slightly more ordered than the original (untreated) clay.