493 resultados para Automatized Indexing
Resumo:
Knowledge organization (KO) research is a field of scholarship concerned with the design, study and critique of the processes of organizing and representing documents that societies see as worthy of preserving (Tennis, 2008). In this context we are concerned with the relationship between language and action.On the one hand, we are concerned with what language can and does do for our knowledge organization systems (KOS). For example, how do the words NEGRO or INDIAN work in historical and contemporary indexing languages? In relation to this, we are also concerned with how we know about knowledge organization (KO) and its languages. On the other hand, we are concerned with how to act given this knowledge. That is, how do we carry out research and how do we design, implement, and evaluate KO systems?It is important to consider these questions in the context of our work because we are delegated by society to disseminate cultural memory. We are endowed with a perspective, prepared by an education, and granted positions whereby society asks us to ensure that documentary material is accessible to future generations. There is a social value in our work, and as such there is a social imperative to our work. We must act with good conscience, and use language judiciously, for the memory of the world is a heavy burden.In this paper, I explore these two weights of language and action that bear down on KO researchers. I first summarize what extant literature says about the knowledge claims we make with regard to KO practices and systems. To make it clear what it is that I think we know, I create a schematic that will link claims (language) to actions in advising, implementing, or evaluating information practices and systems.I will then contrast this with what we do not know, that is, what the unanswered questions might be (Gnoli, 2008 ; Dahlberg, 2011), and I will discuss them in relation to the two weights in our field of KO.Further, I will try to provide a systematic overview of possible ways to address these open questions in KO research. I will draw on the concept of elenchus - the forms of epistemology, theory, and methodology in KO (Tennis, 2008), and framework analysis which are structures, work practice, and discourses of KO systems (Tennis, 2006). In so doing, I will argue for a Neopragmatic stance on the weight of language and action in KO (Rorty, 1982 ; 2000). I will close by addressing the lacuna left in Neopragmatic thought – the ethical imperative to use language and action in a particular good and moral way. That is, I will address the ethical imperative of KO given its weights, epistemologies, theories, and methods. To do this, I will review a sample of relevant work on deontology in both western and eastern philosophical schools (e.g., Harvey, 1995).The perspective I want to communicate in this section is that the good in carrying out KO research may begin with epistemic stances (cf., language), but ultimately stands on ethical actions. I will present an analysis describing the micro and the macro ethical concerns in relation to KO research and its advice on practice. I hope this demonstrates that the direction of epistemology, theory, and methodology in KO, while burdened with the dual weights of language and action, is clear when provided an ethical sounding board. We know how to proceed when we understand how our work can benefit the world.KO is an important, if not always understood, division of labor in a society that values its documentary heritage and memory institutions. Being able to do good requires us to understand how to balance the weights of language and action. We must understand where we stand and be able to chart a path forward, one that does not cause harm, but adds value to the world and those that want to access recorded knowledge.
Resumo:
Describes three tensions in the theoretical literature of indexing: chief sources of evidence indexing, process of indexing (rubrics and methods), and philosophical position of indexing scholarship. Following this exposition, we argue for a change in perspective in Knowledge Organization research. Using the difference between prescriptive and descriptive linguis- tics as a metaphor, we advocate for a shift to a more descriptive, rather than the customary prescriptive, approach to the theo- retical and empirical study of indexing, and by extension Knowledge Organization.
Resumo:
Describes three units of time helpful for understanding and evaluating classificatory structures: long time (versions and states of classification schemes), short time (the act of indexing as repeated ritual or form), and micro-time (where stages of the interpretation process of indexing are separated out and inventoried). Concludes with a short discussion of how time and the impermanence of classification also conjures up an artistic conceptualization of indexing, and briefly uses that to question the seemingly dominant understanding of classification practice as outcome of scientific management and assembly line thought.
Resumo:
Describes four waves of Ranganathan’s dynamic theory of classification. Outlines components that distinguish each wave, and porposes ways in which this understanding can inform systems design in the contemporary environment, particularly with regard to interoperability and scheme versioning. Ends with an appeal to better understanding the relationship between structure and semantics in faceted classification schemes and similar indexing languages.
Resumo:
Subject ontogeny is the life of the subject in an indexing language (e.g., classification scheme like the DDC). Examining how a subject is treated over time tells us about the anatomy of an indexing language. For example, gypsies as a subject has been handled differently in different editions of the DDC.
Resumo:
In this article, we describe the development of an exten- sion to the Simple Knowledge Organization System (SKOS) to accommodate the needs of vocabulary devel- opment applications (VDA) managing metadata schemes and requiring close tracking of change to both those schemes and their member concepts. We take a neo- pragmatic epistemic stance in asserting the need for an entity in SKOS modeling to mediate between the abstract concept and the concrete scheme. While the SKOS model sufficiently describes entities for modeling the current state of a scheme in support of indexing and search on the Semantic Web, it lacks the expressive power to serve the needs of VDA needing to maintain scheme historical continuity. We demonstrate prelimi- narily that conceptualizations drawn from empirical work in modeling entities in the bibliographic universe, such as works, texts, and exemplars, can provide the basis for SKOS extension in ways that support more rig- orous demands of capturing concept evolution in VDA.
Resumo:
In reflecting on the practice of knowledge organization, we tacitly or explicitly root our conceptions of work and its value in some epistemic and ontological foundation. Zen Buddhist philosophy offers a unique set of conceptions vis-à-vis organizing, indexing, and describing documents.When we engage in knowledge organization, we are setting our mind to work with an intention. We intend to make some sort of intervention. We then create a form a realization of an abstraction (like classes or terms) [1], we do this from a foundation of some set of beliefs (epistemology, ontology, and ethics), and because we have to make decisions about what to privilege, we need to decide what is foremost in our minds. We must ask what is the most important thing?Form, foundation, and the ethos of foremost require evoke in our reflection on work number of ethical, epistemic, and ontological concerns that ripple throughout our conceptions of space, “good work”, aesthetics, and moral mandate [2,3]. We reflect on this.
Resumo:
This paper outlines the purposes, predications, functions, and contexts of information organization frameworks; including: bibliographic control, information retrieval, resource discovery, resource description, open access scholarly indexing, personal information management protocols, and social tagging in order to compare and contrast those purposes, predications, functions, and contexts. Information organization frameworks, for the purpose of this paper, consist of information organization systems (classification schemes, taxonomies, ontologies, bibliographic descriptions, etc.), methods of conceiving of and creating the systems, and the work processes involved in maintaining these systems. The paper first outlines the theoretical literature of these information organization frameworks. In conclusion, this paper establishes the first part of an evaluation rubric for a function, predication, purpose, and context analysis.
Resumo:
Conventional web search engines are centralised in that a single entity crawls and indexes the documents selected for future retrieval, and the relevance models used to determine which documents are relevant to a given user query. As a result, these search engines suffer from several technical drawbacks such as handling scale, timeliness and reliability, in addition to ethical concerns such as commercial manipulation and information censorship. Alleviating the need to rely entirely on a single entity, Peer-to-Peer (P2P) Information Retrieval (IR) has been proposed as a solution, as it distributes the functional components of a web search engine – from crawling and indexing documents, to query processing – across the network of users (or, peers) who use the search engine. This strategy for constructing an IR system poses several efficiency and effectiveness challenges which have been identified in past work. Accordingly, this thesis makes several contributions towards advancing the state of the art in P2P-IR effectiveness by improving the query processing and relevance scoring aspects of a P2P web search. Federated search systems are a form of distributed information retrieval model that route the user’s information need, formulated as a query, to distributed resources and merge the retrieved result lists into a final list. P2P-IR networks are one form of federated search in routing queries and merging result among participating peers. The query is propagated through disseminated nodes to hit the peers that are most likely to contain relevant documents, then the retrieved result lists are merged at different points along the path from the relevant peers to the query initializer (or namely, customer). However, query routing in P2P-IR networks is considered as one of the major challenges and critical part in P2P-IR networks; as the relevant peers might be lost in low-quality peer selection while executing the query routing, and inevitably lead to less effective retrieval results. This motivates this thesis to study and propose query routing techniques to improve retrieval quality in such networks. Cluster-based semi-structured P2P-IR networks exploit the cluster hypothesis to organise the peers into similar semantic clusters where each such semantic cluster is managed by super-peers. In this thesis, I construct three semi-structured P2P-IR models and examine their retrieval effectiveness. I also leverage the cluster centroids at the super-peer level as content representations gathered from cooperative peers to propose a query routing approach called Inverted PeerCluster Index (IPI) that simulates the conventional inverted index of the centralised corpus to organise the statistics of peers’ terms. The results show a competitive retrieval quality in comparison to baseline approaches. Furthermore, I study the applicability of using the conventional Information Retrieval models as peer selection approaches where each peer can be considered as a big document of documents. The experimental evaluation shows comparative and significant results and explains that document retrieval methods are very effective for peer selection that brings back the analogy between documents and peers. Additionally, Learning to Rank (LtR) algorithms are exploited to build a learned classifier for peer ranking at the super-peer level. The experiments show significant results with state-of-the-art resource selection methods and competitive results to corresponding classification-based approaches. Finally, I propose reputation-based query routing approaches that exploit the idea of providing feedback on a specific item in the social community networks and manage it for future decision-making. The system monitors users’ behaviours when they click or download documents from the final ranked list as implicit feedback and mines the given information to build a reputation-based data structure. The data structure is used to score peers and then rank them for query routing. I conduct a set of experiments to cover various scenarios including noisy feedback information (i.e, providing positive feedback on non-relevant documents) to examine the robustness of reputation-based approaches. The empirical evaluation shows significant results in almost all measurement metrics with approximate improvement more than 56% compared to baseline approaches. Thus, based on the results, if one were to choose one technique, reputation-based approaches are clearly the natural choices which also can be deployed on any P2P network.
Resumo:
A developed and sustainable agriculture requires a permanent and reliable monitoring of climatic/ meteorological elements in (agro) meteorological stations which should be located close to agricultural, silvicultural or pastoral activities. An adequate network of meteorological stations is then a necessary condition to support innovation and development in any country. Developing countries, mainly those with a history of frequent conflicts, presents deficient number of weather stations, often poorly composed and improperly distributed within their territories, and without a regular operation that allows continuity of records for a sufficiently long period of time. The objective of this work was to build a network of meteorological and agro-meteorological stations in East Timor. To achieve this goal, the number and location of pre-existing stations, their structure and composition (number and type of sensors, communication system,… ), the administrative division of the country and the available agro-ecological zoning, the agricultural and forestry practices in the country, the existing centres for the agricultural research and the history of the weathers records were taken into account. Several troubles were found (some of the automatic stations were assembled incorrectly, others stations duplicated information regarding the same agricultural area, vast areas with relevant agro-ecological representativeness were not monitored …). It was proposed the elimination of 11 existing stations, the relocation of 7 new stations in places not covered until then, the automation of 3 manual meteorological stations. Two networks were then purposed, a major with 15 agro-meteorological stations (all automatized) and one other secondary composed by 32 weather stations (only two were manual). The set of the 47 stations corresponded to a density of 329 km2/station. The flexibility in the composition of each of the networks was safeguarded and intends to respond effectively to any substantive change in the conditions in a country in constant change. It was also discussed the national coverage by these networks under a “management concept for weather stations”.
Resumo:
In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.
Resumo:
This thesis work has been developed in collaboration between the Department of Physics and Astronomy of the University of Bologna and the IRCCS Rizzoli Orthopedic Institute during an internship period. The study aims to investigate the sensitivity of single-sided NMR in detecting structural differences of the articular cartilage tissue and their correlation with mechanical behavior. Suitable cartilage indicators for osteoarthritis (OA) severity (e.g., water and proteoglycans content, collagen structure) were explored through four NMR parameters: T2, T1, D, and Slp. Structural variations of the cartilage among its three layers (i.e., superficial, middle, and deep) were investigated performing several NMR pulses sequences on bovine knee joint samples using the NMR-MOUSE device. Previously, cartilage degradation studies were carried out, performing tests in three different experimental setups. The monitoring of the parameters and the best experimental setup were determined. An NMR automatized procedure based on the acquisition of these quantitative parameters was implemented, tested, and used for the investigation of the layers of twenty bovine cartilage samples. Statistical and pattern recognition analyses on these parameters have been performed. The results obtained from the analyses are very promising: the discrimination of the three cartilage layers shows very good results in terms of significance, paving the way for extensive use of NMR single-sided devices for biomedical applications. These results will be also integrated with analyses of tissue mechanical properties for a complete evaluation of cartilage changes throughout OA disease. The use of low-priced and mobile devices towards clinical applications could concern the screening of diseases related to cartilage tissue. This could have a positive impact both economically (including for underdeveloped countries) and socially, providing screening possibilities to a large part of the population.
Resumo:
Air pollution is one of the greatest health risks in the world. At the same time, the strong correlation with climate change, as well as with Urban Heat Island and Heat Waves, make more intense the effects of all these phenomena. A good air quality and high levels of thermal comfort are the big goals to be reached in urban areas in coming years. Air quality forecast help decision makers to improve air quality and public health strategies, mitigating the occurrence of acute air pollution episodes. Air quality forecasting approaches combine an ensemble of models to provide forecasts from global to regional air pollution and downscaling for selected countries and regions. The development of models dedicated to urban air quality issues requires a good set of data regarding the urban morphology and building material characteristics. Only few examples of air quality forecast system at urban scale exist in the literature and often they are limited to selected cities. This thesis develops by setting up a methodology for the development of a forecasting tool. The forecasting tool can be adapted to all cities and uses a new parametrization for vegetated areas. The parametrization method, based on aerodynamic parameters, produce the urban spatially varying roughness. At the core of the forecasting tool there is a dispersion model (urban scale) used in forecasting mode, and the meteorological and background concentration forecasts provided by two regional numerical weather forecasting models. The tool produces the 1-day spatial forecast of NO2, PM10, O3 concentration, the air temperature, the air humidity and BLQ-Air index values. The tool is automatized to run every day, the maps produced are displayed on the e-Globus platform, updated every day. The results obtained indicate that the forecasting output were in good agreement with the observed measurements.