846 resultados para Semantic Publishing, Linked Data, Bibliometrics, Informetrics, Data Retrieval, Citations
Resumo:
This project is a step forward in the study of text mining where enhanced text representation with semantic information plays a significant role. It develops effective methods of entity-oriented retrieval, semantic relation identification and text clustering utilizing semantically annotated data. These methods are based on enriched text representation generated by introducing semantic information extracted from Wikipedia into the input text data. The proposed methods are evaluated against several start-of-art benchmarking methods on real-life data-sets. In particular, this thesis improves the performance of entity-oriented retrieval, identifies different lexical forms for an entity relation and handles clustering documents with multiple feature spaces.
Resumo:
Many techniques in information retrieval produce counts from a sample, and it is common to analyse these counts as proportions of the whole - term frequencies are a familiar example. Proportions carry only relative information and are not free to vary independently of one another: for the proportion of one term to increase, one or more others must decrease. These constraints are hallmarks of compositional data. While there has long been discussion in other fields of how such data should be analysed, to our knowledge, Compositional Data Analysis (CoDA) has not been considered in IR. In this work we explore compositional data in IR through the lens of distance measures, and demonstrate that common measures, naïve to compositions, have some undesirable properties which can be avoided with composition-aware measures. As a practical example, these measures are shown to improve clustering. Copyright 2014 ACM.
Resumo:
Due to the availability of huge number of web services, finding an appropriate Web service according to the requirements of a service consumer is still a challenge. Moreover, sometimes a single web service is unable to fully satisfy the requirements of the service consumer. In such cases, combinations of multiple inter-related web services can be utilised. This paper proposes a method that first utilises a semantic kernel model to find related services and then models these related Web services as nodes of a graph. An all-pair shortest-path algorithm is applied to find the best compositions of Web services that are semantically related to the service consumer requirement. The recommendation of individual and composite Web services composition for a service request is finally made. Empirical evaluation confirms that the proposed method significantly improves the accuracy of service discovery in comparison to traditional keyword-based discovery methods.
Resumo:
The current study presents an algorithm to retrieve surface Soil Moisture (SM) from multi-temporal Synthetic Aperture Radar (SAR) data. The developed algorithm is based on the Cumulative Density Function (CDF) transformation of multi-temporal RADARSAT-2 backscatter coefficient (BC) to obtain relative SM values, and then converts relative SM values into absolute SM values using soil information. The algorithm is tested in a semi-arid tropical region in South India using 30 satellite images of RADARSAT-2, SMOS L2 SM products, and 1262 SM field measurements in 50 plots spanning over 4 years. The validation with the field data showed the ability of the developed algorithm to retrieve SM with RMSE ranging from 0.02 to 0.06 m(3)/m(3) for the majority of plots. Comparison with the SMOS SM showed a good temporal behaviour with RMSE of approximately 0.05 m(3)/m(3) and a correlation coefficient of approximately 0.9. The developed model is compared and found to be better than the change detection and delta index model. The approach does not require calibration of any parameter to obtain relative SM and hence can easily be extended to any region having time series of SAR data available.
Resumo:
A new algorithm based on the multiparameter neural network is proposed to retrieve wind speed (WS), sea surface temperature (SST), sea surface air temperature, and relative humidity ( RH) simultaneously over the global oceans from Special Sensor Microwave Imager (SSM/I) observations. The retrieved geophysical parameters are used to estimate the surface latent heat flux and sensible heat flux using a bulk method over the global oceans. The neural network is trained and validated with the matchups of SSM/I overpasses and National Data Buoy Center buoys under both clear and cloudy weather conditions. In addition, the data acquired by the 85.5-GHz channels of SSM/I are used as the input variables of the neural network to improve its performance. The root-mean-square (rms) errors between the estimated WS, SST, sea surface air temperature, and RH from SSM/I observations and the buoy measurements are 1.48 m s(-1), 1.54 degrees C, 1.47 degrees C, and 7.85, respectively. The rms errors between the estimated latent and sensible heat fluxes from SSM/I observations and the Xisha Island ( in the South China Sea) measurements are 3.21 and 30.54 W m(-2), whereas those between the SSM/ I estimates and the buoy data are 4.9 and 37.85 W m(-2), respectively. Both of these errors ( those for WS, SST, and sea surface air temperature, in particular) are smaller than those by previous retrieval algorithms of SSM/ I observations over the global oceans. Unlike previous methods, the present algorithm is capable of producing near-real-time estimates of surface latent and sensible heat fluxes for the global oceans from SSM/I data.
Resumo:
The task in text retrieval is to find the subset of a collection of documents relevant to a user's information request, usually expressed as a set of words. Classically, documents and queries are represented as vectors of word counts. In its simplest form, relevance is defined to be the dot product between a document and a query vector--a measure of the number of common terms. A central difficulty in text retrieval is that the presence or absence of a word is not sufficient to determine relevance to a query. Linear dimensionality reduction has been proposed as a technique for extracting underlying structure from the document collection. In some domains (such as vision) dimensionality reduction reduces computational complexity. In text retrieval it is more often used to improve retrieval performance. We propose an alternative and novel technique that produces sparse representations constructed from sets of highly-related words. Documents and queries are represented by their distance to these sets. and relevance is measured by the number of common clusters. This technique significantly improves retrieval performance, is efficient to compute and shares properties with the optimal linear projection operator and the independent components of documents.
Resumo:
Abstract—Personal communication devices are increasingly being equipped with sensors that are able to passively collect information from their surroundings – information that could be stored in fairly small local caches. We envision a system in which users of such devices use their collective sensing, storage, and communication resources to query the state of (possibly remote) neighborhoods. The goal of such a system is to achieve the highest query success ratio using the least communication overhead (power). We show that the use of Data Centric Storage (DCS), or directed placement, is a viable approach for achieving this goal, but only when the underlying network is well connected. Alternatively, we propose, amorphous placement, in which sensory samples are cached locally and informed exchanges of cached samples is used to diffuse the sensory data throughout the whole network. In handling queries, the local cache is searched first for potential answers. If unsuccessful, the query is forwarded to one or more direct neighbors for answers. This technique leverages node mobility and caching capabilities to avoid the multi-hop communication overhead of directed placement. Using a simplified mobility model, we provide analytical lower and upper bounds on the ability of amorphous placement to achieve uniform field coverage in one and two dimensions. We show that combining informed shuffling of cached samples upon an encounter between two nodes, with the querying of direct neighbors could lead to significant performance improvements. For instance, under realistic mobility models, our simulation experiments show that amorphous placement achieves 10% to 40% better query answering ratio at a 25% to 35% savings in consumed power over directed placement.
Resumo:
The AMSR-E satellite data and in-situ data were applied to retrieve sea surface air temperature (Ta) over the Southern Ocean. The in-situ data were obtained from the 24~(th) -26~(th) Chinese Antarctic Expeditions during 2008-2010. First, Ta was used to analyze the relativity with the bright temperature (Tb) from the twelve channels of AMSR-E, and no high relativity was found between Ta and Tb from any of the channels. The highest relativity was 0.38 (with 23.8 GHz). The dataset for the modeling was obtained by using in-situ data to match up with Tb, and two methods were applied to build the retrieval model. In multi-parameters regression method, the Tbs from 12 channels were used to the model and the region was divided into two parts according to the latitude of 50°S. The retrieval results were compared with the in-situ data. The Root Mean Square Error (RMS) and relativity of high latitude zone were 0.96℃and 0.93, respectively. And those of low latitude zone were 1.29 ℃ and 0.96, respectively. Artificial neural network (ANN) method was applied to retrieve Ta.The RMS and relativity were 1.26 ℃ and 0.98, respectively.
Resumo:
Objectives. We compared the mental health risk to unpaid caregivers bereaved of a care recipient with the risk to persons otherwise bereaved and to nonbereaved caregivers.
Methods. We linked prescription records for antidepressant and anxiolytic drugs to characteristics and life-event data of members of the Northern Ireland Longitudinal Study (n = 317 264). Using a case-control design, we fitted logistic regression models, stratified by age, to model relative likelihood of mental health problems, using the proxy measures of mental health–related prescription.
Results. Both caregivers and bereaved individuals were estimated to be at between 20% and 50% greater risk for mental health problems than noncaregivers in similar circumstances (for bereaved working-age caregivers, odds ratio = 1.41; 95% confidence interval = 1.27, 1.56). For older people, there was no evidence of additional risk to bereaved caregivers, though there was for working-age people. Older people appeared to recover more quickly from caregiver bereavement.
Conclusions. Caregivers were at risk for mental ill health while providing care and after the death of the care recipient. Targeted caregiver support needs to extend beyond the life of the care recipient.
Resumo:
Das Thema Linked Open Data hat in den vergangenen Jahren im Bereich der Bibliotheken viel Aufmerksamkeit erfahren. Unterschiedlichste Projekte werden von Bibliotheken betrieben, um Linked Open Data für die Einrichtung und die Kunden nutzbringend einzusetzen. Ausgangspunkt für diese Arbeit ist die These, dass Linked Open Data im Bibliotheksbereich das größte Potenzial freisetzen kann. Es wird überprüft, inwiefern diese Aussage auch auf Öffentliche Bibliotheken zutrifft und aufgezeigt, welche Möglichkeiten sich daraus ergeben könnten. Die Arbeit führt in die Grundlagen von Linked Open Data (LOD) ein und betrachtet die Entwicklungen im Bibliotheksbereich. Dabei werden besonders Initiativen zur Behandlung bibliothekarischer Metadaten und der aktuelle Entwicklungsstand von LOD-fähigen Bibliothekssystemen behandelt. Danach wird eine Auswahl an LOD-Datensets vorgestellt, die bibliothekarische Metadaten liefern oder deren Daten als Anreicherungsinformationen in Kataloganwendungen eingesetzt werden können. Im Anschluss wird das Projekt OpenCat der Öffentlichen Bibliothek Fresnes (Frankreich) sowie das LOD-Projekt an der Deichmanske Bibliothek Oslo (Norwegen) vorgestellt. Darauf folgt ein Einblick in die Möglichkeiten, welche durch die Verwendung von LOD in Öffentlichen Bibliotheken verwirklicht werden könnten sowie erste Handlungsempfehlungen für Öffentliche Bibliotheken.
Resumo:
A retrieval model describes the transformation of a query into a set of documents. The question is: what drives this transformation? For semantic information retrieval type of models this transformation is driven by the content and structure of the semantic models. In this case, Knowledge Organization Systems (KOSs) are the semantic models that encode the meaning employed for monolingual and cross-language retrieval. The focus of this research is the relationship between these meanings’ representations and their role and potential in augmenting existing retrieval models effectiveness. The proposed approach is unique in explicitly interpreting a semantic reference as a pointer to a concept in the semantic model that activates all its linked neighboring concepts. It is in fact the formalization of the information retrieval model and the integration of knowledge resources from the Linguistic Linked Open Data cloud that is distinctive from other approaches. The preprocessing of the semantic model using Formal Concept Analysis enables the extraction of conceptual spaces (formal contexts)that are based on sub-graphs from the original structure of the semantic model. The types of conceptual spaces built in this case are limited by the KOSs structural relations relevant to retrieval: exact match, broader, narrower, and related. They capture the definitional and relational aspects of the concepts in the semantic model. Also, each formal context is assigned an operational role in the flow of processes of the retrieval system enabling a clear path towards the implementations of monolingual and cross-lingual systems. By following this model’s theoretical description in constructing a retrieval system, evaluation results have shown statistically significant results in both monolingual and bilingual settings when no methods for query expansion were used. The test suite was run on the Cross-Language Evaluation Forum Domain Specific 2004-2006 collection with additional extensions to match the specifics of this model.