Biblioteca Digital

885 resultados para Semantic Publishing, Linked Data, Bibliometrics, Informetrics, Data Retrieval, Citations

The onset of data-driven mental archeology

Relevância:

50.00% 50.00%

Publicador:

Veja mais

Mapping of Submerged Aquatic Vegetation in Rivers From Very High Resolution Image Data, Using Object Based Image Analysis Combined with Expert Knowledge

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The use of remote sensing for monitoring of submerged aquatic vegetation (SAV) in fluvial environments has been limited by the spatial and spectral resolution of available image data. The absorption of light in water also complicates the use of common image analysis methods. This paper presents the results of a study that uses very high resolution (VHR) image data, collected with a Near Infrared sensitive DSLR camera, to map the distribution of SAV species for three sites along the Desselse Nete, a lowland river in Flanders, Belgium. Plant species, including Ranunculus aquatilis L., Callitriche obtusangula Le Gall, Potamogeton natans L., Sparganium emersum L. and Potamogeton crispus L., were classified from the data using Object-Based Image Analysis (OBIA) and expert knowledge. A classification rule set based on a combination of both spectral and structural image variation (e.g. texture and shape) was developed for images from two sites. A comparison of the classifications with manually delineated ground truth maps resulted for both sites in 61% overall accuracy. Application of the rule set to a third validation image, resulted in 53% overall accuracy. These consistent results show promise for species level mapping in such biodiverse environments, but also prompt a discussion on assessment of classification accuracy.

Veja mais

New morpho-bathymetric and tectono-stratigraphic data on Naples and Salerno gulfs (southern tyrrhenian sea, Italy) derived from bathymetric and seismic data analysis and integrated geological interpretation

Relevância:

50.00% 50.00%

Publicador:

Resumo:

New morpho-bathymetric and tectono-stratigraphic data on Naples and Salerno Gulfs, derived from bathymetric and seismic data analysis and integrated geologic interpretation are here presented. The CUBE(Combined Uncertainty Bathymetric Estimator) method has been applied to complex morphologies, such as the Capri continental slope and the related geological structures occurring in the Salerno Gulf.The bathymetric data analysis has been carried out for marine geological maps of the whole Campania continental margin at scales ranging from 1:25.000 to 1:10.000, including focused examples in Naples and Salerno Gulfs, Naples harbour, Capri and Ischia Islands and Salerno Valley. Seismic data analysis has allowed for the correlation of main morpho-structural lineaments recognized at a regional scale through multichannel profiles with morphological features cropping out at the sea bottom, evident from bathymetry.Main fault systems in the area have been represented on a tectonic sketch map, including the master fault located northwards to the Salerno Valley half graben. Some normal faults parallel to the master fault have been interpreted from the slope map derived from bathymetric data. A complex system of antithetic faults bound two morpho-structural highs located 20km to the south of the Capri Island. Some hints of compressional reactivation of normal faults in an extensional setting involving the whole Campania continental margin have been shown from seismic interpretation.

Veja mais

Discovery driven analysis on semi-structured text data

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Discovery Driven Analysis (DDA) is a common feature of OLAP technology to analyze structured data. In essence, DDA helps analysts to discover anomalous data by highlighting 'unexpected' values in the OLAP cube. By giving indications to the analyst on what dimensions to explore, DDA speeds up the process of discovering anomalies and their causes. However, Discovery Driven Analysis (and OLAP in general) is only applicable on structured data, such as records in databases. We propose a system to extend DDA technology to semi-structured text documents, that is, text documents with a few structured data. Our system pipeline consists of two stages: first, the text part of each document is structured around user specified dimensions, using semi-PLSA algorithm; then, we adapt DDA to these fully structured documents, thus enabling DDA on text documents. We present some applications of this system in OLAP analysis and show how scalability issues are solved. Results show that our system can handle reasonable datasets of documents, in real time, without any need for pre-computation.

Veja mais

RICH AND EFFICIENT VISUAL DATA REPRESENTATION

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Increasing the size of training data in many computer vision tasks has shown to be very effective. Using large scale image datasets (e.g. ImageNet) with simple learning techniques (e.g. linear classifiers) one can achieve state-of-the-art performance in object recognition compared to sophisticated learning techniques on smaller image sets. Semantic search on visual data has become very popular. There are billions of images on the internet and the number is increasing every day. Dealing with large scale image sets is intense per se. They take a significant amount of memory that makes it impossible to process the images with complex algorithms on single CPU machines. Finding an efficient image representation can be a key to attack this problem. A representation being efficient is not enough for image understanding. It should be comprehensive and rich in carrying semantic information. In this proposal we develop an approach to computing binary codes that provide a rich and efficient image representation. We demonstrate several tasks in which binary features can be very effective. We show how binary features can speed up large scale image classification. We present learning techniques to learn the binary features from supervised image set (With different types of semantic supervision; class labels, textual descriptions). We propose several problems that are very important in finding and using efficient image representation.

Veja mais

Top-K Query Processing in Edge-Labeled Graph Data

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed. Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs. Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features. The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned. An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask. The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.

Veja mais

Identifying health inequalities in individuals with major mental illness (MMI) using routine data

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Abstract and Summary of Thesis: Background: Individuals with Major Mental Illness (such as schizophrenia and bipolar disorder) experience increased rates of physical health comorbidity compared to the general population. They also experience inequalities in access to certain aspects of healthcare. This ultimately leads to premature mortality. Studies detailing patterns of physical health comorbidity are limited by their definitions of comorbidity, single disease approach to comorbidity and by the study of heterogeneous groups. To date the investigation of possible sources of healthcare inequalities experienced by individuals with Major Mental Illness (MMI) is relatively limited. Moreover studies detailing the extent of premature mortality experienced by individuals with MMI vary both in terms of the measure of premature mortality reported and age of the cohort investigated, limiting their generalisability to the wider population. Therefore local and national data can be used to describe patterns of physical health comorbidity, investigate possible reasons for health inequalities and describe mortality rates. These findings will extend existing work in this area. Aims and Objectives: To review the relevant literature regarding: patterns of physical health comorbidity, evidence for inequalities in physical healthcare and evidence for premature mortality for individuals with MMI. To examine the rates of physical health comorbidity in a large primary care database and to assess for evidence for inequalities in access to healthcare using both routine primary care prescribing data and incentivised national Quality and Outcome Framework (QOF) data. Finally to examine the rates of premature mortality in a local context with a particular focus on cause of death across the lifespan and effect of International Classification of Disease Version 10 (ICD 10) diagnosis and socioeconomic status on rates and cause of death. Methods: A narrative review of the literature surrounding patterns of physical health comorbidity, the evidence for inequalities in physical healthcare and premature mortality in MMI was undertaken. Rates of physical health comorbidity and multimorbidity in schizophrenia and bipolar disorder were examined using a large primary care dataset (Scottish Programme for Improving Clinical Effectiveness in Primary Care (SPICE)). Possible inequalities in access to healthcare were investigated by comparing patterns of prescribing in individuals with MMI and comorbid physical health conditions with prescribing rates in individuals with physical health conditions without MMI using SPICE data. Potential inequalities in access to health promotion advice (in the form of smoking cessation) and prescribing of Nicotine Replacement Therapy (NRT) were also investigated using SPICE data. Possible inequalities in access to incentivised primary healthcare were investigated using National Quality and Outcome Framework (QOF) data. Finally a pre-existing case register (Glasgow Psychosis Clinical Information System (PsyCIS)) was linked to Scottish Mortality data (available from the Scottish Government Website) to investigate rates and primary cause of death in individuals with MMI. Rate and primary cause of death were compared to the local population and impact of age, socioeconomic status and ICD 10 diagnosis (schizophrenia vs. bipolar disorder) were investigated. Results: Analysis of the SPICE data found that sixteen out of the thirty two common physical comorbidities assessed, occurred significantly more frequently in individuals with schizophrenia. In individuals with bipolar disorder fourteen occurred more frequently. The most prevalent chronic physical health conditions in individuals with schizophrenia and bipolar disorder were: viral hepatitis (Odds Ratios (OR) 3.99 95% Confidence Interval (CI) 2.82-5.64 and OR 5.90 95% CI 3.16-11.03 respectively), constipation (OR 3.24 95% CI 3.01-3.49 and OR 2.84 95% CI 2.47-3.26 respectively) and Parkinson’s disease (OR 3.07 95% CI 2.43-3.89 and OR 2.52 95% CI 1.60-3.97 respectively). Both groups had significantly increased rates of multimorbidity compared to controls: in the schizophrenia group OR for two comorbidities was 1.37 95% CI 1.29-1.45 and in the bipolar disorder group OR was 1.34 95% CI 1.20-1.49. In the studies investigating inequalities in access to healthcare there was evidence of: under-recording of cardiovascular-related conditions for example in individuals with schizophrenia: OR for Atrial Fibrillation (AF) was 0.62 95% CI 0.52 - 0.73, for hypertension 0.71 95% CI 0.67 - 0.76, for Coronary Heart Disease (CHD) 0.76 95% CI 0.69 - 0.83 and for peripheral vascular disease (PVD) 0.83 95% CI 0.72 - 0.97. Similarly in individuals with bipolar disorder OR for AF was 0.56 95% CI 0.41-0.78, for hypertension 0.69 95% CI 0.62 - 0.77 and for CHD 0.77 95% CI 0.66 - 0.91. There was also evidence of less intensive prescribing for individuals with schizophrenia and bipolar disorder who had comorbid hypertension and CHD compared to individuals with hypertension and CHD who did not have schizophrenia or bipolar disorder. Rate of prescribing of statins for individuals with schizophrenia and CHD occurred significantly less frequently than in individuals with CHD without MMI (OR 0.67 95% CI 0.56-0.80). Rates of prescribing of 2 or more anti-hypertensives were lower in individuals with CHD and schizophrenia and CHD and bipolar disorder compared to individuals with CHD without MMI (OR 0.66 95% CI 0.56-0.78 and OR 0.55 95% CI 0.46-0.67, respectively). Smoking was more common in individuals with MMI compared to individuals without MMI (OR 2.53 95% CI 2.44-2.63) and was particularly increased in men (OR 2.83 95% CI 2.68-2.98). Rates of ex-smoking and non-smoking were lower in individuals with MMI (OR 0.79 95% CI 0.75-0.83 and OR 0.50 95% CI 0.48-0.52 respectively). However recorded rates of smoking cessation advice in smokers with MMI were significantly lower than the recorded rates of smoking cessation advice in smokers with diabetes (88.7% vs. 98.0%, p<0.001), smokers with CHD (88.9% vs. 98.7%, p<0.001) and smokers with hypertension (88.3% vs. 98.5%, p<0.001) without MMI. The odds ratio of NRT prescription was also significantly lower in smokers with MMI without diabetes compared to smokers with diabetes without MMI (OR 0.75 95% CI 0.69-0.81). Similar findings were found for smokers with MMI without CHD compared to smokers with CHD without MMI (OR 0.34 95% CI 0.31-0.38) and smokers with MMI without hypertension compared to smokers with hypertension without MMI (OR 0.71 95% CI 0.66-0.76). At a national level, payment and population achievement rates for the recording of body mass index (BMI) in MMI was significantly lower than the payment and population achievement rates for BMI recording in diabetes throughout the whole of the UK combined: payment rate 92.7% (Inter Quartile Range (IQR) 89.3-95.8 vs. 95.5% IQR 93.3-97.2, p<0.001 and population achievement rate 84.0% IQR 76.3-90.0 vs. 92.5% IQR 89.7-94.9, p<0.001 and for each country individually: for example in Scotland payment rate was 94.0% IQR 91.4-97.2 vs. 96.3% IQR 94.3-97.8, p<0.001. Exception rate was significantly higher for the recording of BMI in MMI than the exception rate for BMI recording in diabetes for the UK combined: 7.4% IQR 3.3-15.9 vs. 2.3% IQR 0.9-4.7, p<0.001 and for each country individually. For example in Scotland exception rate in MMI was 11.8% IQR 5.4-19.3 compared to 3.5% IQR 1.9-6.1 in diabetes. Similar findings were found for Blood Pressure (BP) recording: across the whole of the UK payment and population achievement rates for BP recording in MMI were also significantly reduced compared to payment and population achievement rates for the recording of BP in chronic kidney disease (CKD): payment rate: 94.1% IQR 90.9-97.1 vs.97.8% IQR 96.3-98.9 and p<0.001 and population achievement rate 87.0% IQR 81.3-91.7 vs. 97.1% IQR 95.5-98.4, p<0.001. Exception rates again were significantly higher for the recording of BP in MMI compared to CKD (6.4% IQR 3.0-13.1 vs. 0.3% IQR 0.0-1.0, p<0.001). There was also evidence of differences in rates of recording of BMI and BP in MMI across the UK. BMI and BP recording in MMI were significantly lower in Scotland compared to England (BMI:-1.5% 99% CI -2.7 to -0.3%, p<0.001 and BP: -1.8% 99% CI -2.7 to -0.9%, p<0.001). While rates of BMI and BP recording in diabetes and CKD were similar in Scotland compared to England (BMI: -0.5 99% CI -1.0 to 0.05, p=0.004 and BP: 0.02 99% CI -0.2 to 0.3, p=0.797). Data from the PsyCIS cohort showed an increase in Standardised Mortality Ratios (SMR) across the lifespan for individuals with MMI compared to the local Glasgow and wider Scottish populations (Glasgow SMR 1.8 95% CI 1.6-2.0 and Scotland SMR 2.7 95% CI 2.4-3.1). Increasing socioeconomic deprivation was associated with an increased overall rate of death in MMI (350.3 deaths/10,000 population/5 years in the least deprived quintile compared to 794.6 deaths/10,000 population/5 years in the most deprived quintile). No significant difference in rate of death for individuals with schizophrenia compared with bipolar disorder was reported (6.3% vs. 4.9%, p=0.086), but primary cause of death varied: with higher rates of suicide in individuals with bipolar disorder (22.4% vs. 11.7%, p=0.04). Discussion: Local and national datasets can be used for epidemiological study to inform local practice and complement existing national and international studies. While the strengths of this thesis include the large data sets used and therefore their likely representativeness to the wider population, some limitations largely associated with using secondary data sources are acknowledged. While this thesis has confirmed evidence of increased physical health comorbidity and multimorbidity in individuals with MMI, it is likely that these findings represent a significant under reporting and likely under recognition of physical health comorbidity in this population. This is likely due to a combination of patient, health professional and healthcare system factors and requires further investigation. Moreover, evidence of inequality in access to healthcare in terms of: physical health promotion (namely smoking cessation advice), recording of physical health indices (BMI and BP), prescribing of medications for the treatment of physical illness and prescribing of NRT has been found at a national level. While significant premature mortality in individuals with MMI within a Scottish setting has been confirmed, more work is required to further detail and investigate the impact of socioeconomic deprivation on cause and rate of death in this population. It is clear that further education and training is required for all healthcare staff to improve the recognition, diagnosis and treatment of physical health problems in this population with the aim of addressing the significant premature mortality that is seen. Conclusions: Future work lies in the challenge of designing strategies to reduce health inequalities and narrow the gap in premature mortality reported in individuals with MMI. Models of care that allow a much more integrated approach to diagnosing, monitoring and treating both the physical and mental health of individuals with MMI, particularly in areas of social and economic deprivation may be helpful. Strategies to engage this “hard to reach” population also need to be developed. While greater integration of psychiatric services with primary care and with specialist medical services is clearly vital the evidence on how best to achieve this is limited. While the National Health Service (NHS) is currently undergoing major reform, attention needs to be paid to designing better ways to improve the current disconnect between primary and secondary care. This should then help to improve physical, psychological and social outcomes for individuals with MMI.

Veja mais

POWKist: visualising cultural heritage linked datasets.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

A collaboration between dot.rural at the University of Aberdeen and the iSchool at Northumbria University, POWkist is a pilot-study exploring potential usages of currently available linked datasets within the cultural heritage domain. Many privately-held family history collections (shoebox archives) remain vulnerable unless a sustainable, affordable and accessible model of citizen-archivist digital preservation can be offered. Citizen-historians have used the web as a platform to preserve cultural heritage, however with no accessible or sustainable model these digital footprints have been ad hoc and rarely connected to broader historical research. Similarly, current approaches to connecting material on the web by exploiting linked datasets do not take into account the data characteristics of the cultural heritage domain. Funded by Semantic Media, the POWKist project is investigating how best to capture, curate, connect and present the contents of citizen-historians’ shoebox archives in an accessible and sustainable online collection. Using the Curios platform - an open-source digital archive - we have digitised a collection relating to a prisoner of war during WWII (1939-1945). Following a series of user group workshops, POWkist is now connecting these ‘made digital’ items with the broader web using a semantic technology model and identifying appropriate linked datasets of relevant content such as DBPedia (an archived linked dataset of Wikipedia) and Ordnance Survey Open Data. We are analysing the characteristics of cultural heritage linked datasets, so that these materials are better visualised, contextualised and presented in an attractive and comprehensive user interface. Our paper will consider the issues we have identified, the solutions we are developing and include a demonstration of our work-in-progress.

Veja mais

Seeing Turkish state formation processes: Mapping language and education census data

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Language provides an interesting lens to look at state-building processes because of its cross-cutting nature. For example, in addition to its symbolic value and appeal, a national language has other roles in the process, including: (a) becoming the primary medium of communication which permits the nation to function efficiently in its political and economic life, (b) promoting social cohesion, allowing the nation to develop a common culture, and (c) forming a primordial basis for self-determination. Moreover, because of its cross-cutting nature, language interventions are rarely isolated activities. Languages are adopted by speakers, taking root in and spreading between communities because they are legitimated by legislation, and then reproduced through institutions like the education and military systems. Pádraig Ó’ Riagáin (1997) makes a case for this observing that “Language policy is formulated, implemented, and accomplishes its results within a complex interrelated set of economic, social, and political processes which include, inter alia, the operation of other non-language state policies” (p. 45). In the Turkish case, its foundational role in the formation of the Turkish nation-state but its linkages to human rights issues raises interesting issues about how socio-cultural practices become reproduced through institutional infrastructure formation. This dissertation is a country-level case study looking at Turkey’s nation-state building process through the lens of its language and education policy development processes with a focus on the early years of the Republic between 1927 and 1970. This project examines how different groups self-identified or were self-identified (as the case may be) in official Turkish statistical publications (e.g., the Turkish annual statistical yearbooks and the population censuses) during that time period when language and ethnicity data was made publicly available. The overarching questions this dissertation explores include: 1.What were the geo-political conditions surrounding the development and influencing the Turkish government’s language and education policies? 2.Are there any observable patterns in the geo-spatial distribution of language, literacy, and education participation rates over time? In what ways, are these traditionally linked variables (language, literacy, education participation) problematic? 3.What do changes in population identifiers, e.g., language and ethnicity, suggest about the government’s approach towards nation-state building through the construction of a civic Turkish identity and institution building? Archival secondary source data was digitized, aggregated by categories relevant to this project at national and provincial levels and over the course of time (primarily between 1927 and 2000). The data was then re-aggregated into values that could be longitudinally compared and then layered on aspatial administrative maps. This dissertation contributes to existing body of social policy literature by taking an interdisciplinary approach in looking at the larger socio-economic contexts in which language and education policies are produced.

Veja mais

Should data monitoring committees assess efficacy when considering safety in trails in acute stroke?

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The primary role of a trials Data Monitoring Committee (DMC) is to ensure the safety of enrolled patients. In stroke trials, safety is monitored typically by comparing death and stroke specific events between treatment groups. DMCs may also have the remit for monitoring efficacy depending on the aims of the trial. We hypothesised that functional outcome at end of follow-up, a measure of efficacy, is also a powerful measure of safety and tested this in a systematic review

Veja mais

The onset of data-driven mental archeology

Relevância:

50.00% 50.00%

Publicador:

Veja mais

Effects of efficient fronto-temporal circuitry on lexical ambiguity resolution: converging evidence from cross-age comparisons in eye-tracking and ERP data

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Eye-tracking was used to examine how younger and older adults use syntactic and semantic information to disambiguate noun/verb (NV) homographs (e.g., park). We find that young adults exhibit inflated first fixations to NV-homographs when only syntactic cues are available for disambiguation (i.e., in syntactic prose). This effect is eliminated with the addition of disambiguating semantic information. Older adults (60+) as a group fail to show the first fixation effect in syntactic prose; they instead reread NV homographs longer. This pattern mirrors that in prior event-related potential work (Lee & Federmeier, 2009, 2011), which reported a sustained frontal negativity to NV-homographs in syntactic prose for young adults, which was eliminated by semantic constraints. The frontal negativity was not observed in older adults as a group, although older adults with high verbal fluency showed the young-like pattern. Analyses of individual differences in eye-tracking patterns revealed a similar effect of verbal fluency in both young and older adults: high verbal fluency groups of both ages show larger first fixation effects, while low verbal fluency groups show larger downstream costs (rereading and/or refixating NV homographs). Jointly, the eye-tracking and ERP data suggest that effortful meaning selection recruits frontal brain areas important for suppressing contextually inappropriate meanings, which also slows eye movements. Efficacy of fronto-temporal circuitry, as captured by verbal fluency, predicts the success of engaging these mechanisms in both young and older adults. Failure to recruit these processes requires compensatory rereading or leads to comprehension failures (Lee & Federmeier, in press).

Veja mais

A test method for analysing disturbed ethernet data streams

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Ethernet connections, which are widely used in many computer networks, can suffer from electromagnetic interference. Typically, a degradation of the data transmission rate can be perceived as electromagnetic disturbances lead to corruption of data frames on the network media. In this paper a software-based measuring method is presented, which allows a direct assessment of the effects on the link layer. The results can directly be linked to the physical interaction without the influence of software related effects on higher protocol layers. This gives a simple tool for a quantitative analysis of the disturbance of an Ethernet connection based on time domain data. An example is shown, how the data can be used for further investigation of mechanisms and detection of intentional electromagnetic attacks. © 2015 Author(s).

Veja mais

Un Servizio Web per l'analisi Semantica degli Open Data

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Questo lavoro di Tesi ha come obiettivo quello di automatizzare il più possibile la comprensione automatica degli Open Data. Ciò è stato realizzato mediante la progettazione e lo sviluppo del “Semantic Detector”, una soluzione che si interpone tra il dato grezzo, quindi il dataset, e qualsiasi software ad alto livello che sfrutta questi dati per poterli effettivamente riutilizzare o riorganizzare opportunamente in un formato aggregabile.

Veja mais

Query routing in cooperative semi-structured peer-to-peer information retrieval networks

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Conventional web search engines are centralised in that a single entity crawls and indexes the documents selected for future retrieval, and the relevance models used to determine which documents are relevant to a given user query. As a result, these search engines suffer from several technical drawbacks such as handling scale, timeliness and reliability, in addition to ethical concerns such as commercial manipulation and information censorship. Alleviating the need to rely entirely on a single entity, Peer-to-Peer (P2P) Information Retrieval (IR) has been proposed as a solution, as it distributes the functional components of a web search engine – from crawling and indexing documents, to query processing – across the network of users (or, peers) who use the search engine. This strategy for constructing an IR system poses several efficiency and effectiveness challenges which have been identified in past work. Accordingly, this thesis makes several contributions towards advancing the state of the art in P2P-IR effectiveness by improving the query processing and relevance scoring aspects of a P2P web search. Federated search systems are a form of distributed information retrieval model that route the user’s information need, formulated as a query, to distributed resources and merge the retrieved result lists into a final list. P2P-IR networks are one form of federated search in routing queries and merging result among participating peers. The query is propagated through disseminated nodes to hit the peers that are most likely to contain relevant documents, then the retrieved result lists are merged at different points along the path from the relevant peers to the query initializer (or namely, customer). However, query routing in P2P-IR networks is considered as one of the major challenges and critical part in P2P-IR networks; as the relevant peers might be lost in low-quality peer selection while executing the query routing, and inevitably lead to less effective retrieval results. This motivates this thesis to study and propose query routing techniques to improve retrieval quality in such networks. Cluster-based semi-structured P2P-IR networks exploit the cluster hypothesis to organise the peers into similar semantic clusters where each such semantic cluster is managed by super-peers. In this thesis, I construct three semi-structured P2P-IR models and examine their retrieval effectiveness. I also leverage the cluster centroids at the super-peer level as content representations gathered from cooperative peers to propose a query routing approach called Inverted PeerCluster Index (IPI) that simulates the conventional inverted index of the centralised corpus to organise the statistics of peers’ terms. The results show a competitive retrieval quality in comparison to baseline approaches. Furthermore, I study the applicability of using the conventional Information Retrieval models as peer selection approaches where each peer can be considered as a big document of documents. The experimental evaluation shows comparative and significant results and explains that document retrieval methods are very effective for peer selection that brings back the analogy between documents and peers. Additionally, Learning to Rank (LtR) algorithms are exploited to build a learned classifier for peer ranking at the super-peer level. The experiments show significant results with state-of-the-art resource selection methods and competitive results to corresponding classification-based approaches. Finally, I propose reputation-based query routing approaches that exploit the idea of providing feedback on a specific item in the social community networks and manage it for future decision-making. The system monitors users’ behaviours when they click or download documents from the final ranked list as implicit feedback and mines the given information to build a reputation-based data structure. The data structure is used to score peers and then rank them for query routing. I conduct a set of experiments to cover various scenarios including noisy feedback information (i.e, providing positive feedback on non-relevant documents) to examine the robustness of reputation-based approaches. The empirical evaluation shows significant results in almost all measurement metrics with approximate improvement more than 56% compared to baseline approaches. Thus, based on the results, if one were to choose one technique, reputation-based approaches are clearly the natural choices which also can be deployed on any P2P network.

Veja mais

885 resultados para Semantic Publishing, Linked Data, Bibliometrics, Informetrics, Data Retrieval, Citations

Filtro por publicador