10 resultados para Estrazione informazioni, analisi dati non strutturati, Web semantico, data mining, text mining, big data, open data, classificazione di testi.

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Exploratory analysis of data seeks to find common patterns to gain insights into the structure and distribution of the data. In geochemistry it is a valuable means to gain insights into the complicated processes making up a petroleum system. Typically linear visualisation methods like principal components analysis, linked plots, or brushing are used. These methods can not directly be employed when dealing with missing data and they struggle to capture global non-linear structures in the data, however they can do so locally. This thesis discusses a complementary approach based on a non-linear probabilistic model. The generative topographic mapping (GTM) enables the visualisation of the effects of very many variables on a single plot, which is able to incorporate more structure than a two dimensional principal components plot. The model can deal with uncertainty, missing data and allows for the exploration of the non-linear structure in the data. In this thesis a novel approach to initialise the GTM with arbitrary projections is developed. This makes it possible to combine GTM with algorithms like Isomap and fit complex non-linear structure like the Swiss-roll. Another novel extension is the incorporation of prior knowledge about the structure of the covariance matrix. This extension greatly enhances the modelling capabilities of the algorithm resulting in better fit to the data and better imputation capabilities for missing data. Additionally an extensive benchmark study of the missing data imputation capabilities of GTM is performed. Further a novel approach, based on missing data, will be introduced to benchmark the fit of probabilistic visualisation algorithms on unlabelled data. Finally the work is complemented by evaluating the algorithms on real-life datasets from geochemical projects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Electronic Product Code Information Service (EPCIS) is an EPCglobal standard, that aims to bridge the gap between the physical world of RFID1 tagged artifacts, and information systems that enable their tracking and tracing via the Electronic Product Code (EPC). Central to the EPCIS data model are "events" that describe specific occurrences in the supply chain. EPCIS events, recorded and registered against EPC tagged artifacts, encapsulate the "what", "when", "where" and "why" of these artifacts as they flow through the supply chain. In this paper we propose an ontological model for representing EPCIS events on the Web of data. Our model provides a scalable approach for the representation, integration and sharing of EPCIS events as linked data via RESTful interfaces, thereby facilitating interoperability, collaboration and exchange of EPC related data across enterprises on a Web scale.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper re-assesses three independently developed approaches that are aimed at solving the problem of zero-weights or non-zero slacks in Data Envelopment Analysis (DEA). The methods are weights restricted, non-radial and extended facet DEA models. Weights restricted DEA models are dual to envelopment DEA models with restrictions on the dual variables (DEA weights) aimed at avoiding zero values for those weights; non-radial DEA models are envelopment models which avoid non-zero slacks in the input-output constraints. Finally, extended facet DEA models recognize that only projections on facets of full dimension correspond to well defined rates of substitution/transformation between all inputs/outputs which in turn correspond to non-zero weights in the multiplier version of the DEA model. We demonstrate how these methods are equivalent, not only in their aim but also in the solutions they yield. In addition, we show that the aforementioned methods modify the production frontier by extending existing facets or creating unobserved facets. Further we propose a new approach that uses weight restrictions to extend existing facets. This approach has some advantages in computational terms, because extended facet models normally make use of mixed integer programming models, which are computationally demanding.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The INTAMAP FP6 project has developed an interoperable framework for real-time automatic mapping of critical environmental variables by extending spatial statistical methods and employing open, web-based, data exchange protocols and visualisation tools. This paper will give an overview of the underlying problem, of the project, and discuss which problems it has solved and which open problems seem to be most relevant to deal with next. The interpolation problem that INTAMAP solves is the generic problem of spatial interpolation of environmental variables without user interaction, based on measurements of e.g. PM10, rainfall or gamma dose rate, at arbitrary locations or over a regular grid covering the area of interest. It deals with problems of varying spatial resolution of measurements, the interpolation of averages over larger areas, and with providing information on the interpolation error to the end-user. In addition, monitoring network optimisation is addressed in a non-automatic context.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Models are central tools for modern scientists and decision makers, and there are many existing frameworks to support their creation, execution and composition. Many frameworks are based on proprietary interfaces, and do not lend themselves to the integration of models from diverse disciplines. Web based systems, or systems based on web services, such as Taverna and Kepler, allow composition of models based on standard web service technologies. At the same time the Open Geospatial Consortium has been developing their own service stack, which includes the Web Processing Service, designed to facilitate the executing of geospatial processing - including complex environmental models. The current Open Geospatial Consortium service stack employs Extensible Markup Language as a default data exchange standard, and widely-used encodings such as JavaScript Object Notation can often only be used when incorporated with Extensible Markup Language. Similarly, no successful engagement of the Web Processing Service standard with the well-supported technologies of Simple Object Access Protocol and Web Services Description Language has been seen. In this paper we propose a pure Simple Object Access Protocol/Web Services Description Language processing service which addresses some of the issues with the Web Processing Service specication and brings us closer to achieving a degree of interoperability between geospatial models, and thus realising the vision of a useful 'model web'.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this thesis was to investigate the impact of changing values and attitudes toward work and the workplace in Britain, West Germany, France and Japan. A cross-national approach was adopted in order to gain a better understanding of differences and similarities in behaviour and to identify aspects specific to each society. Although the relationship between work and leisure has been thoroughly examined and there is a growing body of literature on changes in the values associated with these two phenomena, little research has been carried out into leisure at work. Studies of work time have tended to consider it as a homogeneous block, whereas recent research suggests that more attention should be devoted to unravelling the multiple uses of time at the workplace. The present study sought to review and analyse this new approach to the study of work time, and special attention is devoted to an examination of definitions of leisure, recreation, free time and work within the context of the workplace. The cross-cultural comparative approach gave rise to several problems due to the number of countries involved and the unusual combination of factors being investigated. The main difficulties were differences in the amount and quality of literature available, the non-comparability of existing data, definitions of concepts and socio-linguistic terms, and problems over access to organizations for fieldwork. Much of the literature generalizes about patterns of behaviour and few authors isolate factors specific to particular societies. In this thesis new empirical work is therefore used to ascertain the extent to which generalizations can be made from the literature and characteristics peculiar to each of the four countries identified. White-collar employees in large, broadly comparable companies were studied using identical questionnaires in the appropriate language. Respondents selected were men and women, aged between 20-65 years and either managers or non-managers. Patterns of leisure at work were found to be broadly similar in the national contexts, but with the Japanese and the West Germans experiencing the least leisure at work, and the British and the French perceiving the most. The general trend seems to be toward convergence of attitudes regarding leisure at work in the four countries. Explanations for variations in practice were sought within the wider societal contexts of each country.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article focuses on the deviations from normality of stock returns before and after a financial liberalisation reform, and shows the extent to which inference based on statistical measures of stock market efficiency can be affected by not controlling for breaks. Drawing from recent advances in the econometrics of structural change, it compares the distribution of the returns of five East Asian emerging markets when breaks in the mean and variance are either (i) imposed using certain official liberalisation dates or (ii) detected non-parametrically using a data-driven procedure. The results suggest that measuring deviations from normality of stock returns with no provision for potentially existing breaks incorporates substantial bias. This is likely to severely affect any inference based on the corresponding descriptive or test statistics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a new framework for evaluating the performance of employment offices based on non-parametric technique of data envelopment analysis. This framework is explained using the assessment of technical efficiency of 82 employment offices in Tunisia which are under the direction of the National Agency for Employment and Independent Work. We further investigated the exogenous factors that may explain part of the variation in efficiency scores using a bootstrapping approach in period January 2006 to December 2008. Given the specialisation of employment offices, we used the proposed approach for the efficiency evaluation of graduate employment offices and multi-services employment offices, separately.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The sharing of near real-time traceability knowledge in supply chains plays a central role in coordinating business operations and is a key driver for their success. However before traceability datasets received from external partners can be integrated with datasets generated internally within an organisation, they need to be validated against information recorded for the physical goods received as well as against bespoke rules defined to ensure uniformity, consistency and completeness within the supply chain. In this paper, we present a knowledge driven framework for the runtime validation of critical constraints on incoming traceability datasets encapuslated as EPCIS event-based linked pedigrees. Our constraints are defined using SPARQL queries and SPIN rules. We present a novel validation architecture based on the integration of Apache Storm framework for real time, distributed computation with popular Semantic Web/Linked data libraries and exemplify our methodology on an abstraction of the pharmaceutical supply chain.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Supply chains comprise of complex processes spanning across multiple trading partners. The various operations involved generate large number of events that need to be integrated in order to enable internal and external traceability. Further, provenance of artifacts and agents involved in the supply chain operations is now a key traceability requirement. In this paper we propose a Semantic web/Linked data powered framework for the event based representation and analysis of supply chain activities governed by the EPCIS specification. We specifically show how a new EPCIS event type called "Transformation Event" can be semantically annotated using EEM - The EPCIS Event Model to generate linked data, that can be exploited for internal event based traceability in supply chains involving transformation of products. For integrating provenance with traceability, we propose a mapping from EEM to PROV-O. We exemplify our approach on an abstraction of the production processes that are part of the wine supply chain.