920 resultados para Data representation
Resumo:
The predominant fear in capital markets is that of a price spike. Commodity markets differ in that there is a fear of both upward and down jumps, this results in implied volatility curves displaying distinct shapes when compared to equity markets. The use of a novel functional data analysis (FDA) approach, provides a framework to produce and interpret functional objects that characterise the underlying dynamics of oil future options. We use the FDA framework to examine implied volatility, jump risk, and pricing dynamics within crude oil markets. Examining a WTI crude oil sample for the 2007–2013 period, which includes the global financial crisis and the Arab Spring, strong evidence is found of converse jump dynamics during periods of demand and supply side weakness. This is used as a basis for an FDA-derived Merton (1976) jump diffusion optimised delta hedging strategy, which exhibits superior portfolio management results over traditional methods.
Resumo:
Eye tracking has become a preponderant technique in the evaluation of user interaction and behaviour with study objects in defined contexts. Common eye tracking related data representation techniques offer valuable input regarding user interaction and eye gaze behaviour, namely through fixations and saccades measurement. However, these and other techniques may be insufficient for the representation of acquired data in specific studies, namely because of the complexity of the study object being analysed. This paper intends to contribute with a summary of data representation and information visualization techniques used in data analysis within different contexts (advertising, websites, television news and video games). Additionally, several methodological approaches are presented in this paper, which resulted from several studies developed and under development at CETAC.MEDIA - Communication Sciences and Technologies Research Centre. In the studies described, traditional data representation techniques were insufficient. As a result, new approaches were necessary and therefore, new forms of representing data, based on common techniques were developed with the objective of improving communication and information strategies. In each of these studies, a brief summary of the contribution to their respective area will be presented, as well as the data representation techniques used and some of the acquired results.
Resumo:
Specific choices about how to represent complex networks can have a substantial impact on the execution time required for the respective construction and analysis of those structures. In this work we report a comparison of the effects of representing complex networks statically by adjacency matrices or dynamically by adjacency lists. Three theoretical models of complex networks are considered: two types of Erdos-Renyi as well as the Barabasi-Albert model. We investigated the effect of the different representations with respect to the construction and measurement of several topological properties (i.e. degree, clustering coefficient, shortest path length, and betweenness centrality). We found that different forms of representation generally have a substantial effect on the execution time, with the sparse representation frequently resulting in remarkably superior performance. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
An important feature in computer systems developed for the agricultural sector is to satisfy the heterogeneity of data generated in different processes. Most problems related with this heterogeneity arise from the lack of standard for different computing solutions proposed. An efficient solution for that is to create a single standard for data exchange. The study on the actual process involved in cotton production was based on a research developed by the Brazilian Agricultural Research Corporation (EMBRAPA) that reports all phases as a result of the compilation of several theoretical and practical researches related to cotton crop. The proposition of a standard starts with the identification of the most important classes of data involved in the process, and includes an ontology that is the systematization of concepts related to the production of cotton fiber and results in a set of classes, relations, functions and instances. The results are used as a reference for the development of computational tools, transforming implicit knowledge into applications that support the knowledge described. This research is based on data from the Midwest of Brazil. The choice of the cotton process as a study case comes from the fact that Brazil is one of the major players and there are several improvements required for system integration in this segment.
Resumo:
Molybdenum isotopes are increasingly widely applied in Earth Sciences. They are primarily used to investigate the oxygenation of Earth's ocean and atmosphere. However, more and more fields of application are being developed, such as magmatic and hydrothermal processes, planetary sciences or the tracking of environmental pollution. Here, we present a proposal for a unifying presentation of Mo isotope ratios in the studies of mass-dependent isotope fractionation. We suggest that the δ98/95Mo of the NIST SRM 3134 be defined as +0.25‰. The rationale is that the vast majority of published data are presented relative to reference materials that are similar, but not identical, and that are all slightly lighter than NIST SRM 3134. Our proposed data presentation allows a direct first-order comparison of almost all old data with future work while referring to an international measurement standard. In particular, canonical δ98/95Mo values such as +2.3‰ for seawater and −0.7‰ for marine Fe–Mn precipitates can be kept for discussion. As recent publications show that the ocean molybdenum isotope signature is homogeneous, the IAPSO ocean water standard or any other open ocean water sample is suggested as a secondary measurement standard, with a defined δ98/95Mo value of +2.34 ± 0.10‰ (2s). Les isotopes du molybdène (Mo) sont de plus en plus largement utilisés dans les sciences de la Terre. Ils sont principalement utilisés pour étudier l'oxygénation de l'océan et de l'atmosphère de la Terre. Cependant, de plus en plus de domaines d'application sont en cours de développement, tels que ceux concernant les processus magmatiques et hydrothermaux, les sciences planétaires ou encore le suivi de la pollution environnementale. Ici, nous présentons une proposition de présentation unifiée des rapports isotopiques du Mo dans les études du fractionnement isotopique dépendant de la masse. Nous suggérons que le δ98/95Mo du NIST SRM 3134 soit définit comme étant égal à +0.25 ‰. La raison est que la grande majorité des données publiées sont présentés par rapport à des matériaux de référence qui sont similaires, mais pas identiques, et qui sont tous légèrement plus léger que le NIST SRM 3134. Notre proposition de présentation des données permet une comparaison directe au premier ordre de presque toutes les anciennes données avec les travaux futurs en se référant à un standard international. En particulier, les valeurs canoniques du δ98/95Mo comme celle de +2,3 ‰ pour l'eau de mer et de -0,7 ‰ pour les précipités de Fe-Mn marins peuvent être conservés pour la discussion. Comme les publications récentes montrent que la signature isotopique moyenne du molybdène de l'océan est homogène, le standard de l'eau océanique IAPSO ou tout autre échantillon d'eau provenant de l'océan ouvert sont proposé comme standards secondaires, avec une valeur définie du δ98/95 Mo de 2.34 ± 0.10 ‰ (2s).
Resumo:
Increasing the size of training data in many computer vision tasks has shown to be very effective. Using large scale image datasets (e.g. ImageNet) with simple learning techniques (e.g. linear classifiers) one can achieve state-of-the-art performance in object recognition compared to sophisticated learning techniques on smaller image sets. Semantic search on visual data has become very popular. There are billions of images on the internet and the number is increasing every day. Dealing with large scale image sets is intense per se. They take a significant amount of memory that makes it impossible to process the images with complex algorithms on single CPU machines. Finding an efficient image representation can be a key to attack this problem. A representation being efficient is not enough for image understanding. It should be comprehensive and rich in carrying semantic information. In this proposal we develop an approach to computing binary codes that provide a rich and efficient image representation. We demonstrate several tasks in which binary features can be very effective. We show how binary features can speed up large scale image classification. We present learning techniques to learn the binary features from supervised image set (With different types of semantic supervision; class labels, textual descriptions). We propose several problems that are very important in finding and using efficient image representation.
Resumo:
A tag-based item recommendation method generates an ordered list of items, likely interesting to a particular user, using the users past tagging behaviour. However, the users tagging behaviour varies in different tagging systems. A potential problem in generating quality recommendation is how to build user profiles, that interprets user behaviour to be effectively used, in recommendation models. Generally, the recommendation methods are made to work with specific types of user profiles, and may not work well with different datasets. In this paper, we investigate several tagging data interpretation and representation schemes that can lead to building an effective user profile. We discuss the various benefits a scheme brings to a recommendation method by highlighting the representative features of user tagging behaviours on a specific dataset. Empirical analysis shows that each interpretation scheme forms a distinct data representation which eventually affects the recommendation result. Results on various datasets show that an interpretation scheme should be selected based on the dominant usage in the tagging data (i.e. either higher amount of tags or higher amount of items present). The usage represents the characteristic of user tagging behaviour in the system. The results also demonstrate how the scheme is able to address the cold-start user problem.
Resumo:
Spatial data representation and compression has become a focus issue in computer graphics and image processing applications. Quadtrees, as one of hierarchical data structures, basing on the principle of recursive decomposition of space, always offer a compact and efficient representation of an image. For a given image, the choice of quadtree root node plays an important role in its quadtree representation and final data compression. The goal of this thesis is to present a heuristic algorithm for finding a root node of a region quadtree, which is able to reduce the number of leaf nodes when compared with the standard quadtree decomposition. The empirical results indicate that, this proposed algorithm has quadtree representation and data compression improvement when in comparison with the traditional method.
Resumo:
This chapter addresses data modelling as a means of promoting statistical literacy in the early grades. Consideration is first given to the importance of increasing young children’s exposure to statistical reasoning experiences and how data modelling can be a rich means of doing so. Selected components of data modelling are then reviewed, followed by a report on some findings from the third-year of a three-year longitudinal study across grades one through three.
Resumo:
In this paper we describe the design of DNA Jewellery, which is a wearable tangible data representation of personal DNA profile data. An iterative design process was followed to develop a 3D form-language that could be mapped to standard DNA profile data, with the aim of retaining readability of data while also producing an aesthetically pleasing and unique result in the area of personalized design. The work explores design issues with the production of data tangibles, contributes to a growing body of research exploring tangible representations of data and highlights the importance of approaches that move between technology, art and design.
Resumo:
The Echology: Making Sense of Data initiative seeks to break new ground in arts practice by asking artists to innovate with respect to a) the possible forms of data representation in public art and b) the artist's role in engaging publics on environmental sustainability in new urban developments. Initiated by ANAT and Carbon Arts in 2011, Echology has seen three artists selected by National competition in 2012 for Lend Lease sites across Australia. In 2013 commissioning of one of these works, the Mussel Choir by Natalie Jeremijenko, began in Melbourne's Victoria Harbour development. This emerging practice of data - driven and environmentally engaged public artwork presents multiple challenges to established systems of public arts production and management, at the same time as offering up new avenues for artists to forge new modes of collaboration. The experience of Echology and in particular, the Mussel Choir is examined here to reveal opportunities for expansion of this practice through identification of the factors that lead to a resilient 'ecology of part nership' between stakeholders that include science and technology researchers, education providers, city administrators, and urban developers.
Resumo:
In this paper we describe the design of DNA Jewelry, which is a wearable tangible data representation of personal DNA profile data. An iterative design process was followed to develop a 3D form-language that could be mapped to standard DNA profile data, with the aim of retaining readability of data while also producing an aesthetically pleasing and unique result in the area of personalised design. The work explores design issues with the production of data tangibles, contributes to a growing body of research exploring tangible representations of data and highlights the importance of approaches that move between technology, art and design.
Resumo:
This thesis is an investigation into the nature of data analysis and computer software systems which support this activity.
The first chapter develops the notion of data analysis as an experimental science which has two major components: data-gathering and theory-building. The basic role of language in determining the meaningfulness of theory is stressed, and the informativeness of a language and data base pair is studied. The static and dynamic aspects of data analysis are then considered from this conceptual vantage point. The second chapter surveys the available types of computer systems which may be useful for data analysis. Particular attention is paid to the questions raised in the first chapter about the language restrictions imposed by the computer system and its dynamic properties.
The third chapter discusses the REL data analysis system, which was designed to satisfy the needs of the data analyzer in an operational relational data system. The major limitation on the use of such systems is the amount of access to data stored on a relatively slow secondary memory. This problem of the paging of data is investigated and two classes of data structure representations are found, each of which has desirable paging characteristics for certain types of queries. One representation is used by most of the generalized data base management systems in existence today, but the other is clearly preferred in the data analysis environment, as conceptualized in Chapter I.
This data representation has strong implications for a fundamental process of data analysis -- the quantification of variables. Since quantification is one of the few means of summarizing and abstracting, data analysis systems are under strong pressure to facilitate the process. Two implementations of quantification are studied: one analagous to the form of the lower predicate calculus and another more closely attuned to the data representation. A comparison of these indicates that the use of the "label class" method results in orders of magnitude improvement over the lower predicate calculus technique.