893 resultados para data analysis: algorithms and implementation
Resumo:
Distributed data aggregation is an important task, allowing the de- centralized determination of meaningful global properties, that can then be used to direct the execution of other applications. The resulting val- ues result from the distributed computation of functions like count, sum and average. Some application examples can found to determine the network size, total storage capacity, average load, majorities and many others. In the last decade, many di erent approaches have been pro- posed, with di erent trade-o s in terms of accuracy, reliability, message and time complexity. Due to the considerable amount and variety of ag- gregation algorithms, it can be di cult and time consuming to determine which techniques will be more appropriate to use in speci c settings, jus- tifying the existence of a survey to aid in this task. This work reviews the state of the art on distributed data aggregation algorithms, providing three main contributions. First, it formally de nes the concept of aggrega- tion, characterizing the di erent types of aggregation functions. Second, it succinctly describes the main aggregation techniques, organizing them in a taxonomy. Finally, it provides some guidelines toward the selection and use of the most relevant techniques, summarizing their principal characteristics.
Resumo:
A spreadsheet usually starts as a simple and singleuser software artifact, but, as frequent as in other software systems, quickly evolves into a complex system developed by many actors. Often, different users work on different aspects of the same spreadsheet: while a secretary may be only involved in adding plain data to the spreadsheet, an accountant may define new business rules, while an engineer may need to adapt the spreadsheet content so it can be used by other software systems.Unfortunately,spreadsheetsystemsdonotoffermodular mechanisms, and as a consequence, some of the previous tasks may be defined by adding intrusive “code” to the spreadsheet. In this paper we go through the design and implementation of an aspect-oriented language for spreadsheets so that users can work on different aspects of a spreadsheet in a modular way. For example, aspects can be defined in order to introduce new business rules to an existing spreadsheet, or to manipulate the spreadsheet data to be ported to another system. Aspects are defined as aspect-oriented program specifications that are dynamically woven into the underlying spreadsheet by an aspect weaver. In this aspect-oriented style of spreadsheet development, differentusers develop,orreuse,aspects withoutaddingintrusive code to the original spreadsheet. Such code is added/executed by the spreadsheet weaving mechanism proposed in this paper.
Resumo:
Although the ASP model has been around for over a decade, it has not achieved the expected high level of market uptake. This research project examines the past and present state of ASP adoption and identifies security as a primary factor influencing the uptake of the model. The early chapters of this document examine the ASP model and ASP security in particular. Specifically, the literature and technology review chapter analyses ASP literature, security technologies and best practices with respect to system security in general. Based on this investigation, a prototype to illustrate the range and types of technologies that encompass a security framework was developed and is described in detail. The latter chapters of this document evaluate the practical implementation of system security in an ASP environment. Finally, this document outlines the research outputs, including the conclusions drawn and recommendations with respect to system security in an ASP environment. The primary research output is the recommendation that by following best practices with respect to security, an ASP application can provide the same level of security one would expect from any other n-tier client-server application. In addition, a security evaluation matrix, which could be used to evaluate not only the security of ASP applications but the security of any n-tier application, was developed by the author. This thesis shows that perceptions with regard to fears of inadequate security of ASP solutions and solution data are misguided. Finally, based on the research conducted, the author recommends that ASP solutions should be developed and deployed on tried, tested and trusted infrastructure. Existing Application Programming Interfaces (APIs) should be used where possible and security best practices should be adhered to where feasible.
Resumo:
The results presented in this paper clearly indicate that precocene and azadirachtin are effective inhibitors of moulting and reproduction in the hemipteran Rhodnius prolixus. The time of application is important and only applications of these substances early in the intermoulting period cause their effects in nymphs. The inhibition of moulting is fully reversed by ecdysone therapy. Precocene and azadirachtin also affected drastically the oogenesis and egg deposition in this insect. Precocene-induced sterilization is reversed by application of juvenile hormone III. However, this hormone is unable to reverse the effect of azadirachtin on reproduction. Ecdysteroid titers in nymphs and adult females are decreased by these treatments. In vitro analysis suggest that precocene and azadirachtin may act directly on the prothoracic glands and ovaries producing ecdysteroids. Based on these and other findings the possible mode of action of these compounds on the development and reproduction of Rhodnius prolixus is discussed.
Resumo:
The use of Geographic Information Systems has revolutionalized the handling and the visualization of geo-referenced data and has underlined the critic role of spatial analysis. The usual tools for such a purpose are geostatistics which are widely used in Earth science. Geostatistics are based upon several hypothesis which are not always verified in practice. On the other hand, Artificial Neural Network (ANN) a priori can be used without special assumptions and are known to be flexible. This paper proposes to discuss the application of ANN in the case of the interpolation of a geo-referenced variable.
Resumo:
AIM: Although acute pain is frequently reported by patients admitted to the emergency room, it is often insufficiently evaluated by physicians and is thus undertreated. With the aim of improving the care of adult patients with acute pain, we developed and implemented abbreviated clinical practice guidelines (CG) for the staff of nurses and physicians in our hospital's emergency room. METHODS: Our algorithm is based upon the practices described in the international literature and uses a simultaneous approach of treating acute pain in a rapid and efficacious manner along with diagnostic and therapeutic procedures. RESULTS: Pain was assessed using either a visual analogue scale (VAS) or a numerical rating scale (NRS) at ER admission and again during the hospital stay. Patients were treated with paracetamol and/or NSAID (VAS/NRS <4) or intravenous morphine (VAS/NRS > or =04). The algorithm also outlines a specific approach for patients with headaches to minimise the risks inherent to a non-specific treatment. In addition, our algorithm addresses the treatment of paroxysmal pain in patients with chronic pain as well as acute pain in drug addicts. It also outlines measures for pain prevention prior to minor diagnostic or therapeutic procedures. CONCLUSIONS: Based on published guidelines, an abbreviated clinical algorithm (AA) was developed and its simple format permitted a widespread implementation. In contrast to international guidelines, our algorithm favours giving nursing staff responsibility for decision making aspects of pain assessment and treatment in emergency room patients.
Resumo:
This report provides an overview of the development and field testing of the S�_olta Quality Assurance Programme (QAP). It outlines the timeline, key roles and activities and draws upon evaluation data gathered at various stages of the action research and development process. It briefly describes the processes, tools, materials and the professional roles that have been developed to support implementation of the S�_olta QAP. It concludes with consideration of the context within which the S�_olta QAP will operate into the future and makes a set of recommendations to connect this research and development phase for S�_olta and the S�_olta QAP with national and international policy developments related to the improvement of the quality of early childhood care and education (ecce) in Ireland.
Resumo:
In an earlier investigation (Burger et al., 2000) five sediment cores near the RodriguesTriple Junction in the Indian Ocean were studied applying classical statistical methods(fuzzy c-means clustering, linear mixing model, principal component analysis) for theextraction of endmembers and evaluating the spatial and temporal variation ofgeochemical signals. Three main factors of sedimentation were expected by the marinegeologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. Thedisplay of fuzzy membership values and/or factor scores versus depth providedconsistent results for two factors only; the ultra-basic component could not beidentified. The reason for this may be that only traditional statistical methods wereapplied, i.e. the untransformed components were used and the cosine-theta coefficient assimilarity measure.During the last decade considerable progress in compositional data analysis was madeand many case studies were published using new tools for exploratory analysis of thesedata. Therefore it makes sense to check if the application of suitable data transformations,reduction of the D-part simplex to two or three factors and visualinterpretation of the factor scores would lead to a revision of earlier results and toanswers to open questions . In this paper we follow the lines of a paper of R. Tolosana-Delgado et al. (2005) starting with a problem-oriented interpretation of the biplotscattergram, extracting compositional factors, ilr-transformation of the components andvisualization of the factor scores in a spatial context: The compositional factors will beplotted versus depth (time) of the core samples in order to facilitate the identification ofthe expected sources of the sedimentary process.Kew words: compositional data analysis, biplot, deep sea sediments
Resumo:
Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods
Resumo:
Examples of compositional data. The simplex, a suitable sample space for compositional data and Aitchison's geometry. R, a free language and environment for statistical computing and graphics
Resumo:
Compositional data naturally arises from the scientific analysis of the chemicalcomposition of archaeological material such as ceramic and glass artefacts. Data of thistype can be explored using a variety of techniques, from standard multivariate methodssuch as principal components analysis and cluster analysis, to methods based upon theuse of log-ratios. The general aim is to identify groups of chemically similar artefactsthat could potentially be used to answer questions of provenance.This paper will demonstrate work in progress on the development of a documentedlibrary of methods, implemented using the statistical package R, for the analysis ofcompositional data. R is an open source package that makes available very powerfulstatistical facilities at no cost. We aim to show how, with the aid of statistical softwaresuch as R, traditional exploratory multivariate analysis can easily be used alongside, orin combination with, specialist techniques of compositional data analysis.The library has been developed from a core of basic R functionality, together withpurpose-written routines arising from our own research (for example that reported atCoDaWork'03). In addition, we have included other appropriate publicly availabletechniques and libraries that have been implemented in R by other authors. Availablefunctions range from standard multivariate techniques through to various approaches tolog-ratio analysis and zero replacement. We also discuss and demonstrate a smallselection of relatively new techniques that have hitherto been little-used inarchaeometric applications involving compositional data. The application of the libraryto the analysis of data arising in archaeometry will be demonstrated; results fromdifferent analyses will be compared; and the utility of the various methods discussed
Resumo:
”compositions” is a new R-package for the analysis of compositional and positive data.It contains four classes corresponding to the four different types of compositional andpositive geometry (including the Aitchison geometry). It provides means for computation,plotting and high-level multivariate statistical analysis in all four geometries.These geometries are treated in an fully analogous way, based on the principle of workingin coordinates, and the object-oriented programming paradigm of R. In this way,called functions automatically select the most appropriate type of analysis as a functionof the geometry. The graphical capabilities include ternary diagrams and tetrahedrons,various compositional plots (boxplots, barplots, piecharts) and extensive graphical toolsfor principal components. Afterwards, ortion and proportion lines, straight lines andellipses in all geometries can be added to plots. The package is accompanied by ahands-on-introduction, documentation for every function, demos of the graphical capabilitiesand plenty of usage examples. It allows direct and parallel computation inall four vector spaces and provides the beginner with a copy-and-paste style of dataanalysis, while letting advanced users keep the functionality and customizability theydemand of R, as well as all necessary tools to add own analysis routines. A completeexample is included in the appendix
Resumo:
We shall call an n × p data matrix fully-compositional if the rows sum to a constant, and sub-compositional if the variables are a subset of a fully-compositional data set1. Such data occur widely in archaeometry, where it is common to determine the chemical composition of ceramic, glass, metal or other artefacts using techniques such as neutron activation analysis (NAA), inductively coupled plasma spectroscopy (ICPS), X-ray fluorescence analysis (XRF) etc. Interest often centres on whether there are distinct chemical groups within the data and whether, for example, these can be associated with different origins or manufacturing technologies
Resumo:
R from http://www.r-project.org/ is ‘GNU S’ – a language and environment for statistical computingand graphics. The environment in which many classical and modern statistical techniques havebeen implemented, but many are supplied as packages. There are 8 standard packages and many moreare available through the cran family of Internet sites http://cran.r-project.org .We started to develop a library of functions in R to support the analysis of mixtures and our goal isa MixeR package for compositional data analysis that provides support foroperations on compositions: perturbation and power multiplication, subcomposition with or withoutresiduals, centering of the data, computing Aitchison’s, Euclidean, Bhattacharyya distances,compositional Kullback-Leibler divergence etc.graphical presentation of compositions in ternary diagrams and tetrahedrons with additional features:barycenter, geometric mean of the data set, the percentiles lines, marking and coloring ofsubsets of the data set, theirs geometric means, notation of individual data in the set . . .dealing with zeros and missing values in compositional data sets with R procedures for simpleand multiplicative replacement strategy,the time series analysis of compositional data.We’ll present the current status of MixeR development and illustrate its use on selected data sets
Resumo:
The low levels of unemployment recorded in the UK in recent years are widely cited asevidence of the country’s improved economic performance, and the apparent convergence of unemployment rates across the country’s regions used to suggest that the longstanding divide in living standards between the relatively prosperous ‘south’ and the more depressed ‘north’ has been substantially narrowed. Dissenters from theseconclusions have drawn attention to the greatly increased extent of non-employment(around a quarter of the UK’s working age population are not in employment) and themarked regional dimension in its distribution across the country. Amongst these dissenters it is generally agreed that non-employment is concentrated amongst oldermales previously employed in the now very much smaller ‘heavy’ industries (e.g. coal,steel, shipbuilding).This paper uses the tools of compositiona l data analysis to provide a much richer picture of non-employment and one which challenges the conventional analysis wisdom about UK labour market performance as well as the dissenters view of the nature of theproblem. It is shown that, associated with the striking ‘north/south’ divide in nonemployment rates, there is a statistically significant relationship between the size of the non-employment rate and the composition of non-employment. Specifically, it is shown that the share of unemployment in non-employment is negatively correlated with the overall non-employment rate: in regions where the non-employment rate is high the share of unemployment is relatively low. So the unemployment rate is not a very reliable indicator of regional disparities in labour market performance. Even more importantly from a policy viewpoint, a significant positive relationship is found between the size ofthe non-employment rate and the share of those not employed through reason of sicknessor disability and it seems (contrary to the dissenters) that this connection is just as strong for women as it is for men