Biblioteca Digital

879 resultados para Security of data

Rapid preconditioning of data for accelerating convex hull algorithms

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Given a dataset of two-dimensional points in the plane with integer coordinates, the method proposed reduces a set of n points down to a set of s points s ≤ n, such that the convex hull on the set of s points is the same as the convex hull of the original set of n points. The method is O(n). It helps any convex hull algorithm run faster. The empirical analysis of a practical case shows a percentage reduction in points of over 98%, that is reflected as a faster computation with a speedup factor of at least 4.

A survey of data mining techniques for social media analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.

Research data management and openness: the role of data sharing in developing institutional policies and practices

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose: To investigate the relationship between research data management (RDM) and data sharing in the formulation of RDM policies and development of practices in higher education institutions (HEIs). Design/methodology/approach: Two strands of work were undertaken sequentially: firstly, content analysis of 37 RDM policies from UK HEIs; secondly, two detailed case studies of institutions with different approaches to RDM based on semi-structured interviews with staff involved in the development of RDM policy and services. The data are interpreted using insights from Actor Network Theory. Findings: RDM policy formation and service development has created a complex set of networks within and beyond institutions involving different professional groups with widely varying priorities shaping activities. Data sharing is considered an important activity in the policies and services of HEIs studied, but its prominence can in most cases be attributed to the positions adopted by large research funders. Research limitations/implications: The case studies, as research based on qualitative data, cannot be assumed to be universally applicable but do illustrate a variety of issues and challenges experienced more generally, particularly in the UK. Practical implications: The research may help to inform development of policy and practice in RDM in HEIs and funder organisations. Originality/value: This paper makes an early contribution to the RDM literature on the specific topic of the relationship between RDM policy and services, and openness – a topic which to date has received limited attention.

Application of data assimilation to ocean and climate prediction

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ocean prediction systems are now able to analyse and predict temperature, salinity and velocity structures within the ocean by assimilating measurements of the ocean’s temperature and salinity into physically based ocean models. Data assimilation combines current estimates of state variables, such as temperature and salinity, from a computational model with measurements of the ocean and atmosphere in order to improve forecasts and reduce uncertainty in the forecast accuracy. Data assimilation generally works well with ocean models away from the equator but has been found to induce vigorous and unrealistic overturning circulations near the equator. A pressure correction method was developed at the University of Reading and the Met Office to control these circulations using ideas from control theory and an understanding of equatorial dynamics. The method has been used for the last 10 years in seasonal forecasting and ocean prediction systems at the Met Office and European Center for Medium-range Weather Forecasting (ECMWF). It has been an important element in recent re-analyses of the ocean heat uptake that mitigates climate change.

Parametric analysis of discrepant sets of data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper a new parametric method to deal with discrepant experimental results is developed. The method is based on the fit of a probability density function to the data. This paper also compares the characteristics of different methods used to deduce recommended values and uncertainties from a discrepant set of experimental data. The methods are applied to the (137)Cs and (90)Sr published half-lives and special emphasis is given to the deduced confidence intervals. The obtained results are analyzed considering two fundamental properties expected from an experimental result: the probability content of confidence intervals and the statistical consistency between different recommended values. The recommended values and uncertainties for the (137)Cs and (90)Sr half-lives are 10,984 (24) days and 10,523 (70) days, respectively. (C) 2009 Elsevier B.V. All rights reserved.

On the efficiency of data representation on the modeling and characterization of complex networks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Specific choices about how to represent complex networks can have a substantial impact on the execution time required for the respective construction and analysis of those structures. In this work we report a comparison of the effects of representing complex networks statically by adjacency matrices or dynamically by adjacency lists. Three theoretical models of complex networks are considered: two types of Erdos-Renyi as well as the Barabasi-Albert model. We investigated the effect of the different representations with respect to the construction and measurement of several topological properties (i.e. degree, clustering coefficient, shortest path length, and betweenness centrality). We found that different forms of representation generally have a substantial effect on the execution time, with the sparse representation frequently resulting in remarkably superior performance. (C) 2011 Elsevier B.V. All rights reserved.

Modelling the Provenance of Data in Autonomous Systems

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Determining the provenance of data, i.e. the process that led to that data, is vital in many disciplines. For example, in science, the process that produced a given result must be demonstrably rigorous for the result to be deemed reliable. A provenance system supports applications in recording adequate documentation about process executions to answer queries regarding provenance, and provides functionality to perform those queries. Several provenance systems are being developed, but all focus on systems in which the components are textitreactive, for example Web Services that act on the basis of a request, job submission system, etc. This limitation means that questions regarding the motives of autonomous actors, or textitagents, in such systems remain unanswerable in the general case. Such questions include: who was ultimately responsible for a given effect, what was their reason for initiating the process and does the effect of a process match what was intended to occur by those initiating the process? In this paper, we address this limitation by integrating two solutions: a generic, re-usable framework for representing the provenance of data in service-oriented architectures and a model for describing the goal-oriented delegation and engagement of agents in multi-agent systems. Using these solutions, we present algorithms to answer common questions regarding responsibility and success of a process and evaluate the approach with a simulated healthcare example.

HIV soroprevalence rates of Brazilian drug users : an analysis of selected variables based on ten years of data collection

Relevância:

100.00% 100.00%

Publicador:

Factors affecting the technical efficiency of production of the Brazilian banking system : a comparison of four statistical models in the context of data envelopment analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper uses an output oriented Data Envelopment Analysis (DEA) measure of technical efficiency to assess the technical efficiencies of the Brazilian banking system. Four approaches to estimation are compared in order to assess the significance of factors affecting inefficiency. These are nonparametric Analysis of Covariance, maximum likelihood using a family of exponential distributions, maximum likelihood using a family of truncated normal distributions, and the normal Tobit model. The sole focus of the paper is on a combined measure of output and the data analyzed refers to the year 2001. The factors of interest in the analysis and likely to affect efficiency are bank nature (multiple and commercial), bank type (credit, business, bursary and retail), bank size (large, medium, small and micro), bank control (private and public), bank origin (domestic and foreign), and non-performing loans. The latter is a measure of bank risk. All quantitative variables, including non-performing loans, are measured on a per employee basis. The best fits to the data are provided by the exponential family and the nonparametric Analysis of Covariance. The significance of a factor however varies according to the model fit although it can be said that there is some agreements between the best models. A highly significant association in all models fitted is observed only for nonperforming loans. The nonparametric Analysis of Covariance is more consistent with the inefficiency median responses observed for the qualitative factors. The findings of the analysis reinforce the significant association of the level of bank inefficiency, measured by DEA residuals, with the risk of bank failure.

Information mining and visualization of data from the Brazilian Supreme Court (STF): a case study

Relevância:

100.00% 100.00%

Publicador:

Resumo:

EMAp - Escola de Matemática Aplicada

The onset of data-driven mental archeology

Relevância:

100.00% 100.00%

Publicador:

Influence of data structure on the estimation of the additive genetic direct and maternal covariance for early growth traits in Nellore cattle

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of the present study was to investigate the effect of data structure on estimated genetic parameters and predicted breeding values of direct and maternal genetic effects for weaning weight (WW) and weight gain from birth to weaning (BWG), including or not the genetic covariance between direct and maternal effects. Records of 97,490 Nellore animals born between 1993 and 2006, from the Jacarezinho cattle raising farm, were used. Two different data sets were analyzed: DI_all, which included all available progenies of dams without their own performance; DII_all, which included DI_all + 20% of recorded progenies with maternal phenotypes. Two subsets were obtained from each data set (DI_all and DII_all): DI_1 and DII_1, which included only dams with three or fewer progenies; DI_5 and DII_5, which included only dams with five or more progenies. (Co)variance components and heritabilities were estimated by Bayesian inference through Gibbs sampling using univariate animal models. In general, for the population and traits studied, the proportion of dams with known phenotypic information and the number of progenies per dam influenced direct and maternal heritabilities, as well as the contribution of maternal permanent environmental variance to phenotypic variance. Only small differences were observed in the genetic and environmental parameters when the genetic covariance between direct and maternal effects was set to zero in the data sets studied. Thus, the inclusion or not of the genetic covariance between direct and maternal effects had little effect on the ranking of animals according to their breeding values for WW and BWG. Accurate estimation of genetic correlations between direct and maternal genetic effects depends on the data structure. Thus, this covariance should be set to zero in Nellore data sets in which the proportion of dams with phenotypic information is low, the number of progenies per dam is small, and pedigree relationships are poorly known. (c) 2012 Elsevier B.V. All rights reserved.

Teaching and Learning of Data Structures Supported by Computer: An Experience with the CADILAG tool

Relevância:

100.00% 100.00%

Publicador:

Simulations under Ideal and Nonideal Conditions for Characterization of a Passive Doppler Geographical Location System Using Extension of Data Reception Network

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Search for associated production of charginos and neutralinos in the trilepton final state using 2.3 fb(-1) of data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

«
1
2
...
6
7
8
9
10
11
12
...
58
59
»