999 resultados para Zipf law
Resumo:
Temporal locality of reference in Web request streams emerges from two distinct phenomena: the popularity of Web objects and the {\em temporal correlation} of requests. Capturing these two elements of temporal locality is important because it enables cache replacement policies to adjust how they capitalize on temporal locality based on the relative prevalence of these phenomena. In this paper, we show that temporal locality metrics proposed in the literature are unable to delineate between these two sources of temporal locality. In particular, we show that the commonly-used distribution of reference interarrival times is predominantly determined by the power law governing the popularity of documents in a request stream. To capture (and more importantly quantify) both sources of temporal locality in a request stream, we propose a new and robust metric that enables accurate delineation between locality due to popularity and that due to temporal correlation. Using this metric, we characterize the locality of reference in a number of representative proxy cache traces. Our findings show that there are measurable differences between the degrees (and sources) of temporal locality across these traces, and that these differences are effectively captured using our proposed metric. We illustrate the significance of our findings by summarizing the performance of a novel Web cache replacement policy---called GreedyDual*---which exploits both long-term popularity and short-term temporal correlation in an adaptive fashion. Our trace-driven simulation experiments (which are detailed in an accompanying Technical Report) show the superior performance of GreedyDual* when compared to other Web cache replacement policies.
Resumo:
Power law distributions, also known as heavy tail distributions, model distinct real life phenomena in the areas of biology, demography, computer science, economics, information theory, language, and astronomy, amongst others. In this paper, it is presented a review of the literature having in mind applications and possible explanations for the use of power laws in real phenomena. We also unravel some controversies around power laws.
Resumo:
In Phys. Rev. Letters (73:2), Mantegna et al. conclude on the basis of Zipf rank frequency data that noncoding DNA sequence regions are more like natural languages than coding regions. We argue on the contrary that an empirical fit to Zipf"s "law" cannot be used as a criterion for similarity to natural languages. Although DNA is a presumably "organized system of signs" in Mandelbrot"s (1961) sense, and observation of statistical featurs of the sort presented in the Mantegna et al. paper does not shed light on the similarity between DNA's "gramar" and natural language grammars, just as the observation of exact Zipf-like behavior cannot distinguish between the underlying processes of tossing an M-sided die or a finite-state branching process.
Resumo:
Power laws, also known as Pareto-like laws or Zipf-like laws, are commonly used to explain a variety of real world distinct phenomena, often described merely by the produced signals. In this paper, we study twelve cases, namely worldwide technological accidents, the annual revenue of America׳s largest private companies, the number of inhabitants in America׳s largest cities, the magnitude of earthquakes with minimum moment magnitude equal to 4, the total burned area in forest fires occurred in Portugal, the net worth of the richer people in America, the frequency of occurrence of words in the novel Ulysses, by James Joyce, the total number of deaths in worldwide terrorist attacks, the number of linking root domains of the top internet domains, the number of linking root domains of the top internet pages, the total number of human victims of tornadoes occurred in the U.S., and the number of inhabitants in the 60 most populated countries. The results demonstrate the emergence of statistical characteristics, very close to a power law behavior. Furthermore, the parametric characterization reveals complex relationships present at higher level of description.
Resumo:
The number of citations of a scientific publication or of an individual scientist has become an important factor of quality assessment in science. We report a study of the statistical distribution of the citation index of both scientific publications and scientists. We give numerical evidence that Tsallis (power law) statistics explains the entire distribution over eight orders of magnitude (10-4 to 10(4)). Also, we draw Zipf plots in order to analyze the statistical distribution of the citation index of Brazilian and international physicists and chemists. The relatively small group of Brazilian scientists seems more adequate to explain the dynamics of the citation index. In this case, we find that the distribution of the citation index can also be explained by a gradually truncated power law with similar parameters. We finally discuss possible mechanisms behind the citation index of scientists and scientific publications.
Resumo:
Quando la probabilità di misurare un particolare valore di una certa quantità varia inversamente come potenza di tale valore, il quantitativo è detto come seguente una power-law, conosciuta anche come legge di Zipf o distribuzione di Pareto. Obiettivo di questa tesi sarà principalmente quello di verificare se il campione esteso di imprese segue la power-law (e se sì, in che limiti). A tale fine si configureranno i dati in un formato di rete monomodale, della quale si studieranno alcune macro-proprietà di struttura a livllo complessivo e con riferimento alle componenti (i singoli subnet distinti) di maggior dimensione. Successivamente si compiranno alcuni approfondimenti sulla struttura fine di alcuni subnet, essenzialmente rivolti ad evidenziare la potenza di unapproccio network-based, anche al fine di rivelare rilevanti proprietà nascoste del sistema economico soggiacente, sempre, ovviamente, nei limiti della modellizzazione adottata. In sintesi, ciò che questo lavoro intende ottenere è lo sviluppo di un approccio alternativo al trattamento dei big data a componente relazionale intrinseca (in questo caso le partecipazioni di capitale), verso la loro conversione in "big knowledge": da un insieme di dati cognitivamente inaccessibili, attraverso la strutturazione dell'informazione in modalità di rete, giungere ad una conoscenza sufficientemente chiara e giustificata.
Resumo:
Tropical Cyclones are a continuing threat to life and property. Willoughby (2012) found that a Pareto (power-law) cumulative distribution fitted to the most damaging 10% of US hurricane seasons fit their impacts well. Here, we find that damage follows a Pareto distribution because the assets at hazard follow a Zipf distribution, which can be thought of as a Pareto distribution with exponent 1. The Z-CAT model is an idealized hurricane catastrophe model that represents a coastline where populated places with Zipf- distributed assets are randomly scattered and damaged by virtual hurricanes with sizes and intensities generated through a Monte-Carlo process. Results produce realistic Pareto exponents. The ability of the Z-CAT model to simulate different climate scenarios allowed testing of sensitivities to Maximum Potential Intensity, landfall rates and building structure vulnerability. The Z-CAT model results demonstrate that a statistical significant difference in damage is found when only changes in the parameters create a doubling of damage.
Resumo:
The Queensland University of Technology (QUT) University Academic Board approved a new QUT Assessment Policy in September 2003, which requires a criterion-referenced approach as opposed to a norm-referenced approach to assessment across the university(QUT,MOPP,2003). In 2004, the QUT Law School embarked upon a process of awareness raising about criterion-referenced assessment amongst staff and from 2004 – 2005 staggered the implementation of criterion-referenced assessment in all first year core undergraduate law units. This paper will briefly discuss the benefits and potential pitfalls of criterion referenced assessment and the context for implementing it in the first year law program, report on student’s feedback on the introduction of criterion referenced assessment and the strategies adopted in 2005 to engage students more fully in criterion referenced assessment processes to enhance their learning outcomes.