4 resultados para Web access characterization and modeling
em Boston University Digital Common
Resumo:
Temporal locality of reference in Web request streams emerges from two distinct phenomena: the popularity of Web objects and the {\em temporal correlation} of requests. Capturing these two elements of temporal locality is important because it enables cache replacement policies to adjust how they capitalize on temporal locality based on the relative prevalence of these phenomena. In this paper, we show that temporal locality metrics proposed in the literature are unable to delineate between these two sources of temporal locality. In particular, we show that the commonly-used distribution of reference interarrival times is predominantly determined by the power law governing the popularity of documents in a request stream. To capture (and more importantly quantify) both sources of temporal locality in a request stream, we propose a new and robust metric that enables accurate delineation between locality due to popularity and that due to temporal correlation. Using this metric, we characterize the locality of reference in a number of representative proxy cache traces. Our findings show that there are measurable differences between the degrees (and sources) of temporal locality across these traces, and that these differences are effectively captured using our proposed metric. We illustrate the significance of our findings by summarizing the performance of a novel Web cache replacement policy---called GreedyDual*---which exploits both long-term popularity and short-term temporal correlation in an adaptive fashion. Our trace-driven simulation experiments (which are detailed in an accompanying Technical Report) show the superior performance of GreedyDual* when compared to other Web cache replacement policies.
Resumo:
We present a thorough characterization of the access patterns in blogspace, which comprises a rich interconnected web of blog postings and comments by an increasingly prominent user community that collectively define what has become known as the blogosphere. Our characterization of over 35 million read, write, and management requests spanning a 28-day period is done at three different levels. The user view characterizes how individual users interact with blogosphere objects (blogs); the object view characterizes how individual blogs are accessed; the server view characterizes the aggregate access patterns of all users to all blogs. The more-interactive nature of the blogosphere leads to interesting traffic and communication patterns, which are different from those observed for traditional web content. We identify and characterize novel features of the blogosphere workload, and we show the similarities and differences between typical web server workloads and blogosphere server workloads. Finally, based on our main characterization results, we build a new synthetic blogosphere workload generator called GBLOT, which aims at mimicking closely a stream of requests originating from a population of blog users. Given the increasing share of blogspace traffic, realistic workload models and tools are important for capacity planning and traffic engineering purposes.
Resumo:
Web caching aims to reduce network traffic, server load, and user-perceived retrieval delays by replicating "popular" content on proxy caches that are strategically placed within the network. While key to effective cache utilization, popularity information (e.g. relative access frequencies of objects requested through a proxy) is seldom incorporated directly in cache replacement algorithms. Rather, other properties of the request stream (e.g. temporal locality and content size), which are easier to capture in an on-line fashion, are used to indirectly infer popularity information, and hence drive cache replacement policies. Recent studies suggest that the correlation between these secondary properties and popularity is weakening due in part to the prevalence of efficient client and proxy caches (which tend to mask these correlations). This trend points to the need for proxy cache replacement algorithms that directly capture and use popularity information. In this paper, we (1) present an on-line algorithm that effectively captures and maintains an accurate popularity profile of Web objects requested through a caching proxy, (2) propose a novel cache replacement policy that uses such information to generalize the well-known GreedyDual-Size algorithm, and (3) show the superiority of our proposed algorithm by comparing it to a host of recently-proposed and widely-used algorithms using extensive trace-driven simulations and a variety of performance metrics.
Resumo:
A growing wave of behavioral studies, using a wide variety of paradigms that were introduced or greatly refined in recent years, has generated a new wealth of parametric observations about serial order behavior. What was a mere trickle of neurophysiological studies has grown to a more steady stream of probes of neural sites and mechanisms underlying sequential behavior. Moreover, simulation models of serial behavior generation have begun to open a channel to link cellular dynamics with cognitive and behavioral dynamics. Here we summarize the major results from prominent sequence learning and performance tasks, namely immediate serial recall, typing, 2XN, discrete sequence production, and serial reaction time. These populate a continuum from higher to lower degrees of internal control of sequential organization. The main movement classes covered are speech and keypressing, both involving small amplitude movements that are very amenable to parametric study. A brief synopsis of classes of serial order models, vis-à-vis the detailing of major effects found in the behavioral data, leads to a focus on competitive queuing (CQ) models. Recently, the many behavioral predictive successes of CQ models have been joined by successful prediction of distinctively patterend electrophysiological recordings in prefrontal cortex, wherein parallel activation dynamics of multiple neural ensembles strikingly matches the parallel dynamics predicted by CQ theory. An extended CQ simulation model-the N-STREAMS neural network model-is then examined to highlight issues in ongoing attemptes to accomodate a broader range of behavioral and neurophysiological data within a CQ-consistent theory. Important contemporary issues such as the nature of working memory representations for sequential behavior, and the development and role of chunks in hierarchial control are prominent throughout.