183 resultados para Parallel or distributed processing
Resumo:
The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does this on a single mid-range machine using efficient algorithms and compressed document representations. It is applied to two web-scale crawls covering tens of terabytes. ClueWeb09 and ClueWeb12 contain 500 and 733 million web pages and were clustered into 500,000 to 700,000 clusters. To the best of our knowledge, such fine grained clustering has not been previously demonstrated. Previous approaches clustered a sample that limits the maximum number of discoverable clusters. The proposed EM-tree algorithm uses the entire collection in clustering and produces several orders of magnitude more clusters than the existing algorithms. Fine grained clustering is necessary for meaningful clustering in massive collections where the number of distinct topics grows linearly with collection size. These fine-grained clusters show an improved cluster quality when assessed with two novel evaluations using ad hoc search relevance judgments and spam classifications for external validation. These evaluations solve the problem of assessing the quality of clusters where categorical labeling is unavailable and unfeasible.
Resumo:
Distributed Denial of Services DDoS, attacks has become one of the biggest threats for resources over Internet. Purpose of these attacks is to make servers deny from providing services to legitimate users. These attacks are also used for occupying media bandwidth. Currently intrusion detection systems can just detect the attacks but cannot prevent / track the location of intruders. Some schemes also prevent the attacks by simply discarding attack packets, which saves victim from attack, but still network bandwidth is wasted. In our opinion, DDoS requires a distributed solution to save wastage of resources. The paper, presents a system that helps us not only in detecting such attacks but also helps in tracing and blocking (to save the bandwidth as well) the multiple intruders using Intelligent Software Agents. The system gives dynamic response and can be integrated with the existing network defense systems without disturbing existing Internet model. We have implemented an agent based networking monitoring system in this regard.
Resumo:
We study the rates of growth of the regret in online convex optimization. First, we show that a simple extension of the algorithm of Hazan et al eliminates the need for a priori knowledge of the lower bound on the second derivatives of the observed functions. We then provide an algorithm, Adaptive Online Gradient Descent, which interpolates between the results of Zinkevich for linear functions and of Hazan et al for strongly convex functions, achieving intermediate rates between [square root T] and [log T]. Furthermore, we show strong optimality of the algorithm. Finally, we provide an extension of our results to general norms.
Resumo:
We consider the problem of prediction with expert advice in the setting where a forecaster is presented with several online prediction tasks. Instead of competing against the best expert separately on each task, we assume the tasks are related, and thus we expect that a few experts will perform well on the entire set of tasks. That is, our forecaster would like, on each task, to compete against the best expert chosen from a small set of experts. While we describe the "ideal" algorithm and its performance bound, we show that the computation required for this algorithm is as hard as computation of a matrix permanent. We present an efficient algorithm based on mixing priors, and prove a bound that is nearly as good for the sequential task presentation case. We also consider a harder case where the task may change arbitrarily from round to round, and we develop an efficient approximate randomized algorithm based on Markov chain Monte Carlo techniques.
Resumo:
Unusual event detection in crowded scenes remains challenging because of the diversity of events and noise. In this paper, we present a novel approach for unusual event detection via sparse reconstruction of dynamic textures over an overcomplete basis set, with the dynamic texture described by local binary patterns from three orthogonal planes (LBPTOP). The overcomplete basis set is learnt from the training data where only the normal items observed. In the detection process, given a new observation, we compute the sparse coefficients using the Dantzig Selector algorithm which was proposed in the literature of compressed sensing. Then the reconstruction errors are computed, based on which we detect the abnormal items. Our application can be used to detect both local and global abnormal events. We evaluate our algorithm on UCSD Abnormality Datasets for local anomaly detection, which is shown to outperform current state-of-the-art approaches, and we also get promising results for rapid escape detection using the PETS2009 dataset.
Resumo:
The growing importance and need of data processing for information extraction is vital for Web databases. Due to the sheer size and volume of databases, retrieval of relevant information as needed by users has become a cumbersome process. Information seekers are faced by information overloading - too many result sets are returned for their queries. Moreover, too few or no results are returned if a specific query is asked. This paper proposes a ranking algorithm that gives higher preference to a user’s current search and also utilizes profile information in order to obtain the relevant results for a user’s query.
Resumo:
Cockatoos are the distinctive family Cacatuidae, a major lineage of the order of parrots (Psittaciformes) and distributed throughout the Australasian region of the world. However, the evolutionary history of cockatoos is not well understood. We investigated the phylogeny of cockatoos based on three mitochondrial and three nuclear DNA genes obtained from 16 of 21 species of Cacatuidae. In addition, five novel mitochondrial genomes were used to estimate time of divergence and our estimates indicate Cacatuidae diverged from Psittacidae approximately 40.7 million years ago (95% CI 51.6–30.3 Ma) during the Eocene. Our data shows Cacatuidae began to diversify approximately 27.9 Ma (95% CI 38.1–18.3 Ma) during the Oligocene. The early to middle Miocene (20–10 Ma) was a significant period in the evolution of modern Australian environments and vegetation, in which a transformation from mainly mesic to xeric habitats (e.g., fire-adapted sclerophyll vegetation and grasslands) occurred. We hypothesize that this environmental transformation was a driving force behind the diversification of cockatoos. A detailed multi-locus molecular phylogeny enabled us to resolve the phylogenetic placements of the Palm Cockatoo (Probosciger aterrimus), Galah (Eolophus roseicapillus), Gang-gang Cockatoo (Callocephalon fimbriatum) and Cockatiel (Nymphicus hollandicus), which have historically been difficult to place within Cacatuidae. When the molecular evidence is analysed in concert with morphology, it is clear that many of the cockatoo species’ diagnostic phenotypic traits such as plumage colour, body size, wing shape and bill morphology have evolved in parallel or convergently across lineages.
Resumo:
Iterative Intersectioning is a body of art works that comes out of the collaboration between author and electronic artist Jen Seevinck and a community of print artists, most particularly Elizabeth Saunders (EJ) and Robert Oakman. The work shown here is concerned with the creative process of collaboration, specifically as this informs visual forms. This is through our focus on process. This process has facilitated a 'conversational' exchange between all artists and a corresponding evolution in the artworks. In each case the dialogue is either between the author, Jen and EJ or between Jen and Robert. It consists of passing work between parties, interpreting it and working into it, before passing it back. The result is a series of art works including those shown here. The concept evolves in parallel to this. Importantly, at each of her iterations of creative work, the author Jen determines a similar 'treatment' or 'interpretation' across both print artists works at that time. A synthesis of EJ and Robert's creative interpretation -- at a high level -- occurs. In this sense the concept and works can be understood to intersect with one another.
Resumo:
We study the rates of growth of the regret in online convex optimization. First, we show that a simple extension of the algorithm of Hazan et al eliminates the need for a priori knowledge of the lower bound on the second derivatives of the observed functions. We then provide an algorithm, Adaptive Online Gradient Descent, which interpolates between the results of Zinkevich for linear functions and of Hazan et al for strongly convex functions, achieving intermediate rates between [square root T] and [log T]. Furthermore, we show strong optimality of the algorithm. Finally, we provide an extension of our results to general norms.
Resumo:
Blood metaphors abound in everyday social discourse among both Aboriginal and non-Aboriginal people. However, ‘Aboriginal blood talk’, more specifically, is located within a contradictory and contested space in terms of the meanings and values that can be attributed to it by Aboriginal and non-Aboriginal people. In the colonial context, blood talk operated as a tool of oppression for Aboriginal people via blood quantum discourses, yet today, Aboriginal people draw upon notions of blood, namely bloodlines, in articulating their identities. This paper juxtaposes contemporary Aboriginal blood talk as expressed by Aboriginal people against colonial blood talk and critically examines the ongoing political and intellectual governance regarding the validity of this talk in articulating Aboriginalities.
Resumo:
This paper describes a concept for supporting distributed hands-on collaboration through interaction design for the physical and the digital workspace. The Blended Interaction Spaces concept creates distributed work environments in which collaborating parties all feel that they are present “here” rather than “there”. We describe thinking and inspirations behind the Blended Interaction Spaces concept, and summarize findings from fieldwork activities informing our design. We then exemplify the Blended Interaction Spaces concept through a prototype implementation of one of four concepts.
Resumo:
Classic identity negative priming (NP) refers to the finding that when an object is ignored, subsequent naming responses to it are slower than when it has not been previously ignored (Tipper, S.P., 1985. The negative priming effect: inhibitory priming by ignored objects. Q. J. Exp. Psychol. 37A, 571-590). It is unclear whether this phenomenon arises due to the involvement of abstract semantic representations that the ignored object accesses automatically. Contemporary connectionist models propose a key role for the anterior temporal cortex in the representation of abstract semantic knowledge (e.g., McClelland, J.L., Rogers, T.T., 2003. The parallel distributed processing approach to semantic cognition. Nat. Rev. Neurosci. 4, 310-322), suggesting that this region should be involved during performance of the classic identity NP task if it involves semantic access. Using high-field (4 T) event-related functional magnetic resonance imaging, we observed increased BOLD responses in the left anterolateral temporal cortex including the temporal pole that was directly related to the magnitude of each individual's NP effect, supporting a semantic locus. Additional signal increases were observed in the supplementary eye fields (SEF) and left inferior parietal lobule (IPL).
Resumo:
Solving large-scale all-to-all comparison problems using distributed computing is increasingly significant for various applications. Previous efforts to implement distributed all-to-all comparison frameworks have treated the two phases of data distribution and comparison task scheduling separately. This leads to high storage demands as well as poor data locality for the comparison tasks, thus creating a need to redistribute the data at runtime. Furthermore, most previous methods have been developed for homogeneous computing environments, so their overall performance is degraded even further when they are used in heterogeneous distributed systems. To tackle these challenges, this paper presents a data-aware task scheduling approach for solving all-to-all comparison problems in heterogeneous distributed systems. The approach formulates the requirements for data distribution and comparison task scheduling simultaneously as a constrained optimization problem. Then, metaheuristic data pre-scheduling and dynamic task scheduling strategies are developed along with an algorithmic implementation to solve the problem. The approach provides perfect data locality for all comparison tasks, avoiding rearrangement of data at runtime. It achieves load balancing among heterogeneous computing nodes, thus enhancing the overall computation time. It also reduces data storage requirements across the network. The effectiveness of the approach is demonstrated through experimental studies.
Resumo:
This paper describes a series of design games, specifically aimed at exploring shifts in human agency, how they are managed, and the impact this will have on the design of future context-aware applications. The games focussed on understanding information handling issues in dental practice with participants from the University of Queensland Dental School playing an active role in the activities. Participatory design activities reveal how technology solution impact on dental practices. By finding methods of representing technological possibilities in ways which can easily be understood we enhance the contribution that dentists can make to the design process.