859 resultados para Parallel projection
Resumo:
The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does this on a single mid-range machine using efficient algorithms and compressed document representations. It is applied to two web-scale crawls covering tens of terabytes. ClueWeb09 and ClueWeb12 contain 500 and 733 million web pages and were clustered into 500,000 to 700,000 clusters. To the best of our knowledge, such fine grained clustering has not been previously demonstrated. Previous approaches clustered a sample that limits the maximum number of discoverable clusters. The proposed EM-tree algorithm uses the entire collection in clustering and produces several orders of magnitude more clusters than the existing algorithms. Fine grained clustering is necessary for meaningful clustering in massive collections where the number of distinct topics grows linearly with collection size. These fine-grained clusters show an improved cluster quality when assessed with two novel evaluations using ad hoc search relevance judgments and spam classifications for external validation. These evaluations solve the problem of assessing the quality of clusters where categorical labeling is unavailable and unfeasible.
Resumo:
The unsteady incompressible viscous fluid flow between two parallel infinite disks which are located at a distance h(t*) at time t* has been studied. The upper disk moves towards the lower disk with velocity h'(t*). The lower disk is porous and rotates with angular velocity Omega(t*). A magnetic field B(t*) is applied perpendicular to the two disks. It has been found that the governing Navier-Stokes equations reduce to a set of ordinary differential equations if h(t*), a(t*) and B(t*) vary with time t* in a particular manner, i.e. h(t*) = H(1 - alpha t*)(1/2), Omega(t*) = Omega(0)(1 - alpha t*)(-1), B(t*) = B-0(1 - alpha t*)(-1/2). These ordinary differential equations have been solved numerically using a shooting method. For small Reynolds numbers, analytical solutions have been obtained using a regular perturbation technique. The effects of squeeze Reynolds numbers, Hartmann number and rotation of the disk on the flow pattern, normal force or load and torque have been studied in detail
Diffraction Of Elastic Waves By Two Parallel Rigid Strips Embedded In An Infinite Orthotropic Medium
Resumo:
The elastodynamic response of a pair of parallel rigid strips embedded in an infinite orthotropic medium due to elastic waves incident normally on the strips has been investigated. The mixed boundary value problem has been solved by the Integral Equation method. The normal stress and the vertical displacement have been derived in closed form. Numerical values of stress intensity factors at inner and outer edges of the strips and vertical displacement at points in the plane of the strips for several orthotropic materials have been calculated and plotted graphically to show the effect of material orthotropy.
Resumo:
We consider the problem of deciding whether the output of a boolean circuit is determined by a partial assignment to its inputs. This problem is easily shown to be hard, i.e., co-Image Image -complete. However, many of the consequences of a partial input assignment may be determined in linear time, by iterating the following step: if we know the values of some inputs to a gate, we can deduce the values of some outputs of that gate. This process of iteratively deducing some of the consequences of a partial assignment is called propagation. This paper explores the parallel complexity of propagation, i.e., the complexity of determining whether the output of a given boolean circuit is determined by propagating a given partial input assignment. We give a complete classification of the problem into those cases that are Image -complete and those that are unlikely to be Image complete.
Resumo:
The development of techniques for scaling up classifiers so that they can be applied to problems with large datasets of training examples is one of the objectives of data mining. Recently, AdaBoost has become popular among machine learning community thanks to its promising results across a variety of applications. However, training AdaBoost on large datasets is a major problem, especially when the dimensionality of the data is very high. This paper discusses the effect of high dimensionality on the training process of AdaBoost. Two preprocessing options to reduce dimensionality, namely the principal component analysis and random projection are briefly examined. Random projection subject to a probabilistic length preserving transformation is explored further as a computationally light preprocessing step. The experimental results obtained demonstrate the effectiveness of the proposed training process for handling high dimensional large datasets.
Resumo:
The paper presents two new algorithms for the direct parallel solution of systems of linear equations. The algorithms employ a novel recursive doubling technique to obtain solutions to an nth-order system in n steps with no more than 2n(n −1) processors. Comparing their performance with the Gaussian elimination algorithm (GE), we show that they are almost 100% faster than the latter. This speedup is achieved by dispensing with all the computation involved in the back-substitution phase of GE. It is also shown that the new algorithms exhibit error characteristics which are superior to GE. An n(n + 1) systolic array structure is proposed for the implementation of the new algorithms. We show that complete solutions can be obtained, through these single-phase solution methods, in 5n−log2n−4 computational steps, without the need for intermediate I/O operations.
Resumo:
A new method of specifying the syntax of programming languages, known as hierarchical language specifications (HLS), is proposed. Efficient parallel algorithms for parsing languages generated by HLS are presented. These algorithms run on an exclusive-read exclusive-write parallel random-access machine. They require O(n) processors and O(log2n) time, where n is the length of the string to be parsed. The most important feature of these algorithms is that they do not use a stack.
Resumo:
In this paper, the design and implementation of a single shared bus, shared memory multiprocessing system using Intel's single board computers is presented. The hardware configuration and the operating system developed to execute the parallel algorithms are discussed. The performance evaluation studies carried out on Image are outlined.
Resumo:
Lateral or transaxial truncation of cone-beam data can occur either due to the field of view limitation of the scanning apparatus or iregion-of-interest tomography. In this paper, we Suggest two new methods to handle lateral truncation in helical scan CT. It is seen that reconstruction with laterally truncated projection data, assuming it to be complete, gives severe artifacts which even penetrates into the field of view. A row-by-row data completion approach using linear prediction is introduced for helical scan truncated data. An extension of this technique known as windowed linear prediction approach is introduced. Efficacy of the two techniques are shown using simulation with standard phantoms. A quantitative image quality measure of the resulting reconstructed images are used to evaluate the performance of the proposed methods against an extension of a standard existing technique.
Resumo:
A new parallel algorithm for transforming an arithmetic infix expression into a par se tree is presented. The technique is based on a result due to Fischer (1980) which enables the construction of the parse tree, by appropriately scanning the vector of precedence values associated with the elements of the expression. The algorithm presented here is suitable for execution on a shared memory model of an SIMD machine with no read/write conflicts permitted. It uses O(n) processors and has a time complexity of O(log2n) where n is the expression length. Parallel algorithms for generating code for an SIMD machine are also presented.
Resumo:
Domain-invariant representations are key to addressing the domain shift problem where the training and test exam- ples follow different distributions. Existing techniques that have attempted to match the distributions of the source and target domains typically compare these distributions in the original feature space. This space, however, may not be di- rectly suitable for such a comparison, since some of the fea- tures may have been distorted by the domain shift, or may be domain specific. In this paper, we introduce a Domain Invariant Projection approach: An unsupervised domain adaptation method that overcomes this issue by extracting the information that is invariant across the source and tar- get domains. More specifically, we learn a projection of the data to a low-dimensional latent space where the distance between the empirical distributions of the source and target examples is minimized. We demonstrate the effectiveness of our approach on the task of visual object recognition and show that it outperforms state-of-the-art methods on a stan- dard domain adaptation benchmark dataset
Resumo:
Abstract is not available.
Resumo:
In the modern business environment, meeting due dates and avoiding delay penalties are very important goals that can be accomplished by minimizing total weighted tardiness. We consider a scheduling problem in a system of parallel processors with the objective of minimizing total weighted tardiness. Our aim in the present work is to develop an efficient algorithm for solving the parallel processor problem as compared to the available heuristics in the literature and we propose the ant colony optimization approach for this problem. An extensive experimentation is conducted to evaluate the performance of the ACO approach on different problem sizes with the varied tardiness factors. Our experimentation shows that the proposed ant colony optimization algorithm is giving promising results compared to the best of the available heuristics.
Resumo:
It is shown that the conclusions arrived at regarding the instability of an incompressible fluid cylinder in the presence of the magnetic field and the streaming velocity in a recent communication easily follow from the study of propagation characteristics of Alfvén surface waves along cylindrical plasma columns made earlier.
Resumo:
X-ray crystal structure analysis of 7-methoxycoumarin reveals that the reactive double bonds are rotated by about 65° with respect to each other, the centre-to-centre distance between the double bonds being 3.83 Å. In spite of this unfavourable arrangement, photodimerization occurs in the crystalline state yielding the syn-head-tail dimer as the only product. Lattice energy calculations on ground-state molecules in crystals throw light on the mechanism of the reaction.