224 resultados para LARGE DELETION


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Emerging high-dimensional data mining applications needs to find interesting clusters embeded in arbitrarily aligned subspaces of lower dimensionality. It is difficult to cluster high-dimensional data objects, when they are sparse and skewed. Updations are quite common in dynamic databases and they are usually processed in batch mode. In very large dynamic databases, it is necessary to perform incremental cluster analysis only to the updations. We present a incremental clustering algorithm for subspace clustering in very high dimensions, which handles both insertion and deletions of datapoints to the backend databases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present the first results of an observational programme undertaken to map the fine structure line emission of singly ionized carbon ([ CII] 157 : 7409 mum) over extended regions using a Fabry Perot spectrometer newly installed at the focal plane of a 100 cm balloon- borne far- infrared telescope. This new combination of instruments has a velocity resolution of similar to 200 km s(-1) and an angular resolution of 1.'5. During the first flight, an area of 30' x 15' in Orion A was mapped. These observations extend over a larger area than previous observations, the map is fully sampled and the spectral scanning method used enables reliable estimation of the continuum emission at frequencies adjacent to the [ CII] line. The total [ CII] line luminosity, calculated by considering up to 20% of the maximum line intensity is 0.04% of the luminosity of the far- infrared continuum. We have compared the [ CII] intensity distribution with the velocity- integrated intensity distributions of (CO)-C-13(1- 0), CI(1- 0) and CO( 3- 2) from the literature. Comparison of the [ CII], [ CI] and the radio continuum intensity distributions indicates that the largescale [ CII] emission originates mainly from the neutral gas, except at the position of M 43, where no [ CI] emission corresponding to the [ CII] emission is seen. Substantial part of the [ CII] emission from here originates from the ionized gas. The observed line intensities and ratios have been analyzed using the PDR models by Kaufman et al. ( 1999) to derive the incident UV flux and volume density at a few selected positions. The models reproduce the observations reasonably well at most positions excepting the [ CII] peak ( which coincides with the position of theta(1) Ori C). Possible reason for the failure could be the simplifying assumption of a homogeneous plane parallel slab in place of a more complicated geometry.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Part classification and coding is still considered as laborious and time-consuming exercise. Keeping in view, the crucial role, which it plays, in developing automated CAPP systems, the attempts have been made in this article to automate a few elements of this exercise using a shape analysis model. In this study, a 24-vector directional template is contemplated to represent the feature elements of the parts (candidate and prototype). Various transformation processes such as deformation, straightening, bypassing, insertion and deletion are embedded in the proposed simulated annealing (SA)-like hybrid algorithm to match the candidate part with their prototype. For a candidate part, searching its matching prototype from the information data is computationally expensive and requires large search space. However, the proposed SA-like hybrid algorithm for solving the part classification problem considerably minimizes the search space and ensures early convergence of the solution. The application of the proposed approach is illustrated by an example part. The proposed approach is applied for the classification of 100 candidate parts and their prototypes to demonstrate the effectiveness of the algorithm. (C) 2003 Elsevier Science Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Delineation of homogeneous precipitation regions (regionalization) is necessary for investigating frequency and spatial distribution of meteorological droughts. The conventional methods of regionalization use statistics of precipitation as attributes to establish homogeneous regions. Therefore they cannot be used to form regions in ungauged areas, and they may not be useful to form meaningful regions in areas having sparse rain gauge density. Further, validation of the regions for homogeneity in precipitation is not possible, since the use of the precipitation statistics to form regions and subsequently to test the regional homogeneity is not appropriate. To alleviate this problem, an approach based on fuzzy cluster analysis is presented. It allows delineation of homogeneous precipitation regions in data sparse areas using large scale atmospheric variables (LSAV), which influence precipitation in the study area, as attributes. The LSAV, location parameters (latitude, longitude and altitude) and seasonality of precipitation are suggested as features for regionalization. The approach allows independent validation of the identified regions for homogeneity using statistics computed from the observed precipitation. Further it has the ability to form regions even in ungauged areas, owing to the use of attributes that can be reliably estimated even when no at-site precipitation data are available. The approach was applied to delineate homogeneous annual rainfall regions in India, and its effectiveness is illustrated by comparing the results with those obtained using rainfall statistics, regionalization based on hard cluster analysis, and meteorological sub-divisions in India. (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We calculate the probability of large rapidity gaps in high energy hadronic collisions using a model based on QCD mini-jets and soft gluon emission down into the infrared region. Comparing with other models we find a remarkable agreement among most predictions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a randomized algorithm for large scale SVM learning which solves the problem by iterating over random subsets of the data. Crucial to the algorithm for scalability is the size of the subsets chosen. In the context of text classification we show that, by using ideas from random projections, a sample size of O(log n) can be used to obtain a solution which is close to the optimal with a high probability. Experiments done on synthetic and real life data sets demonstrate that the algorithm scales up SVM learners, without loss in accuracy. 1

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a novel, scalable, clustering based Ordinal Regression formulation, which is an instance of a Second Order Cone Program (SOCP) with one Second Order Cone (SOC) constraint. The main contribution of the paper is a fast algorithm, CB-OR, which solves the proposed formulation more eficiently than general purpose solvers. Another main contribution of the paper is to pose the problem of focused crawling as a large scale Ordinal Regression problem and solve using the proposed CB-OR. Focused crawling is an efficient mechanism for discovering resources of interest on the web. Posing the problem of focused crawling as an Ordinal Regression problem avoids the need for a negative class and topic hierarchy, which are the main drawbacks of the existing focused crawling methods. Experiments on large synthetic and benchmark datasets show the scalability of CB-OR. Experiments also show that the proposed focused crawler outperforms the state-of-the-art.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A methodology termed the “filtered density function” (FDF) is developed and implemented for large eddy simulation (LES) of chemically reacting turbulent flows. In this methodology, the effects of the unresolved scalar fluctuations are taken into account by considering the probability density function (PDF) of subgrid scale (SGS) scalar quantities. A transport equation is derived for the FDF in which the effect of chemical reactions appears in a closed form. The influences of scalar mixing and convection within the subgrid are modeled. The FDF transport equation is solved numerically via a Lagrangian Monte Carlo scheme in which the solutions of the equivalent stochastic differential equations (SDEs) are obtained. These solutions preserve the Itô-Gikhman nature of the SDEs. The consistency of the FDF approach, the convergence of its Monte Carlo solution and the performance of the closures employed in the FDF transport equation are assessed by comparisons with results obtained by direct numerical simulation (DNS) and by conventional LES procedures in which the first two SGS scalar moments are obtained by a finite difference method (LES-FD). These comparative assessments are conducted by implementations of all three schemes (FDF, DNS and LES-FD) in a temporally developing mixing layer and a spatially developing planar jet under both non-reacting and reacting conditions. In non-reacting flows, the Monte Carlo solution of the FDF yields results similar to those via LES-FD. The advantage of the FDF is demonstrated by its use in reacting flows. In the absence of a closure for the SGS scalar fluctuations, the LES-FD results are significantly different from those based on DNS. The FDF results show a much closer agreement with filtered DNS results. © 1998 American Institute of Physics.