886 resultados para Search-based technique


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Searching in a dataset for elements that are similar to a given query element is a core problem in applications that manage complex data, and has been aided by metric access methods (MAMs). A growing number of applications require indices that must be built faster and repeatedly, also providing faster response for similarity queries. The increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper. we propose the Onion-tree, a new and robust dynamic memory-based MAM that slices the metric space into disjoint subspaces to provide quick indexing of complex data. It introduces three major characteristics: (i) a partitioning method that controls the number of disjoint subspaces generated at each node; (ii) a replacement technique that can change the leaf node pivots in insertion operations; and (iii) range and k-NN extended query algorithms to support the new partitioning method, including a new visit order of the subspaces in k-NN queries. Performance tests with both real-world and synthetic datasets showed that the Onion-tree is very compact. Comparisons of the Onion-tree with the MM-tree and a memory-based version of the Slim-tree showed that the Onion-tree was always faster to build the index. The experiments also showed that the Onion-tree significantly improved range and k-NN query processing performance and was the most efficient MAM, followed by the MM-tree, which in turn outperformed the Slim-tree in almost all the tests. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a new technique and two algorithms to bulk-load data into multi-way dynamic metric access methods, based on the covering radius of representative elements employed to organize data in hierarchical data structures. The proposed algorithms are sample-based, and they always build a valid and height-balanced tree. We compare the proposed algorithm with existing ones, showing the behavior to bulk-load data into the Slim-tree metric access method. After having identified the worst case of our first algorithm, we describe adequate counteractions in an elegant way creating the second algorithm. Experiments performed to evaluate their performance show that our bulk-loading methods build trees faster than the sequential insertion method regarding construction time, and that it also significantly improves search performance. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The evolution of commodity computing lead to the possibility of efficient usage of interconnected machines to solve computationally-intensive tasks, which were previously solvable only by using expensive supercomputers. This, however, required new methods for process scheduling and distribution, considering the network latency, communication cost, heterogeneous environments and distributed computing constraints. An efficient distribution of processes over such environments requires an adequate scheduling strategy, as the cost of inefficient process allocation is unacceptably high. Therefore, a knowledge and prediction of application behavior is essential to perform effective scheduling. In this paper, we overview the evolution of scheduling approaches, focusing on distributed environments. We also evaluate the current approaches for process behavior extraction and prediction, aiming at selecting an adequate technique for online prediction of application execution. Based on this evaluation, we propose a novel model for application behavior prediction, considering chaotic properties of such behavior and the automatic detection of critical execution points. The proposed model is applied and evaluated for process scheduling in cluster and grid computing environments. The obtained results demonstrate that prediction of the process behavior is essential for efficient scheduling in large-scale and heterogeneous distributed environments, outperforming conventional scheduling policies by a factor of 10, and even more in some cases. Furthermore, the proposed approach proves to be efficient for online predictions due to its low computational cost and good precision. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper is about the use of natural language to communicate with computers. Most researches that have pursued this goal consider only requests expressed in English. A way to facilitate the use of several languages in natural language systems is by using an interlingua. An interlingua is an intermediary representation for natural language information that can be processed by machines. We propose to convert natural language requests into an interlingua [universal networking language (UNL)] and to execute these requests using software components. In order to achieve this goal, we propose OntoMap, an ontology-based architecture to perform the semantic mapping between UNL sentences and software components. OntoMap also performs component search and retrieval based on semantic information formalized in ontologies and rules.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The problem of projecting multidimensional data into lower dimensions has been pursued by many researchers due to its potential application to data analyses of various kinds. This paper presents a novel multidimensional projection technique based on least square approximations. The approximations compute the coordinates of a set of projected points based on the coordinates of a reduced number of control points with defined geometry. We name the technique Least Square Projections ( LSP). From an initial projection of the control points, LSP defines the positioning of their neighboring points through a numerical solution that aims at preserving a similarity relationship between the points given by a metric in mD. In order to perform the projection, a small number of distance calculations are necessary, and no repositioning of the points is required to obtain a final solution with satisfactory precision. The results show the capability of the technique to form groups of points by degree of similarity in 2D. We illustrate that capability through its application to mapping collections of textual documents from varied sources, a strategic yet difficult application. LSP is faster and more accurate than other existing high-quality methods, particularly where it was mostly tested, that is, for mapping text sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Public genealogical databases are becoming increasingly populated with historical data and records of the current population`s ancestors. As this increasing amount of available information is used to link individuals to their ancestors, the resulting trees become deeper and more dense, which justifies the need for using organized, space-efficient layouts to display the data. Existing layouts are often only able to show a small subset of the data at a time. As a result, it is easy to become lost when navigating through the data or to lose sight of the overall tree structure. On the contrary, leaving space for unknown ancestors allows one to better understand the tree`s structure, but leaving this space becomes expensive and allows fewer generations to be displayed at a time. In this work, we propose that the H-tree based layout be used in genealogical software to display ancestral trees. We will show that this layout presents an increase in the number of displayable generations, provides a nicely arranged, symmetrical, intuitive and organized fractal structure, increases the user`s ability to understand and navigate through the data, and accounts for the visualization requirements necessary for displaying such trees. Finally, user-study results indicate potential for user acceptance of the new layout.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a novel template-based meshing approach for generating good quality quadrilateral meshes from 2D digital images. This approach builds upon an existing image-based mesh generation technique called Imeshp, which enables us to create a segmented triangle mesh from an image without the need for an image segmentation step. Our approach generates a quadrilateral mesh using an indirect scheme, which converts the segmented triangle mesh created by the initial steps of the Imesh technique into a quadrilateral one. The triangle-to-quadrilateral conversion makes use of template meshes of triangles. To ensure good element quality, the conversion step is followed by a smoothing step, which is based on a new optimization-based procedure. We show several examples of meshes generated by our approach, and present a thorough experimental evaluation of the quality of the meshes given as examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A method for linearly constrained optimization which modifies and generalizes recent box-constraint optimization algorithms is introduced. The new algorithm is based on a relaxed form of Spectral Projected Gradient iterations. Intercalated with these projected steps, internal iterations restricted to faces of the polytope are performed, which enhance the efficiency of the algorithm. Convergence proofs are given and numerical experiments are included and commented. Software supporting this paper is available through the Tango Project web page: http://www.ime.usp.br/similar to egbirgin/tango/.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Moving-least-squares (MLS) surfaces undergoing large deformations need periodic regeneration of the point set (point-set resampling) so as to keep the point-set density quasi-uniform. Previous work by the authors dealt with algebraic MLS surfaces, and proposed a resampling strategy based on defining the new points at the intersections of the MLS surface with a suitable set of rays. That strategy has very low memory requirements and is easy to parallelize. In this article new resampling strategies with reduced CPU-time cost are explored. The basic idea is to choose as set of rays the lines of a regular, Cartesian grid, and to fully exploit this grid: as data structure for search queries, as spatial structure for traversing the surface in a continuation-like algorithm, and also as approximation grid for an interpolated version of the MLS surface. It is shown that in this way a very simple and compact resampling technique is obtained, which cuts the resampling cost by half with affordable memory requirements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Conventional procedures employed in the modeling of viscoelastic properties of polymer rely on the determination of the polymer`s discrete relaxation spectrum from experimentally obtained data. In the past decades, several analytical regression techniques have been proposed to determine an explicit equation which describes the measured spectra. With a diverse approach, the procedure herein introduced constitutes a simulation-based computational optimization technique based on non-deterministic search method arisen from the field of evolutionary computation. Instead of comparing numerical results, this purpose of this paper is to highlight some Subtle differences between both strategies and focus on what properties of the exploited technique emerge as new possibilities for the field, In oder to illustrate this, essayed cases show how the employed technique can outperform conventional approaches in terms of fitting quality. Moreover, in some instances, it produces equivalent results With much fewer fitting parameters, which is convenient for computational simulation applications. I-lie problem formulation and the rationale of the highlighted method are herein discussed and constitute the main intended contribution. (C) 2009 Wiley Periodicals, Inc. J Appl Polym Sci 113: 122-135, 2009

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The concentrations of the water-soluble inorganic aerosol species, ammonium (NH4+), nitrate (NO3-), chloride (Cl-), and sulfate (SO42-), were measured from September to November 2002 at a pasture site in the Amazon Basin (Rondnia, Brazil) (LBA-SMOCC). Measurements were conducted using a semi-continuous technique (Wet-annular denuder/Steam-Jet Aerosol Collector: WAD/SJAC) and three integrating filter-based methods, namely (1) a denuder-filter pack (DFP: Teflon and impregnated Whatman filters), (2) a stacked-filter unit (SFU: polycarbonate filters), and (3) a High Volume dichotomous sampler (HiVol: quartz fiber filters). Measurements covered the late dry season (biomass burning), a transition period, and the onset of the wet season (clean conditions). Analyses of the particles collected on filters were performed using ion chromatography (IC) and Particle-Induced X-ray Emission spectrometry (PIXE). Season-dependent discrepancies were observed between the WAD/SJAC system and the filter-based samplers. During the dry season, when PM2.5 (D-p <= 2.5 mu m) concentrations were similar to 100 mu g m(-3), aerosol NH4+ and SO42- measured by the filter-based samplers were on average two times higher than those determined by the WAD/SJAC. Concentrations of aerosol NO3- and Cl- measured with the HiVol during daytime, and with the DFP during day- and nighttime also exceeded those of the WAD/SJAC by a factor of two. In contrast, aerosol NO3- and Cl- measured with the SFU during the dry season were nearly two times lower than those measured by the WAD/SJAC. These differences declined markedly during the transition period and towards the cleaner conditions during the onset of the wet season (PM2.5 similar to 5 mu g m(-3)); when filter-based samplers measured on average 40-90% less than the WAD/SJAC. The differences were not due to consistent systematic biases of the analytical techniques, but were apparently a result of prevailing environmental conditions and different sampling procedures. For the transition period and wet season, the significance of our results is reduced by a low number of data points. We argue that the observed differences are mainly attributable to (a) positive and negative filter sampling artifacts, (b) presence of organic compounds and organosulfates on filter substrates, and (c) a SJAC sampling efficiency of less than 100%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For the first time, nanograined Pb(1-1.5x)La(x)TiO(3) ferroelectric ceramics, with x=0.2, were produced by a process based on a high-pressure densification technique (HPD) that eliminates the need of high-temperature sintering. Our results showed the production of workable dense ceramics with average grain size around 100 nm and free from secondary phase. Regarding the dielectric measurements, the samples showed satisfactory dielectric losses as well as remarkable diffusivity in the dielectric curves. Moreover, ferroelectric hysteresis measurements showed that samples produced by the HPD technique can stand high electric fields necessary to switch the polarization and thus to induce piezoelectric activity. Our results demonstrated clearly the viability of the proposed method to produce nanograined ferroelectric bulk ceramics, then opening the possibility of developing new technologies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work report results from proton nuclear magnetic resonance (NMR), continuous-wave (CW-EPR) and pulsed electron paramagnetic resonance (P-EPR) and complex impedance spectroscopy of gelatin-based polymer gel electrolytes containing acetic acid. cross-linked with formaldehyde and plasticized with glycerol. Ionic conductivity of 2 x 10(-5) S/cm was obtained at room temperature for samples prepared with 33 wt% of acetic acid. Proton ((1)H) line shapes and spin-lattice relaxation times were measured as a function of temperature. The NMR results show that the proton mobility is dependent on acetic acid content in the plasticized polymer gel electrolytes. The CW-EPR spectra, which were carried out in samples doped with copper perchlorate, indicate the presence of the paramagnetic Cu(2+) ions in axially distorted sites. The P-EPR technique, known as electron spin echo envelope modulation (ESEEM), was employed to show the involvement of both, hydrogen and nitrogen atoms, in the copper complexation of the gel electrolyte. (C) 2009 Elsevier Ltd. All rights reserved.