859 resultados para Fuzzy c-means algorithm
Resumo:
The amount of textual information digitally stored is growing every day. However, our capability of processing and analyzing that information is not growing at the same pace. To overcome this limitation, it is important to develop semiautomatic processes to extract relevant knowledge from textual information, such as the text mining process. One of the main and most expensive stages of the text mining process is the text pre-processing stage, where the unstructured text should be transformed to structured format such as an attribute-value table. The stemming process, i.e. linguistics normalization, is usually used to find the attributes of this table. However, the stemming process is strongly dependent on the language in which the original textual information is given. Furthermore, for most languages, the stemming algorithms proposed in the literature are computationally expensive. In this work, several improvements of the well know Porter stemming algorithm for the Portuguese language, which explore the characteristics of this language, are proposed. Experimental results show that the proposed algorithm executes in far less time without affecting the quality of the generated stems.
Resumo:
Conventional procedures employed in the modeling of viscoelastic properties of polymer rely on the determination of the polymer`s discrete relaxation spectrum from experimentally obtained data. In the past decades, several analytical regression techniques have been proposed to determine an explicit equation which describes the measured spectra. With a diverse approach, the procedure herein introduced constitutes a simulation-based computational optimization technique based on non-deterministic search method arisen from the field of evolutionary computation. Instead of comparing numerical results, this purpose of this paper is to highlight some Subtle differences between both strategies and focus on what properties of the exploited technique emerge as new possibilities for the field, In oder to illustrate this, essayed cases show how the employed technique can outperform conventional approaches in terms of fitting quality. Moreover, in some instances, it produces equivalent results With much fewer fitting parameters, which is convenient for computational simulation applications. I-lie problem formulation and the rationale of the highlighted method are herein discussed and constitute the main intended contribution. (C) 2009 Wiley Periodicals, Inc. J Appl Polym Sci 113: 122-135, 2009
Resumo:
Purpose - The purpose of this paper is to develop a novel unstructured simulation approach for injection molding processes described by the Hele-Shaw model. Design/methodology/approach - The scheme involves dual dynamic meshes with active and inactive cells determined from an initial background pointset. The quasi-static pressure solution in each timestep for this evolving unstructured mesh system is approximated using a control volume finite element method formulation coupled to a corresponding modified volume of fluid method. The flow is considered to be isothermal and non-Newtonian. Findings - Supporting numerical tests and performance studies for polystyrene described by Carreau, Cross, Ellis and Power-law fluid models are conducted. Results for the present method are shown to be comparable to those from other methods for both Newtonian fluid and polystyrene fluid injected in different mold geometries. Research limitations/implications - With respect to the methodology, the background pointset infers a mesh that is dynamically reconstructed here, and there are a number of efficiency issues and improvements that would be relevant to industrial applications. For instance, one can use the pointset to construct special bases and invoke a so-called ""meshless"" scheme using the basis. This would require some interesting strategies to deal with the dynamic point enrichment of the moving front that could benefit from the present front treatment strategy. There are also issues related to mass conservation and fill-time errors that might be addressed by introducing suitable projections. The general question of ""rate of convergence"" of these schemes requires analysis. Numerical results here suggest first-order accuracy and are consistent with the approximations made, but theoretical results are not available yet for these methods. Originality/value - This novel unstructured simulation approach involves dual meshes with active and inactive cells determined from an initial background pointset: local active dual patches are constructed ""on-the-fly"" for each ""active point"" to form a dynamic virtual mesh of active elements that evolves with the moving interface.
Resumo:
Nuclear (p,alpha) reactions destroying the so-called ""light-elements"" lithium, beryllium and boron have been largely studied in the past mainly because their role in understanding some astrophysical phenomena, i.e. mixing-phenomena occurring in young F-G stars [1]. Such mechanisms transport the surface material down to the region close to the nuclear destruction zone, where typical temperatures of the order of similar to 10(6) K are reached. The corresponding Gamow energy E(0)=1.22 (Z(x)(2)Z(X)(2)T(6)(2))(1/3) [2] is about similar to 10 keV if one considers the ""boron-case"" and replaces in the previous formula Z(x) = 1, Z(X) = 5 and T(6) = 5. Direct measurements of the two (11)B(p,alpha(0))(8)Be and (10)B(p,alpha)(7)Be reactions in correspondence of this energy region are difficult to perform mainly because the combined effects of Coulomb barrier penetrability and electron screening [3]. The indirect method of the Trojan Horse (THM) [4-6] allows one to extract the two-body reaction cross section of interest for astrophysics without the extrapolation-procedures. Due to the THM formalism, the extracted indirect data have to be normalized to the available direct ones at higher energies thus implying that the method is a complementary tool in solving some still open questions for both nuclear and astrophysical issues [7-12].
Resumo:
A novel cryptography method based on the Lorenz`s attractor chaotic system is presented. The proposed algorithm is secure and fast, making it practical for general use. We introduce the chaotic operation mode, which provides an interaction among the password, message and a chaotic system. It ensures that the algorithm yields a secure codification, even if the nature of the chaotic system is known. The algorithm has been implemented in two versions: one sequential and slow and the other, parallel and fast. Our algorithm assures the integrity of the ciphertext (we know if it has been altered, which is not assured by traditional algorithms) and consequently its authenticity. Numerical experiments are presented, discussed and show the behavior of the method in terms of security and performance. The fast version of the algorithm has a performance comparable to AES, a popular cryptography program used commercially nowadays, but it is more secure, which makes it immediately suitable for general purpose cryptography applications. An internet page has been set up, which enables the readers to test the algorithm and also to try to break into the cipher.
Resumo:
In this work, Ba(Zr(0.25)Ti(0.75))O(3) ceramic was prepared by solid-state reaction. This material was characterized by x-ray diffraction and Fourier transform Raman spectroscopy. The temperature dependent dielectric properties were investigated in the frequency range from 1 kHz to 1 MHz. The dielectric measurements indicated a diffuse phase transition. The broadening of the dielectric permittivity in the frequency range as well as its shifting at higher temperatures indicated a relaxor-like behaviour for this material. The diffusivity and the relaxation strength were estimated using the modified Curie-Weiss law. The optical properties were analysed by ultraviolet-visible (UV-vis) absorption spectroscopy and photoluminescence (PL) measurements at room temperature. The UV-vis spectrum indicated that the Ba(Zr(0.25)Ti(0.75))O(3) ceramic has an optical band gap of 2.98 eV. A blue PL emission was observed for this compound when excited with 350 nm wavelength. The polarity as well as the PL property of this material was attributed to the presence of polar [TiO(6)] distorted clusters into a globally cubic matrix.
Resumo:
The properties of complex networks are highly Influenced by border effects frequently found as a consequence of the finite nature of real-world networks as well as network Sampling Therefore, it becomes critical to devise effective means for sound estimation of net work topological and dynamical properties will le avoiding these types of artifacts. In the current work, an algorithm for minimization of border effects is proposed and discussed, and its potential IS Illustrated with respect to two real-world networks. namely bone canals and air transportation (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Two targets, reverse transcriptase (RT) and protease from HIV-1, were used during the past two decades to the discovery of non-nucleoside reverse transcriptase inhibitors (NNRTI) and protease inhibitors (PI) that belong to the arsenal of the antiretroviral therapy. Herein these enzymes were chosen as templates for conducting a computer-aided ligand design. Ligand and structure-based drug designs were the starting points to select compounds from a database bearing more than five million compounds by means of cheminformatic tools. New promising lead structures are retrieved from the database, which are open to acquisition and test. Classes of molecules already described as NNRTI or PI in the literature also came out and were useful to prove the reliability of the workflow, and thus validating the work carried out so far. (c) 2007 Elsevier Masson SAS. All rights reserved.
Resumo:
This paper proposes an improved voice activity detection (VAD) algorithm using wavelet and support vector machine (SVM) for European Telecommunication Standards Institution (ETS1) adaptive multi-rate (AMR) narrow-band (NB) and wide-band (WB) speech codecs. First, based on the wavelet transform, the original IIR filter bank and pitch/tone detector are implemented, respectively, via the wavelet filter bank and the wavelet-based pitch/tone detection algorithm. The wavelet filter bank can divide input speech signal into several frequency bands so that the signal power level at each sub-band can be calculated. In addition, the background noise level can be estimated in each sub-band by using the wavelet de-noising method. The wavelet filter bank is also derived to detect correlated complex signals like music. Then the proposed algorithm can apply SVM to train an optimized non-linear VAD decision rule involving the sub-band power, noise level, pitch period, tone flag, and complex signals warning flag of input speech signals. By the use of the trained SVM, the proposed VAD algorithm can produce more accurate detection results. Various experimental results carried out from the Aurora speech database with different noise conditions show that the proposed algorithm gives considerable VAD performances superior to the AMR-NB VAD Options 1 and 2, and AMR-WB VAD. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
The focus of study in this paper is the class of packing problems. More specifically, it deals with the placement of a set of N circular items of unitary radius inside an object with the aim of minimizing its dimensions. Differently shaped containers are considered, namely circles, squares, rectangles, strips and triangles. By means of the resolution of non-linear equations systems through the Newton-Raphson method, the herein presented algorithm succeeds in improving the accuracy of previous results attained by continuous optimization approaches up to numerical machine precision. The computer implementation and the data sets are available at http://www.ime.usp.br/similar to egbirgin/packing/. (C) 2009 Elsevier Ltd, All rights reserved.
Resumo:
This paper presents the formulation of a combinatorial optimization problem with the following characteristics: (i) the search space is the power set of a finite set structured as a Boolean lattice; (ii) the cost function forms a U-shaped curve when applied to any lattice chain. This formulation applies for feature selection in the context of pattern recognition. The known approaches for this problem are branch-and-bound algorithms and heuristics that explore partially the search space. Branch-and-bound algorithms are equivalent to the full search, while heuristics are not. This paper presents a branch-and-bound algorithm that differs from the others known by exploring the lattice structure and the U-shaped chain curves of the search space. The main contribution of this paper is the architecture of this algorithm that is based on the representation and exploration of the search space by new lattice properties proven here. Several experiments, with well known public data, indicate the superiority of the proposed method to the sequential floating forward selection (SFFS), which is a popular heuristic that gives good results in very short computational time. In all experiments, the proposed method got better or equal results in similar or even smaller computational time. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Large-scale simulations of parts of the brain using detailed neuronal models to improve our understanding of brain functions are becoming a reality with the usage of supercomputers and large clusters. However, the high acquisition and maintenance cost of these computers, including the physical space, air conditioning, and electrical power, limits the number of simulations of this kind that scientists can perform. Modern commodity graphical cards, based on the CUDA platform, contain graphical processing units (GPUs) composed of hundreds of processors that can simultaneously execute thousands of threads and thus constitute a low-cost solution for many high-performance computing applications. In this work, we present a CUDA algorithm that enables the execution, on multiple GPUs, of simulations of large-scale networks composed of biologically realistic Hodgkin-Huxley neurons. The algorithm represents each neuron as a CUDA thread, which solves the set of coupled differential equations that model each neuron. Communication among neurons located in different GPUs is coordinated by the CPU. We obtained speedups of 40 for the simulation of 200k neurons that received random external input and speedups of 9 for a network with 200k neurons and 20M neuronal connections, in a single computer with two graphic boards with two GPUs each, when compared with a modern quad-core CPU. Copyright (C) 2010 John Wiley & Sons, Ltd.
Resumo:
One of the key issues in e-learning environments is the possibility of creating and evaluating exercises. However, the lack of tools supporting the authoring and automatic checking of exercises for specifics topics (e.g., geometry) drastically reduces advantages in the use of e-learning environments on a larger scale, as usually happens in Brazil. This paper describes an algorithm, and a tool based on it, designed for the authoring and automatic checking of geometry exercises. The algorithm dynamically compares the distances between the geometric objects of the student`s solution and the template`s solution, provided by the author of the exercise. Each solution is a geometric construction which is considered a function receiving geometric objects (input) and returning other geometric objects (output). Thus, for a given problem, if we know one function (construction) that solves the problem, we can compare it to any other function to check whether they are equivalent or not. Two functions are equivalent if, and only if, they have the same output when the same input is applied. If the student`s solution is equivalent to the template`s solution, then we consider the student`s solution as a correct solution. Our software utility provides both authoring and checking tools to work directly on the Internet, together with learning management systems. These tools are implemented using the dynamic geometry software, iGeom, which has been used in a geometry course since 2004 and has a successful track record in the classroom. Empowered with these new features, iGeom simplifies teachers` tasks, solves non-trivial problems in student solutions and helps to increase student motivation by providing feedback in real time. (c) 2008 Elsevier Ltd. All rights reserved.
Resumo:
Given two strings A and B of lengths n(a) and n(b), n(a) <= n(b), respectively, the all-substrings longest common subsequence (ALCS) problem obtains, for every substring B` of B, the length of the longest string that is a subsequence of both A and B. The ALCS problem has many applications, such as finding approximate tandem repeats in strings, solving the circular alignment of two strings and finding the alignment of one string with several others that have a common substring. We present an algorithm to prepare the basic data structure for ALCS queries that takes O(n(a)n(b)) time and O(n(a) + n(b)) space. After this preparation, it is possible to build that allows any LCS length to be retrieved in constant time. Some trade-offs between the space required and a matrix of size O(n(b)(2)) the querying time are discussed. To our knowledge, this is the first algorithm in the literature for the ALCS problem. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
The bioactivity-guided fractionation of the crude extracts from leaves of Brazilian species Piper aduncum and Piper hostmannianum by means of bioautography using the fungi Cladosporium cladosporioides and C. sphaerospermum afforded prenylated methyl benzoate, chromenes, and dihydrobenzopyran derivatives as antifungal compounds. The isolation and structural elucidation of a new compound methyl 4-hydroxy-3-(2`-hydroperoxy-3`-methyl-3`-butenyl) benzoate were performed by application of chromatographic techniques and spectroscopic analyses. (C) 2009 Phytochemical Society of Europe. Published by Elsevier B.V. All rights reserved.