39 resultados para Data processing Computer science
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
Maltose-binding protein is the periplasmic component of the ABC transporter responsible for the uptake of maltose/maltodextrins. The Xanthomonas axonopodis pv. citri maltose-binding protein MalE has been crystallized at 293 Kusing the hanging-drop vapour-diffusion method. The crystal belonged to the primitive hexagonal space group P6(1)22, with unit-cell parameters a = 123.59, b = 123.59, c = 304.20 angstrom, and contained two molecules in the asymetric unit. It diffracted to 2.24 angstrom resolution.
Resumo:
The zero-inflated negative binomial model is used to account for overdispersion detected in data that are initially analyzed under the zero-Inflated Poisson model A frequentist analysis a jackknife estimator and a non-parametric bootstrap for parameter estimation of zero-inflated negative binomial regression models are considered In addition an EM-type algorithm is developed for performing maximum likelihood estimation Then the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and some ways to perform global influence analysis are derived In order to study departures from the error assumption as well as the presence of outliers residual analysis based on the standardized Pearson residuals is discussed The relevance of the approach is illustrated with a real data set where It is shown that zero-inflated negative binomial regression models seems to fit the data better than the Poisson counterpart (C) 2010 Elsevier B V All rights reserved
Resumo:
This paper presents an approach for assisting low-literacy readers in accessing Web online information. The oEducational FACILITAo tool is a Web content adaptation tool that provides innovative features and follows more intuitive interaction models regarding accessibility concerns. Especially, we propose an interaction model and a Web application that explore the natural language processing tasks of lexical elaboration and named entity labeling for improving Web accessibility. We report on the results obtained from a pilot study on usability analysis carried out with low-literacy users. The preliminary results show that oEducational FACILITAo improves the comprehension of text elements, although the assistance mechanisms might also confuse users when word sense ambiguity is introduced, by gathering, for a complex word, a list of synonyms with multiple meanings. This fact evokes a future solution in which the correct sense for a complex word in a sentence is identified, solving this pervasive characteristic of natural languages. The pilot study also identified that experienced computer users find the tool to be more useful than novice computer users do.
Resumo:
The amount of textual information digitally stored is growing every day. However, our capability of processing and analyzing that information is not growing at the same pace. To overcome this limitation, it is important to develop semiautomatic processes to extract relevant knowledge from textual information, such as the text mining process. One of the main and most expensive stages of the text mining process is the text pre-processing stage, where the unstructured text should be transformed to structured format such as an attribute-value table. The stemming process, i.e. linguistics normalization, is usually used to find the attributes of this table. However, the stemming process is strongly dependent on the language in which the original textual information is given. Furthermore, for most languages, the stemming algorithms proposed in the literature are computationally expensive. In this work, several improvements of the well know Porter stemming algorithm for the Portuguese language, which explore the characteristics of this language, are proposed. Experimental results show that the proposed algorithm executes in far less time without affecting the quality of the generated stems.
Resumo:
Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.
Resumo:
This work proposes a method based on both preprocessing and data mining with the objective of identify harmonic current sources in residential consumers. In addition, this methodology can also be applied to identify linear and nonlinear loads. It should be emphasized that the entire database was obtained through laboratory essays, i.e., real data were acquired from residential loads. Thus, the residential system created in laboratory was fed by a configurable power source and in its output were placed the loads and the power quality analyzers (all measurements were stored in a microcomputer). So, the data were submitted to pre-processing, which was based on attribute selection techniques in order to minimize the complexity in identifying the loads. A newer database was generated maintaining only the attributes selected, thus, Artificial Neural Networks were trained to realized the identification of loads. In order to validate the methodology proposed, the loads were fed both under ideal conditions (without harmonics), but also by harmonic voltages within limits pre-established. These limits are in accordance with IEEE Std. 519-1992 and PRODIST (procedures to delivery energy employed by Brazilian`s utilities). The results obtained seek to validate the methodology proposed and furnish a method that can serve as alternative to conventional methods.
Resumo:
Searching in a dataset for elements that are similar to a given query element is a core problem in applications that manage complex data, and has been aided by metric access methods (MAMs). A growing number of applications require indices that must be built faster and repeatedly, also providing faster response for similarity queries. The increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper. we propose the Onion-tree, a new and robust dynamic memory-based MAM that slices the metric space into disjoint subspaces to provide quick indexing of complex data. It introduces three major characteristics: (i) a partitioning method that controls the number of disjoint subspaces generated at each node; (ii) a replacement technique that can change the leaf node pivots in insertion operations; and (iii) range and k-NN extended query algorithms to support the new partitioning method, including a new visit order of the subspaces in k-NN queries. Performance tests with both real-world and synthetic datasets showed that the Onion-tree is very compact. Comparisons of the Onion-tree with the MM-tree and a memory-based version of the Slim-tree showed that the Onion-tree was always faster to build the index. The experiments also showed that the Onion-tree significantly improved range and k-NN query processing performance and was the most efficient MAM, followed by the MM-tree, which in turn outperformed the Slim-tree in almost all the tests. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Most multidimensional projection techniques rely on distance (dissimilarity) information between data instances to embed high-dimensional data into a visual space. When data are endowed with Cartesian coordinates, an extra computational effort is necessary to compute the needed distances, making multidimensional projection prohibitive in applications dealing with interactivity and massive data. The novel multidimensional projection technique proposed in this work, called Part-Linear Multidimensional Projection (PLMP), has been tailored to handle multivariate data represented in Cartesian high-dimensional spaces, requiring only distance information between pairs of representative samples. This characteristic renders PLMP faster than previous methods when processing large data sets while still being competitive in terms of precision. Moreover, knowing the range of variation for data instances in the high-dimensional space, we can make PLMP a truly streaming data projection technique, a trait absent in previous methods.
Resumo:
The goal of this paper is to study and propose a new technique for noise reduction used during the reconstruction of speech signals, particularly for biomedical applications. The proposed method is based on Kalman filtering in the time domain combined with spectral subtraction. Comparison with discrete Kalman filter in the frequency domain shows better performance of the proposed technique. The performance is evaluated by using the segmental signal-to-noise ratio and the Itakura-Saito`s distance. Results have shown that Kalman`s filter in time combined with spectral subtraction is more robust and efficient, improving the Itakura-Saito`s distance by up to four times. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
Most post-processors for boundary element (BE) analysis use an auxiliary domain mesh to display domain results, working against the profitable modelling process of a pure boundary discretization. This paper introduces a novel visualization technique which preserves the basic properties of the boundary element methods. The proposed algorithm does not require any domain discretization and is based on the direct and automatic identification of isolines. Another critical aspect of the visualization of domain results in BE analysis is the effort required to evaluate results in interior points. In order to tackle this issue, the present article also provides a comparison between the performance of two different BE formulations (conventional and hybrid). In addition, this paper presents an overview of the most common post-processing and visualization techniques in BE analysis, such as the classical algorithms of scan line and the interpolation over a domain discretization. The results presented herein show that the proposed algorithm offers a very high performance compared with other visualization procedures.
Resumo:
In this work, an algorithm to compute the envelope of non-destructive testing (NDT) signals is proposed. This method allows increasing the speed and reducing the memory in extensive data processing. Also, this procedure presents advantage of preserving the data information for physical modeling applications of time-dependent measurements. The algorithm is conceived to be applied for analyze data from non-destructive testing. The comparison between different envelope methods and the proposed method, applied to Magnetic Bark Signal (MBN), is studied. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
In this paper, processing methods of Fourier optics implemented in a digital holographic microscopy system are presented. The proposed methodology is based on the possibility of the digital holography in carrying out the whole reconstruction of the recorded wave front and consequently, the determination of the phase and intensity distribution in any arbitrary plane located between the object and the recording plane. In this way, in digital holographic microscopy the field produced by the objective lens can be reconstructed along its propagation, allowing the reconstruction of the back focal plane of the lens, so that the complex amplitudes of the Fraunhofer diffraction, or equivalently the Fourier transform, of the light distribution across the object can be known. The manipulation of Fourier transform plane makes possible the design of digital methods of optical processing and image analysis. The proposed method has a great practical utility and represents a powerful tool in image analysis and data processing. The theoretical aspects of the method are presented, and its validity has been demonstrated using computer generated holograms and images simulations of microscopic objects. (c) 2007 Elsevier B.V. All rights reserved.
Resumo:
The classical approach for acoustic imaging consists of beamforming, and produces the source distribution of interest convolved with the array point spread function. This convolution smears the image of interest, significantly reducing its effective resolution. Deconvolution methods have been proposed to enhance acoustic images and have produced significant improvements. Other proposals involve covariance fitting techniques, which avoid deconvolution altogether. However, in their traditional presentation, these enhanced reconstruction methods have very high computational costs, mostly because they have no means of efficiently transforming back and forth between a hypothetical image and the measured data. In this paper, we propose the Kronecker Array Transform ( KAT), a fast separable transform for array imaging applications. Under the assumption of a separable array, it enables the acceleration of imaging techniques by several orders of magnitude with respect to the fastest previously available methods, and enables the use of state-of-the-art regularized least-squares solvers. Using the KAT, one can reconstruct images with higher resolutions than was previously possible and use more accurate reconstruction techniques, opening new and exciting possibilities for acoustic imaging.
Resumo:
In this work, a system using active RFID tags to supervise truck bulk cargo is described. The tags are attached to the bodies of the trucks and readers are distributed in the cargo buildings and attached to weighs and the discharge platforms. PDAs with camera and support to a WiFi network are provided to the inspectors and access points are installed throughout the discharge area to allow effective confirmations of unload actions and the acquisition of pictures for future audit. Broadband radio equipments are used to establish efficient communication links between the weighs and cargo buildings which are usually located very far from each other in the field. A web application software was especially developed to enable robust communication between the equipments for efficient device management, data processing and reports generation to the operating personal. The system was deployed in a cargo station of a Brazilian seashore port. The obtained results prove the effectiveness of the proposed system.
Resumo:
Computer viruses are an important risk to computational systems endangering either corporations of all sizes or personal computers used for domestic applications. Here, classical epidemiological models for disease propagation are adapted to computer networks and, by using simple systems identification techniques a model called SAIC (Susceptible, Antidotal, Infectious, Contaminated) is developed. Real data about computer viruses are used to validate the model. (c) 2008 Elsevier Ltd. All rights reserved.