23 resultados para Processing wikipedia data

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Most multidimensional projection techniques rely on distance (dissimilarity) information between data instances to embed high-dimensional data into a visual space. When data are endowed with Cartesian coordinates, an extra computational effort is necessary to compute the needed distances, making multidimensional projection prohibitive in applications dealing with interactivity and massive data. The novel multidimensional projection technique proposed in this work, called Part-Linear Multidimensional Projection (PLMP), has been tailored to handle multivariate data represented in Cartesian high-dimensional spaces, requiring only distance information between pairs of representative samples. This characteristic renders PLMP faster than previous methods when processing large data sets while still being competitive in terms of precision. Moreover, knowing the range of variation for data instances in the high-dimensional space, we can make PLMP a truly streaming data projection technique, a trait absent in previous methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Genetic variation provides a basis upon which populations can be genetically improved. Management of animal genetic resources in order to minimize loss of genetic diversity both within and across breeds has recently received attention at different levels, e. g., breed, national and international levels. A major need for sustainable improvement and conservation programs is accurate estimates of population parameters, such as rate of inbreeding and effective population size. A software system (POPREP) is presented that automatically generates a typeset report. Key parameters for population management, such as age structure, generation interval, variance in family size, rate of inbreeding, and effective population size form the core part of this report. The report includes a default text that describes definition, computation and meaning of the various parameters. The report is summarized in two pdf files, named Population Structure and Pedigree Analysis Reports. In addition, results (e. g., individual inbreeding coefficients, rate of inbreeding and effective population size) are stored in comma-separate-values files that are available for further processing. Pedigree data from eight livestock breeds from different species and countries were used to describe the potential of POPREP and to highlight areas for further research.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Optical monitoring systems are necessary to manufacture multilayer thin-film optical filters with low tolerance on spectrum specification. Furthermore, to have better accuracy on the measurement of film thickness, direct monitoring is a must. Direct monitoring implies acquiring spectrum data from the optical component undergoing the film deposition itself, in real time. In making film depositions on surfaces of optical components, the high vacuum evaporator chamber is the most popular equipment. Inside the evaporator, at the top of the chamber, there is a metallic support with several holes where the optical components are assembled. This metallic support has rotary motion to promote film homogenization. To acquire a measurement of the spectrum of the film in deposition, it is necessary to pass a light beam through a glass witness undergoing the film deposition process, and collect a sample of the light beam using a spectrometer. As both the light beam and the light collector are stationary, a synchronization system is required to identify the moment at which the optical component passes through the light beam.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objective: To assess time trends in the contribution of processed foods to food purchases made by Brazilian households and to explore the potential impact on the overall quality of the diet. Design: Application of a new classification of foodstuffs based on extent and purpose of food processing to data collected by comparable probabilistic household budget surveys. The classification assigns foodstuffs to the following groups: unprocessed/minimally processed foods (Group 1); processed culinary ingredients (Group 2); or ultra-processed ready-to-eat or ready-to-heat food products (Group 3). Setting: Eleven metropolitan areas of Brazil. Subjects: Households; n 13 611 in 1987-8, n 16 014 in 1995-5 and n 13 848 in 2002-3. Results: Over the last three decades, the household consumption of Group 1 and Group 2 foods has been steadily replaced by consumption of Group 3 ultra-processed food products, both overall and in lower- and upper-income groups. In the 2002-3 survey, Group 3 items represented more than one-quarter of total energy (more than one-third for higher-income households). The overall nutrient profile of Group 3 items, compared with that of Group 1 and Group 2 items, revealed more added sugar, more saturated fat, more sodium, less fibre and much higher energy density. Conclusions: The high energy density and the unfavourable nutrition profiling of Group 3 food products, and also their potential harmful effects on eating and drinking behaviours, indicate that governments and health authorities should use all possible methods, including legislation and statutory regulation, to halt and reverse the replacement of minimally processed foods and processed culinary ingredients by ultra-processed food products.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Due to the imprecise nature of biological experiments, biological data is often characterized by the presence of redundant and noisy data. This may be due to errors that occurred during data collection, such as contaminations in laboratorial samples. It is the case of gene expression data, where the equipments and tools currently used frequently produce noisy biological data. Machine Learning algorithms have been successfully used in gene expression data analysis. Although many Machine Learning algorithms can deal with noise, detecting and removing noisy instances from the training data set can help the induction of the target hypothesis. This paper evaluates the use of distance-based pre-processing techniques for noise detection in gene expression data classification problems. This evaluation analyzes the effectiveness of the techniques investigated in removing noisy data, measured by the accuracy obtained by different Machine Learning classifiers over the pre-processed data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Maltose-binding protein is the periplasmic component of the ABC transporter responsible for the uptake of maltose/maltodextrins. The Xanthomonas axonopodis pv. citri maltose-binding protein MalE has been crystallized at 293 Kusing the hanging-drop vapour-diffusion method. The crystal belonged to the primitive hexagonal space group P6(1)22, with unit-cell parameters a = 123.59, b = 123.59, c = 304.20 angstrom, and contained two molecules in the asymetric unit. It diffracted to 2.24 angstrom resolution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Geographic Data Warehouses (GDW) are one of the main technologies used in decision-making processes and spatial analysis, and the literature proposes several conceptual and logical data models for GDW. However, little effort has been focused on studying how spatial data redundancy affects SOLAP (Spatial On-Line Analytical Processing) query performance over GDW. In this paper, we investigate this issue. Firstly, we compare redundant and non-redundant GDW schemas and conclude that redundancy is related to high performance losses. We also analyze the issue of indexing, aiming at improving SOLAP query performance on a redundant GDW. Comparisons of the SB-index approach, the star-join aided by R-tree and the star-join aided by GiST indicate that the SB-index significantly improves the elapsed time in query processing from 25% up to 99% with regard to SOLAP queries defined over the spatial predicates of intersection, enclosure and containment and applied to roll-up and drill-down operations. We also investigate the impact of the increase in data volume on the performance. The increase did not impair the performance of the SB-index, which highly improved the elapsed time in query processing. Performance tests also show that the SB-index is far more compact than the star-join, requiring only a small fraction of at most 0.20% of the volume. Moreover, we propose a specific enhancement of the SB-index to deal with spatial data redundancy. This enhancement improved performance from 80 to 91% for redundant GDW schemas.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a new food classification which assigns foodstuffs according to the extent and purpose of the industrial processing applied to them. Three main groups are defined: unprocessed or minimally processed foods (group 1), processed culinary and food industry ingredients (group 2), and ultra-processed food products (group 3). The use of this classification is illustrated by applying it to data collected in the Brazilian Household Budget Survey which was conducted in 2002/2003 through a probabilistic sample of 48,470 Brazilian households. The average daily food availability was 1,792 kcal/person being 42.5% from group 1 (mostly rice and beans and meat and milk), 37.5% from group 2 (mostly vegetable oils, sugar, and flours), and 20% from group 3 (mostly breads, biscuits, sweets, soft drinks, and sausages). The share of group 3 foods increased with income, and represented almost one third of all calories in higher income households. The impact of the replacement of group 1 foods and group 2 ingredients by group 3 products on the overall quality of the diet, eating patterns and health is discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hot tensile and creep tests were carried out on Kanthal A1 alloy in the temperature range from 600 to 800 degrees C. Each of these sets of data were analyzed separately according to their own methodologies, but an attempt was made to find a correlation between them. A new criterion proposed for converting hot tensile data to creep data, makes possible the analysis of the two kinds of results according to usual creep relations like: Norton, Monkman-Grant, Larson-Miller and others. The remarkable compatibility verified between both sets of data by this procedure strongly suggests that hot tensile data can be converted to creep data and vice-versa for Kanthal A1 alloy, as verified previously for other metallic materials.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work proposes a method based on both preprocessing and data mining with the objective of identify harmonic current sources in residential consumers. In addition, this methodology can also be applied to identify linear and nonlinear loads. It should be emphasized that the entire database was obtained through laboratory essays, i.e., real data were acquired from residential loads. Thus, the residential system created in laboratory was fed by a configurable power source and in its output were placed the loads and the power quality analyzers (all measurements were stored in a microcomputer). So, the data were submitted to pre-processing, which was based on attribute selection techniques in order to minimize the complexity in identifying the loads. A newer database was generated maintaining only the attributes selected, thus, Artificial Neural Networks were trained to realized the identification of loads. In order to validate the methodology proposed, the loads were fed both under ideal conditions (without harmonics), but also by harmonic voltages within limits pre-established. These limits are in accordance with IEEE Std. 519-1992 and PRODIST (procedures to delivery energy employed by Brazilian`s utilities). The results obtained seek to validate the methodology proposed and furnish a method that can serve as alternative to conventional methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We hypothesized that the processing of auditory information by the perisylvian polymicrogyric cortex may be different from the normal cortex. To characterize the auditory processing in bilateral perisylvian syndrome, we examined ten patients with perisylvian polymicrogyria (Group 1) and seven control children (Group 11). Group I was composed by four children with bilateral perisylvian polymicrogyria and six children with bilateral posterior perisylvian polymicrogyria. The evaluation included neurological and neuroimaging investigation, intellectual quotient and audiological assessment (audiometry and behavior auditory tests). The results revealed a statistically significant difference between the groups in the behavioral auditory tests, Such as, digits dichotic test, nonverbal dichotic test (specifically in right attention), and random gap detection/random gap detection expanded tests. Our data showed abnormalities in the auditory processing of children with perisylvian polymicrogyria, suggesting that perisylvian polymicrogyric cortex is functionally abnormal. We also found a correlation between the severity of our auditory findings and the extent of the cortical abnormality. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The identification, modeling, and analysis of interactions between nodes of neural systems in the human brain have become the aim of interest of many studies in neuroscience. The complex neural network structure and its correlations with brain functions have played a role in all areas of neuroscience, including the comprehension of cognitive and emotional processing. Indeed, understanding how information is stored, retrieved, processed, and transmitted is one of the ultimate challenges in brain research. In this context, in functional neuroimaging, connectivity analysis is a major tool for the exploration and characterization of the information flow between specialized brain regions. In most functional magnetic resonance imaging (fMRI) studies, connectivity analysis is carried out by first selecting regions of interest (ROI) and then calculating an average BOLD time series (across the voxels in each cluster). Some studies have shown that the average may not be a good choice and have suggested, as an alternative, the use of principal component analysis (PCA) to extract the principal eigen-time series from the ROI(s). In this paper, we introduce a novel approach called cluster Granger analysis (CGA) to study connectivity between ROIs. The main aim of this method was to employ multiple eigen-time series in each ROI to avoid temporal information loss during identification of Granger causality. Such information loss is inherent in averaging (e.g., to yield a single ""representative"" time series per ROI). This, in turn, may lead to a lack of power in detecting connections. The proposed approach is based on multivariate statistical analysis and integrates PCA and partial canonical correlation in a framework of Granger causality for clusters (sets) of time series. We also describe an algorithm for statistical significance testing based on bootstrapping. By using Monte Carlo simulations, we show that the proposed approach outperforms conventional Granger causality analysis (i.e., using representative time series extracted by signal averaging or first principal components estimation from ROIs). The usefulness of the CGA approach in real fMRI data is illustrated in an experiment using human faces expressing emotions. With this data set, the proposed approach suggested the presence of significantly more connections between the ROIs than were detected using a single representative time series in each ROI. (c) 2010 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We assess the effects of chemical processing, ethylene oxide sterilization, and threading on bone surface and mechanical properties of bovine undecalcified bone screws. In addition, we evaluate the possibility of manufacturing bone screws with predefined dimensions. Scanning electronic microscopic images show that chemical processing and ethylene oxide treatment causes collagen fiber amalgamation on the bone surface. Processed screws hold higher ultimate loads under bending and torsion than the in natura bone group, with no change in pull-out strength between groups. Threading significantly reduces deformation and bone strength under torsion. Metrological data demonstrate the possibility of manufacturing bone screws with standardized dimensions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this study was to evaluate ex vivo the accuracy an electronic apex locator during root canal length determination in primary molars. Methods: One calibrated examiner determined the root canal length in 15 primary molars (total=34 root canals) with different stages of root resorption. Root canal length was measured both visually, with the placement of a K-file 1 mm short of the apical foramen or the apical resorption bevel, and electronically using an electronic apex locator (Digital Signal Processing). Data were analyzed statistically using the intraclass correlation (ICC) test. Results: Comparing the actual and electronic root canal length measurements in the primary teeth showed a high correlation (ICC=0.95) Conclusions: The Digital Signal Processing apex locator is useful and accurate for apex foramen location during root canal length measurement in primary molars. (Pediatr Dent 200937:320-2) Received April 75, 2008 vertical bar Lost Revision August 21, 2008 vertical bar Revision Accepted August 22, 2008